Home   >   CSC-OpenAccess Library   >    Manuscript Information
Data Quality Mining using Genetic Algorithm
Sufal Das, Banani Saha
Pages - 105 - 112     |    Revised - 05-05-2009     |    Published - 18-05-2009
Volume - 3   Issue - 2    |    Publication Date - April 2009  Table of Contents
Data Quality, Genetic Algorithms, Association Rule Mining, Multi-objective Optimization
Data quality mining (DQM) as a new and promising data mining approach from the academic and the business point of view. Data quality is important to organizations. People use information attributes as a tool for assessing data quality. The goal of DQM is to employ data mining methods in order to detect, quantify, explain and correct data quality deficiencies in very large databases. Data quality is crucial for many applications of knowledge discovery in databases (KDD). In this work, we have considered four data qualities like accuracy, comprehensibility, interestingness and completeness. We have tried to develop Multi-objective Genetic Algorithm (GA) based approach utilizing linkage between feature selection and association rule. The main motivation for using GA in the discovery of high-level prediction rules is that they perform a global search and cope better with attribute interaction that the greedy rule induction algorithms often used in data mining.
CITED BY (31)  
1 Mayilvaganan, M., & Geethamani, G. S. A Preliminary Survey on Genetic Algorithm Techniques.
2 Babu, K. R., & Sunitha, K. V. N. (2014). An effective hybrid technique for image enhancement through Genetic Algorithm-morphological operations. International Journal of Signal and Imaging Systems Engineering, 7(2), 83-91.
3 Barai, A. K. Performance Based Association Rule-Mining Technique Using Genetic Algorithm. algorithms, 1, 2.
4 Asha, T., Natarajan, S., & Murthy, K. N. B. (2014). optimization of association rules for tuberculosis using genetic algorithm. international journal of computing, 12(2), 151-159.
5 Abubacker, N. F., Azman, A., Doraisamy, S., Murad, M. A. A., Elmanna, M. E. M., & Saravanan, R. (2014). Correlation-Based Feature Selection for Association Rule Mining in Semantic Annotation of Mammographic Medical Images. In Information Retrieval Technology (pp. 482-493). Springer International Publishing.
6 Zhao, D. (2014). International Journal of" Computing" Research Institute of Intelligent Computer Systems Ternopil National Economic University. Computing, 13(2).
7 Shankar, S., & Purusothaman, T. (2013). A new utility-emphasized analysis for stock trading rules. Intelligent Data Analysis, 17(2), 271-294.
8 Dhanore, G., & Chaturvedi, S. K. (2013). An Optimization of Association Rule Mining using K-Map and Genetic Algorithm for Large Database. International Journal of Computer Applications, 84(17), 26-31.
9 Taleb, A. M., Yahya, A. A., & Taleb, N. M. Parallel Genetic Algorithm Model to Extract Association Rules.
10 Chadokar, S. K., Singh, D., & Singh, A. (2013, July). Optimizing network traffic by generating association rules using hybrid apriori-genetic algorithm. In Wireless and Optical Communications Networks (WOCN), 2013 Tenth International Conference on (pp. 1-5). IEEE.
11 Krishnamoorthy, k. pagerank algorithm and genetic algorithm in web mining.
12 Singh, M., Sharma, S., & Kaur, A. (2013). Performance Analysis of Decision Trees. International Journal of Computer Applications, 71(19), 10-14.
13 Poornamala, K., & Lawrance, R. (2012). A Frequent Pattern Tree Algorithm for Mining Association Rule Using Genetic Algorithm. Data Mining and Knowledge Engineering, 4(7), 357-360.
14 Kumar, K. D., & Kumar, K. P. (2012). Performance Evolutions of All Positive and Negative Association Rules.
15 Nayak, K., Deccaraman, M., & Nayak, V. (2012). Formulating a Mathematical Model for the E-Nose Application through Genetic Algorithm (GA). International Journal of Computer Applications, 51(1), 7-13.
16 Wangjian Ling, Li Wenbin, & Jia Chunsheng. (2012). Based on Data Mining acupuncture and moxibustion literature data warehouse building. Acupuncture study, 37 (1), 67-71.
17 Bharathi, C. R., & Shanthi, V. (2012). An effective system for acute spotting aberration in the speech of abnormal children via artificial neural network and genetic algorithm. American Journal of Applied Sciences, 9(10), 1561.
18 Poornamala, K., & Lawrance, R. a general survey on frequent pattern mining using genetic algorithm.
19 Gupta, R., & Satsangi, C. S. (2012). An efficient range partitioning method for finding frequent patterns from huge database. International Journal of Advanced Computer Research, 2(2), 62-69.
20 . Wangjian Ling, Li Wenbin, & Jia Chunsheng (2011) Design and build a data warehouse literature acupuncture and moxibustion acupuncture magazine world: English, 21 (3), 41-45.
21 Omara, E., El Said, T., & Mousa, M. (2011). Employing Neural Networks for Assessment of Data Quality with Emphasis on Data Completeness. ICGST International Journal on Artificial Intelligence and Machine Learning, AIML, 11(1), 21-28.
22 Dewang, R., & Agarwal, J. (2011). A New Method for Generating All Positive and Negative Association Rules. International Journal on Computer Science and Engineering, 3(4), 1649-1657.
23 Kumar, M. R., & Iyakutti, D. K. (2011). Application of Genetic algorithms for the prioritization of Association Rules. IJCA Special Issue on Artificial Intelligence Techniques-Novel Approaches and Practical Applications, 1-3.
24 Oyelade, O. J., & Oladipupo, O. O. (2010). Knowledge Discovery from Students’ Result Repository: Association Rule Mining Approach.
25 Awad, M., & Occupied, P. (2010). Optimization RBFNNs Parameters using Genetic Algorithms: Applied on Function Approximation Full text.
26 Awad, M. (2010). Optimization RBFNNs parameters using genetic algorithms: applied on function approximation. International Journal of Computer Science and Security (IJCSS), 4(3), 295.
27 Miao, H. (2010). A multi-operator based simulated annealing approach for robot navigation in uncertain environments. International Journal of Computer Science and Security, 4(1), 50-61.
28 Garofalo, M. GPU Computing for Machine Learning Algorithms.
29 Divya, P., & Rajalakshmi, M. S. Classifying Spyware Files Using Data Mining Algorithms and Hexadecimal Representation.
30 Chandra, E., & Nandhini, K. (2010). Knowledge mining from student data. European journal of scientific research, 47(1), 156-163.
31 OO, O. Knowledge Discovery from Students’ Result Repository: Association Rule Mining Approach. International Journal of Computer Science and Security (IJCSS), 149(2), 199.
1 Google Scholar 
2 Academic Journals Database 
3 ScientificCommons 
4 Academic Index 
5 CiteSeerX 
6 refSeek 
7 iSEEK 
8 Socol@r  
9 ResearchGATE 
10 Libsearch 
11 Bielefeld Academic Search Engine (BASE) 
12 Scribd 
13 WorldCat 
14 SlideShare 
16 PdfSR 
Cheng-Hong Yang, Chung-Jui Tu, Jun-Yang Chang Hsiou-Hsiang Liu Po-Chang Ko, “Dimensionality Reduction using GA-PSO”(2001).
Cristiano Pitangui, Gerson Zaverucha, “Genetic Based Machine Learning:Merging Pittsburgh and Michigan, an Implicit Feature Selection Mechanism and a New Crossover Operator”, Proceedings of the Sixth International Conference on Hybrid Intelligent Systems (HIS'06).(2006).
Erick Cantu-Paz, “Feature Subset Selection, Class Separability, and Genetic Algorithms”, Center for Applied Scientic Computing Lawrence Livermore National Laboratory Livermore, CA, (1994).
Freitas, A.A., E. Noda and H.S. Lopes,. “Discovering interesting prediction rules with a genetic algorithm”’. Proc. Conf. Evolutionary Computation, (CEC-99), pp: 1322–1329.(1999)
Hsu, W., B. Liu and S. Chen, “Ggeneral impressions to analyze discovered classificationrules”,. Proc. Of 3rd Intl. Conf. On Knowledge Discovery & Data Mining (KDD-97), pp: 31–36.AAAI Press.(1997)
Imielinski, T., R. Agrawal and A. Swami, “Mining association rules between sets of items in large databases”. Proc. ACM SIGMOD Conf. Management of Data, pp: 207–216.
Jochen Hipp,Ulrich G¨untzer and Udo Grimmer, “Data Quality Mining - Making a Virtue of Necessity”, In Proceedings of the 6th ACM SIGMOD Workshop on Research Issues in Data Mining and Knowledge Discovery (DMKD) 2001.
K.M. Faraoun, A. Rabhi, “Data dimensionality reduction based on genetic selection of feature subsets”, EEDIS UDL University- SBA, (2002).
M. Pei, E. D. Goodman, F. Punch, “Feature Extraction using genetic algorithm”, Case Center for Computer-Aided Engineering and Manufacturing W. Department of Computer Science,(2000).
P_adraig, “Dimension Reduction”, Cunningham University College Dublin Technical Report UCDCSI-2007-7 August 8th, 2007
R. Agrawal, R. Srikant, “Fast algorithms for mining association rules”, in Proceeding of the 20th Int’l Conference on Very Large Databases, Chile, 1994.
Sufal Das, Bhabesh Nath, “Dimensionality Reduction using Association Rule Mining”, IEEE Region 10 Colloquium and Third International Conference on Industrial and Information Systems (ICIIS 2008) December 8-10, 2008, IIT Kharagpur, India
Mr. Sufal Das
- India
Dr. Banani Saha
Calcutta University - India