首页 | 官方网站   微博 | 高级检索  
     


Enhancement of Mahalanobis–Taguchi System via Rough Sets based Feature Selection
Affiliation:1. Department of Industrial Engineering & Management, IIT Kharagpur, Kharagpur, 721302, India;2. WMG, University of Warwick, Coventry, CV4 7AL, United Kingdom;1. Electrical and Computer Engineering Department, University of Miami, Coral Gables, FL 33146, United States;2. Evelyn F. McKnight Brain Institute, University of Miami, Miller School of Medicine, Miami, FL 33136, United States;1. School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China;2. State Key Laboratory of Information Security, Institute of Information Engineering, Chinese Academy of Sciences, Beijing 100093, China;3. Department of Computing, The Hong Kong Polytechnic University, Hong Kong;1. Gradiant Research Centre, Vigo, Spain;2. AtlantTIC Research Center for Information and Communication Technologies, Department of Telematics Engineering, University of Vigo, Spain;1. Indian Institute of Technology Banaras Hindu University, Varanasi 221005, Uttar Pradesh, India;2. Malaviya National Institute of Technology, Jaipur 302017, Rajasthan, India
Abstract:The current research presents a methodology for classification based on Mahalanobis Distance (MD) and Association Mining using Rough Sets Theory (RST). MD has been used in Mahalanobis Taguchi System (MTS) to develop classification scheme for systems having dichotomous states or categories. In MTS, selection of important features or variables to improve classification accuracy is done using Signal-to-Noise (S/N) ratios and Orthogonal Arrays (OAs). OAs has been reviewed for limitations in handling large number of variables. Secondly, penalty for over-fitting or regularization is not included in the feature selection process for the MTS classifier. Besides, there is scope to enhance the utility of MTS to a classification-cum-causality analysis method by adding comprehensive information about the underlying process which generated the data. This paper proposes to select variables based on maximization of degree-of-dependency between Subset of System Variables (SSVs) and system classes or categories (R). Degree-of-dependency, which reflects goodness-of-model and hence goodness of the SSV, is measured by conditional probability of system states on subset of variables. Moreover, a suitable regularization factor equivalent to L0 norm is introduced in an optimization problem which jointly maximizes goodness-of-model and effect of regularization. Dependency between SSVs and R is modeled via the equivalent sets of Rough Set Theory. Two new variants of MTS classifier are developed and their performance in terms of accuracy of classification is evaluated on test datasets from five case studies. The proposed variants of MTS are observed to be performing better than existing MTS methods and other classification techniques found in literature.
Keywords:Data mining  Mahalanobis Taguchi System  Feature Selection  Orthogonal Arrays  Rough Sets  Over-fitting  Regularization  Conditional probability  IF–THEN rules
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号