首页 | 官方网站   微博 | 高级检索  
     


Microarray gene expression classification with few genes: Criteria to combine attribute selection and classification methods
Authors:Carlos J Alonso-GonzálezQ Isaac Moro-Sancho  Arancha Simon-HurtadoRicardo Varela-Arrabal
Affiliation:Intelligent Systems Group (GSI), Departamento de Informática, Escuela Técnica Superior de Ingeniería Informática, Universidad de Valladolid, Campus Miguel Delibes s/n, 47011 Valladolid, Spain
Abstract:Microarray data classification is a task involving high dimensionality and small samples sizes. A common criterion to decide on the number of selected genes is maximizing the accuracy, which risks overfitting and usually selects more genes than actually needed. We propose, relaxing the maximum accuracy criterion, to select the combination of attribute selection and classification algorithm that using less attributes has an accuracy not statistically significantly worst that the best. Also we give some advice to choose a suitable combination of attribute selection and classifying algorithms for a good accuracy when using a low number of gene expressions. We used some well known attribute selection methods (FCBF, ReliefF and SVM-RFE, plus a Random selection, used as a base line technique) and classifying techniques (Naive Bayes, 3 Nearest Neighbor and SVM with linear kernel) applied to 30 data sets involving different cancer types.
Keywords:Microarray data classification  Feature selection  Machine learning  Efficient classification with few genes
本文献已被 ScienceDirect 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号