首页 | 官方网站   微博 | 高级检索  
     

决策树C4.5算法的优化与应用
引用本文:苗煜飞,张霄宏.决策树C4.5算法的优化与应用[J].计算机工程与应用,2015,51(13):255-258.
作者姓名:苗煜飞  张霄宏
作者单位:1.河南理工大学 计算机科学与技术学院,河南 焦作 454000 2.中国科学院 深圳先进技术研究院,广东 深圳 518055
基金项目:国家自然科学基金面上项目(No.51274088);河南省教育厅资助项目(No.ITE12103);河南理工大学博士基金(No.B2012-099);河南理工大学矿山信息化省级重点实验室资助项目(No.KY2012-05)。
摘    要:C4.5算法作为目前最具影响力的决策树分类算法,仍存一些不足之处。针对C4.5算法在对连续值属性离散化处理过程中比较耗时的缺点,基于Fayyad和Irani的边界定理,在连续属性离散化之后使用Gini指标代替信息熵对算法进行了化简。针对决策树算法中的过度拟合问题,基于Occam’s razor,采用再带入估计,对算法进行了改进。将上述思想应用于金融借贷数据,实验结果表明,改进的C4.5算法在保证准确率的前提下,执行时间平均降低8.74%,模型复杂度平均降低6.26%,表明了该算法的有效性。

关 键 词:C4.5算法  边界定理  Gini指标  奥卡姆剃刀  再带入估计  

Improvement and application of C4.5 decision tree algorithm
MIAO Yufei,ZHANG Xiaohong.Improvement and application of C4.5 decision tree algorithm[J].Computer Engineering and Applications,2015,51(13):255-258.
Authors:MIAO Yufei  ZHANG Xiaohong
Affiliation:1.College of Computer Science and Technology, Henan Polytechnic University, Jiaozuo, Henan 454000, China 2.Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, Guangdong 518055, China
Abstract:C4.5 is the most influential decision tree classified algorithm, but it still has some deficiencies. To improve the deficiency of consuming more time in discretizing continuous-valued attributes using C4.5 algorithm, a new simplified algorithm is proposed by using Gini index to replace information entropy after discretizing continuous-valued attributes based on Fayyad and Irani boundary theory. To solving the over fitting problem in decision tree method, the improved algorithm is considered by using resubstitution estimate based on Occam’s razor. Applying the idea above to financial loan data, experimental results show that the execution time is reduced by an average of 8.74%, and that the model complexity is reduced by an average of 6.26% by using the improved C4.5 algorithm under the premise of guaranteeing the accuracy. Finally, the experimental results verify the validity of this algorithm.
Keywords:C4  5 algorithm  boundary theorem  Gini index  Occam’s razor  resubstitution estimate
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号