首页 | 官方网站   微博 | 高级检索  
     

基于决策树算法的信息系统数据挖掘研究
引用本文:李颖.基于决策树算法的信息系统数据挖掘研究[J].信息技术,2022(2).
作者姓名:李颖
作者单位:海军青岛特勤疗养中心经济管理科
摘    要:为了提高数据挖掘准确性和效率,文中提出了基于决策树算法的信息系统数据挖掘方法。以C4.5决策树算法计算属性的信息增益率和属性值的信息熵为基础,提出基于余弦相似度改进的C4.5决策树算法,若任意两个属性值的信息熵之差在阈值范围内,通过计算其余弦相似度合并在阈值范围内的属性值,并重新计算合并后属性的信息增益率,实现信息系统数据挖掘。实验结果显示:所提方法对不同数据集的分类精度均高于95%,数据挖掘效率高。

关 键 词:决策树  信息系统  数据挖掘  信息熵  余弦相似度

Research on information system data mining based on decision tree algorithm
LI Ying.Research on information system data mining based on decision tree algorithm[J].Information Technology,2022(2).
Authors:LI Ying
Affiliation:(Economic Management Department,Qingdao Special Servicemen Recuperation Center of PLA Navy,Qingdao 266071,Shandong Province,China)
Abstract:In order to improve the accuracy and efficiency of data mining,a data mining method of information system based on decision tree algorithm is proposed.On the basis of information gain rate based on C4.5 decision tree algorithm attributes and the information entropy of attribute values,an improved C4.5 decision tree algorithm based on cosine similarity is proposed.If the difference of information entropy between any two attribute values is within the threshold range,the information system is realized by calculating the cosine similarity of the attribute values merged within the threshold range and recalculating the information gain rate of the merged attributes.The experiment results show that the classification accuracy of the proposed method for different data sets is higher than 95%,and the data mining efficiency is high.
Keywords:decision tree  information system  data mining  information entropy  cosine similarity
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号