首页 | 官方网站   微博 | 高级检索  
     

基于互信息的文本特征加权方法
引用本文:樊小超,张重阳,邓雄伟.基于互信息的文本特征加权方法[J].计算机工程与应用,2015,51(13):145-148.
作者姓名:樊小超  张重阳  邓雄伟
作者单位:1.南京理工大学 计算机科学与工程学院,南京 210018 2.新疆师范大学 计算机科学技术学院,乌鲁木齐 830054
摘    要:特征加权是文本分类中的重要环节,通过考察传统的特征选择函数,发现互信息方法在特征加权过程中表现尤为突出。为了提高互信息方法在特征加权时的性能,加入了词频信息、文档频率信息以及类别相关度因子,提出了一种基于改进的互信息特征加权方法。实验结果表明,该方法比传统的特征加权方法具有更好的分类性能。

关 键 词:文本分类  特征选择  特征加权  互信息  

Text feature weighting method based on mutual information
FAN Xiaochao,ZHANG Chongyang,DENG Xiongwei.Text feature weighting method based on mutual information[J].Computer Engineering and Applications,2015,51(13):145-148.
Authors:FAN Xiaochao  ZHANG Chongyang  DENG Xiongwei
Affiliation:1.College of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210018, China 2.College of Computer Science and Technology, Xinjiang Normal University, Urumqi 830054, China
Abstract:Feature weighting is an important part of the procedure of text categorization, by examining the traditional feature selection function, it finds that the method of mutual information in feature weighting process performs particularly prominent. In order to improve the performance of the method of mutual information in feature weighting, the paper adds the term frequency information, document frequency information and categories correlation factor, and proposes a feature weighted based on mutual information method. The experiments show that this method has better classification performance than the traditional feature weighting method.
Keywords:text categorization  feature selection  feature weighting  mutual information
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号