首页 | 官方网站   微博 | 高级检索  
     

基于类别的特征选择算法的文本分类系统
引用本文:蒋伟贞,陶宏才.基于类别的特征选择算法的文本分类系统[J].计算机应用,2005,25(11):2658-2660.
作者姓名:蒋伟贞  陶宏才
作者单位:1.暨南大学信息科学技术学院; 2.西南交通大学信息科学与技术学院
摘    要:目前的索引词选择算法大多是基于词频的,没有利用训练样本中的类别信息,为此提出了一种新的基于类别的特征选择算法。该算法根据某个词是否存在于文档中导致该类文档相似度的区别,来确定该词区分不同文档的分辨力,以此分辨力作为选取关键词的重要度。以该算法为基础,设计了一个英文文本自动分类系统,并对该系统进行了测试和结果分析。

关 键 词:文本自动分类    特征选择    向量空间模型    朴素贝叶斯    分辨力
文章编号:1001-9081(2005)11-2658-03
收稿时间:2005-05-24
修稿时间:2005-05-242005-08-08

An automatic text classifier of class-based feature selection algorithm
JIANG Wei-zhen,TAO Hong-cai.An automatic text classifier of class-based feature selection algorithm[J].journal of Computer Applications,2005,25(11):2658-2660.
Authors:JIANG Wei-zhen  TAO Hong-cai
Affiliation:1.School of Information Science and Technology,Jinan University,Guagnzhou Guangdong 510084,China;2.School of Information Science and Technology,Southwest Jiaotong University,Chengdu Sichuan 610031,China
Abstract:Current feature selection algorithms are all based on term frequency,and ignore the class information in the training sample set.A new feature selection algorithm based on class information was put forward.The principle of the algorithm is as follows: according to the similarity difference caused by whether or not a word existed in a document,the discriminative power with that this word distinguished different documents could be determined.And then,the discriminative power was taken as the importance for
Keywords:automatic text classification  feature selection  VSM model  naive Bayes  discriminative power
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号