首页 | 官方网站   微博 | 高级检索  
     

基于改进KNN算法的中文文本分类方法
引用本文:王爱平,徐晓艳,国玮玮,李仿华.基于改进KNN算法的中文文本分类方法[J].微型机与应用,2011,30(18).
作者姓名:王爱平  徐晓艳  国玮玮  李仿华
作者单位:安徽大学计算智能与信号处理教育部重点实验室,安徽合肥,230039
摘    要:介绍了中心向量算法和KNN算法两种分类方法。针对KNN分类方法在计算文本相似度时存在的不足,提出了改进方案。新方案引入了中心向量分类法的思想。通过实验,对改进的KNN算法、中心向量算法和传统的KNN算法应用于文本分类效果进行了比较。实验结果表明,改进的KNN算法较中心向量法和传统的KNN算法在处理中文文本分类问题上有较好的分类效果,验证了对KNN算法改进的有效性和可行性。

关 键 词:文本分类  中心向量法  KNN  相似度

Text categorization method based on improved KNN algorithm
Wang Aiping,Xu Xiaoyan,Guo Weiwei,Li Fanghua.Text categorization method based on improved KNN algorithm[J].Microcomputer & its Applications,2011,30(18).
Authors:Wang Aiping  Xu Xiaoyan  Guo Weiwei  Li Fanghua
Affiliation:Wang Aiping,Xu Xiaoyan,Guo Weiwei,Li Fanghua(Ministry of Education Key Laboratory of Intelligent Computing & Signal Processing,Anhui University,Hefei 230039,China)
Abstract:This paper mainly introduces the central vector algorithms and KNN algorithms two classification method. According to KNN classification method in calculating text the shortcomings of the similarity, put out one improved scheme. The new scheme introduces the idea of central vector classification method. At last an empirical study of using the improved KNN algorithm,the central vector algorithm and the traditional KNN algorithm to categorize the Chinese text is conducted. The result of the experiment shows that,compared with central vector algorithm and traditional KNN algorithm,improved KNN algorithm has better categorization effect of the Chinese text,and verify the validity and feasibility of improvement KNN algorithm.
Keywords:texts categorization  central vector algorithm KNN  similarity
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号