首页 | 官方网站   微博 | 高级检索  
     

一种基于词汇相关度的网络文本分类算法研究
引用本文:邱前智,刘忠.一种基于词汇相关度的网络文本分类算法研究[J].网络安全技术与应用,2012(5):33-34,40.
作者姓名:邱前智  刘忠
作者单位:桂林理工大学,广西,541004
摘    要:传统文本分类算法,在特征选择这一阶段,采用统计观点和方法机械处理词语与类别的联系,假定词语之间相互独立,忽略特征关键词之间的语义关系。本文提出一种新的特征选择方法,用基于上下文统计的词汇相关度方法,计算特征词之间的词汇相关度,设定相关度阀值,进行特征选择。降低了特征空间的高维稀疏性,并有效的减少噪声,提高了分类精度和算法效率。

关 键 词:文本分类  特征选择  词汇相关度

Research of,Web Text Classification Algorithm based on Lexical Relatedness
Qiu Qianzhi,Liu Zhong.Research of,Web Text Classification Algorithm based on Lexical Relatedness[J].Net Security Technologies and Application,2012(5):33-34,40.
Authors:Qiu Qianzhi  Liu Zhong
Affiliation:Guilin University of Technology,Guangxi,541004,China
Abstract:Traditional text classification algorithms,on the stage of feature selection,use statistical point and methods handle the links between words and categories,and assume that words are independent,ignore the semantic relationships between keywords.This paper presents a new feature selection method,and use lexical relatedness based on the context of statistics,calculate the words’lexical relatedness and set the relevant threshold values for feature selection.Reduce the scarcity of high dimensional feature space,and effectively reduce noise,improve the classification accuracy and efficiency of the algorithm.
Keywords:Text Categorization  Feature Selection  Lexical Relatedness
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号