首页 | 官方网站   微博 | 高级检索  
     

中文垃圾邮件的索引分词法的研究与设计
引用本文:强永妍,杨庚.中文垃圾邮件的索引分词法的研究与设计[J].计算机应用,2007,27(9):2334-2336.
作者姓名:强永妍  杨庚
作者单位:南京邮电大学,计算机学院,江苏,南京,210003
摘    要:为了提高中文垃圾邮件预处理阶段的性能,加快查找分词的速度,基于哈希函数的算法思想创造性的构造了索引词典,设计了一种针对中文垃圾邮件的中文索引分词方法。通过实验,表明该方法提高了传统机械分词法的效率和准确率,改善了邮件预处理阶段的性能,并且可以广泛地应用于中文分词领域。

关 键 词:反垃圾邮件  中文分词  哈希函数
文章编号:1001-9081(2007)09-2334-03
收稿时间:2007-03-26
修稿时间:2007年3月26日

Research and design of Chinese-spam's phrase segmentation based on indexing
QIANG Yong-yan,YANG Geng.Research and design of Chinese-spam''''s phrase segmentation based on indexing[J].journal of Computer Applications,2007,27(9):2334-2336.
Authors:QIANG Yong-yan  YANG Geng
Abstract:To improve the preprocessing performance for anti-spam and to search for phrases more efficiently, this paper creatively constructed an indexing dictionary based on hash algorithm, and designed a method of Chinese phrase segmentation based on this indexing dictionary aiming at anti-Chinese-spam. Through the study of the experimental data, this method is proved to be more efficient and accurate compared with the traditional mechanical classification, and it does improve the preprocessing performance and can be widely utilized in the field of Chinese phrase segmentation.
Keywords:anti-spam  Chinese phrase segmentation  hash algorithm
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号