首页 | 官方网站   微博 | 高级检索  
     

基于朴素贝叶斯和层次聚类的两阶段垃圾邮件过滤方法
引用本文:廖明涛,张德运,李金库.基于朴素贝叶斯和层次聚类的两阶段垃圾邮件过滤方法[J].微电子学与计算机,2007,24(8):1-3,7.
作者姓名:廖明涛  张德运  李金库
作者单位:西安交通大学,网络所,陕西,西安,710049
基金项目:国家高技术研究发展计划(863计划);国家"火炬计划"
摘    要:为降低对合法邮件的误判,提出一种基于朴素贝叶斯和层次聚类的两阶段垃圾邮件过滤方法。该方法将邮件划分为“合法邮件”、“可疑邮件”和“垃圾邮件”3类,在第一阶段,利用朴素贝叶斯算法速度快、分类性能好的优点,对邮件进行初步分类;在第二阶段,基于垃圾邮件的发送特征,利用层次聚类算法进行相似性比较。实验表明,该方法能够显著提高垃圾邮件的查准率,降低对合法邮件的误判,更加符合实际应用需求。

关 键 词:朴素贝叶斯  层次聚类  垃圾邮件过滤
文章编号:1000-7180(2007)08-0001-03
修稿时间:2006-09-23

A Two-Stage Spam Email Filtering Method Based on Naive Bayes and Hierarchical Clustering
LIAO Ming-tao,ZHANG De-yun,LI Jin-ku.A Two-Stage Spam Email Filtering Method Based on Naive Bayes and Hierarchical Clustering[J].Microelectronics & Computer,2007,24(8):1-3,7.
Authors:LIAO Ming-tao  ZHANG De-yun  LI Jin-ku
Abstract:To reduce misclassification rate of legitimate emails, proposed a two-stage spare email filtering method based on naive Bayes and hierarchical clustering. This method classifies emails as Legitimate, Unsure and Spare. At first stage, it classifies email as Legitimate and Unsure by using naive Bayesian classifier. At second stage, a hierarchical clustering method is used to find similar email in the pre-collected spam emails set. The experiment showed that, this method can increase the precision of spam detection, lower the misclassification of legitimate emails, which is more viable in practice.
Keywords:naive bayes  hierarchical clustering  spam email filtering
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号