首页 | 官方网站   微博 | 高级检索  
     

基于二叉树的多分类SVM算法在电子邮件过滤中的应用
引用本文:衣治安,刘杨.基于二叉树的多分类SVM算法在电子邮件过滤中的应用[J].计算机应用,2007,27(11):2860-2862.
作者姓名:衣治安  刘杨
作者单位:大庆石油学院,计算机与信息技术学院,黑龙江,大庆,163318
基金项目:黑龙江省研究生创新科研项目
摘    要:目前性能较好的多分类算法有1-v-r支持向量机(SVM)、1-1-1SVM、DDAG SVM等,但存在大量不可分区域且训练时间较长的问题。提出一种基于二叉树的多分类SVM算法用于电子邮件的分类与过滤,通过构建二叉树将多分类转化为二值分类,算法采用先聚类再分类的思想,计算测试样本与子类中心的最大相似度和子类间的分离度,以构造决策节点的最优分类超平面。对于C类分类只需C-1个决策函数,从而可节省训练时间。实验表明,该算法得到了较高的查全率、查准率。

关 键 词:二叉树  多分类SVM  电子邮件过滤  聚类
文章编号:1001-9081(2007)11-2860-03
收稿时间:2007-05-18
修稿时间:2007年5月18日

Application of multiclass SVM based on binary tree in E-mail filtering
YI Zhi-an,LIU Yang.Application of multiclass SVM based on binary tree in E-mail filtering[J].journal of Computer Applications,2007,27(11):2860-2862.
Authors:YI Zhi-an  LIU Yang
Abstract:Now some preferable performance multiclass algorithms, such as 1-v-r support vector machine (SVM) , 1-1-1 SVM and DDAG SVM,have many problems of impartible regions and longer training time.A new multiclass SVM algorithm based on binary tree was introduced on E-mail filtering. It could convert multiclass problem to binary classification by constructing binary tree. The idea of clustering first and classification later was adopted, and the largest similarity between testing sample and sub-category center and the separation measure of sub categories were calculated, in oder to construct the optimal class hyperplane of decision making nodes. Only C-1 optimal functions were needed for C kinds of classification, so training time could be saved. The experiment results show that the new algorithm has higher filtering recall and precision.
Keywords:binary tree  multiclass Support Vector Machine (SVM)  E-mail filtering  clustering
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号