首页 | 官方网站   微博 | 高级检索  
     

面向超大数据集的SVM近似训练算法
引用本文:曾志强,廖备水,高济.面向超大数据集的SVM近似训练算法[J].计算机科学,2009,36(11):208-212.
作者姓名:曾志强  廖备水  高济
作者单位:1. 厦门理工学院计算机科学与技术系,厦门,361024;浙江大学计算机科学与技术学院,杭州,310027
2. 浙江大学计算机科学与技术学院,杭州,310027
基金项目:国家自然科学基金,福建省青年人才项目,厦门理丁学院引进人才项目 
摘    要:标准SVM学习算法运行所需的时间和空间复杂度分别为O(l~3)和O(l~2),l为训练样本的数量,因此不适用于对超大数据集进行训练.提出一种基于近似解的SVM训练算法:Approximate Vector Machine(AVM).AVM采用增量学习的策略来寻找近似最优分类超平面,并且在迭代过程中采用热启动及抽样技巧来加快训练速度.理论分析表明,该算法的计算复杂度与训练样本的数量无关,因此具有良好的时间与空间扩展性.在超大数据集上的实验结果表明,该算法在极大提高训练速度的同时,仍然保持了原始分类器的泛化性能,并且训练完毕具有较少的支持向量,因此结果分类器具有更快的分类速度.

关 键 词:支持向量机  核函数  增量学习  近似解  核心集
收稿时间:2008/12/24 0:00:00
修稿时间:2009/3/12 0:00:00

Approximate Approach to Train SVM on Very Large Data Sets
ZENG Zhi-qiang,LIAO Bei shui,GAO Ji.Approximate Approach to Train SVM on Very Large Data Sets[J].Computer Science,2009,36(11):208-212.
Authors:ZENG Zhi-qiang  LIAO Bei shui  GAO Ji
Affiliation:(Department of Computer Science and Technology,Xiamen University of Technology,Xiamen 361024,China);(College of Computer Science and Technology,Zhejiang University, Hangzhou 310027,China)
Abstract:Standard Support Vector Machine (SVM) training has O(l~3) time and O(l~2) space complexities,where l is the training set size.It is thus computationally infeasible on very large data sets.A novel SVM training method, Approx-imate Vector Machine (AVM), based on approximate solution was presented to scale up kernel methods on very large data sets.This approach only obtains an approximately optimal hyper plane by incremental learning, and uses probabilis-tic speedup and hot start tricks to accelerate training speed during each iterative stage.Theoretical analysis indicates that AVM has the time and space complexities that are independent of training set size.Experiments on very large data sets show that the proposed method not only preserves the generalization performance of the original SVM classifiers, but outperforms existing scale-up methods in terms of training time and number of support vectors.
Keywords:Support vector machine  Kernel function  Incremental learning  Approximate solution  Core set
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号