首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到19条相似文献,搜索用时 156 毫秒
1.
支持向量机(support vector machine, SVM)具有良好的泛化性能而被广泛应用于机器学习及模式识别领域。然而,当训练集较大时,训练SVM需要极大的时间及空间开销。另一方面,SVM训练所得的判定函数取决于支持向量,使用支持向量集取代训练样本集进行学习,可以在不影响结果分类器分类精度的同时缩短训练时间。采用混合方法来削减训练数据集,实现潜在支持向量的选择,从而降低SVM训练所需的时间及空间复杂度。实验结果表明,该算法在极大提高SVM训练速度的同时,基本维持了原始分类器的泛化性能。  相似文献   

2.
支持向量机(SVM)作为一种有效的模式分类方法,当数据集规模较大时,学习时间长、泛化能力下降;而核向量机(CVM)分类算法的时间复杂度与样本规模无关,但随着支持向量的增加,CVM的学习时间会快速增长。针对以上问题,提出一种CVM与SVM相结合的二阶段快速学习算法(CCS),首先使用CVM初步训练样本,基于最小包围球(MEB)筛选出潜在核向量,构建新的最有可能影响问题解的训练样本,以此降低样本规模,并使用标记方法快速提取新样本;然后对得到的新训练样本使用SVM进行训练。通过在6个数据集上与SVM和CVM进行比较,实验结果表明,CCS在保持分类精度的同时训练时间平均减少了30%以上,是一种有效的大规模分类学习算法。  相似文献   

3.
为实现对历史训练数据有选择地遗忘,并尽可能少地丢失训练样本集中的有用信息,分析了KKT条件与样本分布间的关系并得出了结论,给出了增量训练中当前训练样本集的构成.为了提高SVM增量训练速度,进一步利用训练样本集的几何结构信息对当前训练样本集进行约减,用约减后的当前训练样本集进行SVM增量训练,从而提出一种利用KKT务件与类边界包向量的快速SVM增量学习算法.实验结果表明,该算法在保持较高分类精度的同时提高了SVM增量学习速度.  相似文献   

4.
针对SVM方法在大样本情况下学习和分类速度慢的问题,提出了大样本情况下的一种新的SVM迭代训练算法。该算法利用K均值聚类算法对训练样本集进行压缩,将聚类中心作为初始训练样本集,减少了样本间的冗余,提高了学习速度。同时为了保证学习的精度,采用往初始训练样本集中加入边界样本和错分样本的策略来更新训练样本集,迭代训练直到错分样本数目不变为止。该文提出的基于K均值聚类的SVM迭代算法能在保持学习精度的同时,减小训练样本集及决策函数的支持向量集的规模,从而提高学习和分类的速度。  相似文献   

5.
针对集成学习算法的不足,提出了一种新颖的集成学习算法一集成最大间隔集成学习算法(MMEA).该算法的时间与空间复杂度都是O(N),而标准的SVM算法的时间复杂度是O(N3),空间复杂度是O(N2),其中N是数据样本的大小,并从理论上证明了MMEA算法的收敛性.用MMEA算法与Bagging LibSVM,AdaBoostLibSVM,BaggingLiblinear,AdaBoostLiblinear流行的集成算法对扩展的MIT人脸数据集进行分类.实验结果表明,提出的MMEA算法在多项指标上均达到最优.  相似文献   

6.
针对SVM (support vector machine)算法应用到大规模网络流量分类中存在计算复杂度高、训练速度慢等问题,提出一种基于云计算平台进行并行网络流量分类的SVM方法,以提高对大数据集的分类训练速度.该方法是一种采用云计算平台构建多级SVM和映射规约(MapReduce)模型的方法.它将训练数据集划分为多个子训练数据集,通过对所有子训练数据集进行并行训练,得到支持向量集,进而训练出流量分类模型.实验结果表明,与传统的SVM方法相比,并行SVM网络流量分类方法在保持较高分类精度的前提下,有效地减少了训练时间,提高了大规模网络流量分类的速度.  相似文献   

7.
一种用于文本分类的语义SVM及其在线学习算法   总被引:1,自引:1,他引:1  
该文利用SVM在小训练样本集条件下仍有高泛化能力的特性,结合文本分类问题中同类别文本的特征在特征空间中具有聚类性分布的特点,提出一种使用语义中心集代替原训练样本集作为训练样本和支持向量的SVM:语义SVM。文中给出语义中心集的生成步骤,进而给出语义SVM的在线学习(在线分类知识积累)算法框架,以及基于SMO算法的在线学习算法的实现。实验结果说明语义SVM及其在线学习算法具有巨大的应用潜力:不仅在线学习速度和分类速度相对于标准SVM及其简单增量算法有数量级提高,而且分类准确率方面具有一定优势。  相似文献   

8.
针对网络数据集过于庞大,学习速度过慢的问题,提出了一种基于空间块和样本密度的SVM算法,并将其应用到入侵检测中。该算法根据样本的局部密度选择训练样本,减少参加训练的样本数量,提高学习速度。实验结果表明,该算法在保证检测精度的同时,学习速度快于传统SVM入侵检测方法。  相似文献   

9.
支持向量机在大规模训练集上学习时,存在学习时间长、泛化能力下降的问题。路径跟踪算法具有O(n L)的时间复杂度,能够在多项式时间内求解大规模QP问题。分析了影响SVM分类超平面的主要因素,使用路径跟踪内点算法和核距离矩阵快速约简训练集,再用约简后的训练集重新训练SVM。实验结果表明,重新训练后的SVM模型得到了简化,模型的泛化能力也得到提高。  相似文献   

10.
基于类边界壳向量的快速SVM增量学习算法   总被引:1,自引:0,他引:1       下载免费PDF全文
为进一步提高SVM增量训练的速度,在有效保留含有重要分类信息的历史样本的基础上,对当前增量训练样本集进行了约简,提出了一种基于类边界壳向量的快速SVM增量学习算法,定义了类边界壳向量。算法中增量训练样本集由壳向量集和新增样本集构成,在每一次增量训练过程中,首先从几何角度出发求出当前训练样本集的壳向量,然后利用中心距离比值法选择出类边界壳向量后进行增量SVM训练。分别使用人工数据集和UCI标准数据库中的数据进行了实验,结果表明了方法的有效性。  相似文献   

11.
1 Introduction Based on recent advances in statistical learning theory, Support Vector Machines (SVMs) compose a new class of learning system for pattern classification. Training a SVM amounts to solving a quadratic pro- gramming (QP) problem with a dense matrix. Stan- dard QP solvers require the full storage of this matrix, and their e?ciency lies in its sparseness, which make its application to SVM training with large training sets intractable. The SVM, pioneered by Vapnik and his te…  相似文献   

12.
Traditional Support Vector Machine (SVM) solution suffers from O(n 2) time complexity, which makes it impractical to very large datasets. To reduce its high computational complexity, several data reduction methods are proposed in previous studies. However, such methods are not effective to extract informative patterns. In this paper, a two-stage informative pattern extraction approach is proposed. The first stage of our approach is data cleaning based on bootstrap sampling. A bundle of weak SVM classifiers are constructed on the sampled datasets. Training data correctly classified by all the weak classifiers are cleaned due to lacking useful information for training. To further extract more informative training data, two informative pattern extraction algorithms are proposed in the second stage. As most training data are eliminated and only the more informative samples remain, the final SVM training time is reduced significantly. Contributions of this paper are three-fold. (1) First, a parallelized bootstrap sampling based method is proposed to clean the initial training data. By doing that, a large number of training data with little information are eliminated. (2) Then, we present two algorithms to effectively extract more informative training data. Both algorithms are based on maximum information entropy according to the empirical misclassification probability of each sample estimated in the first stage. Therefore, training time can be further reduced for training data further reduction. (3) Finally, empirical studies on four large datasets show the effectiveness of our approach in reducing the training data size and the computational cost, compared with the state-of-the-art algorithms, including PEGASOS, LIBLINEAR SVM and RSVM. Meanwhile, the generalization performance of our approach is comparable with baseline methods.  相似文献   

13.
支持向量机是最有效的分类技术之一,具有很高的分类精度和良好的泛化能力,但其应用于大型数据集时的训练过程还是非常复杂。对此提出了一种基于单类支持向量机的分类方法。采用随机选择算法来约简训练集,以达到提高训练速度的目的;同时,通过恢复超球体交集中样本在原始数据中的邻域来保证支持向量机的分类精度。实验证明,该方法能在较大程度上减小计算复杂度,从而提高大型数据集中的训练速度。  相似文献   

14.
The increasing size and dimensionality of real-world datasets make it necessary to design efficient algorithms not only in the training process but also in the prediction phase. In applications such as credit card fraud detection, the classifier needs to predict an event in 10 ms at most. In these environments the speed of the prediction constraints heavily outweighs the training costs. We propose a new classification method, called a Hierarchical Linear Support Vector Machine (H-LSVM), based on the construction of an oblique decision tree in which the node split is obtained as a Linear Support Vector Machine. Although other methods have been proposed to break the data space down in subregions to speed up Support Vector Machines, the H-LSVM algorithm represents a very simple and efficient model in training but mainly in prediction for large-scale datasets. Only a few hyperplanes need to be evaluated in the prediction step, no kernel computation is required and the tree structure makes parallelization possible. In experiments with medium and large datasets, the H-LSVM reduces the prediction cost considerably while achieving classification results closer to the non-linear SVM than that of the linear case.  相似文献   

15.
支持向量机回归的参数选择方法   总被引:8,自引:3,他引:5       下载免费PDF全文
闫国华  朱永生 《计算机工程》2009,35(14):218-220
综合4种支持向量机回归的参数选择方法的优点,提出一种对训练样本进行分析并直接确定参数的方法。在标准测试数据集上的试验证明,该方法与传统网格搜索法相比,在时间和预测精度方面取得了更好的结果,可以较好地解决支持向量机在实际应用中参数难以选择、消耗时间长的问题。  相似文献   

16.
This study evaluates the potential of object-based image analysis in combination with supervised machine learning to identify urban structure type patterns from Landsat Thematic Mapper (TM) images. The main aim is to assess the influence of several critical choices commonly made during the training stage of a learning machine on the classification performance and to give recommendations for classifier-dependent intelligent training. Particular emphasis is given to assess the influence of size and class distribution of the training data, the approach of training data sampling (user-guided or random) and the type of training samples (squares or segments) on the classification performance of a Support Vector Machine (SVM). Different feature selection algorithms are compared and segmentation and classifier parameters are dynamically tuned for the specific image scene, classification task, and training data. The performance of the classifier is measured against a set of reference data sets from manual image interpretation and furthermore compared on the basis of landscape metrics to a very high resolution reference classification derived from light detection and ranging (lidar) measurements. The study highlights the importance of a careful design of the training stage and dynamically tuned classifier parameters, especially when dealing with noisy data and small training data sets. For the given experimental set-up, the study concludes that given optimized feature space and classifier parameters, training an SVM with segment-shaped samples that were sampled in a guided manner and are balanced between the classes provided the best classification results. If square-shaped samples are used, a random sampling provided better results than a guided selection. Equally balanced sample distributions outperformed unbalanced training sets.  相似文献   

17.
基于支持向量机的图像亚像素配准及超分辨率重建   总被引:1,自引:0,他引:1  
陈浩  胡暾 《计算机应用》2010,30(3):628-631
超分辨率重建是根据场景的一组低分辨率图像重建其高分辨率图像。重建算法中,低分辨图像之间的亚像素配准是很重要的一部分。提出了一种基于支持向量机的亚像素配准方法,将低分辨图像之间的相对旋转平移参数看成支持向量机的目标集,通过支持向量回归建立图像特征与目标集之间的映射关系,从而计算图像间的相对运动参数。实验表明,与现有算法相比,所提出的算法具有较高的精度。  相似文献   

18.
由于图像数据量庞大,将标准支持向量机应用于图像分割时,其训练的时间复杂度较高。通过使用球向量机对图像进行分割,以降低训练过程消耗的时间。实验表明,在无噪声和有噪声情况下,使用球向量机对图像进行分割,其分割效果和抗噪性能与标准支持向量机的分割效果基本相同。然而,球向量机在训练过程中所消耗的时间显著小于标准支持向量机。应用球向量机进行图像分割,可以显著提高图像分割的整体性能。  相似文献   

19.
针对大样本支持向量机内存开销大、训练速度慢的缺点,提出了一种改进的支持向量机算法。算法先利用KNN方法找出可能支持向量,然后利用SVM在可能支持向量集上训练得到分类器。实验表明改进算法训练速度提高明显。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号