期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Sparse kernel SVMs via cutting-plane training 总被引：1，自引：0，他引：1

Thorsten Joachims Chun-Nam John Yu 《Machine Learning》2009,76(2-3):179-193

We explore an algorithm for training SVMs with Kernels that can represent the learned rule using arbitrary basis vectors, not just the support vectors (SVs) from the training set. This results in two benefits. First, the added flexibility makes it possible to find sparser solutions of good quality, substantially speeding-up prediction. Second, the improved sparsity can also make training of Kernel SVMs more efficient, especially for high-dimensional and sparse data (e.g. text classification). This has the potential to make training of Kernel SVMs tractable for large training sets, where conventional methods scale quadratically due to the linear growth of the number of SVs. In addition to a theoretical analysis of the algorithm, we also present an empirical evaluation. 相似文献

2.

Application of support vector machines to the antenna design

Z. Zheng X. Chen K. Huang 《国际射频与微波计算机辅助工程杂志》2011,21(1):85-90

The antenna design is a complicated and time‐consuming procedure. This work explores using support vector machines (SVMs), a statistical learning theory based on the structural risk minimization principle and has a great generalization capability, as a fast and accurate tool in the antenna design. As examples, SVMs is used to design a rectangular patch antenna and a rectangular patch antenna array. Results show, after an appropriate training, SVMs is able to effectively design antennas with high accuracy. © 2010 Wiley Periodicals, Inc. Int J RF and Microwave CAE, 2011. 相似文献

3.

Progressive refinement for support vector machines

Kiri L. Wagstaff Michael Kocurek Dominic Mazzoni Benyang Tang 《Data mining and knowledge discovery》2010,20(1):53-69

Support vector machines (SVMs) have good accuracy and generalization properties, but they tend to be slow to classify new examples. In contrast to previous work that aims to reduce the time required to fully classify all examples, we present a method that provides the best-possible classification given a specific amount of computational time. We construct two SVMs: a “full” SVM that is optimized for high accuracy, and an approximation SVM (via reduced-set or subset methods) that provides extremely fast, but less accurate, classifications. We apply the approximate SVM to the full data set, estimate the posterior probability that each classification is correct, and then use the full SVM to reclassify items in order of their likelihood of misclassification. Our experimental results show that this method rapidly achieves high accuracy, by selectively devoting resources (reclassification) only where needed. It also provides the first such progressive SVM solution that can be applied to multiclass problems. 相似文献

4.

Two-phase optimization for support vectors and parameter selection of support vector machines: Two-class classification

《Applied Soft Computing》2017

Support vector machines (SVMs) are one of the most popular classification tools and show the most potential to address under-sampled noisy data (a large number of features and a relatively small number of samples). However, the computational cost is too expensive, even for modern-scale samples, and the performance largely depends on the proper setting of parameters. As the data scale increases, the improvement in speed becomes increasingly challenging. As the dimension (feature number) largely increases while the sample size remains small, the avoidance of overfitting becomes a significant challenge. In this study, we propose a two-phase sequential minimal optimization (TSMO) to largely reduce the training cost for large-scale data (tested with 3186–70,000-sample datasets) and a two-phased-in differential-learning particle swarm optimization (tDPSO) to ensure the accuracy for under-sampled data (tested with 2000–24481-feature datasets). Because the purpose of training SVMs is to identify support vectors that denote a hyperplane, TSMO is developed to quickly select support vector candidates from the entire dataset and then identify support vectors from those candidates. In this manner, the computational burden is largely reduced (a 29.4%–65.3% reduction rate). The proposed tDPSO uses topology variation and differential learning to solve PSO’s premature convergence issue. Population diversity is ensured through dynamic topology until a ring connection is achieved (topology-variation phases). Further, particles initiate chemo-type simulated-annealing operations, and the global-best particle takes a two-turn diversion in response to stagnation (event-induced phases). The proposed tDPSO-embedded SVMs were tested with several under-sampled noisy cancer datasets and showed superior performance over various methods, even those methods with feature selection for the preprocessing of data. 相似文献

5.

基于主动学习和半监督学习的多类图像分类 总被引：5，自引：0，他引：5

陈荣曹永锋孙洪《自动化学报》2011,37(8):954-962

多数图像分类算法需要大量的训练样本对分类器模型进行训练.在实际应用中, 对大量样本进行标注非常枯燥、耗时.对于一些特殊图像,如合成孔径雷达 (Synthetic aperture radar, SAR)图像, 对其内容判读非常困难,因此能够获得的标注样本数量非常有限. 本文将基于最优标号和次优标号(Best vs second-best, BvSB)的主动学习和带约束条件的自学习(Constrained self-training, CST) 引入到基于支持向量机(Support vector machine, SVM)分类器的图像分类算法中,提出了一种新的图像分类方法.通过BvSB 主动学习去挖掘那些对当前分类器模型最有价值的样本进行人工标注,并借助CST半监督学习进一步利用样本集中大量的未标注样本,使得在花费较小标注代价情况下, 能够获得良好的分类性能.将新方法与随机样本选择、基于熵的不确定性采样主动学习算法以及BvSB主动学习方法进行了性能比较.对3个光学图像集及1个SAR图像集分类问题的实验结果显示,新方法能够有效地减少分类器训练时所需的人工标注样本的数量,并获得较高的准确率和较好的鲁棒性. 相似文献

6.

Benchmarking Least Squares Support Vector Machine Classifiers 总被引：16，自引：0，他引：16

van Gestel Tony Suykens Johan A.K. Baesens Bart Viaene Stijn Vanthienen Jan Dedene Guido de Moor Bart Vandewalle Joos 《Machine Learning》2004,54(1):5-32

In Support Vector Machines (SVMs), the solution of the classification problem is characterized by a (convex) quadratic programming (QP) problem. In a modified version of SVMs, called Least Squares SVM classifiers (LS-SVMs), a least squares cost function is proposed so as to obtain a linear set of equations in the dual space. While the SVM classifier has a large margin interpretation, the LS-SVM formulation is related in this paper to a ridge regression approach for classification with binary targets and to Fisher's linear discriminant analysis in the feature space. Multiclass categorization problems are represented by a set of binary classifiers using different output coding schemes. While regularization is used to control the effective number of parameters of the LS-SVM classifier, the sparseness property of SVMs is lost due to the choice of the 2-norm. Sparseness can be imposed in a second stage by gradually pruning the support value spectrum and optimizing the hyperparameters during the sparse approximation procedure. In this paper, twenty public domain benchmark datasets are used to evaluate the test set performance of LS-SVM classifiers with linear, polynomial and radial basis function (RBF) kernels. Both the SVM and LS-SVM classifier with RBF kernel in combination with standard cross-validation procedures for hyperparameter selection achieve comparable test set performances. These SVM and LS-SVM performances are consistently very good when compared to a variety of methods described in the literature including decision tree based algorithms, statistical algorithms and instance based learning methods. We show on ten UCI datasets that the LS-SVM sparse approximation procedure can be successfully applied. 相似文献

7.

Rough set based 1-v-1 and 1-v-r approaches to support vector machine multi-classification 总被引：1，自引：0，他引：1

Pawan Lingras Cory Butz 《Information Sciences》2007,177(18):3782-3798

Support vector machines (SVMs) are essentially binary classifiers. To improve their applicability, several methods have been suggested for extending SVMs for multi-classification, including one-versus-one (1-v-1), one-versus-rest (1-v-r) and DAGSVM. In this paper, we first describe how binary classification with SVMs can be interpreted using rough sets. A rough set approach to SVM classification removes the necessity of exact classification and is especially useful when dealing with noisy data. Next, by utilizing the boundary region in rough sets, we suggest two new approaches, extensions of 1-v-r and 1-v-1, to SVM multi-classification that allow for an error rate. We explicitly demonstrate how our extended 1-v-r may shorten the training time of the conventional 1-v-r approach. In addition, we show that our 1-v-1 approach may have reduced storage requirements compared to the conventional 1-v-1 and DAGSVM techniques. Our techniques also provide better semantic interpretations of the classification process. The theoretical conclusions are supported by experimental findings involving a synthetic dataset. 相似文献

8.

多任务学习的不平衡SVM+算法

周国华过林吉殷新春《计算机应用研究》2019,36(11)

处理不平衡数据分类时,传统支持向量机技术（SVM）对少数类样本识别率较低。鉴于SVM+技术能利用样本间隐藏信息的启发,提出了多任务学习的不平衡SVM+算法（MTL-IC-SVM+）。MTL-IC-SVM+基于SVM+将不平衡数据的分类表示为一个多任务的学习问题,并从纠正分类面的偏移出发,分别赋予多数类和少数类样本不同的错分惩罚因子,且设置少数类样本到分类面的距离大于多数类样本到分类面的距离。UCI数据集上的实验结果表明,MTL-IC-SVM+在不平衡数据分类问题上具有较高的分类精度。相似文献

9.

Adaptive binary tree for fast SVM multiclass classification 总被引：1，自引：0，他引：1

Jin Cheng Runsheng 《Neurocomputing》2009,72(13-15):3370

This paper presents an adaptive binary tree (ABT) to reduce the test computational complexity of multiclass support vector machine (SVM). It achieves a fast classification by: (1) reducing the number of binary SVMs for one classification by using separating planes of some binary SVMs to discriminate other binary problems; (2) selecting the binary SVMs with the fewest average number of support vectors (SVs). The average number of SVs is proposed to denote the computational complexity to exclude one class. Compared with five well-known methods, experiments on many benchmark data sets demonstrate our method can speed up the test phase while remain the high accuracy of SVMs. 相似文献

10.

Distributed coordinate descent for generalized linear models with regularization

I.?Trofimov Email author A.?Genkin 《Pattern Recognition and Image Analysis》2017,27(2):349-364

Generalized linear model with L ₁ and L ₂ regularization is a widely used technique for solving classification, class probability estimation and regression problems. With the numbers of both features and examples growing rapidly in the fields like text mining and clickstream data analysis parallelization and the use of cluster architectures becomes important. We present a novel algorithm for fitting regularized generalized linear models in the distributed environment. The algorithm splits data between nodes by features, uses coordinate descent on each node and line search to merge results globally. Convergence proof is provided. A modifications of the algorithm addresses slow node problem. For an important particular case of logistic regression we empirically compare our program with several state-of-the art approaches that rely on different algorithmic and data spitting methods. Experiments demonstrate that our approach is scalable and superior when training on large and sparse datasets. 相似文献

11.

A flexible probabilistic framework for large-margin mixture of experts

Sharma Archit Saxena Siddhartha Rai Piyush 《Machine Learning》2019,108(8-9):1369-1393

Mixture-of-Experts (MoE) enable learning highly nonlinear models by combining simple expert models. Each expert handles a small region of the data space, as dictated by the gating network which generates the (soft) assignment of input to the corresponding experts. Despite their flexibility and renewed interest lately, existing MoE constructions pose several difficulties during model training. Crucially, neither of the two popular gating networks used in MoE, namely the softmax gating network and hierarchical gating network (the latter used in the hierarchical mixture of experts), have efficient inference algorithms. The problem is further exacerbated if the experts do not have conjugate likelihood and lack a naturally probabilistic formulation (e.g., logistic regression or large-margin classifiers such as SVM). To address these issues, we develop novel inference algorithms with closed-form parameter updates, leveraging some of the recent advances in data augmentation techniques. We also present a novel probabilistic framework for MoE, consisting of a range of gating networks with efficient inference made possible through our proposed algorithms. We exploit this framework by using Bayesian linear SVMs as experts on various classification problems (which has a non-conjugate likelihood otherwise generally), providing our final model with attractive large-margin properties. We show that our models are significantly more efficient than other training algorithms for MoE while outperforming other traditional non-linear models like Kernel SVMs and Gaussian Processes on several benchmark datasets.

相似文献

12.

A parallel mixture of SVMs for very large scale problems 总被引：7，自引：0，他引：7

Collobert R Bengio S Bengio Y 《Neural computation》2002,14(5):1105-1114

Support vector machines (SVMs) are the state-of-the-art models for many classification problems, but they suffer from the complexity of their training algorithm, which is at least quadratic with respect to the number of examples. Hence, it is hopeless to try to solve real-life problems having more than a few hundred thousand examples with SVMs. This article proposes a new mixture of SVMs that can be easily implemented in parallel and where each SVM is trained on a small subset of the whole data set. Experiments on a large benchmark data set (Forest) yielded significant time improvement (time complexity appears empirically to locally grow linearly with the number of examples). In addition, and surprisingly, a significant improvement in generalization was observed. 相似文献

13.

A hybrid SVM based decision tree

M. Arun Kumar M. GopalAuthor vitae 《Pattern recognition》2010,43(12):3977-3987

We have proposed a hybrid SVM based decision tree to speedup SVMs in its testing phase for binary classification tasks. While most existing methods addressed towards this task aim at reducing the number of support vectors, we have focused on reducing the number of test datapoints that need SVM’s help in getting classified. The central idea is to approximate the decision boundary of SVM using decision trees. The resulting tree is a hybrid tree in the sense that it has both univariate and multivariate (SVM) nodes. The hybrid tree takes SVM’s help only in classifying crucial datapoints lying near decision boundary; remaining less crucial datapoints are classified by fast univariate nodes. The classification accuracy of the hybrid tree is guaranteed by tuning a threshold parameter. Extensive computational comparisons on 19 publicly available datasets indicate that the proposed method achieves significant speedup when compared to SVMs, without any compromise in classification accuracy. 相似文献

14.

Fast minimization of structural risk by nearest neighbor rule

Karacali B. Krim H. 《Neural Networks, IEEE Transactions on》2003,14(1):127-137

In this paper, we present a novel nearest neighbor rule-based implementation of the structural risk minimization principle to address a generic classification problem. We propose a fast reference set thinning algorithm on the training data set similar to a support vector machine (SVM) approach. We then show that the nearest neighbor rule based on the reduced set implements the structural risk minimization principle, in a manner which does not involve selection of a convenient feature space. Simulation results on real data indicate that this method significantly reduces the computational cost of the conventional SVMs, and achieves a nearly comparable test error performance. 相似文献

15.

Large-scale Gaussian process classification using random decision forests

B. Fröhlich E. Rodner M. Kemmler J. Denzler 《Pattern Recognition and Image Analysis》2012,22(1):113-120

Gaussian processes are powerful modeling tools in machine learning which offer wide applicability for regression and classification tasks due to their non-parametric and non-linear behavior. However, one of their main drawbacks is the training time complexity which scales cubically with the number of examples. Our work addresses this issue by combining Gaussian processes with random decision forests to enable fast learning. An important advantage of our method is its simplicity and the ability to directly control the tradeoff between classification performance and computational speed. Experiments on an indoor place recognition task and on standard machine learning benchmarks show that our method can handle large training sets of up to three million examples in reasonable time while retaining good classification accuracy. 相似文献

16.

A new maximal-margin spherical-structured multi-class support vector machine 总被引：4，自引：2，他引：2

Pei-Yi Hao Jung-Hsien Chiang Yen-Hsiu Lin 《Applied Intelligence》2009,30(2):98-111

Support vector machines (SVMs), initially proposed for two-class classification problems, have been very successful in pattern recognition problems. For multi-class classification problems, the standard hyperplane-based SVMs are made by constructing and combining several maximal-margin hyperplanes, and each class of data is confined into a certain area constructed by those hyperplanes. Instead of using hyperplanes, hyperspheres that tightly enclosed the data of each class can be used. Since the class-specific hyperspheres are constructed for each class separately, the spherical-structured SVMs can be used to deal with the multi-class classification problem easily. In addition, the center and radius of the class-specific hypersphere characterize the distribution of examples from that class, and may be useful for dealing with imbalance problems. In this paper, we incorporate the concept of maximal margin into the spherical-structured SVMs. Besides, the proposed approach has the advantage of using a new parameter on controlling the number of support vectors. Experimental results show that the proposed method performs well on both artificial and benchmark datasets. 相似文献

17.

Deep kernel learning in core vector machines

A. L. Afzal S. Asharaf 《Pattern Analysis & Applications》2018,21(3):721-729

In machine learning literature, deep learning methods have been moving toward greater heights by giving due importance in both data representation and classification methods. The recently developed multilayered arc-cosine kernel leverages the possibilities of extending deep learning features into the kernel machines. Even though this kernel has been widely used in conjunction with support vector machines (SVM) on small-size datasets, it does not seem to be a feasible solution for the modern real-world applications that involve very large size datasets. There are lot of avenues where the scalability aspects of deep kernel machines in handling large dataset need to be evaluated. In machine learning literature, core vector machine (CVM) is being used as a scaling up mechanism for traditional SVMs. In CVM, the quadratic programming problem involved in SVM is reformulated as an equivalent minimum enclosing ball problem and then solved by using a subset of training sample (Core Set) obtained by a faster \((1+\epsilon )\) approximation algorithm. This paper explores the possibilities of using principles of core vector machines as a scaling up mechanism for deep support vector machine with arc-cosine kernel. Experiments on different datasets show that the proposed system gives high classification accuracy with reasonable training time compared to traditional core vector machines, deep support vector machines with arc-cosine kernel and deep convolutional neural network. 相似文献

18.

Classification of fuzzy data based on the support vector machines

Yahya Forghani Hadi Sadoghi Yazdi Sohrab Effati 《Expert Systems》2013,30(5):403-417

Data may be afflicted with uncertainty. Uncertain data may be shown by an interval value or in general by a fuzzy set. A number of classification methods have considered uncertainty in features of samples. Some of these classification methods are extended version of the support vector machines (SVMs), such as the Interval‐SVM (ISVM), Holder‐ISVM and Distance‐ISVM, which are used to obtain a classifier for separating samples whose features are interval values. In this paper, we extend the SVM for robust classification of linear/non‐linear separable data whose features are fuzzy numbers. The support of such training data is shown by a hypercube. Our proposed method tries to obtain a hyperplane (in the input space or in a high‐dimensional feature space) such that the nearest point of the hypercube of each training sample to the hyperplane is separated with the widest symmetric margin. This strategy can reduce the misclassification probability of our proposed method. Our experimental results on six real data sets show that the classification rate of our novel method is better than or equal to the classification rate of the well‐known SVM, ISVM, Holder‐ISVM and Distance‐ISVM for all of these data sets. 相似文献

19.

一种基于聚类的PU主动文本分类方法 总被引：1，自引：0，他引：1

刘露彭涛左万利戴耀康《软件学报》2013,24(11):2571-2583

文本分类是信息检索的关键问题之一.提取更多的可信反例和构造准确高效的分类器是PU(positive andunlabeled)文本分类的两个重要问题.然而,在现有的可信反例提取方法中,很多方法提取的可信反例数量较少,构建的分类器质量有待提高.分别针对这两个重要步骤提供了一种基于聚类的半监督主动分类方法.与传统的反例提取方法不同,利用聚类技术和正例文档应与反例文档共享尽可能少的特征项这一特点,从未标识数据集中尽可能多地移除正例,从而可以获得更多的可信反例.结合SVM 主动学习和改进的Rocchio 构建分类器,并采用改进的TFIDF(term frequency inverse document frequency)进行特征提取,可以显著提高分类的准确度.分别在3 个不同的数据集中测试了分类结果(RCV1,Reuters-21578,20 Newsgoups).实验结果表明,基于聚类寻找可信反例可以在保持较低错误率的情况下获取更多的可信反例,而且主动学习方法的引入也显著提升了分类精度. 相似文献

20.

Large-scale maximum margin discriminant analysis using core vector machines. 总被引：1，自引：0，他引：1

I H Tsang A Kocsor J Y Kwok 《Neural Networks, IEEE Transactions on》2008,19(4):610-624

Large-margin methods, such as support vector machines (SVMs), have been very successful in classification problems. Recently, maximum margin discriminant analysis (MMDA) was proposed that extends the large-margin idea to feature extraction. It often outperforms traditional methods such as kernel principal component analysis (KPCA) and kernel Fisher discriminant analysis (KFD). However, as in the SVM, its time complexity is cubic in the number of training points m, and is thus computationally inefficient on massive data sets. In this paper, we propose an (1+epsilon)(2)-approximation algorithm for obtaining the MMDA features by extending the core vector machine. The resultant time complexity is only linear in m, while its space complexity is independent of m. Extensive comparisons with the original MMDA, KPCA, and KFD on a number of large data sets show that the proposed feature extractor can improve classification accuracy, and is also faster than these kernel-based methods by over an order of magnitude. 相似文献