首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 265 毫秒
1.
基于流形学习和SVM的Web文档分类算法   总被引:7,自引:4,他引:3       下载免费PDF全文
王自强  钱旭 《计算机工程》2009,35(15):38-40
为解决Web文档分类问题,提出一种基于流形学习和SVM的Web文档分类算法。该算法利用流形学习算法LPP对训练集中的高维Web文档空间进行非线性降维,从中找出隐藏在高维观测数据中有意义的低维结构,在降维后的低维特征空间中利用乘性更新规则的优化SVM进行分类预测。实验结果表明该算法以较少的运行时间获得更高的分类准确率。  相似文献   

2.
李勇  李应  余清清 《计算机工程》2011,37(7):288-290
为利用生态环境中各种声音包含的信息,提出一种将流形学习算法和支持向量机(SVM)相结合的生态环境声音分类技术。提取音频强度、音色、音调和音频节奏的特征集合并计算对应的特征向量,采用改进的拉普拉斯特征映射流形学习算法对特征向量进行维数约简,从而降低数据处理的复杂性。使用SVM对降维后的特征向量进行分类,发挥SVM在处理小样本、非线性及高维数据方面的优势,从而提高分类准确率。实验结果表明,该技术能对生态环境声音进行快速准确的分类。  相似文献   

3.
针对传统的流形学习算法不能对位于黎曼流形上的协方差描述子进行有效降维这一问题,本文提出一种推广的流形学习算法,即基于Log-Euclidean黎曼核的自适应半监督正交局部保持投影(Log-Euclidean Riemannian kernel-based adaptive semi-supervised orthogonal locality preserving projection,LRK-ASOLPP),并将其成功用于高分辨率遥感影像目标分类问题.首先,提取图像每个像素点处的几何结构特征,计算图像特征的协方差描述子;其次,通过采用Log-Euclidean黎曼核将协方差描述子投影到再生核Hilbert空间;然后,基于流形学习理论,建立黎曼流形上半监督正交局部保持投影算法模型,利用交替迭代更新算法对目标函数进行优化求解,同时获得相似性权矩阵和低维投影矩阵;最后,利用求得的低维投影矩阵计算测试样本的低维投影,并用K—近邻、支持向量机(Support victor machine,SVM)等分类器对其进行分类.三个高分辨率遥感影像数据集上的实验结果说明了该算法的有效性与可行性.  相似文献   

4.
基于非线性流形学习和支持向量机的文本分类算法   总被引:2,自引:1,他引:1  
为解决文本自动分类问题,提出一种流形学习和支持向量机相结合的文本分类算法(LLE-LSSVM)。LLE-LSSVM算法利用非线性流形学习算法LEE对高维文本特征进行非线性降维,挖掘出特征内在规律与本征信息,从而得到低维特征空间,然后将其输入到LSSVM中进行学习,同时利用混沌粒子群算法对LSSVM参数进行优化,建立文本分类模型。仿真实验结果表明,LLE-LSSVM算法提高了文本分类准确率,减少了分类运行时间,是一种有效的文本分类算法。  相似文献   

5.
随着光学遥感图像技术的快速发展与广泛应用,对光学遥感图像的准确分类具有深远的研究意义。传统特征提取方式提取的高维特征中夹杂着许多冗余信息,分类过程可能导致过拟合现象,针对传统的线性降维算法不足以保持原始数据的内部结构,容易造成数据失真这一问题,提出基于流形学习的光学遥感图像分类算法。该算法首先提取出图像的SIFT特征,然后将流形学习运用于特征降维,最后结合支持向量机进行训练和识别。实验结果表明,在Satellite、NWPU和UCMerced实验数据中,冰川、建筑群和海滩分类精度得到了有效提高,达到85%左右;针对沙漠、岩石、水域等特殊环境遥感图像,分类精度提高了10%左右。总而言之,基于流形学习的分类算法对通过降维之后的数据能够保持在原高维空间中的拓扑结构,相似特征点能得到有效聚合,预防了"维数灾难",减少了计算量,保证了分类精度。  相似文献   

6.
半监督型广义特征值最接近支持向量机   总被引:1,自引:0,他引:1  
广义特征值最接近支持向量机(GEPSVM)是近年提出的一种两分类方法.本文结合GEPSVM的平面特点和流形学习,给出一类半监督学习算法SemiGEPSVM.该方法不仅仍保持对诸如XOR问题的分类能力,而且在每类仅有一个有标样本的极端情形下,仍具有适用性.当已标样本不能用于构建超平面时,本文采用k-近邻方法选择样本并标记类别.一旦已标样本的个数可构建超平面时,采用本文的选择方法标记样本.此外,本文还从理论上证明该算法存在全局最优解.最后,SemiGEPSVM算法的有效性在人工数据集和标准数据集上得到验证.  相似文献   

7.
流形嵌入的支持向量数据描述   总被引:3,自引:0,他引:3  
测地距离能在宏观层面上较真实地反映数据中所隐含的几何结构,可基于它的支持向量数据描述(SVDD)无法直接优化.为此,文中提出一种流形分类学习算法的设计框架.用原空间测地距离近似各向同性的特征映射(ISOMAP)降维空间上的欧氏距离,即在隐含ISOMAP降维后空间上执行原学习算法.按照该框架,以SVDD为例发展出嵌入的ISOMAP发现的低维流形的SVDD(mSVDD),从而解决基于测地距离的SVDD的优化问题.USPS手写体数字数据集上的实验表明,mSVDD的单类性能较SVDD有较显著提高.  相似文献   

8.
提出一种基于边界鉴别分析的递归维数约简算法.该算法把已求取边界鉴别向量正交于待求超平面法向量作为支持向量机(SVM)优化问题新的约束条件;然后对改进SVM进行递归求解,得到正交边界鉴别向量基;最后将数据样本在正交边界鉴别向量上投影实现维数约简.该算法不仅克服了现有维数约简算法难以支持小样本数据集、受数据样本分布影响等问题,而且抽取的特征向量具有更优的分类性能.仿真实验说明了算法的有效性.  相似文献   

9.
标准的SVM分类计算过程中有大量的支持向量参与了计算,导致了分类速度缓慢。该文为提高SVM的分类速度,提出了一种快速的多项式核函数SVM分类算法,即将使用多项式核的SVM分类决策函数展开为关于待分类向量各分量的多项式,分类时通过计算各个多项式的值而得到分类结果,使分类计算量和支持向量数量无关,又保留了全部支持向量的信息。当多项式核函数的阶数或待分类向量的维数较低而支持向量数量较多时,使用该算法可以使SVM 分类的速度得到极大的提高。针对实际数据集的实验表明了该算法的有效性。  相似文献   

10.
等谱流形学习算法   总被引:1,自引:0,他引:1  
黄运娟  李凡长 《软件学报》2013,24(11):2656-2666
基于谱方法的流形学习算法的目标是发现嵌入在高维数据空间中的低维表示.近年来,该算法已得到广泛的应用.等谱流形学习是谱方法中的主要内容之一.等谱流形学习源于这样的结论:只要两个流形的谱相同,其内部结构就是相同的.而谱计算难以解决的问题是近邻参数的选择以及如何构造合理邻接权.为此,提出了等谱流形学习算法(isospectral manifold learning algorithm,简称IMLA).它通过直接修正稀疏重构权矩阵,将类内的判别监督信息和类间的判别监督信息同时融入邻接图,达到既能保持数据间稀疏重建关系,又能利用监督信息的目的,与PCA等算法相比具有明显的优势.该算法在3 个常用人脸数据集(Yale,ORL,Extended Yale B)上得到了验证,这进一步说明了IMLA 算法的有效性.  相似文献   

11.
Manifold learning methods for unsupervised nonlinear dimensionality reduction have proven effective in the visualization of high dimensional data sets. When dealing with classification tasks, supervised extensions of manifold learning techniques, in which class labels are used to improve the embedding of the training points, require an appropriate method for out-of-sample mapping.In this paper we propose multi-output kernel ridge regression (KRR) for out-of-sample mapping in supervised manifold learning, in place of general regression neural networks (GRNN) that have been adopted by previous studies on the subject. Specifically, we consider a supervised agglomerative variant of Isomap and compare the performance of classification methods when the out-of-sample embedding is based on KRR and GRNN, respectively. Extensive computational experiments, using support vector machines and k-nearest neighbors as base classifiers, provide statistical evidence that out-of-sample mapping based on KRR consistently dominates its GRNN counterpart, and that supervised agglomerative Isomap with KRR achieves a higher accuracy than direct classification methods on most data sets.  相似文献   

12.
The paper presents an empirical comparison of the most prominent nonlinear manifold learning techniques for dimensionality reduction in the context of high-dimensional microarray data classification. In particular, we assessed the performance of six methods: isometric feature mapping, locally linear embedding, Laplacian eigenmaps, Hessian eigenmaps, local tangent space alignment and maximum variance unfolding. Unlike previous studies on the subject, the experimental framework adopted in this work properly extends to dimensionality reduction the supervised learning paradigm, by regarding the test set as an out-of-sample set of new points which are excluded from the manifold learning process. This in order to avoid a possible overestimate of the classification accuracy which may yield misleading comparative results. The different empirical approach requires the use of a fast and effective out-of-sample embedding method for mapping new high-dimensional data points into an existing reduced space. To this aim we propose to apply multi-output kernel ridge regression, an extension of linear ridge regression based on kernel functions which has been recently presented as a powerful method for out-of-sample projection when combined with a variant of isometric feature mapping. Computational experiments on a wide collection of cancer microarray data sets show that classifiers based on Isomap, LLE and LE were consistently more accurate than those relying on HE, LTSA and MVU. In particular, under different experimental conditions LLE-based classifier emerged as the most effective method whereas Isomap algorithm turned out to be the second best alternative for dimensionality reduction.  相似文献   

13.
To effectively handle speech data lying on a nonlinear manifold embedded in a high-dimensional acoustic space, in this paper, an adaptive supervised manifold learning algorithm based on locally linear embedding (LLE) for nonlinear dimensionality reduction is proposed to extract the low-dimensional embedded data representations for phoneme recognition. The proposed method aims to make the interclass dissimilarity maximized, while the intraclass dissimilarity minimized in order to promote the discriminating power and generalization ability of the low-dimensional embedded data representations. The performance of the proposed method is compared with five well-known dimensionality reduction methods, i.e., principal component analysis, linear discriminant analysis, isometric mapping (Isomap), LLE as well as the original supervised LLE. Experimental results on three benchmarking speech databases, i.e., the Deterding database, the DARPA TIMIT database, and the ISOLET E-set database, demonstrate that the proposed method obtains promising performance on the phoneme recognition task, outperforming the other used methods.  相似文献   

14.
针对传统的半监督SVM训练方法把大量时间花费在非支持向量优化上的问题,提出了在凹半监督支持向量机方法中采用遗传FCM(Genetic Fuzzy C Mean,遗传模糊C均值)进行工作集样本预选取的方法。半监督SVM优化学习过程中,在原来训练集上(标签数据)加入了工作集(无标签数据),从而构成了新的训练集。该方法首先利用遗传FCM算法将未知数据划分成某个数量的子集,然后用凹半监督SVM对新数据进行训练得到决策边界与支持矢量,最后对无标识数据进行分类。这样通过减小工作样本集,选择那些可能成为支持向量的边界向量来加入训练集,减少参与训练的样本总数,从而减小了内存开销。并且以随机三维数据为例进行分析,实验结果表明,工作集减小至原工作集的一定范围内,按比例减少工作集后的分类准确率、支持向量数与用原工作集相比差别不大,而分类时间却大为减少,获得了较为理想的样本预选取效果。  相似文献   

15.
流形学习概述   总被引:39,自引:2,他引:37  
流形学习是一种新的非监督学习方法,近年来引起越来越多机器学习和认知科学工作者的重视.为了加深对流形学习的认识和理解,该文由流形学习的拓扑学概念入手,追溯它的发展过程.在明确流形学习的不同表示方法后,针对几种主要的流形算法,分析它们各自的优势和不足,然后分别引用Isomap和LLE的应用示例.结果表明,流形学习较之于传统的线性降维方法,能够有效地发现非线性高维数据的本质维数,利于进行维数约简和数据分析.最后对流形学习未来的研究方向做出展望,以期进一步拓展流形学习的应用领域.  相似文献   

16.
流形学习已成为机器学习和数据挖掘领域的研究热点。比如,算法LLE(Locally Linear Embedding)作为一种非线性降维算法有很好的泛化性能,被广泛地应用于图像分类和目标识别,但其仅仅假设了数据集处于单流形的情况。MM-LLE(Multiple Manifold Locally Linear Embedding)学习算法作为一种考虑多流形情况的改进算法,依然存在几点不足之处。因此,提出改进的MM-LLE算法,通过任意两类间的局部低维流形组合并构建分类器来提高分类精度;同时改进原算法计算最佳维度的方法。通过与算法ISOMAP、LLE以及MM-LLE比较分类精度,实验结果验证了改进算法的有效性。  相似文献   

17.
Manifold learning is a well-known dimensionality reduction scheme which can detect intrinsic low-dimensional structures in non-linear high-dimensional data. It has been recently widely employed in data analysis, pattern recognition, and machine learning applications. Isomap is one of the most promising manifold learning algorithms, which extends metric multi-dimensional scaling by using approximate geodesic distance. However, when Isomap is conducted on real-world applications, it may have some difficulties in dealing with noisy data. Although many applications represent a special sample by multiple feature vectors in different spaces, Isomap employs samples in unique observation space. In this paper, two extended versions of Isomap to multiple feature spaces problem, namely fusion of dissimilarities and fusion of geodesic distances, are presented. We have employed the advantages of several spaces and depicted the Euclidean distance on learned manifold that is more compatible to the semantic distance. To show the effectiveness and validity of the proposed method, some experiments have been carried out on the application of shape analysis on MPEG7 CE Part B and Fish data sets.  相似文献   

18.
曹路 《计算机科学》2016,43(12):97-100
传统的支持向量机在处理不平衡数据时效果不佳。为了提高少类样本的识别精度,提出了一种基于支持向量的上采样方法。首先根据K近邻的思想清除原始数据集中的噪声;然后用支持向量机对训练集进行学习以获得支持向量,进一步对少类样本的每一个支持向量添加服从一定规律的噪声,增加少数类样本的数目以获得相对平衡的数据集;最后将获得的新数据集用支持向量机学习。实验结果显示,该方法在人工数据集和UCI标准数据集上均是有效的。  相似文献   

19.
To improve effectively the performance on spoken emotion recognition, it is needed to perform nonlinear dimensionality reduction for speech data lying on a nonlinear manifold embedded in a high-dimensional acoustic space. In this paper, a new supervised manifold learning algorithm for nonlinear dimensionality reduction, called modified supervised locally linear embedding algorithm (MSLLE) is proposed for spoken emotion recognition. MSLLE aims at enlarging the interclass distance while shrinking the intraclass distance in an effort to promote the discriminating power and generalization ability of low-dimensional embedded data representations. To compare the performance of MSLLE, not only three unsupervised dimensionality reduction methods, i.e., principal component analysis (PCA), locally linear embedding (LLE) and isometric mapping (Isomap), but also five supervised dimensionality reduction methods, i.e., linear discriminant analysis (LDA), supervised locally linear embedding (SLLE), local Fisher discriminant analysis (LFDA), neighborhood component analysis (NCA) and maximally collapsing metric learning (MCML), are used to perform dimensionality reduction on spoken emotion recognition tasks. Experimental results on two emotional speech databases, i.e. the spontaneous Chinese database and the acted Berlin database, confirm the validity and promising performance of the proposed method.  相似文献   

20.
流形学习方法中的LLE算法可以将高维数据在保持局部邻域结构的条件下降维到低维流形子空间中.并得到与原样本集具有相似局部结构的嵌入向量集合。LLE算法在数据降维处理过程中没有考虑样本的分类信息。针对这些问题进行研究,提出改进的有监督的局部线性嵌人算法(MSLLE),并利用MatLab对该改进算法的实现效果同LLE进行实验演示比较。通过实验演示表明,MSLLE算法较LLE算法可以有利于保持数据点本身内部结构。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号