首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
子空间半监督Fisher判别分析   总被引:3,自引:2,他引:1  
杨武夷  梁伟  辛乐  张树武 《自动化学报》2009,35(12):1513-1519
Fisher判别分析寻找一个使样本数据类间散度与样本数据类内散度比值最大的子空间, 是一种很流行的监督式特征降维方法. 标注样本数据所属的类别通常需要大量的人工, 消耗大量的时间, 付出昂贵的成本. 为了解决同时利用有类别信息的样本数据和没有类别信息的样本数据用于寻找降维子空间的问题, 我们提出了一种子空间半监督Fisher判别分析方法. 子空间半监督Fisher判别分析寻找这样一个子空间, 这个子空间即保留了从有类别信息的样本数据中学习的类别判别结构, 也保留了从有类别信息的样本数据和没有类别信息的样本数据中学习的样本结构信息. 我们还推导了基于核的子空间半监督Fisher判别分析方法. 通过人脸识别实验验证了本文算法的有效性.  相似文献   

2.
When only a small number of labeled samples are available, supervised dimensionality reduction methods tend to perform poorly because of overfitting. In such cases, unlabeled samples could be useful in improving the performance. In this paper, we propose a semi-supervised dimensionality reduction method which preserves the global structure of unlabeled samples in addition to separating labeled samples in different classes from each other. The proposed method, which we call SEmi-supervised Local Fisher discriminant analysis (SELF), has an analytic form of the globally optimal solution and it can be computed based on eigen-decomposition. We show the usefulness of SELF through experiments with benchmark and real-world document classification datasets.  相似文献   

3.
超声图像的乳腺癌自动诊断具有重要的临床价值。然而,由于缺乏大量人工标注数据,构建高精度的自动诊断方法极具挑战。近年来,自监督对比学习在利用无标签自然图像产生具有辨别性和高度泛化性的特征方面展现出巨大潜力。然而,采用自然图像构建正负样本的方法在乳腺超声领域并不适用。为此,本文引入超声弹性图像(elastography ultrasound, EUS),利用超声图像的多模态特性,提出一种融合多模态信息的自监督对比学习方法。该方法采用同一病人的多模态超声图像构造正样本;采用不同病人的多模态超声图像构建负样本;基于模态一致性、旋转不变性和样本分离性来构建对比学习的目标学习准则。通过在嵌入空间中学习两种模态的统一特征表示,从而将EUS信息融入模型,提高了模型在下游B型超声分类任务中的表现。实验结果表明本文提出的方法能够在无标签的情况下充分挖掘多模态乳腺超声图像中的高阶语义特征,有效提高乳腺癌的诊断正确率。  相似文献   

4.
In this paper, a novel unsupervised dimensionality reduction algorithm, unsupervised Globality-Locality Preserving Projections in Transfer Learning (UGLPTL) is proposed, based on the conventional Globality-Locality Preserving dimensionality reduction algorithm (GLPP) that does not work well in real-world Transfer Learning (TL) applications. In TL applications, one application (source domain) contains sufficient labeled data, but the related application contains only unlabeled data (target domain). Compared to the existing TL methods, our proposed method incorporates all the objectives, such as minimizing the marginal and conditional distributions between both the domains, maximizing the variance of the target domain, and performing Geometrical Diffusion on Manifolds, all of which are essential for transfer learning applications. UGLPTL seeks a projection vector that projects the source and the target domains data into a common subspace where both the labeled source data and the unlabeled target data can be utilized to perform dimensionality reduction. Comprehensive experiments have verified that the proposed method outperforms many state-of-the-art non-transfer learning and transfer learning methods on two popular real-world cross-domain visual transfer learning data sets. Our proposed UGLPTL approach achieved 82.18% and 87.14% mean accuracies over all the tasks of PIE Face and Office-Caltech data sets, respectively.  相似文献   

5.
To improve effectively the performance on spoken emotion recognition, it is needed to perform nonlinear dimensionality reduction for speech data lying on a nonlinear manifold embedded in a high-dimensional acoustic space. In this paper, a new supervised manifold learning algorithm for nonlinear dimensionality reduction, called modified supervised locally linear embedding algorithm (MSLLE) is proposed for spoken emotion recognition. MSLLE aims at enlarging the interclass distance while shrinking the intraclass distance in an effort to promote the discriminating power and generalization ability of low-dimensional embedded data representations. To compare the performance of MSLLE, not only three unsupervised dimensionality reduction methods, i.e., principal component analysis (PCA), locally linear embedding (LLE) and isometric mapping (Isomap), but also five supervised dimensionality reduction methods, i.e., linear discriminant analysis (LDA), supervised locally linear embedding (SLLE), local Fisher discriminant analysis (LFDA), neighborhood component analysis (NCA) and maximally collapsing metric learning (MCML), are used to perform dimensionality reduction on spoken emotion recognition tasks. Experimental results on two emotional speech databases, i.e. the spontaneous Chinese database and the acted Berlin database, confirm the validity and promising performance of the proposed method.  相似文献   

6.
基于张量表示的直推式多模态视频语义概念检测   总被引:4,自引:0,他引:4  
吴飞  刘亚楠  庄越挺 《软件学报》2008,19(11):2853-2868
提出了一种基于高阶张量表示的视频语义分析与理解框架.在此框架中,视频镜头首先被表示成由视频中所包含的文本、视觉和听觉等多模态数据构成的三阶张量;其次,基于此三阶张量表达及视频的时序关联共生特性设计了一种子空间嵌入降维方法,称为张量镜头;由于直推式学习从已知样本出发能对特定的未知样本进行学习和识别.最后在这个框架中提出了一种基于张量镜头的直推式支持张量机算法,它不仅保持了张量镜头所在的流形空间的本征结构,而且能够将训练集合外数据直接映射到流形子空间,同时充分利用未标记样本改善分类器的学习性能.实验结果表明,该方法能够有效地进行视频镜头的语义概念检测.  相似文献   

7.
In practice, many applications require a dimensionality reduction method to deal with the partially labeled problem. In this paper, we propose a semi-supervised dimensionality reduction framework, which can efficiently handle the unlabeled data. Under the framework, several classical methods, such as principal component analysis (PCA), linear discriminant analysis (LDA), maximum margin criterion (MMC), locality preserving projections (LPP) and their corresponding kernel versions can be seen as special cases. For high-dimensional data, we can give a low-dimensional embedding result for both discriminating multi-class sub-manifolds and preserving local manifold structure. Experiments show that our algorithms can significantly improve the accuracy rates of the corresponding supervised and unsupervised approaches.  相似文献   

8.
半监督局部维数约减   总被引:1,自引:1,他引:0       下载免费PDF全文
在挖掘和分析高维数据任务中,有时只能获得有限的成对约束信息(must-link约束和cannot-link约束),由于缺乏数据类标号信息,监督维数约减方法常常不能得到满意的结果。在这种情况下,使用大量的无标号样本可以提高算法的性能。文中借助于成对约束信息和大量无标号样本,提出半监督局部维数约减方法(SLDR)。SLDR集成数据的局部信息和成对约束寻找一个最优投影,当数据被投影到低维空间时,不仅cannot-link约束中样本点对之间距离更远、must-link约束中样本点对之间距离更近,数据的内在几何信息还被保持。而且SLDR能推广为非线性方法,使之能够适应非线性数据的维数约减。在各种数据集上的实验结果充分验证了所提出算法的有效性。  相似文献   

9.
目的 典型相关分析是一种经典的多视图学习方法。为了提高投影方向的判别性能,现有典型相关分析方法通常采用引入样本标签信息的策略。然而,获取样本的标签信息需要付出大量的人力与物力,为此,提出了一种联合标签预测与判别投影学习的半监督典型相关分析算法。方法 将标签预测与模型构建相融合,具体地说,将标签预测融入典型相关分析框架中,利用联合学习框架学得的标签矩阵更新投影方向,进而学得的投影方向又重新更新标签矩阵。标签预测与投影方向的学习过程相互依赖、交替更新,预测标签不断地接近其真实标签,有利于学得最优的投影方向。结果 本文方法在AR、Extended Yale B、Multi-PIE和ORL这4个人脸数据集上分别进行实验。特征维度为20时,在AR、Extended Yale B、Multi-PIE和ORL人脸数据集上分别取得87%、55%、83%和85%识别率。取训练样本中每人2(3,4,5)幅人脸图像为监督样本,提出的方法识别率在4个人脸数据集上均高于其他方法。训练样本中每人5幅人脸图像为监督样本,在AR、Extended Yale B、Multi-PIE和ORL人脸数据集上分别取得94.67%、68%、83%和85%识别率。实验结果表明在训练样本标签信息较少情况下以及特征降维后的维数较低的情况下,联合学习模型使得降维后的数据最大限度地保存更加有效的信息,得到较好的识别结果。结论 本文提出的联合学习方法提高了学习的投影方向的判别性能,能够有效地处理少量的有标签样本和大量的无标签样本的情况以及解决两步学习策略的缺陷。  相似文献   

10.
一种半监督局部线性嵌入算法的文本分类方法*   总被引:3,自引:0,他引:3  
针对局部线性嵌入算法(LLE)应用于非监督机器学习中的缺陷,将该算法与半监督思想相结合,提出了一种基于半监督局部线性嵌入算法的文本分类方法。通过使用文本数据的流形结构和少量的标签样本,将LLE中的距离矩阵采用分段形式进行调整;使用调整后的矩阵进行线性重建从而实现数据降维;针对半监督LLE中使用欧氏距离的缺点,采用高斯核函数将欧氏距离进行变换,并用新的核距离取代欧氏距离,提出了基于核的半监督局部线性嵌入算法;最后通过仿真实验验证了改进算法的有效性。  相似文献   

11.
Derived from the traditional manifold learning algorithms, local discriminant analysis methods identify the underlying submanifold structures while employing discriminative information for dimensionality reduction. Mathematically, they can all be unified into a graph embedding framework with different construction criteria. However, such learning algorithms are limited by the curse-of-dimensionality if the original data lie on the high-dimensional manifold. Different from the existing algorithms, we consider the discriminant embedding as a kernel analysis approach in the sample space, and a kernel-view based discriminant method is proposed for the embedded feature extraction, where both PCA pre-processing and the pruning of data can be avoided. Extensive experiments on the high-dimensional data sets show the robustness and outstanding performance of our proposed method.  相似文献   

12.
Semi-supervised dimensional reduction methods play an important role in pattern recognition, which are likely to be more suitable for plant leaf and palmprint classification, since labeling plant leaf and palmprint often requires expensive human labor, whereas unlabeled plant leaf and palmprint is far easier to obtain at very low cost. In this paper, we attempt to utilize the unlabeled data to aid plant leaf and palmprint classification task with the limited number of the labeled plant leaf or palmprint data, and propose a semi-supervised locally discriminant projection (SSLDP) algorithm for plant leaf and palmprint classification. By making use of both labeled and unlabeled data in learning a transformation for dimensionality reduction, the proposed method can overcome the small-sample-size (SSS) problem under the situation where labeled data are scant. In SSLDP, the labeled data points, combined with the unlabeled data ones, are used to construct the within-class and between-class weight matrices incorporating the neighborhood information of the data set. The experiments on plant leaf and palmprint databases demonstrate that SSLDP is effective and feasible for plant leaf and palmprint classification.  相似文献   

13.
波段选择是数据降维的有效手段,但有限的标记样本影响了监督波段选择的性能。提出一种利用图Laplacian和自训练策略实现半监督波段选择的方法。该方法首先定义基于图的半监督特征评分准则以产生初始波段子集,接着在该子集基础上进行分类,采用自训练策略将部分可信度较高的非标记样本扩展至标记样本集合,再用特征评分准则对波段子集进行更新。重复该过程,获得最终波段子集。高光谱波段选择与分类实验比较了多种非监督、监督和半监督方法,实验结果表明所提算法能选择出更好的波段子集。  相似文献   

14.
Most manifold learning algorithms adopt the k nearest neighbors function to construct the adjacency graph. However, severe bias may be introduced in this case if the samples are not uniformly distributed in the ambient space. In this paper a semi-supervised dimensionality reduction method is proposed to alleviate this problem. Based on the notion of local margin, we simultaneously maximize the separability between different classes and estimate the intrinsic geometric structure of the data by both the labeled and unlabeled samples. For high-dimensional data, a discriminant subspace is derived via maximizing the cumulative local margins. Experimental results on high-dimensional classification tasks demonstrate the efficacy of our algorithm.  相似文献   

15.
Locally linear embedding (LLE) is a nonlinear dimensionality reduction method proposed recently. It can reveal the intrinsic distribution of data, which cannot be provided by classical linear dimensionality reduction methods. The application of LLE, however, is limited because of its lack of a parametric mapping between the observation and the low-dimensional output. And the large data set to be reduced is necessary. In this paper, we propose methods to establish the process of mapping from low-dimensional embedded space to high-dimensional space for LLE and validate their efficiency with the application of reconstruction of multi-pose face images. Furthermore, we propose that the high-dimensional structure of multi-pose face images is similar for the same kind of pose change mode of different persons. So given the structure information of data distribution which is obtained by leaning large numbers of multi-pose images in a training set, the support vector regression (SVR) method of statistical learning theory is used to learn the high-dimensional structure of someone based on small sets. The detailed learning method and algorithm are given and applied to reconstruct and synthesize face images in small set cases. The experiments prove that our idea and method is correct.  相似文献   

16.
张晨光  张燕  张夏欢 《自动化学报》2015,41(9):1577-1588
针对现有多标记学习方法大多属于有监督学习方法, 而不能有效利用相对便宜且容易获得的大量未标记样本的问题, 本文提出了一种新的多标记半监督学习方法, 称为最大规范化依赖性多标记半监督学习方法(Normalized dependence maximization multi-label semi-supervised learning method). 该方法将已有标签作为约束条件,利用所有样本, 包括已标记和未标记样本,对特征集和标签集的规范化依赖性进行估计, 并以该估计值的最大化为目标, 最终通过求解带边界的迹比值问题为未标记样本打上标签. 与其他经典多标记学习方法在多个真实多标记数据集上的对比实验表明, 本文方法可以有效从已标记和未标记样本中学习, 尤其是已标记样本相对稀少时,学习效果得到了显著提高.  相似文献   

17.
主动协同半监督粗糙集分类模型   总被引:1,自引:0,他引:1  
粗糙集理论是一种有监督学习模型,一般需要适量有标记的数据来训练分类器。但现实一些问题往往存在大量无标记的数据,而有标记数据由于标记代价过大较为稀少。文中结合主动学习和协同训练理论,提出一种可有效利用无标记数据提升分类性能的半监督粗糙集模型。该模型利用半监督属性约简算法提取两个差异性较大的约简构造基分类器,然后基于主动学习思想在无标记数据中选择两分类器分歧较大的样本进行人工标注,并将更新后的分类器交互协同学习。UCI数据集实验对比分析表明,该模型能明显提高分类学习性能,甚至能达到数据集的最优值。  相似文献   

18.
监督学习需要利用大量的标记样本训练模型,但实际应用中,标记样本的采集费时费力。无监督学习不使用先验信息,但模型准确性难以保证。半监督学习突破了传统方法只考虑一种样本类型的局限,能够挖掘大量无标签数据隐藏的信息,辅助少量的标记样本进行训练,成为机器学习的研究热点。通过对半监督学习研究的总趋势以及具体研究内容进行详细的梳理与总结,分别从半监督聚类、分类、回归与降维以及非平衡数据分类和减少噪声数据共六个方面进行综述,发现半监督方法众多,但存在以下不足:(1)部分新提出的方法虽然有效,但仅通过特定数据集进行了实证,缺少一定的理论证明;(2)复杂数据下构建的半监督模型参数较多,结果不稳定且缺乏参数选取的指导经验;(3)监督信息多采用样本标签或成对约束形式,对混合约束的半监督学习需要进一步研究;(4)对半监督回归的研究匮乏,对如何利用连续变量的监督信息研究甚少。  相似文献   

19.
Semi-supervised learning methods are conventionally conducted by simultaneously utilizing abundant unlabeled samples and a few labeled samples given. However, the unlabeled samples are usually adopted with assumptions, e.g., cluster and manifold assumptions, which degrade the performance when the assumptions become invalid. The reliable hidden features embedded in both the labeled and the unlabeled samples can potentially be used to tackle this issue. In this regard, we investigate the feature augmentation technique to improve the robustness of semi-supervised learning in this paper. By introducing an orthonormal projection matrix, we first transform both the unlabeled and labeled samples into a shared hidden subspace to determine the connections between the samples. Then we utilize the hidden features, the raw features, and zero vectors determined to develop a novel feature augmentation strategy. Finally, a hidden feature transformation (HTF) model is proposed to compute the desired projection matrix by applying the maximum joint probability distribution principle in the augmented feature space. The effectiveness of the proposed method is evaluated in terms of the hinge and square loss functions respectively, based on two types of semi-supervised classification formulations developed using only the labeled samples with their original features and hidden features. The experimental results have demonstrated the effectiveness of the proposed feature augmentation technique for semi-supervised learning.  相似文献   

20.
针对集成学习方法中分类器差异性不足以及已标记样本少的问题,提出了一种新的半监督集成学习算法,将半监督方法引入到集成学习中,利用大量未标记样本的信息来细化每个基分类器,并且构造差异性更大的基分类器,首先通过多视图方法选取合适的未标记样本,并使用多视图方法将大量繁杂的特征属性分类,使用不同的特征降维方法对不同的视图进行降维,便与输入到学习模型中,同时采用相互独立的学习模型来增加集成的多样性。在UCI数据集上的实验结果表明,与使用单视图数据相比,使用多视图数据可以实现更准确的分类,并且与现有的诸如Boosting、三重训练算法比较,使用差异性更高的基学习器以及引入半监督方法能够有效提升集成学习的性能效果。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号