共查询到20条相似文献,搜索用时 15 毫秒
1.
Zero-shot learning (ZSL) aims to recognize unseen image classes without requiring any training samples of these specific classes. The ZSL problem is typically achieved by building up a semantic embedding space like attributes to bridge the visual features and class labels of images. Currently, most ZSL approaches focus on learning a visual-semantic alignment from seen classes using only the human-designed attributes, and then ZSL problem is solved by transferring semantic knowledge from seen classes to the unseen classes. However, few works indicate if the human-designed attributes are discriminative enough for image class prediction. To address this issue, we propose a semantic-aware dictionary learning (SADL) framework to explore these discriminative visual attributes across seen and unseen classes. Furthermore, the semantic cues are elegantly integrated into the feature representations via learned visual attributes for recognition task. Experiments conducted on two challenging benchmark datasets show that our approach outweighs other state-of-the-art ZSL methods. 相似文献
2.
In this paper, a group-sensitive multiple kernel learning (GS-MKL) method is proposed for object recognition to accommodate the intraclass diversity and the interclass correlation. By introducing the "group" between the object category and individual images as an intermediate representation, GS-MKL attempts to learn group-sensitive multikernel combinations together with the associated classifier. For each object category, the image corpus from the same category is partitioned into groups. Images with similar appearance are partitioned into the same group, which corresponds to the subcategory of the object category. Accordingly, intraclass diversity can be represented by the set of groups from the same category but with diverse appearances; interclass correlation can be represented by the correlation between groups from different categories. GS-MKL provides a tractable solution to adapt multikernel combination to local data distribution and to seek a tradeoff between capturing the diversity and keeping the invariance for each object category. Different from the simple hybrid grouping strategy that solves sample grouping and GS-MKL training independently, two sample grouping strategies are proposed to integrate sample grouping and GS-MKL training. The first one is a looping hybrid grouping method, where a global kernel clustering method and GS-MKL interact with each other by sharing group-sensitive multikernel combination. The second one is a dynamic divisive grouping method, where a hierarchical kernel-based grouping process interacts with GS-MKL. Experimental results show that performance of GS-MKL does not significantly vary with different grouping strategies, but the looping hybrid grouping method produces slightly better results. On four challenging data sets, our proposed method has achieved encouraging performance comparable to the state-of-the-art and outperformed several existing MKL methods. 相似文献
3.
4.
Canonical correlation analysis (CCA) is a popular method that has been widely used in information fusion. However, CCA requires that the data from two views must be paired, which is hard to satisfy in the real applications, moreover, it only considers the correlated information of the paired data. Thus, it cannot be used when there are only a little paired data or no paired data. In this paper, we propose a novel method named Canonical Principal Angles Correlation Analysis (CPACA) which does not need paired data during training stage. It makes classic CCA escape from the limitation of paired information. Its objective function can be constructed as follows: First, the correlation of two views is represented by the similarity between two subspace spanned by the principal components, which makes CPACA favorably compare with CCA in the case of limited paired data; Second, in order to increase the discriminative information of CPACA, we utilize manifold regularization to exploit the geometry of the marginal distribution. To optimize the objective function, we propose a new method to calculate the projected vectors. The experimental results show that the performance of CPACA is superior to that of traditional CCA and its variants. 相似文献
5.
6.
7.
Pang Ying Han Andrew Teoh Beng Jin 《Journal of Visual Communication and Image Representation》2011,22(7):634-642
In this paper, we present a novel and effective feature extraction technique for face recognition. The proposed technique incorporates a kernel trick with Graph Embedding and the Fisher’s criterion which we call it as Kernel Discriminant Embedding (KDE). The proposed technique projects the original face samples onto a low dimensional subspace such that the within-class face samples are minimized and the between-class face samples are maximized based on Fisher’s criterion. The implementation of kernel trick and Graph Embedding criterion on the proposed technique reveals the underlying structure of data. Our experimental results on face recognition using ORL, FRGC and FERET databases validate the effectiveness of KDE for face feature extraction. 相似文献
8.
针对人脸识别中的特征提取问题,提出了核判别保局投影算法,即KDLPP.该算法通过核技巧将人脸样本映射到高维空间,在高维空间中有效地结合人脸局部的流形结构和人脸的判别信息构建了新的目标函数,其优点是在保持人脸流形结构的基础上,充分利用了样本的类别信息,并采用核方法提取了人脸的非线性特征.在ORL和UMIST人脸库上的实验... 相似文献
9.
迁移学习技术可以利用经验信息辅助当前任务,已在计算机视觉和语音识别领域得到广泛应用,但在电磁领域还没有取得明显的成就.电磁环境变化速度快,源数据或分类器模型在新环境中性能会显著下降,重新训练不仅需要大量的数据且费时费力.迁移学习技术与电磁目标识别任务十分相关,本文采用实测电磁目标数据集,探索迁移学习在解决电磁目标小样本... 相似文献
10.
Capturing the comprehensive information of various sizes and shapes of images in the same convolution layer is typically a challenging task in computer vision. There are two main kinds of methods for capturing those features. The first uses the inception structure and its variants. The second utilizes larger convolution kernels on specific layers or stacks with more convolution blocks. However, these methods can result in computationally intensive or vanishing gradients. In this paper, to accommodate feature distributions with different sizes, shapes and reduce computational cost, we propose a width- and depth-aware module named the WD-module to match feature distributions. Moreover, the proposed WD-module consumes less computational cost and parameters compared with traditional residual convolution layers. To verify the effectiveness of our proposed method, a size- and shape-aware backbone network named S2A-Net was built, which was obtained by stacking the WD-modules. By visualizing heat maps and features, the proposed S2A-Net can adapt to objects with different sizes and shapes in visual recognition tasks and learn more comprehensive characteristics. Experimental results show that the proposed method has higher accuracy in image recognition and outperforms other state-of-the-art networks with the same numbers of layers. 相似文献
11.
12.
13.
Ouyang Shan Bao Zheng Liao Guisheng 《电子科学学刊(英文版)》2000,17(3):270-278
Based on the least-square minimization a computationally efficient learning algorithm for the Principal Component Analysis(PCA) is derived. The dual learning rate parameters are adaptively introduced to make the proposed algorithm providing the capability of the fast convergence and high accuracy for extracting all the principal components. It is shown that all the information needed for PCA can be completely represented by the unnormalized weight vector which is updated based only on the corresponding neuron input-output product. The convergence performance of the proposed algorithm is briefly analyzed.The relation between Oja's rule and the least squares learning rule is also established. Finally, a simulation example is given to illustrate the effectiveness of this algorithm for PCA. 相似文献
14.
Although multiple methods have been proposed for human action recognition, the existing multi-view approaches cannot well discover meaningful relationship among multiple action categories from different views. To handle this problem, this paper proposes an multi-view learning approach for multi-view action recognition. First, the proposed method leverages the popular visual representation method, bag-of-visual-words (BoVW)/fisher vector (FV), to represent individual videos in each view. Second, the sparse coding algorithm is utilized to transfer the low-level features of various views into the discriminative and high-level semantics space. Third, we employ the multi-task learning (MTL) approach for joint action modeling and discovery of latent relationship among different action categories. The extensive experimental results on M2I and IXMAS datasets have demonstrated the effectiveness of our proposed approach. Moreover, the experiments further demonstrate that the discovered latent relationship can benefit multi-view model learning to augment the performance of action recognition. 相似文献
15.
16.
为了挖掘高光谱数据的光谱局部特征,从高光谱遥感数据内在的非线性结构出发,提出了一种基于光谱梯度角的高光谱影像流形学习降维方法。采用局部化流形学习算法局部保持投影(LPP)对高光谱遥感数据进行非线性降维,对距离度量进行改进,将能够更好刻画高光谱影像光谱局部特征的光谱梯度角相似性度量应用于LPP方法,并用真实高光谱图像进行降维实验,取得了优于LPP方法和采用光谱角的LPP方法的结果。结果表明,在光谱规范化特征值方面,所提方法优于LPP方法和采用光谱角的LPP方法;在信息量的保持方面,具有更好的局部细节信息保持量。采用光谱梯度角的流形学习方法用于高光谱影像降维能取得较好的降维效果。 相似文献
17.
18.
The current study puts forward a supervised within-class-similar discriminative dictionary learning (SCDDL) algorithm for face recognition. Some popular discriminative dictionary learning schemes for recognition tasks always incorporate the linear classification error term into the objective function or make some discriminative restrictions on representation coefficients. In the presented SCDDL algorithm, we propose to directly restrict the representation coefficients to be similar within the same class and simultaneously include the linear classification error term in the supervised dictionary learning scheme to derive a more discriminative dictionary for face recognition. The experimental results on three large well-known face databases suggest that our approach can enhance the fisher ratio of representation coefficients when compared with several dictionary learning algorithms that incorporate linear classifiers. In addition, the learned discriminative dictionary, the large fisher ratio of representation coefficients and the simultaneously learned classifier can improve the recognition rate compared with some state-of-the-art dictionary learning algorithms. 相似文献
19.
Appearance-based methods have been proven to be useful for face recognition tasks. The main problem with appearance-based methods originates from the multimodality of face images. It is known that images of different people in the original data space are more closely located to each other than those of the same person under different imaging conditions. In this paper, we propose a novel approach based on the nonlinear manifold embedding to define a linear subspace for illumination variations. This embedding based framework utilizes an optimization scheme to calculate the bases of the subspace. Since the optimization problem does not rely on the physical properties of the factor, the framework can also be used for other types of factors such as pose and expression. We obtained some promising recognition results under changing illumination conditions. Our error rates are comparable with state of art methods. 相似文献
20.
Individual recognition from locomotion is a challenging task owing to large intra-class and small inter-class variations. In this article, we present a novel metric learning method for individual recognition from skeleton sequences. Firstly, we propose to model articulated body on Riemannian manifold to describe the essence of human motion, which can reflect biometric signatures of the enrolled individuals. Then two spatia-temporal metric learning approaches are proposed, namely Spatio-Temporal Large Margin Nearest Neighbor (ST-LMNN) and Spatio-Temporal Multi-Metric Learning (STMM), to learn discriminant bilinear metrics which can encode the spatio-temporal structure of human motion. Specifically, the ST-LMNN algorithm extends the bilinear model into classical Large Margin Nearest Neighbor method, which learns a low-dimensional local linear embedding in the spatial and temporal domain, respectively. To further capture the unique motion pattern for each individual, the proposed STMM algorithm learns a set of individual-specific spatio-temporal metrics, which make the projected features of the same person closer to its class mean than that of different classes by a large margin. Beyond that, we present a new publicly available dataset for locomotion recognition to evaluate the influence of both internal and external covariant factors. According to the experimental results from the three public datasets, we believe that the proposed approaches are both able to achieve competitive results in individual recognition. 相似文献