首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
This paper presents a model-based vision recognition engine for planar contours that are scale invariant of known models. Features are obtained by using a constant-curvature criterion and used to carry out efficient coarse-to-fine recognition. A robust shape matching is proposed for comparing contour fragments from scenes with partial occluding. In order to carry out an early pruning of a large portion of the models, hypotheses are only generated for a subset of contours with enough discriminative information. Poor scene contours are used later in validating or invalidating a relatively small set of hypotheses. Since hypotheses are selectively verified, blocking is avoided by extending current matching through pairing of hypotheses, predictive matching, and retrieving the next weighted hypotheses. This avoids the processing of a large number of initial hypotheses. The authors' evaluation shows that a high recognition error results from the use of too small a bucket size because the indexes may fall at random, producing nonrepeatable results. They use a multidimensional hashing scheme with space separation between dense parameter areas to create additional hashing tables. The robustness of the recognition is based on engineering a coarse bucket size to the best tolerance with respect to various sources of noise. Partially occluded scenes having three objects can be recognized with a success rate of 84%. The results are reproducible against changes in scale, rotation, and translation. Due to the selection of robust initial hypotheses and the structure of the selective matching system, the processing time essentially depends on scene complexity with a marginal dependence on database size  相似文献   

2.
Tracking low-resolution (LR) targets is a practical yet quite challenging problem in real video analysis applications. Lack of discriminative details in the visual appearance of the LR target leads to the matching ambiguity, which confronts most existing tracking methods. Although artificially enhancing the video resolution by superresolution (SR) techniques before analyzing might be an option, the high demand of computational cost can hardly meet the requirements of the tracking scenario. This paper presents a novel solution to track LR targets without explicitly performing SR. This new approach is based on discriminative metric preservation that preserves the data affinity structure in the high-resolution (HR) feature space for effective and efficient matching of LR images. In addition, we substantialize this new approach in a solid case study of differential tracking under metric preservation and derive a closed-form solution to motion estimation for LR video. In addition, this paper extends the basic linear metric preservation method to a more powerful nonlinear kernel metric preservation method. Such a solution to LR target tracking is discriminative, robust, and efficient. Extensive experiments validate the entrustments and effectiveness of the proposed approach and demonstrate the improved performance of the proposed method in tracking LR targets.  相似文献   

3.
A discriminative temporal feature processing method for robust speech recognition is presented by combining the knowledge and the statistical methods. The cepstral features are first filtered by a RASTA method based on human hearing perception and then processed using the minimum classification error algorithm. Improved recognition performance can be achieved in both quiet and noisy environments  相似文献   

4.
Performances of fine-grained recognition have been greatly improved thanks to the fast developments of deep convolutional neural networks (DCNN). DCNN methods often treat each image region equally. Besides, researchers often rely on visual information for classification. To solve these problems, we propose a novel discriminative semantic region selection method for fine-grained recognition (DSRS). We first select a few image regions and then use the pre-trained DCNN models to predict their semantic correlations with corresponding classes. We use both visual and semantic representations to represent image regions. The visual and semantic representations are then linearly combined for joint representation. The combination parameters are determined by considering both semantic distinctiveness and spatial-semantic correlations. We use the joint representations for classifier training. A testing image can be classified by obtaining the visual and semantic representations and encoded for joint representation and classification. Experiments on several publicly available datasets demonstrate the proposed method's superiority.  相似文献   

5.
朱二莉  彭波  刘志中 《电视技术》2015,39(11):77-82
针对自然面部表情识别中的噪声标记问题,提出了一种自适应鲁棒在线度量学习方法.首先,学习新的度量空间以增加不同面部表情的判别性;然后,定义敏感度和特异性来表征每个注释器;最后,引入表示真实类标签的潜在变量,在期望最大化架构中迭代求解距离度量和注释器的可靠性.在MFP和AR人脸数据库上的实验结果表明,相比其他几种较新的方法,本方法在自然表情识别方面能获得更高的识别精度,高兴表情识别率可高达99.7%,并且在一定程度上降低了计算开销.  相似文献   

6.
Constructing the bag-of-features model from Space–time interest points (STIPs) has been successfully utilized for human action recognition. However, how to eliminate a large number of irrelevant STIPs for representing a specific action in realistic scenarios as well as how to select discriminative codewords for effective bag-of-features model still need to be further investigated. In this paper, we propose to select more representative codewords based on our pruned interest points algorithm so as to reduce computational cost as well as improve recognition performance. By taking human perception into account, attention based saliency map is employed to choose salient interest points which fall into salient regions, since visual saliency can provide strong evidence for the location of acting subjects. After salient interest points are identified, each human action is represented with the bag-of-features model. In order to obtain more discriminative codewords, an unsupervised codeword selection algorithm is utilized. Finally, the Support Vector Machine (SVM) method is employed to perform human action recognition. Comprehensive experimental results on the widely used and challenging Hollywood-2 Human Action (HOHA-2) dataset and YouTube dataset demonstrate that our proposed method is computationally efficient while achieving improved performance in recognizing realistic human actions.  相似文献   

7.
The need for movement smoothness quantification to assess motor learning and recovery has resulted in various measures that look at different aspects of a movement's profile. This paper first shows that most of the previously published smoothness measures lack validity, consistency, sensitivity, or robustness. It then introduces and evaluates the spectral arc-length metric that uses a movement speed profile's Fourier magnitude spectrum to quantify movement smoothness. This new metric is systematically tested and compared to other smoothness metrics, using experimental data from stroke and healthy subjects as well as simulated movement data. The results indicate that the spectral arc-length metric is a valid and consistent measure of movement smoothness, which is both sensitive to modifications in motor behavior and robust to measurement noise. We hope that the systematic analysis of this paper is a step toward the standardization of the quantitative assessment of movement smoothness.  相似文献   

8.
The main issue in personal authentication systems for military, security, industrial and social applications is accuracy. This paper presents a finger knuckle print (FKP) recognition approach to identity authentication. It applies a discriminative common vectors (DCV) based method to obtain the unique feature vectors, called discriminative common vectors, and the Euclidean distance as matching strategy to achieve the identification and verification tasks. The recognition process can be divided into the following phases: capturing the image; pre-processing; extracting the discriminative common vectors; matching and, finally, making a decision. In order to test and evaluate the proposed approach both the most representative FKP public databases and an established non-uniform FKP database were used. Experiments with these databases confirm that the DCV-based FKP recognition method achieves the authentication tasks effectively. The results showed the performance of the system in terms of the recognition rate had 100% accuracy for both training data and unseen test data.  相似文献   

9.
A new rotation-invariant pattern recognition system is proposed and analyzed. In this system, silicon retina cells capable of image sensing and edge extraction are used so that the system can directly process images from the real world without an extra edge detector. The rotation-invariant discrete correlation function is modified and implemented in the silicon retina structure by using the current summation. Simulation results have verified the correct function of the proposed system. Moreover, an experimental chip to implement the proposed system with a 32×32 cell array has been designed and fabricated in 0.8-μm n-well CMOS process. Experimental results have successfully shown that the system works well for the arbitrary orientation pattern recognition  相似文献   

10.
Individual recognition from locomotion is a challenging task owing to large intra-class and small inter-class variations. In this article, we present a novel metric learning method for individual recognition from skeleton sequences. Firstly, we propose to model articulated body on Riemannian manifold to describe the essence of human motion, which can reflect biometric signatures of the enrolled individuals. Then two spatia-temporal metric learning approaches are proposed, namely Spatio-Temporal Large Margin Nearest Neighbor (ST-LMNN) and Spatio-Temporal Multi-Metric Learning (STMM), to learn discriminant bilinear metrics which can encode the spatio-temporal structure of human motion. Specifically, the ST-LMNN algorithm extends the bilinear model into classical Large Margin Nearest Neighbor method, which learns a low-dimensional local linear embedding in the spatial and temporal domain, respectively. To further capture the unique motion pattern for each individual, the proposed STMM algorithm learns a set of individual-specific spatio-temporal metrics, which make the projected features of the same person closer to its class mean than that of different classes by a large margin. Beyond that, we present a new publicly available dataset for locomotion recognition to evaluate the influence of both internal and external covariant factors. According to the experimental results from the three public datasets, we believe that the proposed approaches are both able to achieve competitive results in individual recognition.  相似文献   

11.
传统的人脸识别方法对待识别人脸图像的质量要求较高,而且要求所采集的人脸图像的光照情况与人脸训练库的光照情况的差异不能太大,这就限制了人脸识别系统运行的环境条件,从而限制了人脸识别的应用。为了降低人脸识别对环境条件的要求,真服光照对人脸识别的影响。本文分析了人脸图像的幅频特性和相频特性,提出了频域光照归一化的人脸识别方法,使得对任何光照条件下采集的图像经过归一化后,光照情况与训练库中的图像完全相同,同时保留了人脸的可区分性。因为人脸之间差异的信息量一般较少,故本文运用最小非零特征向量作为人脸特征,通过实验仿真,与传统方法相比本文人脸识别方法对光照变化具有鲁棒性。  相似文献   

12.
Techniques currently under investigation for use in the processing of 2-dimensional patterns are reported. Particular reference is made to the design of a parallel-processing computer which is being developed for this task, and some preliminary applications in the segmentation and feature-detection phases of the recognition of handwritten alphanumeric characters are outlined.  相似文献   

13.
Novel sets of operators for assigning direction codes to black-white boundaries in 2-dimensional binary pictures are proposed. These have application in pattern recognition, and are particularly suitable for use in parallel-processing systems.  相似文献   

14.
White  I. 《Electronics letters》1971,7(16):458-460
A 1-dimensional shift invariant is proposed which is a mapping of the real function g(t) such that, for g(t) ? h(t+?) for any ?, the invariant function Rg(·) is not equal to Rh(·). It is a special case of the 2nd-order autocorrelation discussed by McLaughlin et al.  相似文献   

15.
Wavelet transform has been found to be an effective tool for the time-frequency analysis of non-stationary and quasi-stationary signals. Recent years have seen wavelet transform being used for feature extraction in speech recognition applications. In the paper a sub-band feature extraction technique based on an admissible wavelet transform is proposed and the features are modified to make them robust to additive white Gaussian noise. The performance of this system is compared with the conventional mel frequency cepstral coefficients (MFCC) under various signal to noise ratios. The recognition performance based on the eight sub-band features is found to be superior under the noisy conditions compared with MFCC features.  相似文献   

16.
Iris recognition system is one of the biometric systems in which the development is growing rapidly. In this paper, speeded up robust features (SURFs) are used for detecting and describing iris keypoints. For feature matching, simple fusion rules are applied at different levels. Contrast-limited adaptive histogram equalization (CLAHE) is applied on the normalized image and is compared with histogram equalization (HE) and adaptive histogram equalization (AHE). The aim is to find the best enhancement technique with SURF and to verify the necessity of iris image enhancement. The recognition accuracy in each case is calculated. Experimental results demonstrate that CLAHE is a crucial enhancement step for SURF-based iris recognition. More keypoints can be extracted with enhancement using CLAHE compared to HE and AHE. This alleviates the problem of feature loss and increases the recognition accuracy. The accuracies of recognition using left and right iris images are 99 and 99.5 %, respectively. Fusion of local distances and choosing suitable fusion rules affect the recognition accuracy, noticeably. The proposed SURF-based algorithm is compared with scale-invariant feature transform, histogram of oriented gradients, maximally stable extremal regions and DAISY. Results show that the proposed algorithm is robust to different image variations and gives the highest recognition accuracy.  相似文献   

17.
To utilize the supra-segmental nature of Mandarin tones, this article proposes a feature extraction method for hidden markov model (HMM) based tone modeling. The method uses linear transforms to project F0 (fundamental frequency) features of neighboring syllables as compensations, and adds them to the original F0 features of the current syllable. The transforms are discriminatively trained by using an objective function termed as "minimum tone error", which is a smooth approximation of tone recognition accuracy. Experiments show that the new tonal features achieve 3.82% tone recognition rate improvement, compared with the baseline, using maximum likelihood trained HMM on the normal F0 features. Further experiments show that discriminative HMM training on the new features is 8.78% better than the baseline.  相似文献   

18.
The accuracy of face alignment affects greatly the performance of a face recognition system. Since the face alignment is usually conducted using eye positions, the algorithm for accurate eye localization is essential for the accurate face recognition. In this paper, an algorithm is proposed for eye localization. First, the proper AdaBoost detection is adaptively trained to segment the region based on the special gray distribution in the region. After that, a fast radial symmetry operator is used to precisely locate the center of eyes. Experimental results show that the method can accurately locate the eyes, and it is robust to the variations of face poses, illuminations, expressions, and accessories.  相似文献   

19.
In this paper, we propose a new multi-manifold metric learning (MMML) method for the task of face recognition based on image sets. Different from most existing metric learning algorithms that learn the distance metric for measuring single images, our method aims to learn distance metrics to measure the similarity between manifold pairs. In our method, each image set is modeled as a manifold and then multiple distance metrics among different manifolds are learned. With these distance metrics, the intra-class manifold variations are minimized and inter-class manifold variations are maximized simultaneously. For each person, we learn a distance metric by using such a criterion that all the learned distance metrics are person-specific and thus more discriminative. Our method is extensively evaluated on three widely studied face databases, i.e., Honda/UCSD database, CMU MoBo database and YouTube Celebrities database, and compared to the state-of-the-arts. Experimental results are presented to show the effectiveness of the proposed method.  相似文献   

20.
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号