首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
XML documents have recently become ubiquitous because of their varied applicability in a number of applications. Classification is an important problem in the data mining domain, but current classification methods for XML documents use IR-based methods in which each document is treated as a bag of words. Such techniques ignore a significant amount of information hidden inside the documents. In this paper we discuss the problem of rule based classification of XML data by using frequent discriminatory substructures within XML documents. Such a technique is more capable of finding the classification characteristics of documents. In addition, the technique can also be extended to cost sensitive classification. We show the effectiveness of the method with respect to other classifiers. We note that the methodology discussed in this paper is applicable to any kind of semi-structured data. Editors: Hendrik Blockeel, David Jensen and Stefan Kramer An erratum to this article is available at .  相似文献   

2.
3.
This paper studies regularized discriminant analysis (RDA) in the context of face recognition. We check RDA sensitivity to different photometric preprocessing methods and compare its performance to other classifiers. Our study shows that RDA is better able to extract the relevant discriminatory information from training data than the other classifiers tested, thus obtaining a lower error rate. Moreover, RDA is robust under various lighting conditions while the other classifiers perform badly when no photometric method is applied.  相似文献   

4.
In dimensional affect recognition, the machine learning methods, which are used to model and predict affect, are mostly classification and regression. However, the annotation in the dimensional affect space usually takes the form of a continuous real value which has an ordinal property. The aforementioned methods do not focus on taking advantage of this important information. Therefore, we propose an affective rating ranking framework for affect recognition based on face images in the valence and arousal dimensional space. Our approach can appropriately use the ordinal information among affective ratings which are generated by discretizing continuous annotations. Specifically, we first train a series of basic cost-sensitive binary classifiers, each of which uses all samples relabeled according to the comparison results between corresponding ratings and a given rank of a binary classifier. We obtain the final affective ratings by aggregating the outputs of binary classifiers. By comparing the experimental results with the baseline and deep learning based classification and regression methods on the benchmarking database of the AVEC 2015 Challenge and the selected subset of SEMAINE database, we find that our ordinal ranking method is effective in both arousal and valence dimensions.  相似文献   

5.
Automatic analysis of head gestures and facial expressions is a challenging research area and it has significant applications in human-computer interfaces. We develop a face and head gesture detector in video streams. The detector is based on face landmark paradigm in that appearance and configuration information of landmarks are used. First we detect and track accurately facial landmarks using adaptive templates, Kalman predictor and subspace regularization. Then the trajectories (time series) of facial landmark positions during the course of the head gesture or facial expression are converted in various discriminative features. Features can be landmark coordinate time series, facial geometric features or patches on expressive regions of the face. We use comparatively, two feature sequence classifiers, that is, Hidden Markov Models (HMM) and Hidden Conditional Random Fields (HCRF), and various feature subspace classifiers, that is, ICA (Independent Component Analysis) and NMF (Non-negative Matrix Factorization) on the spatiotemporal data. We achieve 87.3% correct gesture classification on a seven-gesture test database, and the performance reaches 98.2% correct detection under a fusion scheme. Promising and competitive results are also achieved on classification of naturally occurring gesture clips of LIlir TwoTalk Corpus.  相似文献   

6.
We present a two-step method to speed-up object detection systems in computer vision that use support vector machines as classifiers. In the first step we build a hierarchy of classifiers. On the bottom level, a simple and fast linear classifier analyzes the whole image and rejects large parts of the background. On the top level, a slower but more accurate classifier performs the final detection. We propose a new method for automatically building and training a hierarchy of classifiers. In the second step we apply feature reduction to the top level classifier by choosing relevant image features according to a measure derived from statistical learning theory. Experiments with a face detection system show that combining feature reduction with hierarchical classification leads to a speed-up by a factor of 335 with similar classification performance.  相似文献   

7.
Automatic classification is one of the basic tasks required in any pattern recognition and human computer interaction application. In this paper, we discuss training probabilistic classifiers with labeled and unlabeled data. We provide a new analysis that shows under what conditions unlabeled data can be used in learning to improve classification performance. We also show that, if the conditions are violated, using unlabeled data can be detrimental to classification performance. We discuss the implications of this analysis to a specific type of probabilistic classifiers, Bayesian networks, and propose a new structure learning algorithm that can utilize unlabeled data to improve classification. Finally, we show how the resulting algorithms are successfully employed in two applications related to human-computer interaction and pattern recognition: facial expression recognition and face detection.  相似文献   

8.
Time-series classification (TSC) problems present a specific challenge for classification algorithms: how to measure similarity between series. A shapelet is a time-series subsequence that allows for TSC based on local, phase-independent similarity in shape. Shapelet-based classification uses the similarity between a shapelet and a series as a discriminatory feature. One benefit of the shapelet approach is that shapelets are comprehensible, and can offer insight into the problem domain. The original shapelet-based classifier embeds the shapelet-discovery algorithm in a decision tree, and uses information gain to assess the quality of candidates, finding a new shapelet at each node of the tree through an enumerative search. Subsequent research has focused mainly on techniques to speed up the search. We examine how best to use the shapelet primitive to construct classifiers. We propose a single-scan shapelet algorithm that finds the best $k$ shapelets, which are used to produce a transformed dataset, where each of the $k$ features represent the distance between a time series and a shapelet. The primary advantages over the embedded approach are that the transformed data can be used in conjunction with any classifier, and that there is no recursive search for shapelets. We demonstrate that the transformed data, in conjunction with more complex classifiers, gives greater accuracy than the embedded shapelet tree. We also evaluate three similarity measures that produce equivalent results to information gain in less time. Finally, we show that by conducting post-transform clustering of shapelets, we can enhance the interpretability of the transformed data. We conduct our experiments on 29 datasets: 17 from the UCR repository, and 12 we provide ourselves.  相似文献   

9.
Classification using adaptive wavelets for feature extraction   总被引:8,自引:0,他引:8  
A major concern arising from the classification of spectral data is that the number of variables or dimensionality often exceeds the number of available spectra. This leads to a substantial deterioration in performance of traditionally favoured classifiers. It becomes necessary to decrease the number of variables to a manageable size, whilst, at the same time, retaining as much discriminatory information as possible. A new and innovative technique based on adaptive wavelets, which aims to reduce the dimensionality and optimize the discriminatory information is presented. The discrete wavelet transform is utilized to produce wavelet coefficients which are used for classification. Rather than using one of the standard wavelet bases, we generate the wavelet which optimizes specified discriminant criteria  相似文献   

10.
This paper shows some combinations of classifiers that achieve high accuracy classifications. Traditionally the maximum likelihood classification is used as an initial classification for a contextual classifier. We show that by using different non-parametric spectral classifiers to obtain the initial classification, we can significatively improve the accuracy of the classification with a reasonable computational cost. In this work we propose the use of different spectral classifications as initial maps for a contextual classifier (ICM) in order to obtain some interesting combinations of spectral-contextual classifiers for remote sensing image classification with an acceptable trade-off between the accuracy of the final classification and the computational effort required.  相似文献   

11.
郑豪  金忠 《计算机工程》2011,37(16):155-157
为充分利用样本的类别信息,提出一种有监督的稀疏保持近邻嵌入算法(SSNPE).该算法结合稀疏表示和保持近邻的思想,根据先验类标签信息保持局部邻域的固有几何关系.采用最小近邻分类器估算识别率,测试结果表明,在姿态、光照和表情变化的情况下,SSNPE都具有较高的识别率.  相似文献   

12.
Subspace face recognition often suffers from two problems: (1) the training sample set is small compared with the high dimensional feature vector; (2) the performance is sensitive to the subspace dimension. Instead of pursuing a single optimal subspace, we develop an ensemble learning framework based on random sampling on all three key components of a classification system: the feature space, training samples, and subspace parameters. Fisherface and Null Space LDA (N-LDA) are two conventional approaches to address the small sample size problem. But in many cases, these LDA classifiers are overfitted to the training set and discard some useful discriminative information. By analyzing different overfitting problems for the two kinds of LDA classifiers, we use random subspace and bagging to improve them respectively. By random sampling on feature vectors and training samples, multiple stabilized Fisherface and N-LDA classifiers are constructed and the two groups of complementary classifiers are integrated using a fusion rule, so nearly all the discriminative information is preserved. In addition, we further apply random sampling on parameter selection in order to overcome the difficulty of selecting optimal parameters in our algorithms. Then, we use the developed random sampling framework for the integration of multiple features. A robust random sampling face recognition system integrating shape, texture, and Gabor responses is finally constructed.  相似文献   

13.
14.
In this paper, we present a detailed study and comparison of different classification algorithms. Our main purpose is the study of the Vicinal Support Vector Classifier (VSVC) and its relations to the other state-of-the-art classifiers. To this end, we start by the historical development of each classifier, derivation of the mathematics behind it and describing the relations that exist between some of them, in particular the relation between the VSVC and the other classifiers. Thereafter, we apply them to two famous learning datasets very used by the research community, namely the MIT-CBCL face and the Wisconsin Diagnostic Breast Cancer (WDBC) datasets. We show that despite its simplicity compared to the other state-of-the-art classifiers, the VSVC leads to very robust classification results and provide some practical advantages compared to the other classifiers.  相似文献   

15.
Face recognition using fuzzy Integral and wavelet decomposition method   总被引:2,自引:0,他引:2  
In this paper, we develop a method for recognizing face images by combining wavelet decomposition, Fisherface method, and fuzzy integral. The proposed approach is comprised of four main stages. The first stage uses the wavelet decomposition that helps extract intrinsic features of face images. As a result of this decomposition, we obtain four subimages (namely approximation, horizontal, vertical, and diagonal detailed images). The second stage of the approach concerns the application of the Fisherface method to these four decompositions. The choice of the Fisherface method in this setting is motivated by its insensitivity to large variation in light direction, face pose, and facial expression. The two last phases are concerned with the aggregation of the individual classifiers by means of the fuzzy integral. Both Sugeno and Choquet type of fuzzy integral are considered as the aggregation method. In the experiments we use n-fold cross-validation to assure high consistency of the produced classification outcomes. The experimental results obtained for the Chungbuk National University (CNU) and Yale University face databases reveal that the approach presented in this paper yields better classification performance in comparison to the results obtained by other classifiers.  相似文献   

16.
集成学习被广泛用于提高分类精度, 近年来的研究表明, 通过多模态扰乱策略来构建集成分类器可以进一步提高分类性能. 本文提出了一种基于近似约简与最优采样的集成剪枝算法(EPA_AO). 在EPA_AO中, 我们设计了一种多模态扰乱策略来构建不同的个体分类器. 该扰乱策略可以同时扰乱属性空间和训练集, 从而增加了个体分类器的多样性. 我们利用证据KNN (K-近邻)算法来训练个体分类器, 并在多个UCI数据集上比较了EPA_AO与现有同类型算法的性能. 实验结果表明, EPA_AO是一种有效的集成学习方法.  相似文献   

17.
In using traditional digital classification algorithms, a researcher typically encounters serious issues in identifying urban land cover classes employing high resolution data. A normal approach is to use spectral information alone and ignore spatial information and a group of pixels that need to be considered together as an object. We used QuickBird image data over a central region in the city of Phoenix, Arizona to examine if an object-based classifier can accurately identify urban classes. To demonstrate if spectral information alone is practical in urban classification, we used spectra of the selected classes from randomly selected points to examine if they can be effectively discriminated. The overall accuracy based on spectral information alone reached only about 63.33%. We employed five different classification procedures with the object-based paradigm that separates spatially and spectrally similar pixels at different scales. The classifiers to assign land covers to segmented objects used in the study include membership functions and the nearest neighbor classifier. The object-based classifier achieved a high overall accuracy (90.40%), whereas the most commonly used decision rule, namely maximum likelihood classifier, produced a lower overall accuracy (67.60%). This study demonstrates that the object-based classifier is a significantly better approach than the classical per-pixel classifiers. Further, this study reviews application of different parameters for segmentation and classification, combined use of composite and original bands, selection of different scale levels, and choice of classifiers. Strengths and weaknesses of the object-based prototype are presented and we provide suggestions to avoid or minimize uncertainties and limitations associated with the approach.  相似文献   

18.
张丹  杨斌  张瑞禹 《遥感信息》2009,(5):41-43,55
在遥感影像分类应用中,不同分类器的分类精度是不同的,而同一分类器对不同类别的分类精度也是不相同的。多分类器结合的思想就是利用现有分类器之间的互补性,通过适当的方法将不同的分类器之间进行优势互补,往往可以得到比单个分类器更好的分类结果。本文研究了如何在Matlab下采用最短距离分类器、贝叶斯分类器、BP神经网络分类器对影像进行分类,并采用投票法进行多种分类器结合的遥感影像分类,最后进行分类后处理。实验结果表明多分类器结合的遥感影像分类比单一分类器分类的精度高。  相似文献   

19.
该文针对中文网络评论情感分类任务,提出了一种集成学习框架。首先针对中文网络评论复杂多样的特点,采用词性组合模式、频繁词序列模式和保序子矩阵模式作为输入特征。然后采用基于信息增益的随机子空间算法解决文本特征繁多的问题,同时提高基分类器的分类性能。最后基于产品属性构造基分类器算法综合评论文本中每个属性的情感信息,进而判别评论的句子级情感倾向。实验结果表明了该框架在中文网络评论情感分类任务上的有效性,特别是在Logistic Regression分类算法上准确率达到90.3%。  相似文献   

20.
In this paper, we describe a novel multiclass boosting algorithm, EDBoost, to achieve robust face recognition directly in JPEG compressed domain. In comparison with existing boosting algorithms, the proposed EDBoost exploits Euclidean distance (ED) to eliminate non-effective weak classifiers in each iteration of the boosted learning, and hence improves both feature selection and classifier learning by using fewer weak classifiers and producing lower error rates. When applied to face recognition, the EDBoost algorithm is capable of selecting the most discriminative DCT features directly in JPEG compressed domain to achieve high recognition performances. In addition, a new DC replacement scheme is also proposed to reduce the effect of illumination changes. In comparison with the existing techniques, the proposed scheme achieves robust face recognition without losing the important information carried by all DC coefficients. Extensive experiments support the conclusion that the proposed algorithm outperforms all representative existing techniques in terms of boosted learning, multiclass classification, lighting effect reduction and face recognition rates.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号