首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 187 毫秒
1.
Enwang  Alireza   《Pattern recognition》2007,40(12):3401-3414
A new method for design of a fuzzy-rule-based classifier using genetic algorithms (GAs) is discussed. The optimal parameters of the fuzzy classifier including fuzzy membership functions and the size and structure of fuzzy rules are extracted from the training data using GAs. This is done by introducing new representation schemes for fuzzy membership functions and fuzzy rules. An effectiveness measure for fuzzy rules is developed that allows for systematic addition or deletion of rules during the GA optimization process. A clustering method is utilized for generating new rules to be added when additions are required. The performance of the classifier is tested on two real-world databases (Iris and Wine) and a simulated Gaussian database. The results indicate that highly accurate classifiers could be designed with relatively few fuzzy rules. The performance is also compared to other fuzzy classifiers tested on the same databases.  相似文献   

2.
Mixed group ranks: preference and confidence in classifier combination   总被引:1,自引:0,他引:1  
Classifier combination holds the potential of improving performance by combining the results of multiple classifiers. For domains with very large numbers of classes, such as biometrics, we present an axiomatic framework of desirable mathematical properties for combination functions of rank-based classifiers. This framework represents a continuum of combination rules, including the Borda Count, Logistic Regression, and Highest Rank combination methods as extreme cases. Intuitively, this framework captures how the two complementary concepts of general preference for specific classifiers and the confidence it has in any specific result (as indicated by ranks) can be balanced while maintaining consistent rank interpretation. Mixed Group Ranks (MGR) is a new combination function that balances preference and confidence by generalizing these other functions. We demonstrate that MGR is an effective combination approach by performing multiple experiments on data sets with large numbers of classes and classifiers from the FERET face recognition study.  相似文献   

3.
Several researchers have shown that substantial improvements can be achieved in difficult pattern recognition problems by combining the outputs of multiple neural networks. In this work, we present and test a pattern classification multi-net system based on both supervised and unsupervised learning. Following the ‘divide-and-conquer’ framework, the input space is partitioned into overlapping subspaces and neural networks are subsequently used to solve the respective classification subtasks. Finally, the outputs of individual classifiers are appropriately combined to obtain the final classification decision. Two clustering methods have been applied for input space partitioning and two schemes have been considered for combining the outputs of the multiple classifiers. Experiments on well-known data sets indicate that the multi-net classification system exhibits promising performance compared with the case of single network training, both in terms of error rates and in terms of training speed (especially if the training of the classifiers is done in parallel). ID="A1"Correspondance and offprint requests to: D. Frosyniotis, National Technical University of Athens, Department of Electrical and Computer Engineering, Zographou 157 73, Athens, Greece. E-mail: andreas@cs.ntua.gr  相似文献   

4.
Analysis of a Plurality Voting-based Combination of Classifiers   总被引:1,自引:0,他引:1  
In various studies, it has been demonstrated that combining the decisions of multiple classifiers can lead to better recognition result. Plurality voting is one of the most widely used combination strategies. In this paper, we both theoretically and experimentally analyze the performance of a plurality voting based ensemble classifier. Theoretical expressions for system performance are derived as a function of the model parameters: N (number of classifiers), m (number of classes), and p (probability that a single classifier is correct). Experimental results on the human face recognition problem show that the voting strategy can successfully achieve high detection and identification rates, and, simultaneously, low false acceptance rates.  相似文献   

5.
Stacking is a general ensemble method in which a number of base classifiers are combined using one meta-classifier which learns their outputs. Such an approach provides certain advantages: simplicity; performance that is similar to the best classifier; and the capability of combining classifiers induced by different inducers. The disadvantage of stacking is that on multiclass problems, stacking seems to perform worse than other meta-learning approaches. In this paper we present Troika, a new stacking method for improving ensemble classifiers. The new scheme is built from three layers of combining classifiers. The new method was tested on various datasets and the results indicate the superiority of the proposed method to other legacy ensemble schemes, Stacking and StackingC, especially when the classification task consists of more than two classes.  相似文献   

6.
The problem of classifier combination is considered in the context of the two main fusion scenarios: fusion of opinions based on identical and on distinct representations. We develop a theoretical framework for classifier combination for these two scenarios. For multiple experts using distinct representations we argue that many existing schemes such as the product rule, sum rule, min rule, max rule, majority voting, and weighted combination, can be considered as special cases of compound classification. We then consider the effect of classifier combination in the case of multiple experts using a shared representation where the aim of fusion is to obtain a better estimate of the appropriatea posteriori class probabilities. We also show that the two theoretical frameworks can be used for devising fusion strategies when the individual experts use features some of which are shared and the remaining ones distinct. We show that in both cases (distinct and shared representations), the expert fusion involves the computation of a linear or nonlinear function of thea posteriori class probabilities estimated by the individual experts. Classifier combination can therefore be viewed as a multistage classification process whereby thea posteriori class probabilities generated by the individual classifiers are considered as features for a second stage classification scheme. Most importantly, when the linear or nonlinear combination functions are obtained by training, the distinctions between the two scenarios fade away, and one can view classifier fusion in a unified way.  相似文献   

7.
Improved Generalization Through Explicit Optimization of Margins   总被引:2,自引:0,他引:2  
Mason  Llew  Bartlett  Peter L.  Baxter  Jonathan 《Machine Learning》2000,38(3):243-255
Recent theoretical results have shown that the generalization performance of thresholded convex combinations of base classifiers is greatly improved if the underlying convex combination has large margins on the training data (i.e., correct examples are classified well away from the decision boundary). Neural network algorithms and AdaBoost have been shown to implicitly maximize margins, thus providing some theoretical justification for their remarkably good generalization performance. In this paper we are concerned with maximizing the margin explicitly. In particular, we prove a theorem bounding the generalization performance of convex combinations in terms of general cost functions of the margin, in contrast to previous results, which were stated in terms of the particular cost function sgn( – margin). We then present a new algorithm, DOOM, for directly optimizing a piecewise-linear family of cost functions satisfying the conditions of the theorem. Experiments on several of the datasets in the UC Irvine database are presented in which AdaBoost was used to generate a set of base classifiers and then DOOM was used to find the optimal convex combination of those classifiers. In all but one case the convex combination generated by DOOM had lower test error than AdaBoost's combination. In many cases DOOM achieves these lower test errors by sacrificing training error, in the interests of reducing the new cost function. In our experiments the margin plots suggest that the size of the minimum margin is not the critical factor in determining generalization performance.  相似文献   

8.
Various fusion functions for classifier combination have been designed to optimize the results of ensembles of classifiers (EoC). We propose a pairwise fusion matrix (PFM) transformation, which produces reliable probabilities for the use of classifier combination and can be amalgamated with most existent fusion functions for combining classifiers. The PFM requires only crisp class label outputs from classifiers, and is suitable for high-class problems or problems with few training samples. Experimental results suggest that the performance of a PFM can be a notch above that of the simple majority voting rule (MAJ), and a PFM can work on problems where a behavior-knowledge space (BKS) might not be applicable.  相似文献   

9.

Repeat buyer prediction is crucial for e-commerce companies to enhance their customer services and product sales. In particular, being aware of which factors or rules drive repeat purchases is as significant as knowing the outcomes of predictions in the business field. Therefore, an interpretable model with excellent prediction performance is required. Many classifiers, such as the multilayer perceptron, have exceptional predictive abilities but lack model interpretability. Tree-based models possess interpretability; however, their predictive performances usually cannot achieve high levels. Based on these observations, we design an approach to balance the predictive and interpretable performance of a decision tree with model distillation and heterogeneous classifier fusion. Specifically, we first train multiple heterogeneous classifiers and integrate them through diverse combination operators. Then, classifier combination plays the role of teacher model. Subsequently, soft targets are obtained from the teacher and guide training of the decision tree. A real-world repeat buyer prediction dataset is utilized in this paper, and we adopt features with respect to three aspects: users, merchants, and user–merchant pairs. Our experimental results show that the accuracy and AUC of the decision tree are both improved, and we provide model interpretations of three aspects.

  相似文献   

10.
This paper investigates the effects of confidence transformation in combining multiple classifiers using various combination rules. The combination methods were tested in handwritten digit recognition by combining varying classifier sets. The classifier outputs are transformed to confidence measures by combining three scaling functions (global normalization, Gaussian density modeling, and logistic regression) and three confidence types (linear, sigmoid, and evidence). The combination rules include fixed rules (sum-rule, product-rule, median-rule, etc.) and trained rules (linear discriminants and weighted combination with various parameter estimation techniques). The experimental results justify that confidence transformation benefits the combination performance of either fixed rules or trained rules. Trained rules mostly outperform fixed rules, especially when the classifier set contains weak classifiers. Among the trained rules, the support vector machine with linear kernel (linear SVM) performs best while the weighted combination with optimized weights performs comparably well. I have also attempted the joint optimization of confidence parameters and combination weights but its performance was inferior to that of cascaded confidence transformation-combination. This justifies that the cascaded strategy is a right way of multiple classifier combination.  相似文献   

11.
Classifier combination methods have proved to be an effective tool to increase the performance of classification techniques that can be used in any pattern recognition applications. Despite a significant number of publications describing successful classifier combination implementations, the theoretical basis is still not matured enough and achieved improvements are inconsistent. In this paper, we propose a novel statistical validation technique known as correlation‐based classifier combination technique for combining classifier in any pattern recognition problem. This validation has significant influence on the performance of combinations, and their utilization is necessary for complete theoretical understanding of combination algorithms. The analysis presented is statistical in nature but promises to lead to a class of algorithms for rank‐based decision combination. The potentials of the theoretical and practical issues in implementation are illustrated by applying it on 2 standard datasets in pattern recognition domain, namely, handwritten digit recognition and letter image recognition datasets taken from UCI Machine Learning Database Repository ( http://www.ics.uci.edu/_mlearn ). 1 An empirical evaluation using 8 well‐known distinct classifiers confirms the validity of our approach compared to some other combinations of multiple classifiers algorithms. Finally, we also suggest a methodology for determining the best mix of individual classifiers.  相似文献   

12.
《Information Fusion》2002,3(4):245-258
In classifier combination, it is believed that diverse ensembles have a better potential for improvement on the accuracy than non-diverse ensembles. We put this hypothesis to a test for two methods for building the ensembles: Bagging and Boosting, with two linear classifier models: the nearest mean classifier and the pseudo-Fisher linear discriminant classifier. To estimate diversity, we apply nine measures proposed in the recent literature on combining classifiers. Eight combination methods were used: minimum, maximum, product, average, simple majority, weighted majority, Naive Bayes and decision templates. We carried out experiments on seven data sets for different sample sizes, different number of classifiers in the ensembles, and the two linear classifiers. Altogether, we created 1364 ensembles by the Bagging method and the same number by the Boosting method. On each of these, we calculated the nine measures of diversity and the accuracy of the eight different combination methods, averaged over 50 runs. The results confirmed in a quantitative way the intuitive explanation behind the success of Boosting for linear classifiers for increasing training sizes, and the poor performance of Bagging in this case. Diversity measures indicated that Boosting succeeds in inducing diversity even for stable classifiers whereas Bagging does not.  相似文献   

13.
Cascade Generalization   总被引:6,自引:0,他引:6  
Using multiple classifiers for increasing learning accuracy is an active research area. In this paper we present two related methods for merging classifiers. The first method, Cascade Generalization, couples classifiers loosely. It belongs to the family of stacking algorithms. The basic idea of Cascade Generalization is to use sequentially the set of classifiers, at each step performing an extension of the original data by the insertion of new attributes. The new attributes are derived from the probability class distribution given by a base classifier. This constructive step extends the representational language for the high level classifiers, relaxing their bias. The second method exploits tight coupling of classifiers, by applying Cascade Generalization locally. At each iteration of a divide and conquer algorithm, a reconstruction of the instance space occurs by the addition of new attributes. Each new attribute represents the probability that an example belongs to a class given by a base classifier. We have implemented three Local Generalization Algorithms. The first merges a linear discriminant with a decision tree, the second merges a naive Bayes with a decision tree, and the third merges a linear discriminant and a naive Bayes with a decision tree. All the algorithms show an increase of performance, when compared with the corresponding single models. Cascade also outperforms other methods for combining classifiers, like Stacked Generalization, and competes well against Boosting at statistically significant confidence levels.  相似文献   

14.
Due to the wide variety of fusion techniques available for combining multiple classifiers into a more accurate classifier, a number of good studies have been devoted to determining in what situations some fusion methods should be preferred over other ones. However, the sample size behavior of the various fusion methods has hitherto received little attention in the literature of multiple classifier systems. The main contribution of this paper is thus to investigate the effect of training sample size on their relative performance and to gain more insight into the conditions for the superiority of some combination rules.A large experiment is conducted to study the performance of some fixed and trainable combination rules for executing one- and two-level classifier fusion for different training sample sizes. The experimental results yield the following conclusions: when implementing one-level fusion to combine homogeneous or heterogeneous base classifiers, fixed rules outperform trainable ones in nearly all cases, with only one exception of merging heterogeneous classifiers for large sample size. Moreover, the best classification for any considered sample size is generally achieved by a second level of combination (namely, utilizing one fusion rule to further combine a set of ensemble classifiers with each of them constructed by fusing base classifiers). Under these circumstances, it seems that adopting different types of fusion rules (fixed or trainable) as the combiners for two levels of fusion is appropriate.  相似文献   

15.
We consider ensemble classification for the case where there is no common labeled training data for jointly designing the individual classifiers and the function that aggregates their decisions. This problem, which we call distributed ensemble classification, applies when individual classifiers operate (perhaps remotely) on different sensing modalities and when combining proprietary or legacy classifiers. The conventional wisdom in this case is to apply fixed rules of combination such as voting methods or rules for aggregating probabilities. Alternatively, we take a transductive approach, optimizing the combining rule for an objective function measured on the unlabeled batch of test data. We propose maximum likelihood (ML) objectives that are shown to yield well-known forms of probabilistic aggregation, albeit with iterative, expectation-maximization-based adjustment to account for mismatch between class priors used by individual classifiers and those reflected in the new data batch. These methods are extensions, for the ensemble case, of the work of Saerens, Latinne, and Decaestecker (2002). We also propose an information-theoretic method that generally outperforms the ML methods, better handles classifier redundancies, and addresses some scenarios where the ML methods are not applicable. This method also well handles the case of missing classes in the test batch. On UC Irvine benchmark data, all our methods give improvements in classification accuracy over the use of fixed rules when there is prior mismatch.  相似文献   

16.
The problem of handwritten digit recognition has long been an open problem in the field of pattern classification and of great importance in industry. The heart of the problem lies within the ability to design an efficient algorithm that can recognize digits written and submitted by users via a tablet, scanner, and other digital devices. From an engineering point of view, it is desirable to achieve a good performance within limited resources. To this end, we have developed a new approach for handwritten digit recognition that uses a small number of patterns for training phase. To improve the overall performance achieved in classification task, the literature suggests combining the decision of multiple classifiers rather than using the output of the best classifier in the ensemble; so, in this new approach, an ensemble of classifiers is used for the recognition of handwritten digit. The classifiers used in proposed system are based on singular value decomposition (SVD) algorithm. The experimental results and the literature show that the SVD algorithm is suitable for solving sparse matrices such as handwritten digit. The decisions obtained by SVD classifiers are combined by a novel proposed combination rule which we named reliable multi-phase particle swarm optimization. We call the method “Reliable” because we have introduced a novel reliability parameter which is applied to tackle the problem of PSO being trapped in local minima. In comparison with previous methods, one of the significant advantages of the proposed method is that it is not sensitive to the size of training set. Unlike other methods, the proposed method uses just 15 % of the dataset as a training set, while other methods usually use (60–75) % of the whole dataset as the training set. To evaluate the proposed method, we tested our algorithm on Farsi/Arabic handwritten digit dataset. What makes the recognition of the handwritten Farsi/Arabic digits more challenging is that some of the digits can be legally written in different shapes. Therefore, 6000 hard samples (600 samples per class) are chosen by K-nearest neighbor algorithm from the HODA dataset which is a standard Farsi/Arabic digit dataset. Experimental results have shown that the proposed method is fast, accurate, and robust against the local minima of PSO. Finally, the proposed method is compared with state of the art methods and some ensemble classifier based on MLP, RBF, and ANFIS with various combination rules.  相似文献   

17.
Segmentation using an ensemble of classifiers (or committee machine) combines multiple classifiers’ results to increase the performance when compared to single classifiers. In this paper, we propose new concepts for combining rules. They are based (1) on uncertainties of the individual classifiers, (2) on combining the result of existing combining rules, (3) on combining local class probabilities with the existing segmentation probabilities at each individual segmentation, and (4) on using uncertainty-based weights for the weighted majority rule. The results show that the proposed local-statistics-aware combining rules can reduce the effect of noise in the individual segmentation result and consequently improve the performance of the final (combined) segmentation. Also, combining existing combining rules and using the proposed uncertainty- based weights can further improve the performance.  相似文献   

18.
Using neural network ensembles for bankruptcy prediction and credit scoring   总被引:2,自引:0,他引:2  
Bankruptcy prediction and credit scoring have long been regarded as critical topics and have been studied extensively in the accounting and finance literature. Artificial intelligence and machine learning techniques have been used to solve these financial decision-making problems. The multilayer perceptron (MLP) network trained by the back-propagation learning algorithm is the mostly used technique for financial decision-making problems. In addition, it is usually superior to other traditional statistical models. Recent studies suggest combining multiple classifiers (or classifier ensembles) should be better than single classifiers. However, the performance of multiple classifiers in bankruptcy prediction and credit scoring is not fully understood. In this paper, we investigate the performance of a single classifier as the baseline classifier to compare with multiple classifiers and diversified multiple classifiers by using neural networks based on three datasets. By comparing with the single classifier as the benchmark in terms of average prediction accuracy, the multiple classifiers only perform better in one of the three datasets. The diversified multiple classifiers trained by not only different classifier parameters but also different sets of training data perform worse in all datasets. However, for the Type I and Type II errors, there is no exact winner. We suggest that it is better to consider these three classifier architectures to make the optimal financial decision.  相似文献   

19.
Hidden Markov models (HMMs) have been shown to provide a high level performance for detecting anomalies in sequences of system calls to the operating system kernel. Using Boolean conjunction and disjunction functions to combine the responses of multiple HMMs in the ROC space may significantly improve performance over a “single best” HMM. However, these techniques assume that the classifiers are conditional independent, and their of ROC curves are convex. These assumptions are violated in most real-world applications, especially when classifiers are designed using limited and imbalanced training data. In this paper, the iterative Boolean combination (IBC) technique is proposed for efficient fusion of the responses from multiple classifiers in the ROC space. It applies all Boolean functions to combine the ROC curves corresponding to multiple classifiers, requires no prior assumptions, and its time complexity is linear with the number of classifiers. The results of computer simulations conducted on both synthetic and real-world host-based intrusion detection data indicate that the IBC of responses from multiple HMMs can achieve a significantly higher level of performance than the Boolean conjunction and disjunction combinations, especially when training data are limited and imbalanced. The proposed IBC is general in that it can be employed to combine diverse responses of any crisp or soft one- or two-class classifiers, and for wide range of application domains.  相似文献   

20.
This paper presents a new loss function for neural network classification, inspired by the recently proposed similarity measure called Correntropy. We show that this function essentially behaves like the conventional square loss for samples that are well within the decision boundary and have small errors, and L0 or counting norm for samples that are outliers or are difficult to classify. Depending on the value of the kernel size parameter, the proposed loss function moves smoothly from convex to non-convex and becomes a close approximation to the misclassification loss (ideal 0–1 loss). We show that the discriminant function obtained by optimizing the proposed loss function in the neighborhood of the ideal 0–1 loss function to train a neural network is immune to overfitting, more robust to outliers, and has consistent and better generalization performance as compared to other commonly used loss functions, even after prolonged training. The results also show that it is a close competitor to the SVM. Since the proposed method is compatible with simple gradient based online learning, it is a practical way of improving the performance of neural network classifiers.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号