首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 109 毫秒
1.
朴素贝叶斯分类器是一种应用广泛且简单有效的分类算法,但其条件独立性的"朴素贝叶斯假设"与现实存在差异,这种假设限制朴素贝叶斯分类器分类的准确率。为削弱这种假设,利用改进的蝙蝠算法优化朴素贝叶斯分类器。改进的蝙蝠算法引入禁忌搜索机制和随机扰动算子,避免其陷入局部最优解,加快收敛速度。改进的蝙蝠算法自动搜索每个属性的权值,通过给每个属性赋予不同的权值,在计算代价不大幅提高的情况下削弱了类独立性假设且增强了朴素贝叶斯分类器的准确率。实验结果表明,该算法与传统的朴素贝叶斯和文献[6]的新加权贝叶斯分类算法相比,其分类效果更加精准。  相似文献   

2.
基于完全无向图的贝叶斯分类器在入侵检测中的应用   总被引:2,自引:0,他引:2  
朴素贝叶斯分类器由于其强独立性假设,并不考虑属性之间的相互关系,而入侵检测的数据集不能很好地满足这一条件假设.为此,提出了一种基于有向完全图的贝叶斯分类器,将属性之间的关系加入到分类器的构造中,降低了朴素贝叶斯分类器的强独立性假设,并将其应用于入侵检测中.在MIT入侵检测数据集的实验表明,该算法能提高入侵检测的准确率,其效果很好.  相似文献   

3.
郑芸芸  王萍  游强华 《福建电脑》2013,(11):106-107,124
朴素贝叶斯分类器是建立在条件独立性假设上的,但在实际运用过程中这种假设通常是不存在的。针对这个问题,结合k-均值聚类算法构造出了一个改进的朴素贝叶斯分类器。算法用k-均值算法将其中相关系数较大的属性合并成一个综合属性,使随后进行贝叶斯分类的各个属性间能尽可能达到属性独立,达到朴素贝叶斯分类器的要求。实验证明这种方法改善了朴素贝叶斯分类器并扩大了朴素贝叶斯分类器的应用范围。  相似文献   

4.
朴素贝叶斯分类器(NBC)是一种简洁而有效的分类模型。介绍了NBC模型的基本原理,并着重分析了该模型的独立性假设条件。在总结现有独立性假设研究的基础上,通过例子和实验分析得出结论:NBC模型的表现和独立性假设是否满足没有必然联系。  相似文献   

5.
基于因子分析的NBC及其在边坡识别中的应用   总被引:1,自引:0,他引:1  
高岩 《计算机工程与设计》2011,32(11):3828-3831
满足条件独立性假设时,朴素贝叶斯分类器理论上比其它分类方法具有更高的分类正确率,但该假设在许多实际情况中并不成立,针对这一问题,提出了一种基于因子分析的朴素贝叶斯分类模型FA-NBC,并将其应用于边坡的稳定性识别。为了保证朴素贝叶斯分类器结构上的简单性,FA-NBC模型以方差贡献为依据构建新的属性集,新属性集包含原属性集的大部分信息且满足条件独立性假设。UCI数据集上的实验结果证明了FA-NBC模型的有效性。  相似文献   

6.
利用Copula的理论提出了基于Copula贝叶斯分类算法,克服了一般的朴素贝叶斯分类器要求属性独立性假设的不足,进一步扩展了朴素贝叶斯分类器,实验结果表明,基于Copula贝叶斯算法取得了较好的分类效果。  相似文献   

7.
朴素贝叶斯分类器是一种简单而高效的分类器,但是其属性独立性假设限制了对实际数据的应用。文章提出一种新的算法,该算法为避免数据预处理时的属性约简对分类效果的直接影响,在训练集上通过随机属性选取生成若干属性子集,以这些子集构建相应的朴素贝叶斯分类器,采用模拟退火遗传算法进行优选。实验表明,与传统的朴素贝叶斯方法相比,该方法具有更好的性能。  相似文献   

8.
不同的入侵检测系统,使用不同的数据属性。朴素贝叶斯(Naive Bayes简称NB)分类器由于其强独立性假设,并未考虑属性之间的相互关系,而入侵检测的数据集不能很好地满足条件假设,本文引入隐藏贝叶斯网络分类器,并将其应用于入侵检测中。该模型为每一个属性创建一个隐藏的父属性,它能影响到分类器的其它属性。实验表明,该算法可以优化朴素贝叶斯模型,能提高入侵检测系统的整体性能,效果更好。  相似文献   

9.
通过对朴素贝叶斯(NBC)分类器与传统的基于树扩展的贝叶斯(TAN)分类器的分析,对TAN分类器进行改进,提出CTAN分类器。朴素贝叶斯分类器对非类属性独立性进行完全独立假设,传统TAN则弱化所有属性的独立性.提出的CTAN则是通过操作TAN保留对数对部分相关属性有选择的进行弱化。CTAN改进的方向主要是对属性关系树的部分利用,通过实验证明,CTAN要优于传统TAN分类器。  相似文献   

10.
秦锋  任诗流  程泽凯  罗慧 《计算机工程与设计》2007,28(20):4873-4874,4877
朴素贝叶斯分类器是一种简单而高效的分类器,但需要属性独立性假设,无法表示现实世界中属性之间的依赖关系,影响了其分类性能.利用独立分量分析提升朴素贝叶斯分类性能,把样本投影到由独立分量所确定的特征空间,提高了朴素贝叶斯分类器的分类性能.实验结果表明,这种基于独立分量分析的朴素贝叶斯分类器具有良好的性能.  相似文献   

11.
On the selection and classification of independent features   总被引:2,自引:0,他引:2  
This paper is focused on the problems of feature selection and classification when classes are modeled by statistically independent features. We show that, under the assumption of class-conditional independence, the class separability measure of divergence is greatly simplified, becoming a sum of unidimensional divergences, providing a feature selection criterion where no exhaustive search is required. Since the hypothesis of independence is infrequently met in practice, we also provide a framework making use of class-conditional Independent Component Analyzers where this assumption can be held on stronger grounds. Divergence and the Bayes decision scheme are adapted to this class-conditional representation. An algorithm that integrates the proposed representation, feature selection technique, and classifier is presented. Experiments on artificial, benchmark, and real-world data illustrate our technique and evaluate its performance.  相似文献   

12.
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss   总被引:67,自引:0,他引:67  
Domingos  Pedro  Pazzani  Michael 《Machine Learning》1997,29(2-3):103-130
The simple Bayesian classifier is known to be optimal when attributes are independent given the class, but the question of whether other sufficient conditions for its optimality exist has so far not been explored. Empirical results showing that it performs surprisingly well in many domains containing clear attribute dependences suggest that the answer to this question may be positive. This article shows that, although the Bayesian classifier's probability estimates are only optimal under quadratic loss if the independence assumption holds, the classifier itself can be optimal under zero-one loss (misclassification rate) even when this assumption is violated by a wide margin. The region of quadratic-loss optimality of the Bayesian classifier is in fact a second-order infinitesimal fraction of the region of zero-one optimality. This implies that the Bayesian classifier has a much greater range of applicability than previously thought. For example, in this article it is shown to be optimal for learning conjunctions and disjunctions, even though they violate the independence assumption. Further, studies in artificial domains show that it will often outperform more powerful classifiers for common training set sizes and numbers of attributes, even if its bias is a priori much less appropriate to the domain. This article's results also imply that detecting attribute dependence is not necessarily the best way to extend the Bayesian classifier, and this is also verified empirically.  相似文献   

13.
Naive Bayes分类建立在贝叶斯理论基础上,应用极为广泛,它采用类条件独立假设对贝叶斯理论进行了近似。Bayesian Network则在这一基础上采用图形模型弥补了独立假设的不足,同时揭示出分类过程中会导致NP问题的出现。本文采用一种折衷的方法--联合关联规则与ABN分类技术构造贝叶斯分类器。它弥补了独立假设的不足,同时也避免了解决NP问题。最后,本文用实验结果展示它在多个领域远远优于Naive Bayes分类器。  相似文献   

14.
Lazy Learning of Bayesian Rules   总被引:19,自引:0,他引:19  
The naive Bayesian classifier provides a simple and effective approach to classifier learning, but its attribute independence assumption is often violated in the real world. A number of approaches have sought to alleviate this problem. A Bayesian tree learning algorithm builds a decision tree, and generates a local naive Bayesian classifier at each leaf. The tests leading to a leaf can alleviate attribute inter-dependencies for the local naive Bayesian classifier. However, Bayesian tree learning still suffers from the small disjunct problem of tree learning. While inferred Bayesian trees demonstrate low average prediction error rates, there is reason to believe that error rates will be higher for those leaves with few training examples. This paper proposes the application of lazy learning techniques to Bayesian tree induction and presents the resulting lazy Bayesian rule learning algorithm, called LBR. This algorithm can be justified by a variant of Bayes theorem which supports a weaker conditional attribute independence assumption than is required by naive Bayes. For each test example, it builds a most appropriate rule with a local naive Bayesian classifier as its consequent. It is demonstrated that the computational requirements of LBR are reasonable in a wide cross-section of natural domains. Experiments with these domains show that, on average, this new algorithm obtains lower error rates significantly more often than the reverse in comparison to a naive Bayesian classifier, C4.5, a Bayesian tree learning algorithm, a constructive Bayesian classifier that eliminates attributes and constructs new attributes using Cartesian products of existing nominal attributes, and a lazy decision tree learning algorithm. It also outperforms, although the result is not statistically significant, a selective naive Bayesian classifier.  相似文献   

15.
尽管朴素贝叶斯简单而且在很多数据集上效果很好,但是其属性独立性假设在现实世界中并不总是成立的,当这一假设不成立时,其结果很差。通过分析和研究,提出了一种放宽这种独立性假设的新算法——懒惰学习双层朴素贝叶斯分类器L^2DLNB,该算法使用基于条件互信息的懒惰学习方法,在求不同类标的似然度时,使用不同的属性依赖关系,从而能够更准确地计算出各类标似然度。实验结果表明此算法在一些数据集上取得了更好的分类精度。  相似文献   

16.
Augmented Bayes分类器的一种学习方法   总被引:1,自引:0,他引:1  
NaveBayes分类器作为一种计算简单、精度较高的分类方法,已经得到了广泛应用。但是其所作的假设:各属性之间相互独立却非常容易在现实中被违背,阻碍了分类器精度的进一步提高。而Bayes网络较好地考虑了属性之间的依赖关系,但是其计算相当复杂。AugmentedBayes分类器将两者的优点结合在一起,既考虑了属性之间的依赖关系,又保证了算法的简单性。该文从属性所拥有的信息量出发考虑,提出了AugmentedBayes分类器的一种基于熵的学习方法。最后,通过测试数据将该方法与NaveBayes分类器和SuperParent算法进行了比较。  相似文献   

17.
In this paper, we describe three Bayesian classifiers for mineral potential mapping: (a) a naive Bayesian classifier that assumes complete conditional independence of input predictor patterns, (b) an augmented naive Bayesian classifier that recognizes and accounts for conditional dependencies amongst input predictor patterns and (c) a selective naive classifier that uses only conditionally independent predictor patterns. We also describe methods for training the classifiers, which involves determining dependencies amongst predictor patterns and estimating conditional probability of each predictor pattern given the target deposit-type. The output of a trained classifier determines the extent to which an input feature vector belongs to either the mineralized class or the barren class and can be mapped to generate a favorability map. The procedures are demonstrated by an application to base metal potential mapping in the proterozoic Aravalli Province (western India). The results indicate that although the naive Bayesian classifier performs well and shows significant tolerance for the violation of the conditional independence assumption, the augmented naive Bayesian classifier performs better and exhibits finer generalization capability. The results also indicate that the rejection of conditionally dependent predictor patterns degrades the performance of a naive classifier.  相似文献   

18.
Accurate classification of microarray data plays a vital role in cancer prediction and diagnosis. Previous studies have demonstrated the usefulness of naïve Bayes classifier in solving various classification problems. In microarray data analysis, however, the conditional independence assumption embedded in the classifier itself and the characteristics of microarray data, e.g. the extremely high dimensionality, may severely affect the classification performance of naïve Bayes classifier. This paper presents a sequential feature extraction approach for naïve Bayes classification of microarray data. The proposed approach consists of feature selection by stepwise regression and feature transformation by class-conditional independent component analysis. Experimental results on five microarray datasets demonstrate the effectiveness of the proposed approach in improving the performance of naïve Bayes classifier in microarray data analysis.  相似文献   

19.
王峻  周孟然 《微机发展》2007,17(7):35-37
朴素贝叶斯分类器是一种简单而高效的分类器,但它的条件独立性假设使其无法表示属性间的依赖关系。TAN分类器按照一定的结构限制,通过添加扩展弧的方式扩展朴素贝叶斯分类器的结构。在TAN分类器中,类变量是每一个属性变量的父结点,但有些属性的存在降低了它分类的正确率。文中提出一种基于MDL度量的选择性扩展贝叶斯分类器(SANC),通过MDL度量,删除影响分类性能的属性变量和扩展弧。实验结果表明,与NBC和TANC相比,SANC具有较高的分类正确率。  相似文献   

20.
We propose a probabilistic framework for classifier combination, which gives rigorous optimality conditions (minimum classification error) for four combination methods: majority vote, weighted majority vote, recall combiner and the naive Bayes combiner. The framework is based on two assumptions: class-conditional independence of the classifier outputs and an assumption about the individual accuracies. The four combiners are derived subsequently from one another, by progressively relaxing and then eliminating the second assumption. In parallel, the number of the trainable parameters increases from one combiner to the next. Simulation studies reveal that if the parameter estimates are accurate and the first assumption is satisfied, the order of preference of the combiners is: naive Bayes, recall, weighted majority and majority. By inducing label noise, we expose a caveat coming from the stability-plasticity dilemma. Experimental results with 73 benchmark data sets reveal that there is no definitive best combiner among the four candidates, giving a slight preference to naive Bayes. This combiner was better for problems with a large number of fairly balanced classes while weighted majority vote was better for problems with a small number of unbalanced classes.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号