首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到17条相似文献,搜索用时 562 毫秒
1.
一种多变量决策树的构造与研究   总被引:3,自引:0,他引:3       下载免费PDF全文
单变量决策树算法造成树的规模庞大、规则复杂、不易理解,而多变量决策树是一种有效用于分类的数据挖掘方法,构造的关键是根据属性之间的相关性选择合适的属性组合构成一个新的属性作为节点。结合粗糙集原理中的知识依赖性度量和信息系统中条件属性集的离散度概念,提出了一种多变量决策树的构造算法(RD)。在UCI上部分数据集的实验结果表明,提出的多变量决策树算法的分类效果与传统的ID3算法以及基于核方法的多变量决策树的分类效果相比,有一定的提高。  相似文献   

2.
现有的多变量决策树在分类准确性与树结构复杂性两方面优于单变量决策树,但其训练时间却高于单变量决策树,使得现有的多变量决策树不适用于快速响应的分类任务.针对现有多变量决策树训练时间高的问题,提出了基于信息熵和几何轮廓相似度的多变量决策树(IEMDT).该算法利用几何轮廓相似度函数的一对一映射特性,将n维空间样本点投影到一维空间的数轴上,进而形成有序的投影点集合,然后通过类别边界和信息增益计算最优分割点集将有序投影点集合划分为多个子集,接着分别对每个子集继续投影分割,最终生成决策树.在8个数据集上的实验结果表明:IEMDT具有较低的训练时间,并且具有较高的分类准确性.  相似文献   

3.
一种多变量决策树方法研究   总被引:1,自引:1,他引:0  
单变量的决策树算法造成树的规模庞大,规则复杂,不易理解.本文结合粗糙集原理中的相对核及加权粗糙度的方法,提出了一种新的多变量决策树算法.通过实例表明,本文的多变量决策树方法产生的决策树比传统的ID3算法构造的决策树更简单,具有较好的分类效果.  相似文献   

4.
利用粗糙集中决策表的分明矩阵选择多变量决策树的根属性,把信息熵研究属性约简过程中的理论用于节点属性检验和选择,实现多变量决策树的建立.通过实例验证多变量决策树诊断模型较之单变量决策树诊断模型减少了故障信息的冗余性,诊断效率高,结果易于理解.  相似文献   

5.
单变量决策树难以反映信息系统属性间的关联作用,构造的决策树往往规模较大。多变量决策树能较好地反映属性间的关系,得到非常简单的决策树,但使构造的决策树难以理解。针对以上两种决策树特点,提出了基于知识粗糙度的混合变量决策树的构造方法,选择知识粗糙度较小的分类属性来构造决策树。实验结果表明,这是一种操作简单、效率很高的决策树生成方法。  相似文献   

6.
面向分布式数据流大数据分类的多变量决策树   总被引:1,自引:0,他引:1  
张宇  包研科  邵良杉  刘威 《自动化学报》2018,44(6):1115-1127
分布式数据流大数据中的类别边界不规则且易变,因此基于单变量决策树的集成分类器需要较大数量的基分类器才能准确地近似表达类别边界,这将降低集成分类器的学习与分类性能.因而,本文提出了基于几何轮廓相似度的多变量决策树.在最优基准向量的引导下将n维空间样本点投影到一维空间以建立有序投影点集合,然后通过类别投影边界将有序投影点集合划分为多个子集,接着分别对不同类别集合的交集递归投影分裂,最终生成决策树.实验表明,本文提出的多变量决策树GODT具有很高的分类精度和较低的训练时间,有效结合了单变量决策树学习效率高与多变量决策树表示能力强的优点.  相似文献   

7.
陈家俊  苏守宝  徐华丽 《计算机应用》2011,31(12):3243-3246
针对经典决策树算法构造的决策树结构复杂、缺乏对噪声数据适应能力等局限性,基于多尺度粗糙集模型提出一种新的决策树构造算法。算法引入尺度变量和尺度函数概念,采用不同尺度下近似分类精度选择测试属性构造决策树,使用抑制因子对决策树进行修剪,有效地去除了噪声规则。结果表明该算法构造的决策树简单有效,对噪声数据有一定的抗干扰性,且能满足不同用户对决策精度的要求。  相似文献   

8.
单变量决策树难以反映信息系统属性间的关联作用,构造的决策树往往规模较大.多变量决策树能较好地反映属性间的关系,得到非常简单的决策树,但使构造的决策树难以理解.针对以上两种决策树特点,提出了基于知识粗糙度的混合变量决策树的构造方法,选择知识粗糙度较小的分类属性来构造决策树.实验结果表明,这是一种操作简单、效率很高的决策树生成方法.  相似文献   

9.
懒惰式决策树分类是一种非常有效的分类方法。它从概念上为每一个测试实例建立一棵“最优”的决策树。但是,大多数的研究是基于小的数据集合之上。在大的数据集合上,它的分类速度慢、内存消耗大、易被噪声误导等缺点,影响了其分类性能。通过分析懒惰式决策树和普通决策树的分类原则,提出了一种新的决策树分类模型,Semi—LDtree。它生成的决策树的节点,如普通决策树一样,包含单变量分裂,但是叶子节点相当于一个懒惰式决策树分类器。这种分类模型保留了普通决策树良好的可解释性,实验结果表明它提高了分类速度和分类精确度,在某些分类任务上它的分类性能经常性地胜过两者,特别是在大的数据集合上。  相似文献   

10.
潜在属性空间树分类器   总被引:2,自引:0,他引:2  
何萍  徐晓华  陈崚 《软件学报》2009,20(7):1735-1745
提出一种潜在属性空间树分类器(latent attribute space tree classifier,简称LAST)框架,通过将原属性空间变换到更容易分离数据或更符合决策树分类特点的潜在属性空间,突破传统决策树算法的决策面局限,改善树分类器的泛化性能.在LAST 框架下,提出了两种奇异值分解斜决策树(SVD (singular value decomposition) oblique decision tree,简称SODT)算法,通过对全局或局部数据进行奇异值分解,构建正交的潜在属性空间,然后在潜在属性空间内构建传统的单变量决策树或树节点,从而间接获得原空间内近似最优的斜决策树.SODT 算法既能够处理整体数据与局部数据分布相同或不同的数据集,又可以充分利用有标签和无标签数据的结构信息,分类结果不受样本随机重排的影响,而且时间复杂度还与单变量决策树算法相同.在复杂数据集上的实验结果表明,与传统的单变量决策树算法和其他斜决策树算法相比,SODT 算法的分类准确率更高,构建的决策树大小更稳定,整体分类性能更鲁棒,决策树构建时间与C4.5 算法相近,而远小于其他斜决策树算法.  相似文献   

11.
We explore the effect of using bagged decision tree (BDT) as an ensemble learning method with proposed time-domain feature extraction methods on electrocardiogram (ECG) arrhythmia beat classification comparing with single decision tree (DT) classifier. RR interval is the main property which defines irregular heart rhythm, and its ratio to the previous value and difference from mean value are used as morphological feature extraction methods. Form factor, its ratio to the previous value and difference from mean value are used to express ECG waveform complexity. In addition, skewness and second-order linear predictive coding coefficients are added to the feature vector of 56,569 ECG heart beats obtained from MIT–BIH arrhythmia database as time-domain feature extraction methods. The quarter of ECG heart beat samples are used as test data for DT and BDT. The performance measures of these classifiers are evaluated using the metrics such as accuracy, sensitivity, specificity and Kappa coefficient for both classifiers, and the performance of BDT classifier is examined for number of base learners up to 75. The BDT results in more predictive performance than DT according to the performance measures. BDT with 69 base learners has 99.51 % of accuracy, 97.50 % of sensitivity, 99.80 % of specificity and 0.989 of Kappa coefficient while DT gives 98.78, 96.05, 99.57 and 0.975 %, respectively. These metrics show that the suggested BDT increases the numbers of successfully identified arrhythmia beats. Moreover, BDT with at least three base learners has higher distinguishing capability than DT.  相似文献   

12.
The automatic and accurate arrhythmia diagnosis in the electrocardiogram (ECG) signals is significant for cardiac health. Typically, the arrhythmia diagnosis is automatically detected depending on single-lead signals or a simple combination of multilead signals from the ECG. However, it ignores the inter-lead correlation and the significance of different leads for different heart beats detection, which decreases the performance of arrhythmia diagnosis. In this paper, arrhythmia diagnosis is converted to a problem of multigranulation computing in the view of granular computing, and thus different lead signals can be captured to improve the effectiveness of abnormal heart beats detection. To this end, multilead ECG signals are firstly granulated into different fuzzy information granules by the fuzzy equivalence relation. An objective decision-making model based on fuzzy set theory is then proposed for describing and analyzing these granulated multilead ECG signals, which brings a self-adaptive and unsupervised decision making. As a result, the significance and correlation of different leads are analyzed by granularity selection and granular structures to make a better decision for arrhythmia diagnosis. Extensive experimental results show that the proposed algorithm can significantly improve the performance of arrhythmia diagnosis, especially better robustness to several types of cardiac arrhythmia.  相似文献   

13.
针对增量数据集,结合粗糙集理论和多变量决策树的优点,给出了增量式的多变量决策树构造算法。该算法针对新增样本与已有规则集产生矛盾,即条件属性相匹配,而决策属性不匹配的情况,计算条件属性相对于决策属性的核,如果核不为空,则计算核相对于决策属性的相对泛化,根据不同的结果形成不同的子集,最终形成不同的决策树分支。该算法很好地避免了在处理增量数据集时,不断重构决策树。实例证明该算法的正确性,对处理小增量数据集具有良好的性能。  相似文献   

14.
基于总线桥协议的异构机群并行虚拟机的构造   总被引:2,自引:0,他引:2  
金利杰  张建军  李未 《软件学报》1997,8(6):417-424
BBP_PVM是为北京航空航天大学计算机系基于总线桥协议的异构可扩展并行计算机群系统BBP_SPC(busbridgeprotocol-scalableparalelcomputer)研制的PVM版本.BBP_PVM以总线桥多机互联协议的消息传递层子协议(BBP_MPL)为虚拟机内各处理机间的通讯协议.BBP_MPL是在BBP可靠链路的基础上实现的精简和可靠的机间通讯协议,BBP_MPL的采用有效地降低了通讯过程中报文应答、重发和动态缓冲区管理的开销.BBP_PVM与PVM3.3.4及其以上版本兼容.  相似文献   

15.
Multivariate Decision Trees   总被引:24,自引:0,他引:24  
Unlike a univariate decision tree, a multivariate decision tree is not restricted to splits of the instance space that are orthogonal to the features' axes. This article addresses several issues for constructing multivariate decision trees: representing a multivariate test, including symbolic and numeric features, learning the coefficients of a multivariate test, selecting the features to include in a test, and pruning of multivariate decision trees. We present several new methods for forming multivariate decision trees and compare them with several well-known methods. We compare the different methods across a variety of learning tasks, in order to assess each method's ability to find concise, accurate decision trees. The results demonstrate that some multivariate methods are in general more effective than others (in the context of our experimental assumptions). In addition, the experiments confirm that allowing multivariate tests generally improves the accuracy of the resulting decision tree over a univariate tree.  相似文献   

16.
Functional Trees   总被引:1,自引:0,他引:1  
In the context of classification problems, algorithms that generate multivariate trees are able to explore multiple representation languages by using decision tests based on a combination of attributes. In the regression setting, model trees algorithms explore multiple representation languages but using linear models at leaf nodes. In this work we study the effects of using combinations of attributes at decision nodes, leaf nodes, or both nodes and leaves in regression and classification tree learning. In order to study the use of functional nodes at different places and for different types of modeling, we introduce a simple unifying framework for multivariate tree learning. This framework combines a univariate decision tree with a linear function by means of constructive induction. Decision trees derived from the framework are able to use decision nodes with multivariate tests, and leaf nodes that make predictions using linear functions. Multivariate decision nodes are built when growing the tree, while functional leaves are built when pruning the tree. We experimentally evaluate a univariate tree, a multivariate tree using linear combinations at inner and leaf nodes, and two simplified versions restricting linear combinations to inner nodes and leaves. The experimental evaluation shows that all functional trees variants exhibit similar performance, with advantages in different datasets. In this study there is a marginal advantage of the full model. These results lead us to study the role of functional leaves and nodes. We use the bias-variance decomposition of the error, cluster analysis, and learning curves as tools for analysis. We observe that in the datasets under study and for classification and regression, the use of multivariate decision nodes has more impact in the bias component of the error, while the use of multivariate decision leaves has more impact in the variance component.  相似文献   

17.
张龙飞  张跃 《计算机工程》2011,37(16):282-284
针对多导联心电监护仪对QRS波的分析需求,提出一种多导联QRS波实时检测算法。对原始心电图信号进行工频滤波和低通滤波处理,将各导联按照单导联预检波规则进行QRS波判别,通过决策融合多个导联的判别结果得到最终判别结果。在圣彼得堡INCART 12导联心率失常数据库上的验证结果表明,该算法的平均识别率和准确率分别为99.88%和99.73%。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号