首页 | 官方网站   微博 | 高级检索  
     

面向分布式数据流大数据分类的多变量决策树
引用本文:张宇,包研科,邵良杉,刘威.面向分布式数据流大数据分类的多变量决策树[J].自动化学报,2018,44(6):1115-1127.
作者姓名:张宇  包研科  邵良杉  刘威
作者单位:1.辽宁工程技术大学理学院 阜新 123000
基金项目:国家自然科学基金71371091
摘    要:分布式数据流大数据中的类别边界不规则且易变,因此基于单变量决策树的集成分类器需要较大数量的基分类器才能准确地近似表达类别边界,这将降低集成分类器的学习与分类性能.因而,本文提出了基于几何轮廓相似度的多变量决策树.在最优基准向量的引导下将n维空间样本点投影到一维空间以建立有序投影点集合,然后通过类别投影边界将有序投影点集合划分为多个子集,接着分别对不同类别集合的交集递归投影分裂,最终生成决策树.实验表明,本文提出的多变量决策树GODT具有很高的分类精度和较低的训练时间,有效结合了单变量决策树学习效率高与多变量决策树表示能力强的优点.

关 键 词:分布式数据流    大数据    分类    几何轮廓相似度    多变量决策树
收稿时间:2016-12-14

A Multivariate Decision Tree for Big Data Classification of Distributed Data Streams
Affiliation:1.School of Science, Liaoning Technical University, Fuxin 1230002.Research Institute of System Engineering, Liaoning Technical University, Fuxin 123000
Abstract:Considering the irregularity and variability of the class boundaries of distributed big data streams, when the univariate decision tree is used as the base classifier in an ensemble classifier, large amounts of base classifiers are needed to accurately approximate class boundaries. This will reduce the learning and classification performance of ensemble classifiers. This article proposes a multivariate decision tree based on geometric outline similarity (GODT). Firstly, by using the optimal reference vector, the n-dimensional data points are projected onto the one-dimensional space, thus a set of ordered projection points are established. Secondly, the set of projection points are divided into several subsets, and the intersections of different subsets are projected and divided by recursive projecting and splitting. Finally, a decision tree is built. Experimental results show that GODT has a better classification accuracy and requires less training time. It combines the high learning efficiency of univariate decision tree algorithm with the strong representation power of multivariate decision tree.
Keywords:
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号