首页 | 官方网站   微博 | 高级检索  
     

数据挖掘网格中决策树并行算法设计及性能分析
引用本文:陈平,乔秀全,刘臻,田小萍.数据挖掘网格中决策树并行算法设计及性能分析[J].北京邮电大学学报,2009,32(z1):49-52.
作者姓名:陈平  乔秀全  刘臻  田小萍
作者单位:北京师范大学,信息网络中心,北京,100875;北京邮电大学,网络与交换技术国家重点实验室,北京,100876
基金项目:国家自然科学基金,高等学校博士学科点专项科研基金,北京市科技新星计划 
摘    要:提出了C4.5决策树算法的一种并行算法,使传统的串行分类算法能在多台PC机和服务器组成的数据挖掘网格上并行数据挖掘. 采用数据纵横剖分,结合递归过程的并行化,实现了可扩展的高性能并行计算,解决了处理海量数据时没有较好并行分类算法的问题. 并给出了指导该并行算法高效计算的方法. 数据运行试验和算法分析表明,该并行算法的性能受多个因素影响,并具有高效的并行效率计算加速比.

关 键 词:数据挖掘  网格计算  决策树  并行性能
收稿时间:2009-04-13

Design and Performance Analysis of a Parallel Decision Tree Algorithm on Data Mining Grid
CHEN Ping,QIAO Xiu-quan,LIU Zhen,TIAN Xiao-ping.Design and Performance Analysis of a Parallel Decision Tree Algorithm on Data Mining Grid[J].Journal of Beijing University of Posts and Telecommunications,2009,32(z1):49-52.
Authors:CHEN Ping  QIAO Xiu-quan  LIU Zhen  TIAN Xiao-ping
Abstract:Working on the group of personal computers and servers, a parallel C4.5 decision tree algorithm is proposed. This algorithm made the parallel date mining run on the data mining grid efficiently. A partition of vertical and horizontal method is introduced to parallel the procedure of recursive algorithm. The algorithm is scalable and solves the situation of lack of efficient parallel algorithm so far. The analysis and experiment for the parallel decision tree prove that the computing efficiency is affected by several parameters and the algorithm has high performance and high computing speedup. Guides to enhance the efficiency are proposed as well.
Keywords:data mining  grid computing  decision tree  parallel performance
本文献已被 万方数据 等数据库收录!
点击此处可从《北京邮电大学学报》浏览原始摘要信息
点击此处可从《北京邮电大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号