首页 | 官方网站   微博 | 高级检索  
     

一种提高并行数据挖掘效率的方法
引用本文:佘春东,范植华,孙世新,车著明,唐剑.一种提高并行数据挖掘效率的方法[J].计算机科学,2004,31(2):132-134.
作者姓名:佘春东  范植华  孙世新  车著明  唐剑
作者单位:1. 电子科技大学计算机科学与工程学院,成都610054
2. 中国科学院软件研究所,北京100080
3. 西昌卫星发射中心技术部,西昌615000
基金项目:中国科学院知识创新工程方向性研究项目基金(名称:大型数字对象应用环境及其并行模拟,批准号:KGCX2-JG-09),总装备部西昌卫星发射中心实验技术项目基金
摘    要:发现关联规则是数据挖掘的一项重要任务,本文介绍了几种数据挖掘的串行和并行算法。其中IDD算法是一种高效的和易于扩展的发现关联规则的并行算法,然而,当处理嚣数目增加时,由于负载的失衡导致其效率的严重下降,于是通过引入近似算法成功地解决了这个问题。我们给出了两种近似算法和其性能证明,其一是在线算法,另一种是离线算法。在本文的最后,我们进行了改进的IDD算法的复杂性分析。

关 键 词:数据库  知识发现  并行数据挖掘效率  关联规则  数据集合  数据驱动  计算机

A Method to Improve the Performance of Parallel Data Mining
SHE Chun-Dong FAN Zhi-Hua SUN Shi-Xing TANG Jian CHE Zhu-Ming.A Method to Improve the Performance of Parallel Data Mining[J].Computer Science,2004,31(2):132-134.
Authors:SHE Chun-Dong FAN Zhi-Hua SUN Shi-Xing TANG Jian CHE Zhu-Ming
Abstract:Discovery of association rules is an important data mining task. Several parallel and sequential algorithms have been proposed in this paper to solve the problem. IDD algorithm is an efficient and scalable parallel method applied in the discovery of association rules in the field of data mining- However.it becomes less effective when processors increases due to the imbalance. Therefore, IDD is improved by means of introducing approximate algorithms to solve the problem of load balance effectively. There are two approximate algorithms,one is called online algorithm.and the other is named offline algorithm. After that,we give the proof of their performance ratio. In the last part.it is the complexity analysis of the improved IDD algorithm.
Keywords:Data mining  Parallel processing  Association rules  Load balance  Scalability  Approximate algorithm  Online algorithm  Off line algorithm  
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号