首页 | 官方网站   微博 | 高级检索  
     

基于DDMINER分布式数据库系统中频繁项目集的更新
引用本文:吉根林,杨明,赵斌,孙志挥.基于DDMINER分布式数据库系统中频繁项目集的更新[J].计算机学报,2003,26(10):1387-1392.
作者姓名:吉根林  杨明  赵斌  孙志挥
作者单位:1. 东南大学计算机科学与工程系,南京,210096;南京师范大学计算机科学系,南京,210097
2. 东南大学计算机科学与工程系,南京,210096
3. 南京师范大学计算机科学系,南京,210097
基金项目:国家自然科学基金 ( 79970 0 92 ),江苏省教育厅自然科学基金 ( 2 0 0 1SXXTSJB12 )资助
摘    要:给出了一种分布式数据挖掘系统的体系结构DDMINER,对分布式数据库系统中频繁项目集的更新问题进行探讨,既考虑了数据库中事务增加的情况,又考虑了事务删除的情况;提出了一种基于DDMINER的局部频繁项目集的更新算法ULF和全局频繁项目集的更新算法UGF.该算法能够产生较少数量的候选频繁项目集,在求解全局频繁项目集过程中,传送候选局部频繁项目集支持数的通信量为O(n);将文章提出的算法用Java语言加以实现,并对算法性能进行了研究;实验结果表明这些算法是正确、可行的,并且具有较高的效率.

关 键 词:分布式数据库系统  频繁项目集  分布式数据挖掘系统  体系结构  DDMINER
修稿时间:2002年7月22日

Updating Technique for Frequent Itemsets in Distributed Database Systems Based on DDMINER
JI Gen-Lin , YANG Ming ZHAO Bin SUN Zhi-Hui.Updating Technique for Frequent Itemsets in Distributed Database Systems Based on DDMINER[J].Chinese Journal of Computers,2003,26(10):1387-1392.
Authors:JI Gen-Lin  YANG Ming ZHAO Bin SUN Zhi-Hui
Affiliation:JI Gen-Lin 1),2) YANG Ming 1) ZHAO Bin 2) SUN Zhi-Hui 1) 1)
Abstract:Some algorithms have been proposed for maintaining the association rules mined in databases, but very little research has been done in updating techniques for association rules in distributed databases. A direct application of the association rules maintaining algorithms to distributed databases is not effective. Updating technique for frequent itemsets is a key problem of updating association rules in distributed database systems. This paper presents the system architecture DDMINER for mining association rules in distributed database systems. Updating technique for frequent itemsets in distributed databases is studied. The more general incremental updating algorithms ULF and UGF is presented for maintaining the frequent itemsets discovered in a distributed database system based on DDMINER in the cases including transaction insertion, transaction deletion in the distributed databases. The algorithm ULF is designed for updating local frequent itemsets discovered in a site. The algorithm UGF is designed for computing global frequent itemsets of transaction insertion or deletion in a distributed database. It requires only O(n) communication messages for support count of each candidate frequent itemsets, where n is the number of sites in a distributed database. The number of candidate frequent itemsets produced by the algorithms is small. The algorithms have been implemented using Java and its performance is studied. The experiment results show that the algorithms are valid and fast. UGF has superior performance when comparing with the direct application of DMA to mine global frequent itemsets.
Keywords:frequent itemsets  association rule  frequent itemsets updating  distributed data mining  KDD
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号