首页 | 官方网站   微博 | 高级检索  
     

基于图的频繁闭项集挖掘算法
引用本文:李力,翟东海,靳蕃.基于图的频繁闭项集挖掘算法[J].西南交通大学学报,2004,39(3):385-389.
作者姓名:李力  翟东海  靳蕃
作者单位:西南交通大学计算机与通信工程学院,四川,成都,610031
基金项目:国家中西药管理局项目(2000 J P 54)
摘    要:为了提高数据挖掘效率,提出了一种基于图的频繁闭项集挖掘算法GFCG(graph—based frequent closed itemset generation).该算法采用位矢量技术构造有向图,表示项与项之间的频繁关系,并在有向图的基础上递归产生频繁闭项集,从而只需扫描数据库2次,不产生候选集;引入扩展频繁项集的概念,大大减小了检查频繁项集是否闭的搜索空间.用1个真实数据库和2个合成数据库对GFCG进行了测试,并与A-close和CLOSET算法的结果进行了比较,结果表明,该算法具有良好的速度和可伸缩性性能.

关 键 词:数据库    数据挖掘  频繁闭集  位向量
文章编号:0258-2724(2004)03-0385-05

Graph-Based Algorithm for Mining Frequent Closed Itemsets
LI Li,ZHAI Dong-hai,JIN Fan.Graph-Based Algorithm for Mining Frequent Closed Itemsets[J].Journal of Southwest Jiaotong University,2004,39(3):385-389.
Authors:LI Li  ZHAI Dong-hai  JIN Fan
Abstract:To raise the efficiency of data mining, a graph-based algorithm for mining frequent closed itemsets, called GFCG (graph-based frequent closed itemset generation) was proposed. In this algorithm, the bit vector technique is used to construct a directed graph to represent the frequent relationship between items, and frequent closed itemsets are generated recursively from this graph. As a result, the GFCG scans the database for only two times, and does not generate candidate sets. Furthermore, the concept of an expanded frequent itemset is introduced to greatly decrease the searching range for adjusting whether a frequent itemset is closed or not. In addition, this algorithm was tested by using one actual and two synthetical databases. The testing result shows that compared with the A-close and CLOSET algorithms, the proposed algorithm has good speed and scale-up properties.
Keywords:database  graph  data mining  frequent closed itemset  bit vector
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号