基于格的快速频繁项集挖掘算法 |
| |
引用本文: | 刘彩苹,毛建频,毛建旭,屈卫兰,蔡玉武.基于格的快速频繁项集挖掘算法[J].湖南大学学报(自然科学版),2013,40(10):52-57. |
| |
作者姓名: | 刘彩苹 毛建频 毛建旭 屈卫兰 蔡玉武 |
| |
作者单位: | (1.湖南大学 信息科学与工程学院,湖南 长沙410082;2.抚州职业技术学院 信息工程系,江西 抚州344000;3.湖南大学 电气与信息工程学院,湖南 长沙410082) |
| |
摘 要: | 随着数据库规模的增加或支持度阈值的减少,频繁模式的数量将以指数形式增长,FP-growth算法运行的时空效率将大为降低.本文提出一种基于格的快速频繁项集挖掘算法LFP-growth,算法利用等价关系将原来的搜索空间(格)划分成若干个较小的子空间(子格),通过子格间的迭代分解,将对网格P(I)的频繁项集挖掘转化为对多个子格的并集进行的约束频繁项集挖掘.实验结果和理论分析表明,在挖掘大型数据库时,LFP-growth算法的时间和空间性能均优于FP-growth算法.
|
关 键 词: | 数据挖掘 FP-树 频繁项集 格 |
Lattice-based Algorithm for Fast Mining Frequent Itemsets |
| |
Abstract: | Along with the increasing size of database and the reduction of support threshold, the number of frequent patterns will grow exponentially, and the time and space efficiency of the FP-growth algorithm will greatly reduce. The cause of low efficiency was analyzed, and according to the analysis, a lattice-based algorithm for fast mining frequent itemsets (LFP-growth) was presented. The proposed algorithm divided a large lattice into many sub-lattices by using equivalence relation. Through iterativing decomposition of sublattices, frequent itemset mining in lattice was transformed into frequent itemsets mining in a union set of multiple sublattices. Experiments have shown that the time and space performance of LFP-growth algorithm is superior to that of FP-growth algorithm in mining large database. |
| |
Keywords: | data mining FP-tree frequent itemsets lattice |
|
| 点击此处可从《湖南大学学报(自然科学版)》浏览原始摘要信息 |
|
点击此处可从《湖南大学学报(自然科学版)》下载全文 |