首页 | 官方网站   微博 | 高级检索  
     

基于倒排索引位运算的深度优先频繁项集挖掘
引用本文:傅向华,陈冬剑,王志强.基于倒排索引位运算的深度优先频繁项集挖掘[J].小型微型计算机系统,2012,33(8):1747-1751.
作者姓名:傅向华  陈冬剑  王志强
作者单位:深圳大学计算机与软件学院,广东深圳,518060
基金项目:国家自然科学基金项目,深圳市科技计划项目
摘    要:频繁项集挖掘是关联规则挖掘中的关键任务,非常耗费时间.为提高频繁项集的产生效率,提出一种基于倒排索引位运算的深度优先频繁项集挖掘算法(DF-FIMBII).该算法以二进制数组存储项目到事务的倒排索引,通过位运算计算两个项目的支持计数,并采用深度优先搜索策略递归地挖掘不同的k-频繁项集.在chess、mushroom、pumb_star、T40I10D100K等数据集上,对DF-FIMBII、Apriori、ECLAT、BitTableFI、Index-BitTableFI等算法进行了实验比较.实验结果表明,在数据规模不是非常巨大和支持度较小的情况下,无论数据集的稠密程度如何,DF-FIMBII均具有较好的时间优越性.

关 键 词:频繁项集  二进制数组  倒排索引  深度优先搜索

Depth First Frequent Itemset Mining Based on Bittable and Inverted Index
FU Xiang-hua , CHEN Dong-jian , WANG Zhi-qiang.Depth First Frequent Itemset Mining Based on Bittable and Inverted Index[J].Mini-micro Systems,2012,33(8):1747-1751.
Authors:FU Xiang-hua  CHEN Dong-jian  WANG Zhi-qiang
Affiliation:(College of Computer Science and Software Engineering,Shenzhen University,Shenzhen 518060,China)
Abstract:Frequent itemset mining is a very important procedure in the association rule mining.A Depth First Frequent Itemset Mining Based on Bittable and Inverted Index(DF-FIMBII) is proposed in this paper.DF-FIMBII stores the inverted index of the vertical structure of items to transactions with bittables arrays,and calculates the support counts of two items through the bit operations.Furthermore,DF-FIMBII computes different k-frequent itemsets with a depth first search strategy.We compared the execution time of DF-FIMBII,Apriori,ECLAT,BitTableFI and Index-BitTableFI in four datasets such as chess,mushroom,pumb_star and T40I10D100K.The experiment results show that DF-FIMBII can improve the time consumption when the volume of dataset is not very large and the support value is relative small.
Keywords:frequent itemset  bittable array  inverted index  depth first search
本文献已被 CNKI 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号