首页 | 官方网站   微博 | 高级检索  
     

一种基于滑动窗口的数据流频繁闭项集挖掘算法
引用本文:黄国言,王立波,任家东.一种基于滑动窗口的数据流频繁闭项集挖掘算法[J].计算机研究与发展,2009,46(Z2).
作者姓名:黄国言  王立波  任家东
作者单位:1. 燕山大学信息科学与工程学院,河北秦皇岛,066004
2. 燕山大学信息科学与工程学院,河北秦皇岛,066004;北京理工大学计算机科学技术学院,北京,100081
基金项目:河北省自然科学基金项目 
摘    要:频繁项集挖掘是数据流挖掘中的一个热点问题.提出了一种新的数据流频繁闭项集挖掘算法MFCI-SW.首先设计了两个新的数据结构:频繁闭项集表FCIL和频繁闭合模式树MFCI-SW-Tree,在此基础上以滑动窗口中的基本窗口为更新单位,在每个基本窗口中提取出频繁闭项集的数据项,将其支持度F和窗口序列号K存到FCIL中;然后随着新基本窗口的到来,通过删除频繁闭项集表中K值最小的数据项和插入新数据项完成对FCIL的更新和MFCI-SW-Tree树的裁剪;最后在MFCI-SW-Tree中可以迅速挖掘出满足用户需要的频繁闭项集.实验结果证明了该算法在执行效率上明显优于DS-CFI算法.

关 键 词:数据流  频繁闭项集  滑动窗口

An Algorithm for Mining Frequent Closed Itemsets in Data Streams over Sliding Window
Huang Guoyan,Wang Libo,Ren Jiadong.An Algorithm for Mining Frequent Closed Itemsets in Data Streams over Sliding Window[J].Journal of Computer Research and Development,2009,46(Z2).
Authors:Huang Guoyan  Wang Libo  Ren Jiadong
Abstract:Frequent itemsets mining is a hot topic in data stream mining.A new algorithm,called MFCI-SW,is proposed for mining frequent closed itemsets in data streams in this paper.Firstly,frequent closed itemsets list(FCIL)and frequent closed itemsets tree by the sliding window(MFCISW-Tree)are introduced in MFCI-SW,and the sliding window iS divided into several basic windows.In each basic window,the data items in frequent closed itemsets are collected up,and the supporting degree F and the window sequence number K of them are stored in FCIL.Then,when a new basic window arrives,the pruning of MFCI-SW-Tree is completed by deleting the data item whose K is the least in FCIL and merging the new data items into FCIL.Finally,the frequent closed itemsets which satisfy the query requirements of users are output.Experimental results show that the execution efficiency of MFCI-SW is better than DS-CFI.
Keywords:data stream  frequent closed itemsets  sliding windows
本文献已被 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号