首页 | 官方网站   微博 | 高级检索  
     


Fast incremental mining of web sequential patterns with PLWAP tree
Authors:C I Ezeife  Yi Liu
Affiliation:(1) Faculty of Information Technology, Monash University, Clayton Campus, Wellington Road, Clayton, Vic, 3800, Australia
Abstract:Point and click at web pages generate continuous data sequences, which flow into the web log data, causing the need to update previously mined web sequential patterns. Algorithms for mining web sequential patterns from scratch include WAP, PLWAP and Apriori-based GSP. Reusing old patterns with only recent additional data sequences in an incremental fashion, when updating patterns, would achieve fast response time with reasonable memory space usage. This paper proposes two algorithms, RePL4UP (Revised PLWAP For UPdate), and PL4UP (PLWAP For UPdate), which use the PLWAP tree structure to incrementally update web sequential patterns efficiently without scanning the whole database even when previous small items become frequent. The RePL4UP concisely stores the position codes of small items in the database sequences in its metadata during tree construction. During mining, RePL4UP scans only the new additional database sequences, revises the old PLWAP tree to restore information on previous small items that have become frequent, while it deletes previous frequent items that have become small using the small item position codes. PL4UP initially builds a bigger PLWAP tree that includes all sequences in the database using a tolerance support, t, that is lower than the regular minimum support, s. The position code features of the PLWAP tree are used to efficiently mine these trees to extract current frequent patterns when the database is updated. These approaches more quickly update old frequent patterns without the need to re-scan the entire updated database.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号