Fast incremental mining of web sequential patterns with PLWAP tree |
| |
Authors: | C I Ezeife Yi Liu |
| |
Affiliation: | (1) Faculty of Information Technology, Monash University, Clayton Campus, Wellington Road, Clayton, Vic, 3800, Australia |
| |
Abstract: | Point and click at web pages generate continuous data sequences, which flow into the web log data, causing the need to update
previously mined web sequential patterns. Algorithms for mining web sequential patterns from scratch include WAP, PLWAP and
Apriori-based GSP. Reusing old patterns with only recent additional data sequences in an incremental fashion, when updating
patterns, would achieve fast response time with reasonable memory space usage. This paper proposes two algorithms, RePL4UP
(Revised PLWAP For UPdate), and PL4UP (PLWAP For UPdate), which use the PLWAP tree structure to incrementally update web sequential
patterns efficiently without scanning the whole database even when previous small items become frequent. The RePL4UP concisely
stores the position codes of small items in the database sequences in its metadata during tree construction. During mining,
RePL4UP scans only the new additional database sequences, revises the old PLWAP tree to restore information on previous small
items that have become frequent, while it deletes previous frequent items that have become small using the small item position
codes. PL4UP initially builds a bigger PLWAP tree that includes all sequences in the database using a tolerance support, t, that is lower than the regular minimum support, s. The position code features of the PLWAP tree are used to efficiently mine these trees to extract current frequent patterns
when the database is updated. These approaches more quickly update old frequent patterns without the need to re-scan the entire
updated database. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|