首页 | 官方网站   微博 | 高级检索  
     

Web日志挖掘中GITC算法的改进
引用本文:郭维. Web日志挖掘中GITC算法的改进[J]. 计算机工程, 2008, 34(4): 60-62
作者姓名:郭维
作者单位:安徽理工大学计算机科学与技术系,淮南,232001
摘    要:GITC算法和Tree-DM算法都是基于交集关系的挖掘算法。文章分析这2个算法的性能特点,提出一种GITC算法的改进算法:GI算法。该算法利用适当的数据结构来保存支持数信息,省去了扫描原数据库来统计支持数耗费的大量时间,并解决了Tree-DM算法在二次求交、冗余求交等方面存在的问题。经过实验验证,较GITC算法而言,GI算法可以更高效地挖掘用户频繁访问模式。

关 键 词:Web日志挖掘  频繁访问模式  交集关系
文章编号:1000-3428(2008)04-0060-03
收稿时间:2007-03-23
修稿时间:2007-03-23

Improvement of GITC Algorithm on Web Log Mining
GUO Wei. Improvement of GITC Algorithm on Web Log Mining[J]. Computer Engineering, 2008, 34(4): 60-62
Authors:GUO Wei
Affiliation:(Dept. of Computer Science and Technology, Anhui University of Science and Technology, Huainan 232001)
Abstract:The GITC algorithm and the Tree-DM algorithm are both based on the intersection relation. The paper analyzes the performance of both algorithms deeply, and puts forward an improved algorithm named GI. It stores the information of support number in appropriate data structure so as to spare a mass of time of getting the support number of each candidate by scanning the original database. It also solves the problem of getting the intersections repeatedly and redundantly in the Tree-DM algorithm. Experimental results show that the GI algorithm can discover user frequent access patterns more effectively than GITC.
Keywords:Web log mining   frequent access pattern   intersection relation
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号