首页 | 官方网站   微博 | 高级检索  
     

Web日志分析系统研究
引用本文:阳小兰,钱程,赵海廷. Web日志分析系统研究[J]. 微机发展, 2011, 0(9): 211-215
作者姓名:阳小兰  钱程  赵海廷
作者单位:武汉科技大学中南分校信息工程学院,湖北武汉430223
基金项目:湖北省自然科学基金项目(2010CDB11102)
摘    要:Web日志分析系统不仅能改进Web网站结构,提高Web服务器性能,而且能识别用户的喜好、满意度,发现潜在用户,增强网站服务竞争力。介绍了Web日志挖掘的各个阶段,设计并实现了一个Web日志分析系统。分析了传统的频繁项集挖掘算法与序列模式挖掘算法的不足之处,根据日志数据的特性,将用户属性引入频繁项目集的生成过程,有效地减少了候选项集的数目,并根据候选集的特点,逐轮压缩数据库。将连续序列引入到ApiroriAll算法的候选集合并过程中,实现了改进算法。通过实验比较了改进算法与传统算法的效率,证明了改进算法的有效性。

关 键 词:日志分析  数据预处理  频繁项目集  序列模式

Research on Web Log Analysis System
YANG Xiao-lan,QIAN Cheng,ZHAO Hai-ting. Research on Web Log Analysis System[J]. Microcomputer Development, 2011, 0(9): 211-215
Authors:YANG Xiao-lan  QIAN Cheng  ZHAO Hai-ting
Affiliation:(College of Information Engineering,Zhongnan Branch,Wuhan University of Science and Technology,Wuhan 430223,China)
Abstract:Web log analysis system can not only improve the Web site structure and improve Web server performance,but also identify the user's preferences,satisfaction,identify potential customers and enhance the competitiveness of Web services.The stages of Web log mining are described,and a Web log analysis system is designed and implemented.The shortcomings of traditional frequent itemsets mining algorithm and sequential pattern mining algorithm are analyzed.According to the characteristics of log data,the user attributes are added into the generation process of frequent item sets,effectively reducing the number of candidate items.According to the characteristics of the candidate set,by round of compressed database.ApiroriAll continuous sequence introduced into the algorithm and the process of candidate set.An improved algorithm is implemented.In the experiment,the efficiency of improved algorithm and traditional algorithm is compared,the effectiveness of the improved algorithm is proved.
Keywords:log analysis  data preprocessing  frequent itemsets  sequential patterns
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号