首页 | 官方网站   微博 | 高级检索  
     

一种启发式主题爬行算法
引用本文:刘欣宇,唐学文,邓一贵. 一种启发式主题爬行算法[J]. 世界科技研究与发展, 2012, 34(5): 723-725
作者姓名:刘欣宇  唐学文  邓一贵
作者单位:1. 重庆大学计算机学院,重庆,400030
2. 重庆大学信息与网络管理中心,重庆,400030
摘    要:为克服传统主题爬行器在爬行速度和主题预测精度上的不足,提高爬行器的查准率和查全率,根据当前常用主题爬行策略的特点,通过页面辐射空间的引入将主题策略中基于链接分析和基于内容分析的方法相结合,并嵌入启发式算法,提出一种基于启发式的主题爬行算法.实验结果表明,该算法较常用爬行算法有较好的爬行效率.

关 键 词:主题爬行  启发式算法  页面辐射空间

Algorithm of Topic-oriented Crawling Based on Heuristic Search
LIU Xinyu , TANG Xuewen , DENG Yigui. Algorithm of Topic-oriented Crawling Based on Heuristic Search[J]. World Sci-tech R & D, 2012, 34(5): 723-725
Authors:LIU Xinyu    TANG Xuewen    DENG Yigui
Affiliation:1. School of Computer Science, Chongqiug University, Chongqing 400030 ; 2. Center of Information and Network, Chongqing University, Chongqing 400030)
Abstract:To solve the traditional topic crawler's drawback in terms of precision and efficiency as well as improve the precise ratio and recall ratio of general search engine results, a new kind of topic-oriented crawling algorithm is put forward, according to the current features of the topic - oriented crawling methods. The methods based on the link analysis and the content analysis of the topic methods is combined through page radiation space to combine, with heuristic algorithm embed. The experiment result shows that this algorithm is more efficient than the usual algorithms.
Keywords:topic-crawler  heuristic algorithm  page radiation space
本文献已被 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号