首页 | 官方网站   微博 | 高级检索  
     

改进的Web链接主题提取算法
引用本文:王卫玲,刘培玉,刘克非.改进的Web链接主题提取算法[J].计算机工程与设计,2007,28(2):294-296.
作者姓名:王卫玲  刘培玉  刘克非
作者单位:山东师范大学,信息科学与工程学院,山东,济南,250014
摘    要:HITS算法是影响相当广泛的链接分析算法.但是,深入的研究表明,它很容易产生主题漂移.而HITS算法产生主题漂移的很大一部分原因在于页面被投影到错误的潜在语义基上.提出一种基于权值调整的超链主题提取算法(weighted adjustments based hyperlinks topic distillation),先在获得根集的过程中,用改进的权值进行相似度计算,得到相对更为准确的个性化根集,再利用HITS算法计算Web页面的权威值和中心值.实验结果表明,基于权值调整的超链主题提取算法可以很好地改善HITS算法所导致的主题漂移问题,更适合于Web查询的需要.

关 键 词:链接分析  主题提取  向量空间模型  权值调整  资源发现  改进  主题漂移  提取算法  algorithm  topic  distillation  查询  问题  改善  结果  实验  中心值  权威值  相似度计算  再利用  个性化  权值调整  过程  weighted  based  超链
文章编号:1000-7024(2007)02-0294-03
修稿时间:2006-05-01

Improved web linkages topic distillation algorithm
WANG Wei-ling,LIU Pei-yu,LIU Ke-fei.Improved web linkages topic distillation algorithm[J].Computer Engineering and Design,2007,28(2):294-296.
Authors:WANG Wei-ling  LIU Pei-yu  LIU Ke-fei
Affiliation:College of Computer Science and Engineering, Shandong Normal University, Jinan 250014, China
Abstract:HITS(hypertext-induced topic search) algorithm is one of the most important algorithms for linkage analysis,however a disadvantage of it is topic drift.The problem of topic drift due to the web pages projecting to wrong latent semantic basis is found.A new WAHTD(weighted adjustments based hyperlinks topic distillation) algorithm is presented,which constructs personalized root set and base set using weighted adjustments and then computes authority and hub value of web pages by HITS to distill topic.The experi-mental results show that WAHTD perform better than HITS in topic distillation quality and improve the topic drift problem,so it is more appropriate to Web query.
Keywords:link analysis  topic distillation  VSM  weighted adjustments  resource discovery
本文献已被 CNKI 维普 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号