首页 | 官方网站   微博 | 高级检索  
     

融合SOM和改进PSO的Web文档集成聚类算法
引用本文:宋剑杰,王伟. 融合SOM和改进PSO的Web文档集成聚类算法[J]. 计算机工程与应用, 2010, 46(34): 111-114. DOI: 10.3778/j.issn.1002-8331.2010.34.034
作者姓名:宋剑杰  王伟
作者单位:1.湖南科技职业学院 电子信息系,长沙 410004 2.中南大学 信息科学与工程学院,长沙 410083
摘    要:随着信息的爆炸式增长,现有的搜索引擎在很多方面不能满足人们的需要。Web文档聚类可以减小搜索空间,加快检索速度,提高查询精度。提出了一种融合SOM(Self-Organizing Maps)粗聚类和改进PSO(Particle Swarm Optimization)细聚类的Web文档集成聚类算法。首先根据向量空间模型表示法,用特征词条及其权值表示Web文档信息,其次用SOM算法对文档特征集进行粗聚类,得到一组输出权值,然后用这组权值初始化改进的PSO算法,用改进PSO算法对此聚类结果进行细化,最终实现Web文档聚类。仿真结果表明,该算法能有效提高文档查询的查准率和查全率,具有一定的实用价值。

关 键 词:Web文档聚类  自组织特征映射  粗聚类  改进PSO算法  细聚类  集成聚类算法
收稿时间:2010-04-20
修稿时间:2010-6-30 

Integrated clustering algorithm based on hybrid of SOM and improved PSO for Web document
SONG Jian-jie,WANG Wei. Integrated clustering algorithm based on hybrid of SOM and improved PSO for Web document[J]. Computer Engineering and Applications, 2010, 46(34): 111-114. DOI: 10.3778/j.issn.1002-8331.2010.34.034
Authors:SONG Jian-jie  WANG Wei
Affiliation:1.Department of Electronics Technology and Informatioin,Science and Technology College of Hunan,Changsha 410004,China 2.School of Information Science and Engineering,Central South University,Changsha 410083,China
Abstract:With the explosive growth of Web information in Internet,it seems that the current search engines cannot meet the requirement of users in many aspects.By grouping similar Web documents into clusters, the search space can be reduced, the search accelerated,and its precision improved.An integrated clustering algorithm for Web document is proposed in this paper,which combines SOM to realize coarse clustering and the improved PSO to realize fine clustering.Firstly,the Web document is expressed as feature lemma and its weight by the vector space model.Secondly,the SOM algorithm is used to realize coarse clustering of the document feature set and a group of output weights can be obtained.Then the improved PSO algorithm is initialized with the output weights and fine clustering can be realized by the algorithm evolution,thus Web document clustering is implemented finally.Simulation result shows that the algorithm can greatly improve the precision and recall of document searching,and have certain practical value.
Keywords:Web document clustering  self-organizing maps  coarse clustering  improved Particle Swarm Optimization(PSO) al- gorithm  fine clustering  integrated clustering algorithm
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号