首页 | 官方网站   微博 | 高级检索  
     

基于半监督近邻传播的数据流聚类算法
引用本文:王文帅,陈 刚.基于半监督近邻传播的数据流聚类算法[J].计算机工程与应用,2013,49(8):6-8.
作者姓名:王文帅  陈 刚
作者单位:1.中国科学院 高能物理研究所 计算中心,北京 100049 2.中国科学院大学,北京 100049
摘    要:为了提高进化数据流的聚类质量,提出基于半监督近邻传播的数据流聚类算法(SAPStream),该算法借鉴半监督聚类的思想对初始数据流构造相似度矩阵进行近邻传播聚类,建立在线聚类模型,随着数据流的进化,应用衰减窗口技术对聚类模型适时做出调整,对产生的类代表点和新到来的数据点再次聚类得到数据流的聚类结果。对数据流进行动态聚类的实验结果表明该算法是高质有效的。

关 键 词:数据流  半监督  近邻传播聚类  衰减窗口  

Data stream clustering algorithm based on semi-supervised affinity propagation
WANG Wenshuai,CHEN Gang.Data stream clustering algorithm based on semi-supervised affinity propagation[J].Computer Engineering and Applications,2013,49(8):6-8.
Authors:WANG Wenshuai  CHEN Gang
Affiliation:1.Computing Center, Institute of High Energy Physics, Chinese Academy of Sciences, Beijing 100049, China 2.University of Chinese Academy of Sciences, Beijing 100049, China
Abstract:In order to improve the clustering quality of evolving data stream, this paper introduces a new data stream clustering algorithm, clustering over data Stream based on Semi-supervised Affinity Propagation(SAPStream), this algorithm calculates the similarity matrix of the initial data with the idea of semi-supervised, executes AP cluster, and then builds online clustering model. With the evolution of the data stream, the clustering model adjusts using decay windows technology, and the data stream clustering results are got by executing cluster again over the exemplars and new arrival data points. SAPStream can analyze and deal with large-scale evolving data stream. Its performance is tested by using both real datasets and synthetic datasets. Experimental results show this algorithm achieves a higher quality of clustering.
Keywords:data stream  semi-supervised  affinity propagation clustering  decay windows  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号