首页 | 官方网站   微博 | 高级检索  
     

一种基于SimRank得分的谱聚类算法
引用本文:李鹏清,李扬定,邓雪莲,李永钢,方月.一种基于SimRank得分的谱聚类算法[J].计算机科学,2018,45(Z11):458-461, 467.
作者姓名:李鹏清  李扬定  邓雪莲  李永钢  方月
作者单位:广西师范大学广西多源信息挖掘与安全重点实验室 广西 桂林541004,广西师范大学广西多源信息挖掘与安全重点实验室 广西 桂林541004,广西中医药大学公共卫生与管理学院 南宁530200,广西师范大学广西多源信息挖掘与安全重点实验室 广西 桂林541004,广西师范大学广西多源信息挖掘与安全重点实验室 广西 桂林541004
基金项目:本文受国家重点研发计划项目(2016YFB1000905),国家自然科学基金(61363009,7,61573270,0),广西自然科学/青年基金(2015GXNSFCB139011,7GXNSFBA198221),广西多源信息挖掘与安全重点实验室开放基金(16-A-01-01,16-A-01-02),广西研究生教育创新计划项目(XYCSZ2017064,XYCSZ2017067,YCSW2017065),广西研究生创新计划项目(YCSW2018094)资助
摘    要:传统的谱聚类算法在建立相似度矩阵时仅考虑数据点与点的距离,忽略了数据点之间隐含的内在联系。针对这一问题,提出了一种基于SimRank的谱聚类算法。该算法首先用无向图数据建立邻接矩阵,并计算出基于SimRank的相似度矩阵;然后根据相似度矩阵建立拉普拉斯矩阵表达式,对其进行归一化后再进行谱分解;最后对分解得到的特征向量进行k-means聚类。在Zoo等UCI标准数据集上的实验结果表明,所提算法在聚类精确度、标准互信息和纯度3个评价指标上均优于现有的LRR(Low Rank Rrepresentation)等基于距离相似度的谱聚类算法。

关 键 词:谱聚类  相似度矩阵  SimRank得分  邻接矩阵  拉普拉斯矩阵  k-均值聚类

Spectral Clustering Algorithm Based on SimRank Score
LI Peng-qing,LI Yang-ding,DENG Xue-lian,LI Yong-gang and FANG Yue.Spectral Clustering Algorithm Based on SimRank Score[J].Computer Science,2018,45(Z11):458-461, 467.
Authors:LI Peng-qing  LI Yang-ding  DENG Xue-lian  LI Yong-gang and FANG Yue
Affiliation:Guangxi Key Lab of Multi-source Information Mining & Security,Guangxi Normal University,Guilin,Guangxi 541004,China,Guangxi Key Lab of Multi-source Information Mining & Security,Guangxi Normal University,Guilin,Guangxi 541004,China,School of Public Health and Management,Guangxi University of Chinese Medicine,Nanning 530200,China,Guangxi Key Lab of Multi-source Information Mining & Security,Guangxi Normal University,Guilin,Guangxi 541004,China and Guangxi Key Lab of Multi-source Information Mining & Security,Guangxi Normal University,Guilin,Guangxi 541004,China
Abstract:Traditional spectral clustering algorithms only consider distance between data points,ignoring their intrinsic relation.To deal with this problem,a spectral clustering method based on SimRank score was proposed.Firstly,the method computes the adjacency matrix of the undirected graph data,and obtains the similarity matrix based on SimRank.Secondly,a Laplacian matrix expression is constructed based on similarity matrix,which is then normalized followed by spectral decomposition.Finally,a k-means clustering procedure is performed on the obtained eigenvectors to obtain the final clustering results.Experimental results on benchmark datasets from UCI data repository show that the proposed algorithm is superior to the existing spectral clustering algorithms based on distance similarity in terms of clustering accuracy,standard mutual information and purity.
Keywords:Spectral clustering  Similarity matrix  SimRank score  Adjacency matrix  Laplace matrix  k-means clustering
点击此处可从《计算机科学》浏览原始摘要信息
点击此处可从《计算机科学》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号