首页 | 官方网站   微博 | 高级检索  
     

基于采样的大规模图聚类分析算法
引用本文:张建朋,陈鸿昶,王凯,祝凯捷,王亚文.基于采样的大规模图聚类分析算法[J].电子学报,2019,47(8):1731-1737.
作者姓名:张建朋  陈鸿昶  王凯  祝凯捷  王亚文
作者单位:国家数字交换系统工程技术研究中心,河南郑州 450000;荷兰埃因霍温理工大学计算机系,荷兰北布拉邦省 5600 MB;国家数字交换系统工程技术研究中心,河南郑州,450000
基金项目:国家自然科学基金;国家重点研发计划
摘    要:针对当前聚类方法(例如经典的GN算法)计算复杂度过高、难以适用于大规模图的聚类问题,本文首先对大规模图的采样算法展开研究,提出了能够有效保持原始图聚类结构的图采样算法(Clustering-structure Representative Sampling,CRS),它能在采样图中产生高质量的聚类代表点,并根据相应的扩张准则进行采样扩张.此采样算法能够很好地保持原始图的内在聚类结构.其次,提出快速的整体样本聚类推断(Population Clustering Inference,PCI)算法,它利用采样子图的聚类标签对整体图的聚类结构进行推断.实验结果表明本文算法对大规模图数据具有较高的聚类质量和处理效率,能够很好地完成大规模图的聚类任务.

关 键 词:大规模图  图采样  图聚类  整体推断  聚类代表点  扩张准则
收稿时间:2017-10-09

A Sampling-Based Graph Clustering Algorithm for Large-Scale Networks
ZHANG Jian-peng,CHEN Hong-chang,WANG Kai,ZHU Kai-jie,WANG Ya-wen.A Sampling-Based Graph Clustering Algorithm for Large-Scale Networks[J].Acta Electronica Sinica,2019,47(8):1731-1737.
Authors:ZHANG Jian-peng  CHEN Hong-chang  WANG Kai  ZHU Kai-jie  WANG Ya-wen
Affiliation:1. National Digital Switching System Engineering & Technological R & D Center, Zhengzhou Henan 450000, China; 2. Dept of Computer Science, Technology University of Eindhoven, Eindhoven 5600 MB, Netherland
Abstract:Since computational complexities of the existing methods such as classic GN algorithm are too costly to cluster large-scale graphs,this paper studies sampling algorithms of large-scale graphs,and proposes a clustering-structure representative sampling (CRS) which can effectively maintain the clustering structure of original graphs.It can produce high quality clustering-representative nodes in samples and expand according to the corresponding expansion criteria.Then,we propose a fast population clustering inference (PCI) method on the original graphs and deduce clustering assignments of the population using the clustering labels of the sampled subgraph.Experiment results show that in comparison with state-of-the-art methods,the proposed algorithm achieves better efficiency as well as clustering accuracy on large-scale graphs.
Keywords:large-scale graphs  graph sampling  graph clustering  population inference  clustering representative nodes  expansion criteria  
本文献已被 万方数据 等数据库收录!
点击此处可从《电子学报》浏览原始摘要信息
点击此处可从《电子学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号