首页 | 官方网站   微博 | 高级检索  
     

基于MapReduce的并行模糊C均值算法
引用本文:虞倩倩,戴月明.基于MapReduce的并行模糊C均值算法[J].计算机工程与应用,2013,49(14):133-137.
作者姓名:虞倩倩  戴月明
作者单位:江南大学 物联网工程学院,江苏 无锡 214122
摘    要:模糊C均值是一种重要的软聚类算法,针对模糊C均值的随着数据量的增加,时间复杂度过高的缺点,提出了一种基于MapReduce的并行模糊C均值算法。算法重新设计模糊C均值,使其符合MapReduce的基于key/value的编程模型,并行计算数据集到中心点的隶属度,并重新计算出新的聚类中心,提高了模糊C均值处理大容量数据的计算效率。实验结果表明,基于MapReduce的并行模糊C均值算法具有较高的加速比和扩展性。

关 键 词:模糊C均值  并行计算  MapReduce编程模型  数据挖掘  云计算  

Parallel fuzzy C-means algorithm based on MapReduce
YU Qianqian,DAI Yueming.Parallel fuzzy C-means algorithm based on MapReduce[J].Computer Engineering and Applications,2013,49(14):133-137.
Authors:YU Qianqian  DAI Yueming
Affiliation:School of IOT Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China
Abstract:Fuzzy C-means?is an important?soft-clustering algorithm, but with the increased amount of data the time complexity will be increased. In this paper, a parallel?fuzzy?C-means?algorithm based on?the MapReduce is proposed. The fuzzy?C-means?algorithm is redesigned to meet the MapReduce programming model. The membership degree of data set to the center is computed in parallel, and the new cluster center is re-calculated, so that the higher calculating efficiency of processing large amount of data can be got. The experimental results show that the parallel?fuzzy?C-means?algorithm based on?the MapReduce has the advantages of both high speedup and good scalability.
Keywords:fuzzy C-means  parallel computing  MapReduce  data mining  cloud computing  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号