首页 | 官方网站   微博 | 高级检索  
     

一种基于Hadoop的大规模Internet频繁闭图挖掘算法
引用本文:张岩庆.一种基于Hadoop的大规模Internet频繁闭图挖掘算法[J].四川大学学报(工程科学版),2016,48(4):136-143.
作者姓名:张岩庆
作者单位:电子工程学院
基金项目:国家自然科学基金(61503394)面向逆反射体检测的特征显著性研究;国家自然科学基金(61405248)基于引射技术的特种车辆尾气红外抑制特性研究;安徽省青年科学基金(1408085QF131)主被动成像车牌识别关键技术;安徽省青年科学基金(1508085QF121)面向交通标志检测的特征显著性研究
摘    要:针对当前单机模式下频繁闭图挖掘算法无法处理大规模Internet数据集的问题,通过改进Apriori算法,提出了基于Hadoop的迭代式频繁闭图挖掘算法AMR(Apriori based on MapReduce)。首先将动态网络的边集存储在键值表中,并设计了序列化子图编码方案以确保频繁子图的唯一性;然后提出了一种传递子图编码的通信机制,通过整合每个分片的支持度得到全局支持度,从而确保了频繁闭图的准确性;最后通过剪枝得到动态网络的频繁闭图。将AMR算法分别运用于国家级和AS级Internet的动态网络中,结果表明,频繁闭图能够准确表征Internet骨干网络的拓扑结构,说明AMR算法能够快速且有效地挖掘大规模动态网络的频繁闭图。

关 键 词:Hadoop  动态网络  大规模Internet  频繁闭图
收稿时间:1/3/2016 12:00:00 AM
修稿时间:2016/4/14 0:00:00

A Frequent Closed Graph Mining Algorithm in Large Scale Internet based on Hadoop
Zhang Yanqing.A Frequent Closed Graph Mining Algorithm in Large Scale Internet based on Hadoop[J].Journal of Sichuan University (Engineering Science Edition),2016,48(4):136-143.
Authors:Zhang Yanqing
Abstract:The existing frequent closed graph(FCG) mining methods under standalone mode can not process massive Internet datasets, by improving the Apriori algorithm, an iterative algorithm AMR(Apriori based on MapReduce) was proposed to mine FCGs of large scale dynamic networks based on Hadoop. First of all, the edge sets of dynamic networks were stored in the key-value table, and a serialized subgraph coding mechanism was designed to ensure the uniqueness of frequent subgraphs(FSG). Secondly, a communication mechanism was proposed to pass subgraph codes, and to ensure the accuracy of FCG, local supports in each partition were aggregated to a global one. Finally, FCGs were achieved by pruning the FSGs. AMR was used in the dynamic networks of both country and AS level Internet, and experimental results showed that FCG can accurately characterize the topology of backbone Internet, thus verifying the efficiency and effectiveness of AMR in mining FCG of large scale dynamic networks.
Keywords:Hadoop  dynamic networks  large scale Internet  frequent closed graph
点击此处可从《四川大学学报(工程科学版)》浏览原始摘要信息
点击此处可从《四川大学学报(工程科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号