首页 | 官方网站   微博 | 高级检索  
     

大规模生物网络马尔可夫聚类的并行化算法
引用本文:孙佳敏,朱嘉富,杨伏长,谢江.大规模生物网络马尔可夫聚类的并行化算法[J].计算机应用,2019,39(1):66-71.
作者姓名:孙佳敏  朱嘉富  杨伏长  谢江
作者单位:上海大学计算机工程与科学学院,上海,200444;上海大学计算机工程与科学学院,上海,200444;上海大学计算机工程与科学学院,上海,200444;上海大学计算机工程与科学学院,上海,200444
基金项目:国家重点研发计划重点专项(2016YFC1401900);上海市自然科学基金资助项目(17ZR1409900)。
摘    要:马尔可夫聚类算法(MCL)是在大规模生物网络中寻找模块的一个有效方法,能够挖掘网络结构和功能影响力较大的模块。算法涉及到大规模矩阵计算,因此复杂度可达立方阶次。针对复杂度高的问题,提出了基于消息传递接口(MPI)的并行化马尔可夫聚类算法以提高算法的计算性能。首先,生物网络转化成邻接矩阵;然后,根据算法的特性,按照矩阵的规模判断并重新生成新矩阵以处理非平方倍数矩阵的计算;其次,并行计算通过按块分配的方式能够有效地实现任意规模矩阵的运算;最后,循环并行计算直至收敛,得到网络聚类结果。通过模拟网络和真实生物网络数据集的实验结果表明,与全块集体式通信(FCC)并行方法相比,平均并行效率提升了10个百分点以上,因此可以将该优化算法应用在不同类型的大规模生物网络中。

关 键 词:消息传递接口  并行化  马尔可夫聚类  Cannon算法  大规模生物网络
收稿时间:2018-07-19
修稿时间:2018-08-17

Parallel algorithm of Markov clustering for large-scale biological networks
SUN Jiamin,ZHU Jiafu,YANG Fuzhang,XIE Jiang.Parallel algorithm of Markov clustering for large-scale biological networks[J].journal of Computer Applications,2019,39(1):66-71.
Authors:SUN Jiamin  ZHU Jiafu  YANG Fuzhang  XIE Jiang
Affiliation:School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
Abstract:Markov Clustering Algorithm (MCL) is an effective method to find modules in large-scale biological networks. It can mine modules that have significant influence on network structure and function. The algorithm involves large-scale matrix calculations, so its complexity can reach cubic orders. For the problem of high complexity, a parallel algorithm of Markov clustering based on Message Passing Interface (MPI) was proposed to improve computational performance of algorithm. Firstly, a biological network was transformed into an adjacency matrix. Secondly, according to the characteristics of the algorithm, the matrix size was judged and a new matrix was regenerated to handle the calculation of non-square multiple matrix. Thirdly, the algorithm was calculated in parallel by means of block allocation, which could effectively implement the operation of matrix of any size. Finally, the loop was parallelized until the matrix was converged to obtain network clustering results. The experimental results on simulated network and real biological network datasets show that compared with Full-block Collective Communication (FCC) parallel method, the average parallel efficiency is improved by more than 10 percentage points, so the optimization algorithm can be applied in different types of large-scale biological networks.
Keywords:Message Passing Interface (MPI)                                                                                                                        parallelization                                                                                                                        Markov clustering                                                                                                                        Cannon algorithm                                                                                                                        large-scale biological network
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号