首页 | 官方网站   微博 | 高级检索  
     

POM:一个MPI程序的进程优化映射工具
引用本文:卢兴敬,商磊,陈莉.POM:一个MPI程序的进程优化映射工具[J].计算机工程与科学,2009,31(Z1).
作者姓名:卢兴敬  商磊  陈莉
作者单位:1. 中国科学院计算技术研究所系统结构重点实验室,北京,100190
2. 中国科学院计算技术研究所系统结构重点实验室,北京,100190;澳大利亚新南威尔士大学,悉尼,2052
基金项目:国家自然科学基金资助项目 
摘    要:现代超级计算机具有越来越多的计算结点,同时结点内具有多个处理器核。由于互联带宽的差异,结点间与结点内构成两个通信性能不同的通信层次,后者的通信性能好于前者。但是,目前MPI程序的默认进程映射未考虑该通信层次差异,无法利用结点内较好的通信带宽,严重束缚了超级计算机的性能发挥。针对该问题,本文设计实现了能利用层次通信差异的MPI程序自动进程优化映射工具POM,提供了高效、低开销获取MPI程序通信信息的方法,最终通过优化通信在通信层次上的分布提高了程序的通信效率,从而提高了应用程序的性能。本文解决了硬件平台通信层次的抽象、MPI程序通信信息的低开销获取与映射方案的计算三个问题。首先,按照通信能力差异将超级计算机结构抽象为高速互联的不同计算结点与相同结点上的多个处理器核两层。其次,提出了将集合通信转化成点到点通信的简单实现方法。最后,利用无向加权边图来表示MPI程序的进程间通信关系,将MPI程序的进程映射问题转化为图划分问题。在曙光5000A和曙光4000A上的实验结果表明,利用POM工具能够显著提高MPI程序的性能。

关 键 词:进程映射  消息传递接口(MPI)  图划分

POM:A Process Optimization Mapping Tool for MPI Programs
LU Xing-jing,SHANG Lei,CHEN Li.POM:A Process Optimization Mapping Tool for MPI Programs[J].Computer Engineering & Science,2009,31(Z1).
Authors:LU Xing-jing  SHANG Lei  CHEN Li
Abstract:Modern supercomputers contain more computing nodes with many multi-core processors in one node. Inter-node and intra-node hvae different bandwidth, and make up two different communication layers, the intra-node layer's communication performance is better. The default process mapping of MPI do not consider the difference of bandwidth, so it decreases the performance of the computing platform. To resolve the problem, this paper introduces an automatic tool of optimizing process mapping for MPI programs, which supplies a low cost method of getting the communication information and optimizes the distribution of the communication of the system. So we can leverage the communication performance of the platform, and also better the performance of the program. First, to present the communication layer of the computing platform, supercomputer was simplified into two layers. The top is different computing nodes connected by high speed networks, the base is the multi-core processors on the same node, which has wider bandwidth. Second, we introduce a method to transform the collective communication into point-to-point communication and add it to the communication information. In the last, using undirected graph with edges of different weights to present the processes' communication relationship. So the process mapping problem now is a graph partitioning problem. This paper uses the open source software Chaco to solve the graph partitioning problem. The experiment proves that the POM can efficiently better the performance of MPI programs.
Keywords:process mapping  message passing interface  graph partitioning
本文献已被 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号