首页 | 官方网站   微博 | 高级检索  
     


Hierarchical parallelization and optimization of high-order stencil computations on multicore clusters
Authors:Hikmet Dursun  Manaschai Kunaseth  Ken-ichi Nomura  Jacqueline Chame  Robert F Lucas  Chun Chen  Mary Hall  Rajiv K Kalia  Aiichiro Nakano  Priya Vashishta
Affiliation:1. Collaboratory for Advanced Computing and Simulations, Department of Computer Science, Department of Physics & Astronomy, Department of Chemical Engineering & Materials Science, University of Southern California, Los Angeles, CA, 90089, USA
2. Information Sciences Institute, University of Southern California, Suite 1001, 4676 Admiralty Way, Marina del Rey, CA, 90292, USA
3. School of Computing, University of Utah, Salt Lake City, UT, 84112, USA
Abstract:We present a scalable parallelization scheme for high-order stencil computations that also optimizes memory behavior on multicore clusters. Our multilevel approach combines: (i)?inter-node parallelization via spatial decomposition; (ii)?inter-core parallelization via multithreading and explicit non-uniform memory access (NUMA) control; (iii)?data locality optimizations through auto-tuned tiling for efficient use of hierarchical memory; and (iv)?register blocking and data parallelism via single-instruction multiple-data techniques to utilize registers and exploit data locality. The scheme is applied to a sixth-order stencil based finite-difference time-domain code. Weak-scaling parallel efficiency is over 98?% on 32,768 BlueGene/P processors. Multithreading with explicit NUMA control attains 9.9-fold speedup on a dual 12-core AMD Opteron system. Data locality optimizations achieve 7.7-fold reduction of the last level cache miss rate of Intel Nehalem, whereas register blocking increases data parallelism and thereby achieves 5.9 Gflops performance on a single core. Register blocking?+ multithreading optimizations achieve 5.8-fold speedup on a single quadcore Nehalem.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号