首页 | 官方网站   微博 | 高级检索  
     

基于GPU的LARED-P算法加速
引用本文:刘来国,徐炜遐,杨灿群,陈娟.基于GPU的LARED-P算法加速[J].计算机工程与科学,2009,31(Z1).
作者姓名:刘来国  徐炜遐  杨灿群  陈娟
作者单位:国防科技大学计算机学院,湖南,长沙,410073
基金项目:国家863计划资助项目,国家自然科学基金资助项目 
摘    要:GPU拥有几百GFlops甚至上TFlops的浮点计算能力,将GPU应用于粒子模拟,可有效提高大规模粒子模拟的速度,降低计算成本。本文利用GPU加速三维激光等离子体模拟算法LARED-P,提出了基于CPU+GPU的任务划分、GPU上任务分解、大规模计算核心的分解方法,结合使用了寄存器、纹理内存对算法进行加速。在双精度条件下,移植后的算法在工作频率为1.44GHz的NVIDIA Tesla S1070的单个GPU上获得了相当于主频2.4GHz的Intel(R)Core(TM)2 Quad CPU Q6600单核的6倍加速比。

关 键 词:GPU  粒子模拟  LARED-P  加速

Accelerating LARED-P Algorithm Based on GPU
XIA Jing,XU Wei-xia,ZHANG Jun,PANG Zheng-bin.Accelerating LARED-P Algorithm Based on GPU[J].Computer Engineering & Science,2009,31(Z1).
Authors:XIA Jing  XU Wei-xia  ZHANG Jun  PANG Zheng-bin
Abstract:The floating-point compute ability of GPU has achieved hundreds to thousands of GFlops. The use of GPU in large scale particle simulations will effectively increase the speed and decrease the cost of particle simulation. In this article, GPU is used to accelerate the three dimension laser plasma simulation algorithm LARED-P. The method of large scale kernel division is put forward along with the work assignment on the platform of CPU+GPU and the task division on GPU. Along with the use of register and texture memory the algorithm is accelerated. In double precision, it achieves a speed-up of 6 times on one GPU of NVIDIA Tesla S1070 at 1.44 GHz compared to that on one core of Intel(R) Core(TM)2 Quad CPU Q6600 at 2.4 GHz .
Keywords:GPU  LARED-P
本文献已被 万方数据 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号