基于非结构网格隐式算法的GPU加速研究 Research on GPU Acceleration of Implicit Schemes Based on Unstructured Grids期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于非结构网格隐式算法的GPU加速研究

引用本文：	陈龙,徐添豪,田书玲.基于非结构网格隐式算法的GPU加速研究[J].计算机系统应用,2018,27(5):238-243.

作者姓名：	陈龙徐添豪田书玲

作者单位：	南京航空航天大学航空宇航学院, 南京 210016,南京航空航天大学航空宇航学院, 南京 210016,南京航空航天大学航空宇航学院, 南京 210016

基金项目：	江苏高校优势学科建设工程资助项目

摘要：	针对非结构网格隐式算法在GPU上的加速效果不佳的问题，通过分析GPU的架构及并行模式，研究并实现了基于非结构网格格点格式的隐式LU-SGS算法的GPU并行加速.通过采用RCM和Metis网格重排序（重组）方法，优化非结构网格的数据局部性，改善非结构网格的隐式算法在GPU上的并行加速效果.通过三维机翼算例验证了本文实现的正确性及效率.结果表明两种网格重排序（重组）方法分别得到了63%和69%的加速效果提高.优化后的LU-SGS隐式GPU并行算法获得了相较于CPU串行算法27倍的加速比，充分说明了本文方法的高效性.
关键词：	GPU加速并行计算网格排序计算流体力学隐式格式
收稿时间：	2017/9/19 0:00:00
修稿时间：	2017/10/10 0:00:00
Research on GPU Acceleration of Implicit Schemes Based on Unstructured Grids

CHEN Long,XU Tian-Hao and TIAN Shu-Ling.Research on GPU Acceleration of Implicit Schemes Based on Unstructured Grids[J].Computer Systems& Applications,2018,27(5):238-243.

Authors:	CHEN Long XU Tian-Hao and TIAN Shu-Ling

Affiliation:	College of Aerospace Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China,College of Aerospace Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China and College of Aerospace Engineering, Nanjing University of Aeronautics and Astronautics, Nanjing 210016, China

Abstract:	With regard to the poor acceleration performance on GPU using the unstructured grids implicit method, this study realizes the GPU acceleration of LU-SGS implicit method based on unstructured grids with the cell-vertex scheme. With introduce the architecture of a GPU and its parallelization method, two grid reordering methods are set forth based on RCM and METIS, to improve data locality of unstructured grids and to improve acceleration performance on GPU using the unstructured grids implicit method. The ONERA M6 Wing test case is carried out to verify and validate this implementation. With two grid reordering methods, the GPU implementations achieve 63% and 69% improvements respectively. The GPU implementation obtains a speedup of 27 times compared to the CPU version running on a single core. It indicates that the proposed GPU implementation has a solid performance.

Keywords:	GPU acceleration parallel computing grid reordering computational fluid dynamics implicit schemes

	点击此处可从《计算机系统应用》浏览原始摘要信息
	点击此处可从《计算机系统应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏