基于OpenCL的图像重映射算法优化研究 Research on Image Remap Algorithm Optimization Based on OpenCL期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于OpenCL的图像重映射算法优化研究

作者姓名：	吴再龙张云泉龙国平徐建良贾海鹏

作者单位：	1.中国海洋大学信息科学与工程学院,山东青岛 266100;2.中国科学院软件研究所并行软件与计算科学实验室,北京 100190;3.中国科学院软件研究所计算机科学国家重点实验室,北京 100190

摘要：	图像重映射(Remap)算法是典型的图像变化算法。在图像放缩、扭曲、旋转等领域有着广泛的应用。随着图片规模和分辨率的不断提高,对图形映射算法的性能提出了越来越高的要求。本文在充分考虑不同GPU平台硬件体系结构差异的基础上,系统研究了在OpenCL框架下图像映射(Remap)算法在不同GPU平台上的高效实现方式。并从片外内存访存优化,向量化计算,减少动态指令等多个优化角度考察了不同优化方法在不同GPU平台上对性能的影响,提出了在不同GPU平台间实现性能移植的可能性。实验结果表明,优化后的算法在不考虑数据传输时间的前提下,在AMD HD5850GPU上相对于CPU版本取得114.3～491.5倍的加速比,相对于CUDA版本(现有GPU算法的实现)得到1.01～1.86的加速比,在NIVIDIA C2050 GPU上相对CPU版本取得100.7～369.8倍的加速比,相对于CUDA版本得到0.95～1.58的加速比。有效验证了本文提出的优化方法的有效性和性能可移植性。
关键词：	OpenCL 通用计算图像重映射算法跨平台
收稿时间：	2012-10-22
Research on Image Remap Algorithm Optimization Based on OpenCL

Authors:	Wu Zailong Zhang Yunquan Long Guoping Xu Jianliang Jia Haipeng

Affiliation:	1. School of Information Science and Technology, The Ocean University of China, Qingdao, Shandong 266100, China; 2. Laboratory of Parallel Software and Computational Science, Institute of Software, the Chinese Academy of Science, Beijing 100190, China; 3. State Key Laboratory of Computing Science, the Chinese Academy of Sciences, Beijing 100190, China

Abstract:	As a typical algorithm for image transformation, remap algorithm is widely used in image zooming, warping, rotating and some others. With continuous increase of image's scale and resolution, higher performance of graphic mapping algorithm has been more and more demanded. Taking full account of the differences of the hardware architectures on different GPU platforms, it is systematically studied in this paper that how remap algorithm based on OpenCL can run effectively on different GPU platforms. By applying memory access optimization of global memory, vectorization calculation, reducing judgments branch and some other optimization methods, we investigated the effects of different optimization on different platforms and suggested the possibility of realizing cross-platform portability. Experimental results showed that without counting the data transfer time, the speedup-ratio is 114.3～491.5 times for AMD HD5850 GPU to CPU version, and 1.01～1.86 times to CUDA version (with present GPU algorithm), and for NIVIDIA C2050 GPU, the speedup-ratio is 100.7～369.8 times to CPU and 0.95～1.58 times to CUDA. These well proved the validity and portability of the optimization methods proposed in this paper.

Keywords:	OpenCL Parallel computing Image remap Cross-platform

	点击此处可从《》浏览原始摘要信息
	点击此处可从《》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏