首页 | 官方网站   微博 | 高级检索  
     

GPU上稀疏矩阵与矢量乘积运算的一种改进
引用本文:马超,韦刚,裴颂文,吴百锋.GPU上稀疏矩阵与矢量乘积运算的一种改进[J].计算机系统应用,2010,19(5):116-120.
作者姓名:马超  韦刚  裴颂文  吴百锋
作者单位:1. 复旦大学计算机科学技术学院,上海,200433
2. 上海理工大学计算机科学工程系,上海,2000933
摘    要:稀疏矩阵和矢量的乘积运算在工程实践及科学计算中经常用到,随着矩阵规模的增长,大量的计算限制了整个系统的性能,因此可以利用GPU的高运算能力加速SpMV。分析了现有GPU上实现的SpMV存在的问题,并设计了行分割优化和float4数据类型优化两种方案。实验表明,该方案可以使性能提升2—8倍。

关 键 词:GPU  稀疏矩阵  CSR  CUDA
收稿时间:2009/9/10 0:00:00
修稿时间:2009/10/25 0:00:00

Improvement of Sparse Matrix-Vector Multiplication on GPU
MA Chao,WEI Gang,PEI Song-Wen and WU Bai-Feng.Improvement of Sparse Matrix-Vector Multiplication on GPU[J].Computer Systems& Applications,2010,19(5):116-120.
Authors:MA Chao  WEI Gang  PEI Song-Wen and WU Bai-Feng
Affiliation:MA Chao1,WEI Gang1,PEI Song-Wen2,WU Bai-Feng1(1.School of Computer Science,Fudan University,Shanghai 200433,China,2.Department of Computer Science , Engineering,University of Shanghai for Science , Technology,Shanghai 200093,China)
Abstract:Sparse Matrix-vector multiplication (SpMV) is one of the most frequently used kernels in engineering practice and scientific computing. With the growth of the scale matrix, a large number of calculations restrict the performance of system, so SpMV can be accelerated by utilizing the high computing power of GPU. In this paper, the problem of existing SpMV on GPU is analyzed. Besides, row partition optimization and float4 optimization are designed. Experimental results demonstrate that the proposed approach can enhance the performance by 2-8 times.
Keywords:GPU  sparse matrix  CSR  CUDA
本文献已被 CNKI 维普 万方数据 等数据库收录!
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号