面向国产平台的LLVM自动向量化移植与优化 Automatic Vectorization Transplant and Optimization of LLVM for Domestic Processors期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向国产平台的LLVM自动向量化移植与优化

引用本文：	李嘉楠,韩林,柴赟达.面向国产平台的LLVM自动向量化移植与优化[J].计算机工程,2022,48(1):142-148.

作者姓名：	李嘉楠韩林柴赟达

作者单位：	1. 郑州大学信息工程学院, 郑州 450000;2. 国家超级计算郑州中心, 郑州 450000

基金项目：	国家重点研发计划“全球对地观测成果管理及共享服务系统关键技术研究”（2018YFB0505000）；

摘要：	作为SIMD扩展部件向量化的重要手段，自动向量化已在LLVM编译器中得到实现，但向量长度以及指令集功能的差异，导致国产平台在自动向量化过程中容易错失向量化机会以及向量化后产生倒加速的问题。为使SIMD得到充分应用，结合国产平台的指令集特征完善指令代价信息以提高收益分析精准度，使其在自动向量化后生成后端支持且简洁高效的向量指令。在此基础上，提出一种改进的控制流向量化方法，通过添加指令代价信息提高自动向量化的适配能力，从而形成一套面向国产平台的LLVM自动向量化系统。实验结果表明，相比自动向量化移植前，通过该方法进行移植优化后，SPEC测试的整体性能提升10.8%，TSVC测试集中的加速比提升16%，精准代价指导下的加速比提升42%，控制流向量化下的加速比提升51%。
关键词：	自动向量化向量化收益移植 LLVM编译器国产平台
收稿时间：	2020-12-09
修稿时间：	2021-01-19
Automatic Vectorization Transplant and Optimization of LLVM for Domestic Processors

LI Jia'nan,HAN Lin,CHAI Yunda.Automatic Vectorization Transplant and Optimization of LLVM for Domestic Processors[J].Computer Engineering,2022,48(1):142-148.

Authors:	LI Jia'nan HAN Lin CHAI Yunda

Affiliation:	1. School of Information Engineering, Zhengzhou University, Zhengzhou 450000, China;2. National Supercomputing Center in Zhengzhou, Zhengzhou 450000, China

Abstract:	Automatic vectorization is essential in SIMD extension vectorization, and has been implemented in the LLVM compiler.However, the difference of vector length and instruction set functions can cause the domestic processors to lose the opportunity of vectorization in the process of automatic vectorization, or produce negative acceleration after vectorization.To make full use of SIMD, this paper discusses how to improve instruction cost information according to the instruction set features of domestic processors, so the accuracy of benefit analysis is increased.On this basis, precise and efficient vector instructions supported by the back end are generated after automatic vectorization.Furthermore, this paper proposes a vectorization method with improved control flows.By adding instruction cost information, the adaptability of automatic vectorization is improved.Finally a LLVM-based automatic vectorization system for domestic platforms is formed.The experimental results show that for the platforms having received automatic vectorization transplant, the proposed method provides a 10.8% overall performance improvement in SPEC tests, 16% acceleration ratio improvement on the TSVC test, 42% acceleration ratio improvement under the guidance of precision cost, and 51% acceleration ratio improvement under the control flow vecctorization.

Keywords:	automatic vectorization vectorization cost transplant LLVM compiler domestic processor
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏