AutoConfig: 面向深度学习编译优化的自动配置机制 AutoConfig: A Configuration-Driven Automated Mechanism for Deep Learning Compilation Optimization期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

AutoConfig: 面向深度学习编译优化的自动配置机制

引用本文：	张洪滨,周旭林,邢明杰,武延军,赵琛.AutoConfig: 面向深度学习编译优化的自动配置机制[J].软件学报,2024,35(6).

作者姓名：	张洪滨周旭林邢明杰武延军赵琛

作者单位：	中国科学院大学, 北京 100049;中国科学院软件研究所, 北京 100190

基金项目：	国家重点研发计划资助(2022YFB4401402)

摘要：	随着深度学习模型和硬件架构的快速发展,深度学习编译器已经被广泛应用.目前,深度学习模型的编译优化和调优的方法主要依赖基于高性能算子库的手动调优和基于搜索的自动调优策略.然而,面对多变的目标算子和多种硬件平台的适配需求,高性能算子库往往需要为各种架构进行多次重复实现.此外,现有的自动调优方案也面临着搜索开销大和缺乏可解释性的挑战.为了解决上述问题,本文提出了AutoConfig,一种面向深度学习编译优化的自动配置机制.针对不同的深度学习计算负载和特定的硬件平台,AutoConfig可以构建具备可解释性的优化算法分析模型,采用静态信息提取和动态开销测量的方法进行综合分析,并基于分析结果利用可配置的代码生成技术自动完成算法选择和调优.本文创新性地将优化分析模型与可配置的代码生成策略相结合,不仅保证了性能加速效果,还减少了重复开发的开销,同时简化了调优过程.在此基础上,本文进一步将AutoConfig集成到深度学习编译器Buddy Compiler中,对矩阵乘法和卷积的多种优化算法建立分析模型,并将自动配置的代码生成策略应用在多种SIMD硬件平台上进行评估.实验结果验证了AutoConfig在代码生成策略中有效地完成了参数配置和算法选择.与经过手动或自动优化的代码相比,由AutoConfig生成的代码可达到相似的执行性能,并且无需承担手动调优的重复实现开销和自动调优的搜索开销.
关键词：	深度学习编译器编译优化代码生成自动配置机制
收稿时间：	2023/9/11 0:00:00
修稿时间：	2023/10/30 0:00:00
AutoConfig: A Configuration-Driven Automated Mechanism for Deep Learning Compilation Optimization

ZHANG Hong-Bin,ZHOU Xu-Lin,XING Ming-Jie,WU Yan-Jun,ZHAO Chen.AutoConfig: A Configuration-Driven Automated Mechanism for Deep Learning Compilation Optimization[J].Journal of Software,2024,35(6).

Authors:	ZHANG Hong-Bin ZHOU Xu-Lin XING Ming-Jie WU Yan-Jun ZHAO Chen

Affiliation:	University of Chinese Academy of Sciences, Beijing 100049, China;Institute of Software, Chinese Academy of Sciences, Beijing 100190, China

Abstract:	Deep learning compilers have been widely used with the rapid development of deep learning models and hardware architectures. At present,the compilation optimization and tuning methods of deep learning models mainly rely on high-performance operator libraries and compiler auto-tuning. However,challenges arise when mapping diverse deep learning workloads to multiple platforms. Optimized libraries often require re-implementation across various architectures,leading to inefficiencies. Additionally,auto-tuning techniques face substantial search overheads and interpretability challenges. To solve this problem,this paper proposes AutoConfig,a configuration-driven automated mechanism for deep learning compilation optimization. Targeting different deep learning workloads and multiple hardware platforms,AutoConfig constructs interpretable performance analysis models,conducts a thorough assessment via static information extraction and dynamic cost measurement,and automates algorithm selection and configuration tuning for code generation. The key innovation of this work is combining the optimization analysis model into a configurable code generation approach. This idea has enabled AutoConfig to ensure performance acceleration,reduce repeated development overheads,and simplify the tuning process. Furthermore,this paper integrates AutoConfig into a deep learning compiler,Buddy Compiler,constructs analysis models for convolution and matrix multiply optimization,and evaluates the optimization on multiple SIMD hardware platforms. Experimental results indicate that AutoConfig is qualified in rewrite pattern configuration and algorithm selection. In addition,without the laborious manual re-implementation or the expansive auto-tuning search overhead,the performance of code generated by AutoConfig can match manual and automatic optimizations of other deep learning libraries and compilers.

Keywords:	deep learning compiler compilation optimization code generation automatic configuration mechanism

	点击此处可从《软件学报》浏览原始摘要信息
	点击此处可从《软件学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏