深度学习批归一化及其相关算法研究进展 Research Progress on Batch Normalization of Deep Learning and Its Related Algorithms期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

深度学习批归一化及其相关算法研究进展

引用本文：	刘建伟,赵会丹,罗雄麟,许鋆.深度学习批归一化及其相关算法研究进展[J].自动化学报,2020,46(6):1090-1120.

作者姓名：	刘建伟赵会丹罗雄麟许鋆

作者单位：	1.中国石油大学(北京)自动化系北京 1022249

基金项目：	国家重点研究发展计划项目基金（2016YFC0303703）,中国石油大学(北京)年度前瞻导向及培育项目基金（2462018QZDX02）资助

摘要：	深度学习已经广泛应用到各个领域, 如计算机视觉和自然语言处理等, 并都取得了明显优于早期机器学习算法的效果. 在信息技术飞速发展的今天, 训练数据逐渐趋于大数据集, 深度神经网络不断趋于大型化, 导致训练越来越困难, 速度和精度都有待提升. 2013年, Ioffe等指出训练深度神经网络过程中存在一个严重问题: 中间协变量迁移(Internal covariate shift), 使网络训练过程对参数初值敏感、收敛速度变慢, 并提出了批归一化(Batch normalization, BN)方法, 以减少中间协变量迁移问题, 加快神经网络训练过程收敛速度. 目前很多网络都将BN作为一种加速网络训练的重要手段, 鉴于BN的应用价值, 本文系统综述了BN及其相关算法的研究进展. 首先对BN的原理进行了详细分析. BN虽然简单实用, 但也存在一些问题, 如依赖于小批量数据集的大小、训练和推理过程对数据处理方式不同等, 于是很多学者相继提出了BN的各种相关结构与算法, 本文对这些结构和算法的原理、优势和可以解决的主要问题进行了分析与归纳. 然后对BN在各个神经网络领域的应用方法进行了概括总结, 并且对其他常用于提升神经网络训练性能的手段进行了归纳. 最后进行了总结, 并对BN的未来研究方向进行了展望.
关键词：	批归一化白化中间协变量迁移随机梯度下降归一化传播批量重归一化逐步归纳批量归一化层归一化
Research Progress on Batch Normalization of Deep Learning and Its Related Algorithms

Affiliation:	1.Department of Automation, China University of Petroleum (Beijing), Beijing 1022492.School of Mechanical Engineering and Automation, Harbin Institute of technology shenzhen, Shenzhen 518055

Abstract:	Deep learning has been widely applied to various fields, such as computer vision and natural language processing, and has achieved much better results than earlier machine learning. Today, with the rapid development of information technology, deep neural networks are trained with larger data sets, and the network depth is deepening, making training complicated and speed or accuracy need to be improved. In 2013, Ioffe et al. pointed out that there is a serious problem in the training process of deep neural network, i.e., internal covariate shift. It slows down the training for requiring careful parameter initialization and smaller learning rate. Ioffe et al. put forward batch normalization (BN) to reduce the effect of internal covariate shift, to accelerate the convergence speed of training neural networks. At present, many networks use BN as an important approach to accelerate training. In view of the application value of BN, this paper systematically reviews the research progress of BN and its related algorithms. Firstly, the theory of BN is analyzed. Although BN is simple and helpful, there are also some problems, such as relying on the size of mini-batch, training and inference process are in different ways. Therefore, many scholars have proposed a variety of algorithms based on BN, the advantages and main function of those algorithms are analyzed and summarized. Then, the applications of BN in various neural network fields are summarized. And we sum up other methods to improve the training performance of neural network. At last, we give a summation to whole paper, and point out the future development tendency and research direction of BN.

Keywords:

	点击此处可从《自动化学报》浏览原始摘要信息
	点击此处可从《自动化学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏