首页 | 官方网站   微博 | 高级检索  
     

一种组合型的深度学习模型学习率策略
引用本文:贺昱曜,李宝奇.一种组合型的深度学习模型学习率策略[J].自动化学报,2016,42(6):953-958.
作者姓名:贺昱曜  李宝奇
作者单位:西北工业大学航海学院 西安 710072
基金项目:国家自然科学基金(61271143)资助
摘    要:一个设计良好的学习率策略可以显著提高深度学习模型的收敛速度, 减少模型的训练时间. 本文针对AdaGrad和AdaDec学习策略只对模型所有参数提供单一学习率方式的问题, 根据模型参数的特点, 提出了一种组合型学习策略: AdaMix. 该策略为连接权重设计了一个仅与当前梯度有关的学习率, 为偏置设计使用了幂指数型学习率.利用深度学习模型Autoencoder对图像数据库MNIST进行重构, 以模型反向微调过程中测试阶段的重构误差作为评价指标, 验证几种学习策略对模型收敛性的影响.实验结果表明, AdaMix比AdaGrad和AdaDec的重构误差小并且计算量也低, 具有更快的收敛速度.

关 键 词:深度学习    学习率    组合学习策略    图像重构
收稿时间:2015-10-20

A Combinatory Form Learning Rate Scheduling for Deep Learning Model
HE Yu-Yao,LI Bao-Qi.A Combinatory Form Learning Rate Scheduling for Deep Learning Model[J].Acta Automatica Sinica,2016,42(6):953-958.
Authors:HE Yu-Yao  LI Bao-Qi
Affiliation:School of Marine Science and Technology, Northwestern Polytechnical University, Xi'an 710072
Abstract:A good learning rate scheduling can significantly improve the convergence rate of the deep learning model and reduce the training time. The AdaGrad and AdaDec learning strategies only provide a single form learning rate for all the parameters of the deep learning model. In this paper, AdaMix is proposed. According to the characteristics of the model parameters, and a learning rate form which is only based on the current epoch gradient is designed for the connection weights, a power exponential learning rate form is used for the bias. The test reconstruction error in the fine-turning phase of the deep learning model is used as the evaluation index. In order to verify the convergence of the deep learning based on different learning rate strategies, Autoencoder, a deep learning model, is trained to restructure the MNIST database. The experimental results show that Adamix has the lowest reconstruction error and minimum calculation compared with AdaGrad and AdaDec, so the deep learning model can quickly converge by using AdaMix.
Keywords:Deep learning  learning rate  combined learning scheduling  image reconstruction
点击此处可从《自动化学报》浏览原始摘要信息
点击此处可从《自动化学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号