首页 | 官方网站   微博 | 高级检索  
     

深度强化学习研究综述
引用本文:杨思明,单征,丁煜,李刚伟.深度强化学习研究综述[J].计算机工程,2021,47(12):19-29.
作者姓名:杨思明  单征  丁煜  李刚伟
作者单位:1. 数学工程与先进计算国家重点实验室, 郑州 450001;2. 中国人民解放军94162部队, 西安 710600;3. 中国人民解放军78100部队, 成都 610031
基金项目:国家自然科学基金(61971092,61701503)。
摘    要:深度强化学习是指利用深度神经网络的特征表示能力对强化学习的状态、动作、价值等函数进行拟合,以提升强化学习模型性能,广泛应用于电子游戏、机械控制、推荐系统、金融投资等领域。回顾深度强化学习方法的主要发展历程,根据当前研究目标对深度强化学习方法进行分类,分析与讨论高维状态动作空间任务上的算法收敛、复杂应用场景下的算法样本效率提高、奖励函数稀疏或无明确定义情况下的算法探索以及多任务场景下的算法泛化性能增强问题,总结与归纳4类深度强化学习方法的研究现状,同时针对深度强化学习技术的未来发展方向进行展望。

关 键 词:深度学习  强化学习  深度强化学习  逆向强化学习  基于模型的元学习  
收稿时间:2021-03-12
修稿时间:2021-05-15

Survey of Research on Deep Reinforcement Learning
YANG Siming,SHAN Zheng,DING Yu,LI Gangwei.Survey of Research on Deep Reinforcement Learning[J].Computer Engineering,2021,47(12):19-29.
Authors:YANG Siming  SHAN Zheng  DING Yu  LI Gangwei
Affiliation:1. State Key Laboratory of Mathematical Engineering and Advanced Computing, Zhengzhou 450001, China;2. 94162 Troops of PLA, Xi'an 710600, China;3. 78100 Troops of PLA, Chengdu 610031, China
Abstract:Deep Reinforcement Learning(DRL) refers to using feature representation capabilities of deep neural networks to fit Reinforcement Learning(RL) functions, including the state, action, and value, so the performance of RL models can be improved.It has been widely used in video games, mechanical control, recommendation system, financial investment and other fields.This article reviews the development history of DRL methods, and categorizes them based on the existing research goals.Then the article analyzes the algorithm convergence problem in high-dimensional state action space tasks, problem of improving sampling efficiency of the algorithms in the complex application scenarios, the algorithm exploration problem in the complex scenarios where the reward functions are sparse or inexplicitly defined, and the problem of enhancing the generalization ability of the algorithm in the multitasking scenarios.Finally, the article summarizes the current development of the four kinds of DRL methods, and discusses the future development trends of DRL technology.
Keywords:Deep Learning(DL)  Reinforcement Learning(RL)  Deep Reinforcement Learning(DRL)  Inverse Reinforcement Learning(IRL)  Model-Based Meta-Learning(MBML)  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程》浏览原始摘要信息
点击此处可从《计算机工程》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号