首页 | 官方网站   微博 | 高级检索  
     

基于强化学习的无人坦克对战仿真研究
引用本文:徐志雄,曹 雷,陈希亮.基于强化学习的无人坦克对战仿真研究[J].计算机工程与应用,2018,54(8):166-171.
作者姓名:徐志雄  曹 雷  陈希亮
作者单位:解放军理工大学 指挥信息系统学院,南京 210000
摘    要:对标准的强化学习进行改进,通过引入动机层,来引入先验知识,加快学习速度。策略迭代选择上,通过采用“同策略”迭代的Sarsa学习算法,代替传统的“异策略”Q学习算法。提出了基于多动机引导的Sarsa学习(MMSarsa)算法,分别和Q学习算法、Sarsa学习算法在坦克对战仿真问题上进行了三种算法的对比实验。实验结果表明,基于多动机引导的Sarsa学习算法收敛速度快且学习效率高。

关 键 词:多动机引导  Q学习  Sarsa学习  无人坦克  对战仿真  

Research on unmanned tank battle simulation based on reinforcement learning
XU Zhixiong,CAO Lei,CHEN Xiliang.Research on unmanned tank battle simulation based on reinforcement learning[J].Computer Engineering and Applications,2018,54(8):166-171.
Authors:XU Zhixiong  CAO Lei  CHEN Xiliang
Affiliation:Institute of Command Information System, PLA University of Science and Technology, Nanjing 210000, China
Abstract:To improve the classic reinforcement learning, through the introduction of motivation, prior knowledge is introduced, and the learning speed is speeded up. As to the iteration strategy, it adopts “on-policy” iterative Sarsa learning algorithm instead of traditional “off-policy” Q learning algorithm. It proposes Multi-Motivation Sarsa learning algorithm(MMSarsa) and respectively carries out the comparative tests on tank battle simulation with Q-learning algorithm and Sarsa learning algorithm. The results of experiment show that Sarsa learning algorithm based on motivation guidance has fast convergence rate and high learning efficiency.
Keywords:multi-motivation guidance  Q learning  Sarsa learning  unmanned tank  battle simulation  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号