首页 | 官方网站   微博 | 高级检索  
     

基于双深度Q网络的智能决策系统研究
引用本文:况立群,冯利,韩燮,贾炅昊,郭广行.基于双深度Q网络的智能决策系统研究[J].计算机技术与发展,2022(2).
作者姓名:况立群  冯利  韩燮  贾炅昊  郭广行
作者单位:中北大学大数据学院;北方自动控制技术研究所;太原师范学院地理科学学院
基金项目:国家级装备预研项目(41401020402)。
摘    要:目前智能决策系统中的经典算法智能化程度较低,而更为先进的强化学习算法应用于复杂决策任务又会导致存储上的维度灾难问题。针对该问题,提出了一种基于双深度Q网络的智能决策算法,改进了目标Q值计算方法,并将动作选择和策略评估分开进行,从而获得更加稳定有效的策略。智能体对输入状态进行训练,输出一个较优的动作来驱动智能体行为,包括环境感知、动作感知及任务协同等,继而在复杂度较高的决策环境中顺利完成给定任务。基于Unity3D游戏引擎开发了虚拟智能对抗演练的验证系统,对演练实时状态和智能体训练结果进行可视化,验证了双深度Q网络模型的正确性和稳定性,有效解决了强化学习算法存在的灾难问题。该智能决策算法有望在策略游戏、对抗演练、任务方案评估等领域发挥作用。

关 键 词:深度强化学习  深度Q网络  对抗演练  仿真训练  UNITY3D

Research on Intelligent Decision-making System Based on Double Deep Q-Network
KUANG Li-qun,FENG Li,HAN Xie,JIA Jiong-hao,GUO Guang-xing.Research on Intelligent Decision-making System Based on Double Deep Q-Network[J].Computer Technology and Development,2022(2).
Authors:KUANG Li-qun  FENG Li  HAN Xie  JIA Jiong-hao  GUO Guang-xing
Affiliation:(School of Data Science and Technology,North University of China,Taiyuan 030051,China;North Automatic Control Technology Institute,Taiyuan 030006,China;School of Geography Science,Taiyuan Normal University,Taiyuan 030006,China)
Abstract:At present, the classical algorithms in intelligent decision-making systems have a lower degree of intelligence, and the application of more advanced reinforcement learning algorithms to complex decision-making tasks will lead to dimensional disasters on storage. Aiming at this problem, an intelligent decision-making algorithm based on double depth Q-network is proposed. The calculation method of target Q value is improved, and the action selection and strategy evaluation are carried out separately, so as to obtain more stable and effective strategies. The agent trains the input state and outputs a better action to drive the agent behavior, including environment perception, action perception and task coordination, and then successfully completes the given task in a more complex decision-making environment. Based on Unity3 D game engine, a verification system for virtual intelligent confrontation drill is developed, which visualizes the real-time states of the drill and the agent training results, verifies the correctness and stability of the double deep Q-network model, and effectively solves the disaster problem of reinforcement learning algorithms. The intelligent decision algorithm proposed is expected to play a role in strategy games, confrontation drills, mission plan evaluations and other fields.
Keywords:deep reinforcement learning  deep Q-network  confrontation drill  simulation training  Unity3D
本文献已被 维普 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号