基于强化学习的多Agent路径规划方法研究 MULTI-AGENT PATH PLANNING BASED ON REINFORCEMENT LEARNING期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于强化学习的多Agent路径规划方法研究

引用本文：	王毅然,经小川,田涛,孙运乾,从帅军.基于强化学习的多Agent路径规划方法研究[J].计算机应用与软件,2019,36(8):165-171.

作者姓名：	王毅然经小川田涛孙运乾从帅军

作者单位：	中国航天系统科学与工程研究院北京 100048;中国航天系统科学与工程研究院北京 100048;中国航天系统科学与工程研究院北京 100048;中国航天系统科学与工程研究院北京 100048;中国航天系统科学与工程研究院北京 100048

基金项目：	广东省科技厅应用型研发基金

摘要：	以复杂任务下多个智能体路径规划问题为研究对象,提出一种基于强化学习的多Agent路径规划方法。该方法采用无模型的在线Q学习算法,多个Agent不断重复“探索学习利用”过程,积累历史经验评估动作策略并优化决策,完成未知环境下的多Agent的路径规划任务。仿真结果表明,与基于强化学习的单Agent路径规划方法相比,该方法在多Agent避免了相碰并成功躲避障碍物的前提下,减少了17.4%的总探索步数,形成了到达目标点的最短路径。
关键词：	多智能体强化学习路径规划 Q学习算法未知环境
MULTI-AGENT PATH PLANNING BASED ON REINFORCEMENT LEARNING

Wang Yiran,Jing Xiaochuan,Tian Tao,Sun Yunqian,Cong Shuaijun.MULTI-AGENT PATH PLANNING BASED ON REINFORCEMENT LEARNING[J].Computer Applications and Software,2019,36(8):165-171.

Authors:	Wang Yiran Jing Xiaochuan Tian Tao Sun Yunqian Cong Shuaijun

Affiliation:	(China Academy of Aerospace System Science and Engineering,Beijing 100048,China)

Abstract:	Taking multiple agents path planning problems under complex tasks as the research object,we proposed a multi-agent path planning method based on reinforcement learning.The method adopted a model-free online Q learning algorithm.In this method,a model-free online Q-learning algorithm was adopted.Many agents repeated the process of "exploration- learning-utilization",accumulated historical experience,evaluated action strategies and optimized decision-making,and completed the task of multi-agent path planning in unknown environment.The simulation results show that compared with the single agent path planning method based on reinforcement learning,this method reduces the total exploration steps by 17.4% and forms the shortest path to the target point on the premise that multi-agent avoids collision and successfully avoids obstacles.

Keywords:	Multi-agent Reinforcement learning Path planning Q learing algorithm Unknown environment
本文献已被维普万方数据等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏