首页 | 官方网站   微博 | 高级检索  
     

改进深度强化学习的室内移动机器人路径规划
引用本文:成怡,郝密密.改进深度强化学习的室内移动机器人路径规划[J].计算机工程与应用,2021,57(21):256-262.
作者姓名:成怡  郝密密
作者单位:天津工业大学 控制科学与工程学院,天津 300387
摘    要:为了解决传统深度强化学习在室内未知环境下移动机器人路径规划中存在探索能力差和环境状态空间奖励稀疏的问题,提出了一种基于深度图像信息的改进深度强化学习算法。利用Kinect视觉传感器直接获取的深度图像信息和目标位置信息作为网络的输入,以机器人的线速度和角速度作为下一步动作指令的输出。设计了改进的奖惩函数,提高了算法的奖励值,优化了状态空间,在一定程度上缓解了奖励稀疏的问题。仿真结果表明,改进算法提高了机器人的探索能力,优化了路径轨迹,使机器人有效地避开了障碍物,规划出更短的路径,简单环境下比DQN算法的平均路径长度缩短了21.4%,复杂环境下平均路径长度缩短了11.3%。

关 键 词:路径规划  深度图像信息  Kinect视觉传感器  深度强化学习  奖惩函数  探索能力  

Path Planning for Indoor Mobile Robot with Improved Deep Reinforcement Learning
CHENG Yi,HAO Mimi.Path Planning for Indoor Mobile Robot with Improved Deep Reinforcement Learning[J].Computer Engineering and Applications,2021,57(21):256-262.
Authors:CHENG Yi  HAO Mimi
Affiliation:School of Control Science and Engineering, Tiangong University, Tianjin 300387, China
Abstract:An improved deep reinforcement learning algorithm based on deep image information is proposed in order to solve the problem of poor exploration ability and sparse environment state space of traditional deep reinforcement learning in path planning of the mobile robot in unknown indoor environment. The depth image information and target position information directly obtained by the Kinect visual sensor are used as the input of the network. The linear velocity and angular velocity of the robot are used as the output of the next action command. An improved reward and punishment function is designed to increase the reward value of the algorithm. The state space is optimized. To a certain extent, it alleviates the problem of reward sparsity. The simulation results show that the improved algorithm can improve the exploration ability of the robot and optimize the path trajectory. The robot can effectively avoid obstacles and plan a shorter path. Compared with DQN algorithm, the average path length in simple environment is shortened by 21.4%. The average path length in complex environment is reduced by 11.3%.
Keywords:path planning  depth image information  Kinect visual sensor  deep reinforcement learning  reward and punishment function  exploration ability  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号