改进深度强化学习的室内移动机器人路径规划 Path Planning for Indoor Mobile Robot with Improved Deep Reinforcement Learning期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

改进深度强化学习的室内移动机器人路径规划

引用本文：	成怡,郝密密.改进深度强化学习的室内移动机器人路径规划[J].计算机工程与应用,2021,57(21):256-262.

作者姓名：	成怡郝密密

作者单位：	天津工业大学控制科学与工程学院，天津 300387

摘要：	为了解决传统深度强化学习在室内未知环境下移动机器人路径规划中存在探索能力差和环境状态空间奖励稀疏的问题，提出了一种基于深度图像信息的改进深度强化学习算法。利用Kinect视觉传感器直接获取的深度图像信息和目标位置信息作为网络的输入，以机器人的线速度和角速度作为下一步动作指令的输出。设计了改进的奖惩函数，提高了算法的奖励值，优化了状态空间，在一定程度上缓解了奖励稀疏的问题。仿真结果表明，改进算法提高了机器人的探索能力，优化了路径轨迹，使机器人有效地避开了障碍物，规划出更短的路径，简单环境下比DQN算法的平均路径长度缩短了21.4%，复杂环境下平均路径长度缩短了11.3%。
关键词：	路径规划深度图像信息 Kinect视觉传感器深度强化学习奖惩函数探索能力
Path Planning for Indoor Mobile Robot with Improved Deep Reinforcement Learning

CHENG Yi,HAO Mimi.Path Planning for Indoor Mobile Robot with Improved Deep Reinforcement Learning[J].Computer Engineering and Applications,2021,57(21):256-262.

Authors:	CHENG Yi HAO Mimi

Affiliation:	School of Control Science and Engineering, Tiangong University, Tianjin 300387, China

Abstract:	An improved deep reinforcement learning algorithm based on deep image information is proposed in order to solve the problem of poor exploration ability and sparse environment state space of traditional deep reinforcement learning in path planning of the mobile robot in unknown indoor environment. The depth image information and target position information directly obtained by the Kinect visual sensor are used as the input of the network. The linear velocity and angular velocity of the robot are used as the output of the next action command. An improved reward and punishment function is designed to increase the reward value of the algorithm. The state space is optimized. To a certain extent, it alleviates the problem of reward sparsity. The simulation results show that the improved algorithm can improve the exploration ability of the robot and optimize the path trajectory. The robot can effectively avoid obstacles and plan a shorter path. Compared with DQN algorithm, the average path length in simple environment is shortened by 21.4%. The average path length in complex environment is reduced by 11.3%.

Keywords:	path planning depth image information Kinect visual sensor deep reinforcement learning reward and punishment function exploration ability
本文献已被万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏