一种基于功用性图的目标推抓技能自监督学习方法 A Self-supervised Learning Method of Target Pushing-Grasping Skills Based on Affordance Map期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

一种基于功用性图的目标推抓技能自监督学习方法

引用本文：	吴培良,刘瑞军,毛秉毅,史浩洋,陈雯柏,高国伟. 一种基于功用性图的目标推抓技能自监督学习方法[J]. 机器人, 2022, 44(4): 385-398. DOI: 10.13973/j.cnki.robot.210265

作者姓名：	吴培良刘瑞军毛秉毅史浩洋陈雯柏高国伟

作者单位：	1. 燕山大学信息科学与工程学院, 河北秦皇岛 066004;2. 河北省计算机虚拟技术与系统集成重点实验室, 河北秦皇岛 066004;3. 北京信息科技大学自动化学院, 北京 100192

基金项目：	国家重点研发计划（2018YFB1308300）；;国家自然科学基金区域联合基金（U20A20167）；;北京市自然科学基金（4202026）；;河北省自然科学基金（F202103079）；

摘要：	提出了一种基于功用性图的目标推抓技能自监督学习方法。首先，给出了杂乱环境下面向目标推抓任务的机器人技能自监督学习问题描述，将工作空间中机器人推抓操作的决策过程定义为一个全新的马尔可夫决策过程（MDP），分别训练视觉机制模块与动作机制模块。其次，在视觉机制模块中融合自适应参数与分组拆分注意力模块设计了特征提取网络RGSA-Net，可由输入网络的原始状态图像生成功用性图，为目标推抓操作提供良好的前提。然后，在动作机制模块中搭建了基于演员－评论家（actor-critic）框架的深度强化学习自监督训练框架DQAC，机器人根据功用性图执行动作后利用该框架进行动作评判，更好地实现了推、抓之间的协同。最后，进行了实验对比与分析，验证了本文方法的有效性。
关键词：	推抓技能学习功用性图自监督学习自适应参数拆分注意力机制
收稿时间：	2021-06-18
A Self-supervised Learning Method of Target Pushing-Grasping Skills Based on Affordance Map

WU Peiliang,LIU Ruijun,MAO Bingyi,SHI Haoyang,CHEN Wenbai,GAO Guowei. A Self-supervised Learning Method of Target Pushing-Grasping Skills Based on Affordance Map[J]. Robot, 2022, 44(4): 385-398. DOI: 10.13973/j.cnki.robot.210265

Authors:	WU Peiliang LIU Ruijun MAO Bingyi SHI Haoyang CHEN Wenbai GAO Guowei

Affiliation:	1. School of Information Science and Engineering, Yanshan University, Qinhuangdao 066004, China;2. The Key Laboratory for Computer Virtual Technology and System Integration of Hebei Province, Qinhuangdao 066004, China;3. School of Automation, Beijing Information Science & Technology University, Beijing 100192, China

Abstract:	A self-supervised learning method of target pushing-grasping skills based on affordance map is presented. Firstly, the self-supervised learning problem is described for robot to learn target pushing-grasping skills in cluttered environment. The decision process of robot pushing and grasping operation in workspace is defined as a new Markov decision process (MDP), in which the vision mechanism module and action mechanism module are trained separately. Secondly, the adaptive parameters and group split attention module are fused in the vision mechanism module to design the feature extraction network RGSA-Net, which can generate the affordance map from the original state image of the input network, and provide a good premise for the target pushing-grasping operation. Then, a deep reinforcement learning based self-supervised training framework DQAC based on actor-critic framework is built in the action mechanism module. After the robot performs the action according to the affordance map, the DQAC framework is used to evaluate the action, and thus better cooperation between pushing and grasping is realized. Finally, experimental comparison and analysis are carried out to verify the effectiveness of the proposed method.

Keywords:	pushing-grasping skill learning affordance map self-supervised learning adaptive parameter split attention mechanism

	点击此处可从《机器人》浏览原始摘要信息
	点击此处可从《机器人》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏