面向空间机械臂操作任务的模仿学习策略 A learning strategy from demonstration for the operation tasks of space manipulators期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

面向空间机械臂操作任务的模仿学习策略

引用本文：	李重阳,蒋再男,刘宏,蔡鹤皋.面向空间机械臂操作任务的模仿学习策略[J].哈尔滨工业大学学报,2020,52(6):111-118.

作者姓名：	李重阳蒋再男刘宏蔡鹤皋

作者单位：	机器人技术与系统国家重点实验室(哈尔滨工业大学),哈尔滨 150001,机器人技术与系统国家重点实验室(哈尔滨工业大学),哈尔滨 150001,机器人技术与系统国家重点实验室(哈尔滨工业大学),哈尔滨 150001,机器人技术与系统国家重点实验室(哈尔滨工业大学),哈尔滨 150001

基金项目：	国家自然科学基金(91848202)

摘要：	为提高空间机械臂克服空间扰动的能力,降低关节力矩波动和能量消耗,提出了一种基于动力学约束的空间机械臂模仿学习策略.该策略分为两个阶段:第一阶段为基于高斯过程的模仿学习,利用运动学实例,采用高斯过程算法建立当前任务的运动模型,再根据当前环境,通过运动模型生成当前任务的期望轨迹分布.第二阶段为基于动力学约束的控制器设计,该控制器以第一阶段输出的期望轨迹分布为输入,以关节期望力矩为输出,在保证轨迹符合任务要求的同时,生成更加平滑的关节控制力矩.采用该模仿学习策略,用天宫二号空间机械臂在轨操控电动工具来定位螺钉,实验验证了该模仿学习策略的有效性.实验结果表明,与传统模仿学习加计算力矩控制的策略相比,采用基于动力学约束的空间机械臂模仿学习策略,机械臂的大负载关节力矩波动的峰-峰值可减少45%,波峰数可减少40%,能耗可减少31%,且关节力矩、加速度和速度更加平滑.该策略不仅克服了环境位置变化的不利因素,而且还降低了关节的力矩波动和能量消耗,提高了空间机械臂运行的平滑性,对高性能空间机械臂的在轨服务应用具有重要意义.
关键词：	空间机械臂模仿学习高斯过程线性二次跟踪型马氏范数
收稿时间：	2020/4/8 0:00:00
A learning strategy from demonstration for the operation tasks of space manipulators

LI Chongyang,JIANG Zainan,LIU Hong,CAI Hegao.A learning strategy from demonstration for the operation tasks of space manipulators[J].Journal of Harbin Institute of Technology,2020,52(6):111-118.

Authors:	LI Chongyang JIANG Zainan LIU Hong CAI Hegao

Affiliation:	State Key Laboratory of Robotics and System Harbin Institute of Technology, Harbin 150001, China

Abstract:	To improve the ability of overcoming the spatial disturbance, and reduce the joint torque fluctuations and energy consumption during operation, a learning strategy from demonstration based on dynamics constraints for space manipulators is proposed. This strategy is divided into two phases. Phase 1 is Gaussian process-based learning from demonstration, in which the motion model of the task is obtained by utilizing Gaussian process based on the kinesthetic demonstrations. Then, the desired trajectory distribution of the current task is reproduced using the model according to the environment. Phase 2 is the design of dynamics-constraint-based controller. The input of this controller is the trajectory distribution from phase 1, and the outputs are the joint desired torques. This controller is used to generate smoother joint control torques, while ensuring that the trajectory of manipulator can meet the task requirements. Finally, the strategy is verified by the on-orbit locating bolts task with Tiangong-2 space manipulator. Compared with the strategy of traditional learning from demonstration combined with computed torque controller, the joint torques peak-peak value of the large load joint is reduced by 45%, the number of peaks is reduced by 40%, and the energy consumption is reduced by 31%. Besides, the joint torques, accelerations and velocities are much smoother.

Keywords:	space manipulator learning from demonstration Gaussian process linear quadratic tracking Mahalanobis norm
本文献已被万方数据等数据库收录！
	点击此处可从《哈尔滨工业大学学报》浏览原始摘要信息
	点击此处可从《哈尔滨工业大学学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏