首页 | 官方网站   微博 | 高级检索  
     

融合注意力特征的遮挡物体6D姿态估计
引用本文:马康哲,皮家甜,熊周兵,吕佳.融合注意力特征的遮挡物体6D姿态估计[J].计算机应用,2022,42(12):3715-3722.
作者姓名:马康哲  皮家甜  熊周兵  吕佳
作者单位:重庆师范大学 计算机与信息科学学院, 重庆 401331
重庆国家应用数学中心(重庆师范大学), 重庆 401331
北京理工大学重庆创新中心, 重庆 401120
基金项目:重庆市教委科技项目重点项目(KJZD?K202114802);重庆市教委科技项目青年项目(KJQN201800116);重庆市高校创新群体项目(CXQT20015);重庆市2021年研究生科研创新项目(CYS21272)
摘    要:在机械臂视觉抓取过程中,现有的算法在复杂背景、光照不足、遮挡等条件下,难以对目标物体进行实时、准确、鲁棒的姿态估计。针对以上问题,提出一种基于关键点方法的融合注意力特征的物体6D姿态网络。首先,在跳跃连接(Skip Connection)阶段引入能够聚焦通道空间信息的卷积注意力模块(CBAM),使编码阶段的浅层特征与解码阶段的深层特征进行有效融合,增强特征图的空间域信息和精确位置通道信息;其次,采用归一化损失函数以弱监督的方式回归每个关键点的注意力图,将注意力图作为对应像素位置上关键点偏移量的权重分数;最后,累加求和得到关键点坐标。实验结果证明,所提网络在LINEMOD数据集和Occlusion LINEMOD数据集上ADD(-S)指标分别达到了91.3%和46.3%。与基于关键点的逐像素投票网络(PVNet)相比ADD(-S)指标分别提升了5.0个百分点和5.5个百分点,验证了所提网络在遮挡场景下有更好的鲁棒性。

关 键 词:物体6D姿态估计  注意力模块  卷积注意力模块  遮挡物体  关键点  
收稿时间:2021-10-28
修稿时间:2021-12-06

6D pose estimation incorporating attentional features for occluded objects
Kangzhe MA,Jiatian PI,Zhoubing XIONG,Jia LYU.6D pose estimation incorporating attentional features for occluded objects[J].journal of Computer Applications,2022,42(12):3715-3722.
Authors:Kangzhe MA  Jiatian PI  Zhoubing XIONG  Jia LYU
Affiliation:College of Computer Information and Science,Chongqing Normal University,Chongqing 401331,China
National Center for Applied Mathematics in Chongqing(Chongqing Normal University),Chongqing 401331,China
Beijing Institute of Technology Chongqing Innovation Center,Chongqing 401120,China
Abstract:In the process of robotic vision grasping, it is difficult for the existing algorithms to perform real-time, accurate and robust pose estimation of the target object under complex background, insufficient illumination, occlusion, etc. Aiming at the above problems, a 6D pose estimation network with fused attention features based on the key point method was proposed. Firstly, Convolutional Block Attention Module (CBAM) was added in the skip connection stage to focus the spatial and channel information, so that the shallow features in the encoding stage were effectively fused with the deep features in the decoding stage, the spatial domain information and accurate position channel information of the feature map were enhanced. Secondly, the attention map of every key point was regressed in a weakly supervised way using a normalized loss function. The attention map was used as the weight of the key point offset at the corresponding pixel position. Finally, the coordinates of keypoints were obtained by accumulating and summing. The experimental results demonstrate that the proposed network reaches 91.3% and 46.3% on the LINEMOD and Occlusion LINEMOD datasets respectively in the ADD(-S) metric. 5.0 percentage points and 5.5 percentage points improvement in the ADD(-S) metric are achieved compared to Pixel Voting Network (PVNet), which verifies that the proposed network improves the robustness of objects in occlusion scenes.
Keywords:object 6D pose estimation  attention mechanism  Convolutional Block Attention Module (CBAM)  obscured object  key point  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号