首页 | 官方网站   微博 | 高级检索  
     

三分支多层次Transformer特征交互的RGB-D显著性目标检测
引用本文:孟令兵,袁梦雅,时雪涵,刘晴晴,程菲,黎玲利,何术锋.三分支多层次Transformer特征交互的RGB-D显著性目标检测[J].四川大学学报(工程科学版),2023,55(6):245-256.
作者姓名:孟令兵  袁梦雅  时雪涵  刘晴晴  程菲  黎玲利  何术锋
作者单位:安徽信息工程学院计算机科学与技术学院,安徽信息工程学院电气与电子工程学院,安徽信息工程学院计算机科学与技术学院,安徽信息工程学院计算机科学与技术学院,杭州电子科技大学,黑龙江大学 计算机科学与技术学院,南京水利科学研究院 生态环境研究所;南京水利科学研究院 生态环境研究所
基金项目:黑龙江省自然科学基金优秀青年项目(YQ2019F016),安徽省自然科学基金面上项目(2008085MF201),安徽信息工程学院高层次人才科研启动项目(rckj2021A002),安徽省教育厅自然科学重点项目(KJ2020a0824)
摘    要:RGB-D显著性目标检测是计算机视觉领域的研究任务之一,很多模型在简单场景下取得了较好的检测效果,却无法有效地处理多目标、深度图质量低下以及显著性目标色彩与背景相似等复杂场景。因此,本文提出一种三分支多层次Transformer特征交互的RGB-D显著性目标检测模型。首先,本文采用坐标注意力模块抑制RGB和深度图的噪声信息,提取出更为显著的特征用于后续解码。其次,通过特征融合模块将高层的三层特征图调整到相同的分辨率送入Transformer层,有效获取远距离显著性目标之间的关联关系和整幅图像的全局信息。然后,本文提出一个多层次特征交互模块,该模块通过有效地利用高层特征和低层特征对显著性目标的位置和边界进行细化。最后,本文设计一个密集扩张特征细化模块,利用密集扩张卷积获取丰富的多尺度特征,有效地应对显著性目标数量和尺寸变化。通过在5个公开的基准数据集与19种主流模型相比,实验结果表明:本文方法在多个测评指标上有较好的提升效果,提高了在特定复杂场景下的检测精度,从P-R曲线、F-measure曲线和显著图也可以直观看出本文方法实现了较好的检测结果,生成的显著图更完整、更清晰,相比其他模型更加接近真值图。

关 键 词:显著性目标检测  坐标注意力  Transformer  特征交互  密集卷积  显著图
收稿时间:2022/5/31 0:00:00
修稿时间:2022/11/27 0:00:00

RGB-D Salient Object Detection with Three-Branch Multi-Level Transformer Feature Interaction
Meng Lingbing,Yuan Mengy,Shi Xuehan,Liu Qingqing,Cheng Fei,Li Lingli and He Shufeng.RGB-D Salient Object Detection with Three-Branch Multi-Level Transformer Feature Interaction[J].Journal of Sichuan University (Engineering Science Edition),2023,55(6):245-256.
Authors:Meng Lingbing  Yuan Mengy  Shi Xuehan  Liu Qingqing  Cheng Fei  Li Lingli and He Shufeng
Affiliation:School of Computer Science and Technology,Anhui Institute of Information Technology,School of Electrical and Electronic Engineering,Anhui Institute of Information Technology,School of Computer Science and Technology,Anhui Institute of Information Technology,School of Computer Science and Technology,Anhui Institute of Information Technology,Hangzhou Dianzi University,School of Computer Science and Technology,Heilongjiang University,
Abstract:RGB-D salient object detection (SOD) is one of the research tasks in the field of computer vision. Many models have achieved good detection performance in simple scenes, but they cannot effectively handle complex scenes with multiple objects, low-quality depth maps, and object colors that are similar to the background. In order to solve the above problems, a three-branch multi-level Transformer feature interaction RGB-D salient object detection model is proposed in this paper. First, the coordinate attention module is used to suppress the noise information in RGB images and depth maps and extract more significant features for the subsequent decoding stage. Secondly, through the feature fusion module, the high three-layer feature map is adjusted to the same resolution and sent to the Transformer layer, which can effectively obtain the correlation between distant objects in the image and the global information of the entire image. Then, a multi-level feature interaction module is proposed, which refines the location and boundary of salient objects by effectively using both high-level features and low-level features. Finally, we design a Dense Dilated Feature Refinement Module to obtain rich multi-scale features by using dense dilation convolution to effectively address the number and size variations of objects. By comparing five public benchmark datasets with 19 mainstream models, the experimental results show that our method has a good improvement effect on multiple evaluation metrics and effectively improves the detection accuracy of salient objects in complex scenes. We can also intuitively see that our method achieves better detection results from the P-R curve, the F-measure curve, and the saliency map of visual detection, which is more complete, clearer, and closer to the ground truth map than other models.
Keywords:salient object detection  coordinate attention  Transformer  feature interaction  dilated convolution  saliency map
点击此处可从《四川大学学报(工程科学版)》浏览原始摘要信息
点击此处可从《四川大学学报(工程科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号