基于多模态特征融合监督的RGB-D图像显著性检测 RGB-D Image Saliency Detection Based on Multi-modal Feature-fused Supervision期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于多模态特征融合监督的RGB-D图像显著性检测

引用本文：	刘政怡,段群涛,石松,赵鹏.基于多模态特征融合监督的RGB-D图像显著性检测[J].电子与信息学报,2020,42(4):997-1004.

作者姓名：	刘政怡段群涛石松赵鹏

作者单位：	安徽大学计算机科学与技术学院合肥 230601

基金项目：	安徽省自然科学基金(1908085MF182)，国家自然科学基金(61602004)，安徽高校自然科学研究项目(KJ2019A0034)

摘要：	RGB-D图像显著性检测是在一组成对的RGB和Depth图中识别出视觉上最显著突出的目标区域。已有的双流网络，同等对待多模态的RGB和Depth图像数据，在提取特征方面几乎一致。然而，低层的Depth特征存在较大噪声，不能很好地表征图像特征。因此，该文提出一种多模态特征融合监督的RGB-D图像显著性检测网络，通过两个独立流分别学习RGB和Depth数据，使用双流侧边监督模块分别获取网络各层基于RGB和Depth特征的显著图，然后采用多模态特征融合模块来融合后3层RGB和Depth高维信息生成高层显著预测结果。网络从第1层至第5层逐步生成RGB和Depth各模态特征，然后从第5层到第3层，利用高层指导低层的方式产生多模态融合特征，接着从第2层到第1层，利用第3层产生的融合特征去逐步地优化前两层的RGB特征，最终输出既包含RGB低层信息又融合RGB-D高层多模态信息的显著图。在3个公开数据集上的实验表明，该文所提网络因为使用了双流侧边监督模块和多模态特征融合模块，其性能优于目前主流的RGB-D显著性检测模型，具有较强的鲁棒性。
关键词：	RGB-D显著性检测卷积神经网络多模态监督
收稿时间：	2019-04-29
RGB-D Image Saliency Detection Based on Multi-modal Feature-fused Supervision

Zhengyi LIU,Quntao DUAN,Song SHI,Peng ZHAO.RGB-D Image Saliency Detection Based on Multi-modal Feature-fused Supervision[J].Journal of Electronics & Information Technology,2020,42(4):997-1004.

Authors:	Zhengyi LIU Quntao DUAN Song SHI Peng ZHAO

Affiliation:	School of Computer Science and Technology, Anhui University, Hefei 230601, China

Abstract:	RGB-D saliency detection identifies the most visually attentive target areas in a pair of RGB and Depth images. Existing two-stream networks, which treat RGB and Depth data equally, are almost identical in feature extraction. As the lower layers Depth features with a lot of noise, it causes image features not be well characterized. Therefore, a multi-modal feature-fused supervision of RGB-D saliency detection network is proposed, RGB and Depth data are studied independently through two-stream , double-side supervision module is used respectively to obtain saliency maps of each layer, and then the multi-modal feature-fused module is used to later three layers of the fused RGB and Depth of higher dimensional information to generate saliency predicted results. Finally, the information of lower layers is fused to generate the ultimate saliency maps. Experiments on three open data sets show that the proposed network has better performance and stronger robustness than the current RGB-D saliency detection models.

Keywords:

	点击此处可从《电子与信息学报》浏览原始摘要信息
	点击此处可从《电子与信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏