首页 | 官方网站   微博 | 高级检索  
     

基于Transformer的跨尺度交互学习伪装目标检测
引用本文:李建东,王岩,曲海成.基于Transformer的跨尺度交互学习伪装目标检测[J].计算机系统应用,2024,33(2):115-124.
作者姓名:李建东  王岩  曲海成
作者单位:辽宁工程技术大学 软件学院, 葫芦岛 125105
基金项目:国家自然科学基金面上项目(42271409); 辽宁省高等学校基本科研项目(LIKMZ20220699)
摘    要:伪装目标检测(COD)旨在精确且高效地检测出与背景高度相似的伪装物体, 其方法可为物种保护、医学病患检测和军事监测等领域提供助力, 具有较高的实用价值. 近年来, 采用深度学习方法进行伪装目标检测成为一个比较新兴的研究方向. 但现有大多数COD算法都是以卷积神经网络(CNN)作为特征提取网络, 并且在结合多层次特征时, 忽略了特征表示和融合方法对检测性能的影响. 针对基于卷积神经网络的伪装目标检测模型对被检测目标的全局特征提取能力较弱问题, 提出一种基于Transformer的跨尺度交互学习伪装目标检测方法. 该模型首先提出了双分支特征融合模块, 将经过迭代注意力的特征进行融合, 更好地融合高低层特征; 其次引入了多尺度全局上下文信息模块, 充分联系上下文信息增强特征; 最后提出了多通道池化模块, 能够聚焦被检测物体的局部信息, 提高伪装目标检测准确率. 在CHAMELEON、CAMO以及COD10K数据集上的实验结果表明, 与当前主流的伪装物体检测算法相比较, 该方法生成的预测图更加清晰, 伪装目标检测模型能取得更高精度.

关 键 词:深度学习  伪装目标检测  视觉特征金字塔  卷积神经网络  特征融合
收稿时间:2023/8/6 0:00:00
修稿时间:2023/9/9 0:00:00

Transformer-based Cross Scale Interactive Learning for Camouflage Object Detection
LI Jian-Dong,WANG Yan,QU Hai-Cheng.Transformer-based Cross Scale Interactive Learning for Camouflage Object Detection[J].Computer Systems& Applications,2024,33(2):115-124.
Authors:LI Jian-Dong  WANG Yan  QU Hai-Cheng
Affiliation:School of Software, Liaoning Technical University, Huludao 125105, China
Abstract:Camouflage object detection (COD) aims to accurately and efficiently detect camouflaged objects that are highly similar to the background. Its method can assist in species protection, medical patient detection, and military monitoring, possessing high practical value. In recent years, using deep learning methods to detect camouflaged objects has become an emerging research direction. However, most existing COD algorithms apply a convolutional neural network (CNN) as the feature extraction network and ignore the influence of feature representation and fusion methods on detection performance when combining multi-level features. As the camouflage object detection model based on CNN has a weak ability to extract the global features of the detected object, this study proposes a cross scale interactive learning method for camouflage object detection based on Transformer. The model first puts forward a dual branch feature fusion module, which fuses features that have undergone iterative attention to better fuse high- and low-level features. Secondly, a multi-scale global context information module is introduced to fully integrate context information to enhance features. Finally, a multi-channel pooling module is proposed, which can focus on the local information of the detected object and improve the accuracy of camouflage target detection. The experimental results on the CHAMELEON, CAMO, and COD10K datasets show that this method generates clearer prediction maps and can achieve higher accuracy in camouflage object detection models than current mainstream camouflage object detection algorithms.
Keywords:deep learning  camouflage object detection (COD)  visual characteristic pyramid  convolutional neural network (CNN)  feature fusion
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号