首页 | 官方网站   微博 | 高级检索  
     

融合注意力和多尺度特征的街景图像语义分割
引用本文:洪军,刘笑楠,刘振宇.融合注意力和多尺度特征的街景图像语义分割[J].计算机系统应用,2024,33(5):94-102.
作者姓名:洪军  刘笑楠  刘振宇
作者单位:沈阳工业大学 信息科学与工程学院, 沈阳 110870
基金项目:辽宁省应用基础研究计划(2023JH2/101300225)
摘    要:为了解决在街道场景图像语义分割任务中传统U-Net网络在多尺度类别下目标分割的准确率较低和图像上下文特征的关联性较差等问题,提出一种改进U-Net的语义分割网络AS-UNet,实现对街道场景图像的精确分割.首先,在U-Net网络中融入空间通道挤压激励(spatial and channel squeeze&excitation block, scSE)注意力机制模块,在通道和空间两个维度来引导卷积神经网络关注与分割任务相关的语义类别,以提取更多有效的语义信息;其次,为了获取图像的全局上下文信息,聚合多尺度特征图来进行特征增强,将空洞空间金字塔池化(atrous spatial pyramid pooling, ASPP)多尺度特征融合模块嵌入到U-Net网络中;最后,通过组合使用交叉熵损失函数和Dice损失函数来解决街道场景目标类别不平衡的问题,进一步提升分割的准确性.实验结果表明,在街道场景Cityscapes数据集和Cam Vid数据集上AS-UNet网络模型的平均交并比(mean intersection over union, MIo U)相较于传统U-Net网络分别提...

关 键 词:图像语义分割  街道场景  U-Net  注意力机制  多尺度特征融合
收稿时间:2023/12/6 0:00:00
修稿时间:2024/1/9 0:00:00

Semantic Segmentation of Street View Image Based on Attention and Multi-scale Features
HONG Jun,LIU Xiao-Nan,LIU Zhen-Yu.Semantic Segmentation of Street View Image Based on Attention and Multi-scale Features[J].Computer Systems& Applications,2024,33(5):94-102.
Authors:HONG Jun  LIU Xiao-Nan  LIU Zhen-Yu
Affiliation:School of Information Science and Engineering, Shenyang University of Technology, Shenyang 110870, China
Abstract:This study aims to solve the problems faced by traditional U-Net network in the semantic segmentation task of street scene images, such as the low accuracy of object segmentation under multi-scale categories and the poor correlation of image context features. To this end, it proposes an improved U-Net semantic segmentation network AS-UNet to achieve accurate segmentation of street scene images. Firstly, the spatial and channel squeeze & excitation block (scSE) attention mechanism module is integrated into the U-Net network to guide the convolutional neural network to focus on semantic categories related to segmentation tasks in both channel and space dimensions, to extract more effective semantic information. Secondly, to obtain the global context information of the image, the multi-scale feature map is aggregated for feature enhancement, and the atrous spatial pyramid pooling (ASPP) multi-scale feature fusion module is embedded into the U-Net network. Finally, the cross-entropy loss function and Dice loss function are combined to solve the problem of unbalanced target categories in street scenes, and the accuracy of segmentation is further improved. The experimental results show that the mean intersection over union (MIoU) of the AS-UNet network model in the Cityscapes and CamVid datasets increases by 3.9% and 3.0%, respectively, compared with the traditional U-Net network. The improved network model significantly improves the segmentation effect of street scene images.
Keywords:image semantic segmentation  street scene  U-Net  attention mechanism  multi-scale feature fusion
点击此处可从《计算机系统应用》浏览原始摘要信息
点击此处可从《计算机系统应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号