首页 | 官方网站   微博 | 高级检索  
     

结合空洞卷积的FuseNet变体网络高分辨率遥感影像语义分割
引用本文:杨军,于茜子.结合空洞卷积的FuseNet变体网络高分辨率遥感影像语义分割[J].武汉大学学报(信息科学版),2022,47(7):1071-1080.
作者姓名:杨军  于茜子
作者单位:1.兰州交通大学电子与信息工程学院,甘肃 兰州,730070
基金项目:国家自然科学基金61862039甘肃省科技计划20JR5RA4292021年度中央引导地方科技发展资金2021-51兰州交通大学优秀平台支持项目201806
摘    要:针对多模态、多尺度的高分辨率遥感影像分割问题,提出了结合空洞卷积的FuseNet变体网络架构对常见的土地覆盖对象类别进行语义分割。首先,采用FuseNet变体网络将数字地表模型(digital surface model,DSM)图像中包含的高程信息与红绿蓝(red green blue,RGB)图像的颜色信息融合;其次,在编码器和解码器中分别使用空洞卷积来增大卷积核感受野;最后,对遥感影像逐像素分类,输出遥感影像语义分割结果。实验结果表明,所提算法在国际摄影测量与遥感学会(International Society for Photogrammetry and Remote Sensing, ISPRS)提供的Potsdam、Vaihingen数据集上的mF1得分分别达到了91.6%和90.4%,优于已有的主流算法。

关 键 词:高分辨率遥感影像    深度卷积神经网络    空洞卷积    语义分割    FuseNet
收稿时间:2020-09-24

Semantic Segmentation of High-Resolution Remote Sensing Images Based on Improved FuseNet Combined with Atrous Convolution
Affiliation:1.School of Electronic and Information Engineering, Lanzhou Jiaotong University, Lanzhou 730070, China2.Faculty of Geomatics, Lanzhou Jiaotong University, Lanzhou 730070, China3.National-Local Joint Engineering Research Center of Technologies and Applications for National Geographic State Monitoring, Lanzhou 730070, China4.Gansu Provincial Engineering Laboratory for National Geographic State Monitoring, Lanzhou 730070, China
Abstract:  Objectives  With the development and popularization of deep learning theory, deep neural networks are widely used in image analysis and interpretation. The high-resolution remote sensing images have the characteristics of a large amount of information, complex data, and rich feature information, and most of the current semantic segmentation neural networks of the natural image are not completely designed for the characteristics of high-resolution remote sensing images, so it cannot effectively extract the detailed features of the ground objects in remote sensing images, and the segmentation accuracy needs to be improved.  Methods  We propose the process of improved FuseNet with the atrous convolution-convolutional neural network(IFA-CNN). Firstly, we use the improved FuseNet to fuse the elevation information of DSM(digital surface model) images with the color information of RGB(red green blue) images. At the same time, we propose a multimodal data fusion scheme to solve the problem of poor fusion of the RGB branch and DSM branch. Secondly, multiscale features are captured through flexibly adjusting the receptive field by the atrous convolution. Through deconvolution and upsampling, a decoder that increases the feature maps is formed. Finally, the Softmax classifier is used to procure the semantic segmentation results.  Results  Compared with relevant algorithms, IFA-CNN effectively improves the edge burr and thinning boundaries in segmented images, and is more accurate for segmentation of larger objects such as buildings and trees, it also reduces the miss segmentation condition with effect, the segmentation of the shadow covered areas is close to being perfect.The mF1 score achieved when our model is applied to the open ISPRS(International Society for Photogrammetry and Remote Sensing) Potsdam and Vaihingen dataset are 91.6% and 90.4% respectively, exceeding by a considerable margin of relevant algorithms.  Conclusions  (1) The virtual fusion(V-Fusion) unit used for segmentation by the multimodal data fusion strategy is more accurate than the one used by the FuseNet network.(2) The encoder-decoder structure is arranged in such a way that the effective improvement of the segmentation accuracy of small target features is guaranteed. So, the loss of detailed information can be decreased. (3) While the multimodal data fusion is being carried out by IFA-CNN, the atrous convolution expands the receptive field accordingly to extract the multiscale information.
Keywords:
点击此处可从《武汉大学学报(信息科学版)》浏览原始摘要信息
点击此处可从《武汉大学学报(信息科学版)》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号