首页 | 官方网站   微博 | 高级检索  
     

3D卷积自编码器高光谱图像分类模型
引用本文:石延新,何进荣,李照奎,曾志高.3D卷积自编码器高光谱图像分类模型[J].中国图象图形学报,2021,26(8):2021-2036.
作者姓名:石延新  何进荣  李照奎  曾志高
作者单位:延安大学数学与计算机科学学院, 延安 716000;沈阳航空航天大学计算机学院, 沈阳 110136;湖南工业大学计算机与通信学院, 株洲 412000
基金项目:国家自然科学基金项目(61902339);陕西省自然科学基础研究计划项目(2021JM-418);陕西省能源大数据智能处理省市共建重点实验室项目(IPBED14);延安市科技专项资助项目(2019-01,2019-13);谷歌支持的教育部产学合作协同育人项目学生项目(202002107065);科技创新2030-"新一代人工智能"重大项目(2018AAA0100400);湖南省自然科学基金项目(2018JJ2098);延安大学2020年省级创新创业训练计划项目(S202010719116,S202010719068)
摘    要:目的 高光谱图像分类是遥感领域的基础问题,高光谱图像同时包含丰富的光谱信息和空间信息,传统模型难以充分利用两种信息之间的关联性,而以卷积神经网络为主的有监督深度学习模型需要大量标注数据,但标注数据难度大且成本高。针对现有模型的不足,本文提出了一种无监督范式下的高光谱图像空谱融合方法,建立了3D卷积自编码器(3D convolutional auto-encoder,3D-CAE)高光谱图像分类模型。方法 3D卷积自编码器由编码器、解码器和分类器构成。将高光谱数据预处理后,输入到编码器中进行无监督特征提取,得到一组特征图。编码器的网络结构为3个卷积块构成的3D卷积神经网络,卷积块中加入批归一化技术防止过拟合。解码器为逆向的编码器,将提取到的特征图重构为原始数据,用均方误差函数作为损失函数判断重构误差并使用Adam算法进行参数优化。分类器由3层全连接层组成,用于判别编码器提取到的特征。以3D-CNN (three dimensional convolutional neural network)为自编码器的主干网络可以充分利用高光谱图像的空间信息和光谱信息,做到空谱融合。以端到端的方式对模型进行训练可以省去复杂的特征工程和数据预处理,模型的鲁棒性和稳定性更强。结果 在Indian Pines、Salinas、Pavia University和Botswana等4个数据集上与7种传统单特征方法及深度学习方法进行了比较,本文方法均取得最优结果,总体分类精度分别为0.948 7、0.986 6、0.986 2和0.964 9。对比实验结果表明了空谱融合和无监督学习对于高光谱遥感图像分类的有效性。结论 本文模型充分利用了高光谱图像的光谱特征和空间特征,可以做到无监督特征提取,无需大量标注数据的同时分类精度高,是一种有效的高光谱图像分类方法。

关 键 词:遥感图像分类  空谱特征融合  3D-CNN  自编码器  卷积神经网络(CNN)  深度学习
收稿时间:2021/3/16 0:00:00
修稿时间:2021/5/24 0:00:00

Hyperspectral image classification model based on 3D convolutional auto-encoder
Shi Yanxin,He Jinrong,Li Zhaokui,Zeng Zhigao.Hyperspectral image classification model based on 3D convolutional auto-encoder[J].Journal of Image and Graphics,2021,26(8):2021-2036.
Authors:Shi Yanxin  He Jinrong  Li Zhaokui  Zeng Zhigao
Affiliation:College of Mathematics and Computer Science, Yan''an University, Yan''an 716000, China;School of Computer, Shenyang Aerospace University, Shenyang 110136, China; College of Computer and Communication, Hunan University of Technology, Zhuzhou 412000, China
Abstract:Objective Hyperspectral image classification is a basic problem in the field of remote sensing, and it has been one of the research hotspots of numerous scholars. Hyperspectral images contain rich spectral and spatial information, and the classification accuracy of remote sensing images can be improved by using spectral and spatial features. Early traditional models, such as support vector machine and decision trees, could not fully utilize both information. With the development of deep learning technology, an increasing number of scholars use convolutional neural network as a model to extract the features of hyperspectral images. However, two dimensional convolutional neural network(2D-CNN) can only extract the spatial features of hyperspectral images and cannot fully use the band information of remote sensing data. 3D-CNN can efficiently simultaneously extract spectral and spatial features. The recurrent neural network cannot complete the task of hyperspectral image classification because of the difficulty of finding the optimal sequence length and over-fitting. At present, scholars focus on supervised deep learning model, which needs a substantial amount of labeled data to be effectively trained. However, labeled data are difficult and costly in reality. Therefore, the model must have good performance in the unknown world. An unsupervised normal form classification method for spatial-spectral fusion of hyperspectral images is proposed to address the problem that the existing models cannot fully use the spatial and spectral information and require a large amount of data for training. An unsupervised hyperspectral image classification model based on 3D convolution self-encoder is also established. Method The 3D convolution auto-encoder(3D-CAE) proposed in this work is composed of an encoder, a decoder, and a classifier. The hyperspectral image is inputted into an encoder after data pre-processing for unsupervised feature extraction to produce a set of feature maps. The network structure of the encoder is a 3D convolutional neural network of three convolution blocks, each of which is made up of two convolution layers and two global max-pooling layers. Batch normalization technique is added to the convolution blocks to prevent over-fitting. The decoder is an inverted encoder, which reconstructs the extracted feature graph into original data, and uses the mean square error function as the loss function to judge the reconstruction error and optimizes the parameters with the Adam algorithm. The classifier consists of three fully connected layers and uses ReLU as the activation function of the fully connected layer to classify the features extracted by the encoder. The backbone network with 3D-CNN as auto-encoder can fully use the spatial and spectral information of hyperspectral images to achieve spatial spectral fusion. The model is also trained end to end, eliminating the need for complex feature engineering and data pre-processing, making it more robust and stable. Result The seven methods on Indian Pines, Salinas, Pavia University, and Botswana datasets achieve the best results compared with the traditional single feature and deep learning methods. The overall classification accuracies are 0.948 7, 0.986 6, 0.986 2, and 0.964 9, the average classification accuracies are 0.936 0, 0.992 4, 0.982 9, and 0.965 9, and the Kappa values are 0.941 5, 0.985 1, 0.981 7, and 0.962 0, respectively. Comparative experimental results show that the spatial-spectral fusion and unsupervised learning are effective for hyperspectral remote sensing image classification. The ablation experiment is added because 3D-CAE is composed of a self-encoder and a classifier. Under the condition of the same self-encoder, four classifiers with different structures are used for classification. The experimental results are stable, and the validity of the self-encoder is proved. Five different proportions of datasets are used to prove the generalization of 3D-CAE. The training set proportions are 5%, 8%, 10%, 15%, and 20%. The loss of the auto-encoder and the classifier on the four datasets remained stable and low, and no oscillation was observed, indicating the better generalization of 3D-CAE. Finally, we analyze and discuss the parameters of each deep learning model. 3D-CAE has less parameters and the best classification performance, which proves its high efficiency. Conclusion The 3D-CAE model proposed in this work fully uses the spectral and spatial features of hyperspectral images. This model also achieves unsupervised feature extraction without substantial pre-processing and high classification accuracy without a large amount of labeled data. Thus, this model is an effective method for hyperspectral image classification.
Keywords:remote sensing image classification  spatial spectral feature fusion  3D-CNN  auto-encoder  convolutional neural network(CNN)  deep learning
点击此处可从《中国图象图形学报》浏览原始摘要信息
点击此处可从《中国图象图形学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号