期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Multi-view convolutional vision transformer for 3D object recognition

《Journal of Visual Communication and Image Representation》2023

With the rapid development of three-dimensional (3D) vision technology and the increasing application of 3D objects, there is an urgent need for 3D object recognition in the fields of computer vision, virtual reality, and artificial intelligence robots. The view-based method projects 3D objects into two-dimensional (2D) images from different viewpoints and applies convolutional neural networks (CNN) to model the projected views. Although these methods have achieved excellent recognition performance, there is not sufficient information interaction between the features of different views in these methods. Inspired by the recent success achieved by vision transformer (ViT) in image recognition, we propose a hybrid network by taking advantage of CNN to extract multi-scale local information of each view, and of transformer to capture the relevance of multi-scale information between different views. To verify the effectiveness of our multi-view convolutional vision transformer (MVCVT), we conduct experiments on two public benchmarks, ModelNet40 and ModelNet10, and compare with those of some state-of-the-art methods. The final results show that MVCVT has competitive performance in 3D object recognition. 相似文献

2.

Multi-scale convolutional attention network for lightweight image super-resolution

《Journal of Visual Communication and Image Representation》2023

Convolutional neural network (CNN) based methods have recently achieved extraordinary performance in single image super-resolution (SISR) tasks. However, most existing CNN-based approaches increase the model’s depth by stacking massive kernel convolutions, bringing expensive computational costs and limiting their application in mobile devices with limited resources. Furthermore, large kernel convolutions are rarely used in lightweight super-resolution designs. To alleviate the above problems, we propose a multi-scale convolutional attention network (MCAN), a lightweight and efficient network for SISR. Specifically, a multi-scale convolutional attention (MCA) is designed to aggregate the spatial information of different large receptive fields. Since the contextual information of the image has a strong local correlation, we design a local feature enhancement unit (LFEU) to further enhance the local feature extraction. Extensive experimental results illustrate that our proposed MCAN can achieve better performance with lower model complexity compared with other state-of-the-art lightweight methods. 相似文献

3.

一种基于修正激活函数的CNN车载毫米波雷达目标检测方法

下载免费PDF全文

王晨王明江陈嵩《信号处理》2023,39(1):116-127

为了提高车载毫米波雷达在复杂城市道路环境中目标检测的抗杂波与干扰能力,本文利用卷积神经网络（CNN）特征参数提取和目标分类特性,提出了一种改进的基于CNN的车载毫米波雷达目标检测方法。该方法首先将毫米波雷达回波信号距离-多普勒二维数据运用滑窗进行分割,并采用CNN网络模型处理分割后的二维矩阵,训练二维CNN网络模型及其参数,使其具有提取回波特征并基于特征参数模型进行目标分类的能力,从而实现目标检测功能。通过对卷积神经网络模型结构进行优化,增加批量归一化层,优化Dropout层使得低权重特征失活,自适应地删减部分神经元节点修正该层非线性激活函数,进一步降低了CNN模型目标检测的虚警概率。实验结果表明,在相同虚警概率条件下,CNN网络检测方法目标发现概率优于传统的单元平均恒虚警检测方法,并且在低信噪比的条件下仍然能够保持较高的发现概率;在同等发现概率水平下,修正后CNN网络检测方法的虚警概率较修正前可提高约1个数量级。相似文献

4.

Recaptured screen image identification based on vision transformer

《Journal of Visual Communication and Image Representation》2023

Due to the copyright issues often involved in the recapture of LCD screen content, recaptured screen image identification has received lots of concerns in image source forensics. This paper analyzes the characteristics of convolutional neural network (CNN) and vision transformer (ViT) in extracting features and proposes a cascaded network structure that combines local-feature and global-feature extraction modules to detect the recaptured screen image from original images with or without demoiréing operation. We first extract the local features of the input images with five convolutional layers and feed the local features into the ViT to enhance the local perception capability of the ViT module, and further extract the global features of the input images. Through thorough experiments, our method achieves a detection accuracy rate of 0.9691 in our generated dataset and 0.9940 in the existing mixture dataset, both showing the best performance among the compared methods. 相似文献

5.

A lightweight convolutional neural network for large-scale Chinese image caption 总被引：1，自引：0，他引：1

赵德新杨瑞雪郭淑涛《光电子快报》2021,17(6):361-366

相似文献

6.

Research on EEG emotion recognition based on CNN+BiLSTM+self-attention model

LI Xueqing LI Penghai FANG Zhendong CHENG Longlong WANG Zhiyong WANG Weijie 《光电子快报》2023,19(8):506-512

<正>To address the problems of insufficient dimensionality of electroencephalogram(EEG) feature extraction, the tendency to ignore the importance of different sequential data segments, and the poor generalization ability of the model in EEG based emotion recognition, the model of convolutional neural network and bi-directional long short-term memory and self-attention(CNN+Bi LSTM+self-attention) is proposed. This model uses convolutional neural network(CNN) to extract more distinctive featu... 相似文献

7.

基于生成对抗网络的高光谱图像分类

齐永锋吕雪超裴晓旭王静《光电子．激光》2021,32(12):1285-1292

为了解决简单卷积神经网络(convolutional neural network, CNN)不能有效提取与充分利用高光谱图像特征信息的问题,提出了一种基于残差网络的多层特征匹配生成对抗网络模型。提出的模型引入残差网络以挖掘高光谱图像的深层特征,生成可分性更高的高光谱图像,并通过一个特征融合层进行特征融合,充分利用网络的各层特征。提出的算法在Indian Pines、Pavia University和Salinas数据集上的分类精度分别达到了97.6%,99.3%,99.1%,与径向基函数支持向量机(radial basis function-support vector machine, RBF-SVM)、堆叠自动编码器(stacked autoencoder, SAE)、深度置信网络(deep belief network, DBN)、PPF-CNN (CNN based on pixel-pair feature)、CNN和三维卷积网络 (three-dimensional convolutional neural network, 3D-CNN)方法相比较,其分类精度具有明显的提高。实验结果表明,提出的方法是一种有效的高光谱图像分类方法。相似文献

8.

基于特征排列和空间激活的显著物体检测方法

祝世平谢文韬赵丛杨李庆海《电子与信息学报》2022,44(3):1093-1101

显著物体检测目前在计算机视觉领域中非常重要,如何处理不同尺度的特征信息成为能否获得优秀预测结果的关键.该文有两个主要贡献,一是提出一种用于显著目标检测的特征排列方法,基于自编码结构的卷积神经网络模型,利用尺度表征的概念将特征图进行分组和重排列,以获得一个更加泛化的显著目标检测模型和更加准确的显著目标预测结果;二是在输出... 相似文献

9.

3维卷积递归神经网络的高光谱图像分类方法

下载免费PDF全文

关世豪杨桄李豪付严宇《激光技术》2020,44(4):485-491

为了针对高光谱图像中空间信息与光谱信息的不同特性进行特征提取，提出一种3维卷积递归神经网络(3-D-CRNN)的高光谱图像分类方法。首先采用3维卷积神经网络提取目标像元的局部空间特征信息，然后利用双向循环神经网络对融合了局部空间信息的光谱数据进行训练，提取空谱联合特征，最后使用Softmax损失函数训练分类器实现分类。3-D-CRNN模型无需对高光谱图像进行复杂的预处理和后处理，可以实现端到端的训练，并且能够充分提取空间与光谱数据中的语义信息。结果表明，与其它基于深度学习的分类方法相比，本文中的方法在Pavia University与Indian Pines数据集上分别取得了99.94%和98.81%的总体分类精度，有效地提高了高光谱图像的分类精度与分类效果。该方法对高光谱图像的特征提取具有一定的启发意义。相似文献

10.

基于CRNN混合神经网络的多语种识别

王瑶龙华邵玉斌杜庆治王延凯《光电子．激光》2022,33(6):620-628

在语种识别过程中,为提取语音信号中的空间特征以及时序特征,从而达到提高多语种识别准确率的目的,提出了一种利用卷积循环神经网络(convolutional recurrent neural network,CRNN)混合神经网络的多语种识别模型。该模型首先提取语音信号的声学特征;然后将特征输入到卷积神经网络(convolutional neural network,CNN) 提取低维度的空间特征;再通过空间金字塔池化层(spatial pyramid pooling layer,SPP layer) 对空间特征进行规整,得到固定长度的一维特征;最后将其输入到循环神经网络(recurrenrt neural network,CNN) 来判别语种信息。为验证模型的鲁棒性,实验分别在3个数据集上进行,结果表明:相比于传统的CNN和RNN,CRNN混合神经网络对不同数据集的语种识别准确率均有提高,其中在8语种数据集中时长为5 s的语音上最为明显,分别提高了 5.3% 和6.1%。相似文献

11.

改进的卷积神经网络实现端到端的水下目标自动识别

下载免费PDF全文

王小宇李凡曹琳李军张驰彭圆丛丰裕《信号处理》2020,36(6):958-965

由于水声信号的高度复杂性，基于特征工程的传统水下目标识别方法表现欠佳。基于深度学习模型的水下目标识别方法可有效减少由于特征提取过程带来的水声信号信息损失，进而提高水下目标识别效果。本文提出一种适用于水下目标识别场景的卷积神经网络结构，即在卷积模块化设计中引入卷积核为1的卷积层，更大程度地保留水声信号局部特征，且降低模型的复杂程度；同时，以全局平均池化层替代全连接层的方式构造基于特征图对应的特征向量主导分类结果的网络结构，使结果更具可解释性，且减少训练参数降低过拟合风险。实验结果表明该方法得到的水下目标识别准确率（91.7%）要优于基于传统卷积神经网络（69.8%）和基于高阶统计量特征的传统方法识别表现（85%）。这说明本文提出的模型能更好保留水声信号的时域结构，进而提高分类识别效果。相似文献

12.

基于多尺度池化和范数注意力机制的遥感图像检索

葛芸马琳叶发茂储珺《电子与信息学报》2022,44(2):543-551

遥感图像内容丰富,一般的深度模型提取遥感图像特征时容易受复杂背景干扰,对关键特征的提取效果不佳,并且难以表达图像的空间信息,该文提出一种基于多尺度池化和范数注意力机制的深度卷积神经网络,在通道层面与空间层面自适应地给显著特征加权.首先,在多尺度池化通道注意力模块中,结合空间金字塔池化的思想,对每个通道上的特征图进行不同... 相似文献

13.

基于字符级联合网络特征融合的中文文本情感分析

王丽亚刘昌辉蔡敦波赵彤洲王梦《微电子学与计算机》2020,(1):80-86

针对传统卷积神经网络(CNN)同层神经元之间信息不能互传,无法充分利用同一层次上的特征信息,以及无法提取长距离上下文相关特征的问题.该文针对中文文本,提出字符级联合网络特征融合的模型进行情感分析,在字符级的基础上采用BiGRU和CNN-BiGRU并行的联合网络提取特征,利用CNN的强学习能力提取深层次特征,再利用双向门限循环神经网络(BiGRU)进行深度学习,加强模型对特征的学习能力.另一方面,利用BiGRU提取上下文相关的特征,丰富特征信息.最后在单方面上引入注意力机制进行特征权重分配,降低噪声干扰.在数据集上进行多组对比实验,该方法取得92.36%的F1值,结果表明本文提出的模型能有效的提高文本分类的准确率. 相似文献

14.

Learning spatial hierarchies of high-level features in deep neural network

《Journal of Visual Communication and Image Representation》2020

This paper addresses a new approach to learn perceptual grouping of the extracted features of the convolutional neural network (CNN) to represent the structure contained in the image. In CNN, the spatial hierarchies between the high-level features are not taken into account. To do so, the perceptual grouping of features is utilized. To consider the intra-relationship between feature maps, modified Guided Co-occurrence Block (mGCoB) is proposed. This block preserves the joint co-occurrence of two features in the spatial domain and it prevents the co-adaptation. Also, to preserve the interrelationship in each feature map, the principle of common region grouping is utilized which states that the features which are located in the same feature map tend to be grouped together. To consider it, an MFC block is proposed. To evaluate the proposed approach, it is applied to some known semantic segmentation and image classification datasets that achieve superior performance. 相似文献

15.

Face recognition system based on CNN and LBP features for classifier optimization and fusion

吴雨林江铭炎《中国邮电高校学报(英文版)》2018,25(1):37-47

Face recognition has been a hot-topic in the field of pattern recognition where feature extraction and classification play an important role. However, convolutional neural network (CNN) and local binary pattern (LBP) can only extract single features of facial images, and fail to select the optimal classifier. To deal with the problem of classifier parameter optimization, two structures based on the support vector machine (SVM) optimized by artificial bee colony (ABC) algorithm are proposed to classify CNN and LBP features separately. In order to solve the single feature problem, a fusion system based on CNN and LBP features is proposed. The facial features can be better represented by extracting and fusing the global and local information of face images. We achieve the goal by fusing the outputs of feature classifiers. Explicit experimental results on Olivetti Research Laboratory (ORL) and face recognition technology (FERET) databases show the superiority of proposed approaches. 相似文献

16.

Stereoscopic image quality assessment considering visual mechanism and multi-loss constraints

《Journal of Visual Communication and Image Representation》2021

In this paper, a convolutional neural network (CNN) with multi-loss constraints is designed for stereoscopic image quality assessment (SIQA). A stereoscopic image not only contains monocular information, but also provides binocular information which is as identically crucial as the former. So we take the image patches of left-view images, right-view images and the difference images as the inputs of the network to utilize monocular information and binocular information. Moreover, we propose a method to obtain proxy label of each image patch. It preserves the quality difference between different regions and views. In addition, the multiple loss functions with adaptive loss weights are introduced in the network, which consider both local features and global features and constrain the feature learning from multiple perspectives. And the adaptive loss weights also make the multi-loss CNN more flexible. The experimental results on four public SIQA databases show that the proposed method is superior to other existing SIQA methods with state-of-the-art performance. 相似文献

17.

基于CNN-WF的高灵敏紫外成像仪中的图像配准与融合

侯思祖刘雅婷陈天威《半导体光电》2021,42(4):596-602

针对现有紫外成像仪中紫外光与可见光图像配准实时性差,精度不高等问题,提出一种基于卷积神经网络(Convolutional Neural Networks,CNN)与小波融合(Wavelet Fusion,WF)的紫外光与可见光图像配准融合方法,并将其应用于高灵敏紫外成像仪中.首先,结合刚体变换和卷积神经网络对采集到的图像数据进行参数模型预训练,通过自主挖掘图像特征寻找到最优空间变换参数,实现紫外光图像与可见光图像的精确配准;其次,利用二维小波分解与重构算法实现紫外光与可见光图像的融合.实验结果表明,所提方法的紫外光图像与可见光图像配准速度快,叠加精度高,且具有良好的稳定性. 相似文献

18.

基于轻量级光谱-空间注意力交互网络的高光谱地物分类研究

周予程二丽张娅莉刘宇红《光电子．激光》2023,34(4):397-404

通过引入基于卷积神经网络(convolutional neural network, CNN)的分类算法,高光谱图像(hyperspectral image, HSI)分类任务的精度取得显著的提升,但目前主流CNN算法往往较为复杂且参数量大,从而导致网络难以训练以及容易产生过拟合问题。为在保证网络分类性能的前提下实现轻量化,本文提出一个轻量级架构的基于光谱-空间注意力交互机制的CNN网络用于HSI分类。为实现HSI的光谱-空间特征提取,构建了一个轻量化的双路径骨干网络用于两种特征的提取和融合。其次,为提高特征的表征能力,设计了两个注意力模块分别用于光谱和空间特征的权重再调整。同时,为加强双路径特征之间的关联以实现特征的更好融合,注意力交互机制被引入到网络中以进一步提升网络性能。在3个真实HSI数据集上的分类结果表明,本文所提网络可达到99.5%的分类准确度,并相比于其他网络至少减少50%的参数量。相似文献

19.

Global and local information aggregation network for edge-aware salient object detection

《Journal of Visual Communication and Image Representation》2021

Aggregation of local and global contextual information by exploiting multi-level features in a fully convolutional network is a challenge for the pixel-wise salient object detection task. Most existing methods still suffer from inaccurate salient regions and blurry boundaries. In this paper, we propose a novel edge-aware global and local information aggregation network (GLNet) to fully exploit the integration of side-output local features and global contextual information and utilization of contour information of salient objects. The global guidance module (GGM) is proposed to learn discriminative multi-level information with the direct guidance of global semantic knowledge for more accurate saliency prediction. Specifically, the GGM consists of two key components, where the global feature discrimination module exploits the inter-channel relationship of global semantic features to boost representation power, and the local feature discrimination module enables different side-output local features to selectively learn informative locations by fusing with global attentive features. Besides, we propose an edge-aware aggregation module (EAM) to employ the correlation between salient edge information and salient object information for generating estimated saliency maps with explicit boundaries. We evaluate our proposed GLNet on six widely-used saliency detection benchmark datasets by comparing with 17 state-of-the-art methods. Experimental results show the effectiveness and superiority of our proposed method on all the six benchmark datasets. 相似文献

20.

一种在MR图像中进行脑胶质瘤检测和病灶分割的方法

陈皓李广刘洋强永乾《电子与信息学报》2021,43(4):992-1002

针对磁共振图像(MRI)进行脑胶质瘤检测及病灶分割对临床治疗方案的选择和手术实施过程的引导都有着重要的价值。为了提高脑胶质瘤的检测效率和分割准确率,该文提出了一种两阶段计算方法。首先,设计了一个轻量级的卷积神经网络,并通过该网络完成MR图像中肿瘤的快速检测及大致定位;接着,通过集成学习过程对肿瘤周围水肿、肿瘤非增强区、肿瘤增强区和正常脑组织等4种不同区域进行分类与彼此边界的精细分割。为提高分割的准确率,在MR图像中提取了416维影像组学特征并与128维通过卷积神经网络提取的高阶特征进行组合和特征约简,将特征约简后产生的298维特征向量用于分类学习。为对算法的性能进行验证,在BraTS2017数据集上进行了实验,实验结果显示该文提出的方法能够快速检测并定位肿瘤,同时相比其它方法,整体分割精度也有明显提升。相似文献