首页 | 官方网站   微博 | 高级检索  
     

注意力金字塔卷积残差网络的表情识别
引用本文:陈加敏,徐杨.注意力金字塔卷积残差网络的表情识别[J].计算机工程与应用,2022,58(22):123-131.
作者姓名:陈加敏  徐杨
作者单位:1.贵州大学 大数据与信息工程学院,贵阳 550025 2.贵阳铝镁设计研究院有限公司,贵阳 550009
摘    要:人脸表情是人类内心情绪最真实最直观的表达方式之一,不同的表情之间具有细微的类间差异信息。因此,提取表征能力较强的特征成为表情识别的关键问题。为提取较为高级的语义特征,在残差网络(ResNet)的基础上提出一种注意力金字塔卷积残差网络模型(APRNET50)。该模型融合金字塔卷积模块、通道注意力和空间注意力。首先用金字塔卷积提取图像的细节特征信息,然后对所提特征在通道和空间维度上分配权重,按权重大小定位显著区域,最后通过全连接层构建分类器对表情进行分类。以端到端的方式进行训练,使得所提网络模型更适合于精细的面部表情分类。实验结果表明,在FER2013和CK+数据集上识别准确率可以达到73.001%和94.949%,与现有的方法相比识别准确率分别提高了2.091个百分点和0.279个百分点,达到了具有相对竞争力的效果。

关 键 词:残差网络  金字塔卷积  注意力机制  表情识别  特征提取  

Expression Recognition Based on Convolution Residual Network of Attention Pyramid
CHEN Jiamin,XU Yang.Expression Recognition Based on Convolution Residual Network of Attention Pyramid[J].Computer Engineering and Applications,2022,58(22):123-131.
Authors:CHEN Jiamin  XU Yang
Affiliation:1.College of Big Data and Information Engineering, Guizhou University, Guiyang 550025, China 2.Guiyang Aluminum-Magnesium Design and Research Institute Co., Ltd., Guiyang 550009, China
Abstract:Facial expression is one of the most authentic and intuitive ways of expressing human inner emotions, there are subtle inter-class differences between different expressions. Therefore, extracting features with strong representational ability has become a key issue in facial expression recognition. In order to extract more advanced semantic features, an attention pyramid convolutional residual network model(APRNTE50) based on residual network(ResNet) is proposed, which integrates the pyramid convolution module, channel attention and spatial attention. Firstly, use pyramid convolution to extract the detailed feature information of the image, then assign the weight of the proposed features in the channel and spatial dimension, and locate the salient regions according to weight, finally, use full connection layer to construct a classifier to classify facial expressions. The proposed network is more suitable for the detailed classification of facial expressions when trained with an end-to-end manner. The results show that the recognition accuracy can reach 73.001% and 94.949% on FER2013 and CK+ datasets, compared with the existing methods, the recognition accuracy is improved by 2.091 percentage points and 0.279 percentage points respectively, and achieve a relatively competitive effect.
Keywords:residual network  pyramid convolution  attention mechanism  facial expression recognition  feature extraction  
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号