首页 | 官方网站   微博 | 高级检索  
     

多文化场景下的多模态情感识别
引用本文:陈师哲,王帅,金琴.多文化场景下的多模态情感识别[J].软件学报,2018,29(4):1060-1070.
作者姓名:陈师哲  王帅  金琴
作者单位:中国人民大学信息学院, 北京 100872,中国人民大学信息学院, 北京 100872,中国人民大学信息学院, 北京 100872
基金项目:国家重点研发计划“人机交互自然性的计算原理”项目(项目号:2016YFB1001200)
摘    要:自动情感识别是一个非常具有挑战性的课题,并且有着广泛的应用价值.本文探讨了在多文化场景下的多模态情感识别问题.我们从语音声学和面部表情等模态分别提取了不同的情感特征,包括传统的手工定制特征和基于深度学习的特征,并通过多模态融合方法结合不同的模态,比较不同单模态特征和多模态特征融合的情感识别性能.我们在CHEAVD中文多模态情感数据集和AFEW英文多模态情感数据集进行实验,通过跨文化情感识别研究,我们验证了文化因素对于情感识别的重要影响,并提出3种训练策略提高在多文化场景下情感识别的性能,包括:分文化选择模型、多文化联合训练以及基于共同情感空间的多文化联合训练,其中基于共同情感空间的多文化联合训练通过将文化影响与情感特征分离,在语音和多模态情感识别中均取得最好的识别效果.

关 键 词:情感识别  多文化  语音情感特征  面部表情特征  多模态融合  深度卷积神经网络
收稿时间:2017/4/30 0:00:00
修稿时间:2017/6/26 0:00:00

Multimodal Emotion Recognition in Multi-Cultural Conditions
CHEN Shi-Zhe,WANG Shuai and JIN Qin.Multimodal Emotion Recognition in Multi-Cultural Conditions[J].Journal of Software,2018,29(4):1060-1070.
Authors:CHEN Shi-Zhe  WANG Shuai and JIN Qin
Affiliation:School of Information, Renmin University of China, Beijing 100872, China,School of Information, Renmin University of China, Beijing 100872, China and School of Information, Renmin University of China, Beijing 100872, China
Abstract:Automatic emotion recognition is a challenging task with a wide range of applications.In this paper, we address the problem of emotion recognition in multi-cultural conditions.We extract diffentmulti-modal features from audio and visual modalities and compare the emotion recognition performance between hand-crafted features and automatically learned features from deep neural networks. We also explore multimodal feature fusion to combine different modalities.The CHEAVD Chinease multimodal emotion dataset and AFEW English multimodal emotion dataset are utilized to evaluate our proposed methods. We demonstrate the importance of the culture factor for emotion recognition through cross-culture emotion recognition, and then propose three different strategies to improve the emotion recognition performance in the multi-cultural environment, such as selecting corresponding emotion model for different cultures, jointly training with multi-cultural datasets and embedding features from multi-cultural datasets into the same emotion space.The embedding strategy separates the culture influence from original features and can generate more discriminative emotion features, which achieves best performance for acoustic and multimodal emotion recognition.
Keywords:emotion recognition  multi-cultural condition  acoustic emotion feature  facial expression feature  multimodal fusion  deep convolutional neural networks
点击此处可从《软件学报》浏览原始摘要信息
点击此处可从《软件学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号