首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
近年来,利用计算机技术实现基于多模态数据的情绪识别成为自然人机交互和人工智能领域重要 的研究方向之一。利用视觉模态信息的情绪识别工作通常都将重点放在脸部特征上,很少考虑动作特征以及融合 动作特征的多模态特征。虽然动作与情绪之间有着紧密的联系,但是从视觉模态中提取有效的动作信息用于情绪 识别的难度较大。以动作与情绪的关系作为出发点,在经典的 MELD 多模态情绪识别数据集中引入视觉模态的 动作数据,采用 ST-GCN 网络模型提取肢体动作特征,并利用该特征实现基于 LSTM 网络模型的单模态情绪识别。 进一步在 MELD 数据集文本特征和音频特征的基础上引入肢体动作特征,提升了基于 LSTM 网络融合模型的多 模态情绪识别准确率,并且结合文本特征和肢体动作特征提升了上下文记忆模型的文本单模态情绪识别准确率, 实验显示虽然肢体动作特征用于单模态情绪识别的准确度无法超越传统的文本特征和音频特征,但是该特征对于 多模态情绪识别具有重要作用。基于单模态和多模态特征的情绪识别实验验证了人体动作中含有情绪信息,利用 肢体动作特征实现多模态情绪识别具有重要的发展潜力。  相似文献   

2.
Pattern Analysis and Applications - Sign Language is the linguistic system adopted by the Deaf to communicate. The lack of fully-fledged Automatic Sign Language (ASLR) technologies contributes to...  相似文献   

3.
面部运动单元检测旨在让计算机从给定的人脸图像或视频中自动检测需要关注的运动单元目标。经过二十多年的研究,尤其是近年来越来越多的面部运动单元数据库的建立和深度学习的兴起,面部运动单元检测技术发展迅速。首先,阐述了面部运动单元的基本概念,介绍了已有的常用面部运动单元检测数据库,概括了包括预处理、特征提取、分类器学习等步骤在内的传统检测方法;然后针对区域学习、面部运动单元关联学习、弱监督学习等几个关键研究方向进行了系统性的回顾梳理与分析;最后讨论了目前面部运动单元检测研究存在的不足以及未来潜在的发展方向。  相似文献   

4.
人脸动作编码系统从人脸解剖学的角度定义了一组面部动作单元(action unit,AU),用于精确刻画人脸表情变化。每个面部动作单元描述了一组脸部肌肉运动产生的表观变化,其组合可以表达任意人脸表情。AU检测问题属于多标签分类问题,其挑战在于标注数据不足、头部姿态干扰、个体差异和不同AU的类别不均衡等。为总结近年来AU检测技术的发展,本文系统概述了2016年以来的代表性方法,根据输入数据的模态分为基于静态图像、基于动态视频以及基于其他模态的AU检测方法,并讨论在不同模态数据下为了降低数据依赖问题而引入的弱监督AU检测方法。针对静态图像,进一步介绍基于局部特征学习、AU关系建模、多任务学习以及弱监督学习的AU检测方法。针对动态视频,主要介绍基于时序特征和自监督AU特征学习的AU检测方法。最后,本文对比并总结了各代表性方法的优缺点,并在此基础上总结和讨论了面部AU检测所面临的挑战和未来发展趋势。  相似文献   

5.
This paper proposes an integrated system for unconstrained face recognition in complex scenes. The scale and orientation tolerant system comprises a face detector followed by a recognizer. Given a color input image of a person, the face detector encloses the face from the complex scene within a circular boundary, and locates the position of the nose. A radial grid mapping centered on the nose is then performed to extract a feature vector within the boundary. The feature vector is input to a radial basis function neural network classifier for face identification. The proposed face detector achieved an average detection rate of 95.8% while the face recognizer achieved an average recognition rate of 97.5% on a database of 21 persons with variations in scale, orientation, natural illumination and background. The two modules were combined to form an automatic face recognition system that was evaluated in the context of a security system using a video database of 21 users and 10 intruders, acquired in an unconstrained environment. A recognition rate of 93.5% with 0% false acceptance rate was achieved.  相似文献   

6.
7.
张晨  钱涛  姬东鸿 《计算机应用》2018,38(9):2464-2468
情绪诱因抽取作为深层次的文本情绪理解已成为情绪分析任务中的新热点,当前研究通常把诱因抽取和情绪识别看作两个独立的任务,容易导致错误在任务间的传播问题。考虑到情绪识别及诱因抽取是相互作用的,以及微博文本中表情符通常表达文本的情绪,提出了一种基于双向长短期记忆条件随机场(Bi-LSTM-CRF)模型的情绪诱因和表情符情绪识别的联合模型。该模型将情绪诱因抽取以及情绪识别形式化为一个统一的序列标注问题,充分利用了情绪诱因与情绪之间的互相作用,将情绪诱因的抽取和情绪识别同时进行。实验结果表明,该模型在诱因抽取任务中的F值为82.70%,在情绪识别任务中的F值为74.74%,相比串行模型的F值分别提高5.82和17.12个百分点,这个结果表明联合模型能够有效降低任务串行进行时的误差传递,同时提高了诱因抽取和情绪识别的F值。  相似文献   

8.
We have employed two pattern recognition methods used commonly for face recognition in order to analyse digital mammograms. The methods are based on novel classification schemes, the AdaBoost and the support vector machines (SVM). A number of tests have been carried out to evaluate the accuracy of these two algorithms under different circumstances. Results for the AdaBoost classifier method are promising, especially for classifying mass-type lesions. In the best case the algorithm achieved accuracy of 76% for all lesion types and 90% for masses only. The SVM based algorithm did not perform as well. In order to achieve a higher accuracy for this method, we should choose image features that are better suited for analysing digital mammograms than the currently used ones.  相似文献   

9.
Evidently, Intelligent Transport System (ITS) has progressed tremendously all its way. The core of ITS are detection and recognition of traffic sign, which are designated to fulfill safety and comfort needs of driver. This paper provides a critical review on three major steps in Automatic Traffic Sign Detection and Recognition(ATSDR) system i.e., segmentation, detection and recognition in the context of vision based driver assistance system. In addition, it focuses on different experimental setups of image acquisition system. Further, discussion on possible future research challenges is made to make ATSDR more efficient, which inturn produce a wide range of opportunities for the researchers to carry out the detailed analysis of ATSDR and to incorporate the future aspects in their research.  相似文献   

10.
A key assumption of traditional machine learning approach is that the test data are draw from the same distribution as the training data. However, this assumption does not hold in many real-world scenarios. For example, in facial expression recognition, the appearance of an expression may vary significantly for different people. As a result, previous work has shown that learning from adequate person-specific data can improve the expression recognition performance over the one from generic data. However, person-specific data is typically very sparse in real-world applications due to the difficulties of data collection and labeling, and learning from sparse data may suffer from serious over-fitting. In this paper, we propose to learn a person-specific model through transfer learning. By transferring the informative knowledge from other people, it allows us to learn an accurate model for a new subject with only a small amount of person-specific data. We conduct extensive experiments to compare different person-specific models for facial expression and action unit (AU) recognition, and show that transfer learning significantly improves the recognition performance with a small amount of training data.  相似文献   

11.
This paper presents an integrated system for emotion detection. In this research effort, we have taken into account the fact that emotions are most widely represented with eye and mouth expressions. The proposed system uses color images and it is consisted of three modules. The first module implements skin detection, using Markov random fields models for image segmentation and skin detection. A set of several colored images with human faces have been considered as the training set. A second module is responsible for eye and mouth detection and extraction. The specific module uses the HLV color space of the specified eye and mouth region. The third module detects the emotions pictured in the eyes and mouth, using edge detection and measuring the gradient of eyes’ and mouth’s region figure. The paper provides results from the system application, along with proposals for further research.  相似文献   

12.
药品灌装质量检测是制药过程的一个重要环节,是药品质量的可靠保证.针对医药大输液可见异物视觉检测的需求,研制出基于多视觉的大输液自动化检测识别系统.首先研究了医药图像的高速高可靠性预处理方法,有效消除由机械振动和跟踪引起的干扰.研究了以药液微小异物为目标的改进模糊细胞神经网络图像分割方法,揭示了液体中异物目标、微粒、气泡等产生机理,综合分析目标的形态特征、边缘轮廓、运行特征等,得到各种异物的类型特征以及在序列图像中的动态变化信息.最后,使用序列图像的目标特征,基于支持向量机的AdaBoosting分类算法进行异物识别,结果证明本文提出的方法检测识别率高,对工程设备的研制具有重要意义.  相似文献   

13.
Automatic perception of human affective behaviour from facial expressions and recognition of intentions and social goals from dialogue contexts would greatly enhance natural human robot interaction. This research concentrates on intelligent neural network based facial emotion recognition and Latent Semantic Analysis based topic detection for a humanoid robot. The work has first of all incorporated Facial Action Coding System describing physical cues and anatomical knowledge of facial behaviour for the detection of neutral and six basic emotions from real-time posed facial expressions. Feedforward neural networks (NN) are used to respectively implement both upper and lower facial Action Units (AU) analysers to recognise six upper and 11 lower facial actions including Inner and Outer Brow Raiser, Lid Tightener, Lip Corner Puller, Upper Lip Raiser, Nose Wrinkler, Mouth Stretch etc. An artificial neural network based facial emotion recogniser is subsequently used to accept the derived 17 Action Units as inputs to decode neutral and six basic emotions from facial expressions. Moreover, in order to advise the robot to make appropriate responses based on the detected affective facial behaviours, Latent Semantic Analysis is used to focus on underlying semantic structures of the data and go beyond linguistic restrictions to identify topics embedded in the users’ conversations. The overall development is integrated with a modern humanoid robot platform under its Linux C++ SDKs. The work presented here shows great potential in developing personalised intelligent agents/robots with emotion and social intelligence.  相似文献   

14.
Effective human and automatic processing of speech requires recovery of more than just the words. It also involves recovering phenomena such as sentence boundaries, filler words, and disfluencies, referred to as structural metadata. We describe a metadata detection system that combines information from different types of textual knowledge sources with information from a prosodic classifier. We investigate maximum entropy and conditional random field models, as well as the predominant hidden Markov model (HMM) approach, and find that discriminative models generally outperform generative models. We report system performance on both broadcast news and conversational telephone speech tasks, illustrating significant performance differences across tasks and as a function of recognizer performance. The results represent the state of the art, as assessed in the NIST RT-04F evaluation.  相似文献   

15.
A system that could automatically analyze the facial actions in real time has applications in a wide range of different fields. However, developing such a system is always challenging due to the richness, ambiguity, and the dynamic nature of facial actions. Although a number of research groups attempt to recognize facial action units (AUs) by either improving facial feature extraction techniques, or the AU classification techniques, these methods often recognize AUs or certain AU combinations individually and statically, ignoring the semantic relationships among AUs and the dynamics of AUs. Hence, these approaches cannot always recognize AUs reliably, robustly, and consistently.In this paper, we propose a novel approach that systematically accounts for the relationships among AUs and their temporal evolutions for AU recognition. Specifically, we use a dynamic Bayesian network (DBN) to model the relationships among different AUs. The DBN provides a coherent and unified hierarchical probabilistic framework to represent probabilistic relationships among various AUs and to account for the temporal changes in facial action development. Within our system, robust computer vision techniques are used to obtain AU measurements. And such AU measurements are then applied as evidence to the DBN for inferring various AUs. The experiments show that the integration of AU relationships and AU dynamics with AU measurements yields significant improvement of AU recognition, especially for spontaneous facial expressions and under more realistic environment including illumination variation, face pose variation, and occlusion.  相似文献   

16.
现有配电自动化设备模块化单元检测系统在检测额定频率下三相平衡电压幅值时的误差较大,因此设计一个配电自动化设备模块化单元自动检测系统.硬件设计中,设计了硬件的整体结构,根据系统性能选择单片机型号,对接口和管脚进行功能设置,实现单片机与上位机之间的通信,在声光报警器中,主要完善了内部电路,利用电路中电平的高低决定报警器的报...  相似文献   

17.
音乐情感自动分析在音乐检索和音乐推荐等方面具有广泛的应用.对3种音乐情感模型进行了对比分析,介绍了音乐情感分类方法,并指出已有研究存在的不足.音乐分割与摘要是高效音乐浏览与推荐的基础,在对音乐分割与摘要方法进行分析的基础上,指出了定长分割策略的不足;借助音乐相似性与情感可视化实现音乐推荐,对音乐相似性度量与可视化方法进行了概述.最后,展望了对音乐情感自动分析的研究方向.  相似文献   

18.
针对语音信号的实时性和不确定性,提出证据信任度信息熵和动态先验权重的方法,对传统D-S证据理论的基本概率分配函数进行改进;针对情感特征在语音情感识别中对不同的情感状态具有不同的识别效果,提出对语音情感特征进行分类。利用各类情感特征的识别结果,应用改进的D-S证据理论进行决策级数据融合,实现基于多类情感特征的语音情感识别,以达到细粒度的语音情感识别。最后通过算例验证了改进算法的迅速收敛和抗干扰性,对比实验结果证明了分类情感特征语音情感识别方法的有效性和稳定性。  相似文献   

19.
语音情感识别研究进展*   总被引:4,自引:1,他引:4  
首先介绍了语音情感识别系统的组成,重点对情感特征和识别算法的研究现状进行了综述,分析了主要的语音情感特征,阐述了代表性的语音情感识别算法以及混合模型,并对其进行了分析比较。最后,指出了语音情感识别技术的可能发展趋势。  相似文献   

20.
To improve effectively the performance on spoken emotion recognition, it is needed to perform nonlinear dimensionality reduction for speech data lying on a nonlinear manifold embedded in a high-dimensional acoustic space. In this paper, a new supervised manifold learning algorithm for nonlinear dimensionality reduction, called modified supervised locally linear embedding algorithm (MSLLE) is proposed for spoken emotion recognition. MSLLE aims at enlarging the interclass distance while shrinking the intraclass distance in an effort to promote the discriminating power and generalization ability of low-dimensional embedded data representations. To compare the performance of MSLLE, not only three unsupervised dimensionality reduction methods, i.e., principal component analysis (PCA), locally linear embedding (LLE) and isometric mapping (Isomap), but also five supervised dimensionality reduction methods, i.e., linear discriminant analysis (LDA), supervised locally linear embedding (SLLE), local Fisher discriminant analysis (LFDA), neighborhood component analysis (NCA) and maximally collapsing metric learning (MCML), are used to perform dimensionality reduction on spoken emotion recognition tasks. Experimental results on two emotional speech databases, i.e. the spontaneous Chinese database and the acted Berlin database, confirm the validity and promising performance of the proposed method.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号