期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

Real-time shape tracking of facial landmarks

Kim Hyungjoon Kim HyeonWoo Hwang Eenjun 《Multimedia Tools and Applications》2020,79(23-24):15945-15963

Detection of facial landmarks and accurate tracking of their shape are essential in real-time applications such as virtual makeup, where users can see the makeup’s effect by moving their face in diverse directions. Typical face tracking techniques detect facial landmarks and track them using a point tracker such as the Kanade-Lucas-Tomasi (KLT) point tracker. Typically, 5 or 64 points are used for tracking a face. Even though these points are enough to track the approximate locations of facial landmarks, they are not sufficient to track the exact shape of facial landmarks. In this paper, we propose a method that can track the exact shape of facial landmarks in real-time by combining a deep learning technique and a point tracker. We detect facial landmarks accurately using SegNet, which performs semantic segmentation based on deep learning. Edge points of detected landmarks are tracked using the KLT point tracker. In spite of its popularity, the KLT point tracker suffers from the point loss problem. We solve this problem by executing SegNet periodically to recalculate the shape of facial landmarks. That is, by combining the two techniques, we can avoid the computational overhead of SegNet and the point loss problem of the KLT point tracker, which leads to accurate real-time shape tracking. We performed several experiments to evaluate the performance of our method and report some of the results herein.

相似文献

2.

基于语音识别技术的英语口语自学系统评分机制的研究

宋芳芳宋晓丽马青玉《数字社区&智能家居》2009,5(3):1726-1728

在外语学习中,计算机辅助发音学习系统给学习者提供有效的发音指导。该文结合语音识别的相关原理,研究英语口语自学系统评分机制的关键技术,针对目前发展状况提出语音评分机制存在的问题,并从主观和客观两方面入手对多种语音评分技术的优缺点进行比较,提出将评分分为三大部分,即声学评分、韵律评分和感知评分,将主观评价和客观评价整合起来,最终提出基于HMM技术和神经网络技术的评分机制,促进英语口语自学能力的提高。相似文献

3.

An efficient deep learning technique for facial emotion recognition

Khattak Asad Asghar Muhammad Zubair Ali Mushtaq Batool Ulfat 《Multimedia Tools and Applications》2022,81(2):1649-1683

Emotion recognition from facial images is considered as a challenging task due to the varying nature of facial expressions. The prior studies on emotion classification from facial images using deep learning models have focused on emotion recognition from facial images but face the issue of performance degradation due to poor selection of layers in the convolutional neural network model.To address this issue, we propose an efficient deep learning technique using a convolutional neural network model for classifying emotions from facial images and detecting age and gender from the facial expressions efficiently. Experimental results show that the proposed model outperformed baseline works by achieving an accuracy of 95.65% for emotion recognition, 98.5% for age recognition, and 99.14% for gender recognition.

相似文献

4.

视觉语音参数的自动估计

王志明蔡莲红艾海舟《计算机研究与发展》2005,42(7):1185-1190

视觉语音参数估计在视觉语音的研究中占有重要的地位．从MPEG-4定义的人脸动画参数FAP中选择24个与发音有直接关系的参数来描述视觉语音,将统计学习方法和基于规则的方法结合起来,利用人脸颜色概率分布信息和先验形状及边缘知识跟踪嘴唇轮廓线和人脸特征点,取得了较为精确的跟踪效果．在滤除参考点跟踪中的高频噪声后,利用人脸上最为突出的4个参考点估计出主要的人脸运动姿态,从而消除了全局运动的影响,最后根据这些人脸特征点的运动计算出准确的视觉语音参数,并得到了实际应用．相似文献

5.

Building highly realistic facial modeling and animation: a survey

Nikolaos?Ersotelos Email author Feng?Dong 《The Visual computer》2008,24(1):13-30

This paper provides a comprehensive survey on the techniques for human facial modeling and animation. The survey is carried out from two different perspectives: facial modeling, which concerns how to produce 3D face models, and facial animation, which regards how to synthesize dynamic facial expressions. To generate an individual face model, we can either perform individualization of a generic model or combine face models from an existing face collection. With respect to facial animation, we have further categorized the techniques into simulation-based, performance-driven and shape blend-based approaches. The strength and weakness of these techniques within each category are discussed, alongside with the applications of these techniques to various exploitations. In addition, a brief historical review of the technique evolution is provided. Limitations and future trend are discussed. Conclusions are drawn at the end of the paper. 相似文献

6.

基于情感识别的智能教学系统研究 总被引：1，自引：0，他引：1

吴彦文刘伟张昆明《计算机工程与设计》2008,29(9):2350-2352

针对传统的智能教学系统(ITS)在情感方面的缺失,提出了基于情感识别技术的ITS模型.该系统模型在传统的教学系统上新增情感识别模块,利用人脸表情识别以及文本识别等技术所构建,可以获取和识别学生的学习情感,并根据学习情感进行相应的情感激励策略,实现情感化的教学. 相似文献

7.

An experimental evaluation of comprehensibility aspects of knowledge structures derived through induction techniques: A case study of industrial fault diagnosis

《Behaviour & Information Technology》2012,31(2):117-135

Machine induction has been extensively used in order to develop knowledge bases for decision support systems and predictive systems. The extent to which developers and domain experts can comprehend these knowledge structures and gain useful insights into the basis of decision making has become a challenging research issue. This article examines the knowledge structures generated by the C4.5 induction technique in a fault diagnostic task and proposes to use a model of human learning in order to guide the process of making comprehensive the results of machine induction. The model of learning is used to generate hierarchical representations of diagnostic knowledge by adjusting the level of abstraction and varying the goal structures between 'shallow' and 'deep' ones. Comprehensibility is assessed in a global way in an experimental comparison where subjects are required to acquire the knowledge structures and transfer to new tasks. This method of addressing the issue of comprehensibility appears promising especially for machine induction techniques that are rather inflexible with regard to the number and sorts of interventions allowed to system developers. 相似文献

8.

基于几何特征及C4.5的人脸美丽分类方法

毛慧芸金连文杜明辉《模式识别与人工智能》2010,23(6):809-814

从机器学习的角度来探索人脸美,提出与中国女性美丽程度相关的17维特征提取方法,然后运用C4。5分类树对不同美丽评分的人脸图像进行训练和测试。对510幅中国女性人脸图像的实验结果表明,文中提出的人脸美丽评价方法简单可行。对于美丽与否的两类别,平均分类精度达到94。1%。而对于4种美丽等级的分类,可达到71。6%的精度。研究表明通过合适的特征及C4。5机器学习来进行人脸美丽的智能感知是可行的。相似文献

9.

语音去混响技术的研究进展与展望

张雄伟李轶南郑昌艳曹铁勇孙蒙闵刚 《数据采集与处理》2017,32(6):1069-1081

语音交互技术在实际的话音驱动应用中得到日益普及。然而,当声源距离传声器较远时,由于实际环境中混响现象的影响,语音交互的性能还远不能使人满意。针对混响问题,数十年来学者们不断地进行大量的研究,并提出了很多实用的方法。特别是近期兴起的在很大程度上改变语音处理格局的深度学习技术,在单通道去混响方面也取得了很多令人瞩目的效果。然而,目前系统性总结分析基于深度学习的去混响方法与经典算法之间联系的工作仍然比较匮乏。因此,本文对单通道语音去混响技术的发展脉络进行系统的梳理和总结,并讨论了有待进一步研究的开放问题。相似文献

10.

面向纹理特征的真实感三维人脸动画方法 总被引：2，自引：0，他引：2

姜大龙高文王兆其陈益强《计算机学报》2004,27(6):750-757

纹理变化是人脸表情的重要组成部分，传统的人脸动画方法通常只是对纹理图像做简单的拉伸变换，没有考虑人脸细微纹理特征的变化，比如皱纹、酒窝等，该文提出了一种面向纹理特征变化的真实感三维人脸动画方法．给出局部表情比率图(Partial Expression Ratio Image，PERI)的概念及其获取方法，在此基础上，进一步给出了面向MPEG-4的PERI参数化与面向三维人脸动画的多方向PERI方法，前者通过有机结合MPEG-4的人脸动画参数(Facial Anlmation Parameter，FAP)，实现人脸动画中细微表情特征的参数化表示；后者通过多方向PERI纹理特征调整方法，使得三维人脸模型在不同角度都具有较好的细微表情特征,该文提出的方法克服了传统人脸动画只考虑人脸曲面形变控制而忽略纹理变化的缺陷，实现面向纹理变化的具有细微表情特征的真实感三维人脸动画，实验表明，该文提出的方法能有效捕捉纹理变化细节，提高人脸动画的真实感。相似文献

11.

基于可视语音的英语发音辅导系统

许芹《广东电脑与电讯》2006,(10):50-54

发音问题是初学英语的一大难题。在我国这样的非英语环境中,很多小学生课后缺少专业老师辅导,极易出现英语发音障碍。本文设计开发了一个基于可视语音的英语发音辅导系统EP Tutor,模拟一个卡通家教的脸部动画,生动亲切的为学生一对一辅导英语发音。本文重点讨论了系统设计理念、系统架构、部分关键功能的详细设计以及关键技术的实现。相似文献

12.

Facial expression recognition using tracked facial actions: Classifier performance analysis

Fadi Dornaika Abdelmalik Moujahid Bogdan Raducanu 《Engineering Applications of Artificial Intelligence》2013,26(1):467-477

In this paper, we address the analysis and recognition of facial expressions in continuous videos. More precisely, we study classifiers performance that exploit head pose independent temporal facial action parameters. These are provided by an appearance-based 3D face tracker that simultaneously provides the 3D head pose and facial actions. The use of such tracker makes the recognition pose- and texture-independent. Two different schemes are studied. The first scheme adopts a dynamic time warping technique for recognizing expressions where training data are given by temporal signatures associated with different universal facial expressions. The second scheme models temporal signatures associated with facial actions with fixed length feature vectors (observations), and uses some machine learning algorithms in order to recognize the displayed expression. Experiments quantified the performance of different schemes. These were carried out on CMU video sequences and home-made video sequences. The results show that the use of dimension reduction techniques on the extracted time series can improve the classification performance. Moreover, these experiments show that the best recognition rate can be above 90%. 相似文献

13.

跨库语音情感识别研究进展

张石清刘瑞欣赵小明《计算机系统应用》2022,31(11):31-48

语音情感识别在人机交互过程中发挥极为重要的作用, 近年来备受关注. 目前, 大多数的语音情感识别方法主要在单一情感数据库上进行训练和测试 . 然而, 在实际应用中训练集和测试集可能来自不同的情感数据库. 由于这种不同情感数据库的分布存在巨大差异性, 导致大多数的语音情感识别方法取得的跨库识别性能不尽人意. 为此, 近年来不少研究者开始聚焦跨库语音情感识别方法的研究. 本文系统性综述了近年来跨库语音情感识别方法的研究现状与进展, 尤其对新发展起来的深度学习技术在跨库语音情感识别中的应用进行了重点分析与归纳. 首先, 介绍了语音情感识别中常用的情感数据库, 然后结合深度学习技术, 从监督、无监督和半监督学习角度出发, 总结和比较了现有基于手工特征和深度特征的跨库语音情感识别方法的研究进展情况, 最后对当前跨库语音情感识别领域存在的挑战和机遇进行了讨论与展望. 相似文献

14.

A novel 2D and 3D multimodal approach for in-the-wild facial expression recognition

《Image and vision computing》2019

This study proposes a novel deep learning approach for the fusion of 2D and 3D modalities in in-the-wild facial expression recognition (FER). Different from other studies, we exploit the 3D facial information in in-the-wild FER. In particular, in-the-wild 3D FER dataset is not widely available; therefore, 3D facial data are constructed from available 2D datasets thanks to recent advances in 3D face reconstruction. The 3D facial geometry features are then extracted by deep learning technique to exploit the mid-level details, which provides meaningful expression for the recognition. In addition, to demonstrate the potential of 3D data on FER, the 2D projected images of 3D faces are taken as additional input to FER. These features are then jointly fused with 2D features obtained from the original input. The fused features are then classified by support vector machines (SVMs). The results show that the proposed approach achieves state-of-the-art recognition performances on Real-World Affective Faces (RAF) and Static Facial Expressions in the Wild (SFEW 2.0), and AffectNet dataset. This approach is also applied to a 3D FER dataset, i.e. BU-3DFE, to compare the effectiveness of reconstructed and available 3D face data for FER. This is the first time such a deep learning combination of 3D and 2D facial modalities is presented in the context of in-the-wild FER. 相似文献

15.

动态人脸表情合成的模型特征驱动算法综述

陈松袁训明《计算机与现代化》2019,(7):47

分析人脸模型的动态表情合成方法并依据它们内在特点进行分类描述。尽管这个领域已经存在较多文献,但是动态人脸表情合成仍然是非常活跃的研究热点。根据输出类型的不同,分类概览二维图像平面和三维人脸曲面上的合成算法。对于二维图像平面空间合成人脸表情主要有如下几种算法：主动表情轮廓模型驱动的人脸表情合成算法,基于拉普拉斯算子迁移计算的合成方法,使用表情比率图合成框架的表情合成算法,基于面部主特征点offset驱动的人脸表情合成算法,基于通用表情映射函数的表情合成方法和近来基于深度学习的表情合成技术。对于三维空间人脸合成则主要包括：基于物理肌肉模型的合成,基于形变的表情合成,基于三维形状线性回归的表情合成,基于脸部运动图的表情合成和近来基于深度学习的三维人脸表情合成技术。对以上每一种类别讨论它们的方法论以及其主要优缺点。本工作有望帮助未来研究者更好地定位研究方向和技术突破口。相似文献

16.

Facial Expression Synthesis Using Manifold Learning and Belief Propagation

Li Huang Congyong Su 《Soft Computing - A Fusion of Foundations, Methodologies and Applications》2006,10(12):1193-1200

Given a person’s neutral face, we can predict his/her unseen expression by machine learning techniques for image processing. Different from the prior expression cloning or image analogy approaches, we try to hallucinate the person’s plausible facial expression with the help of a large face expression database. In the first step, regularization network based nonlinear manifold learning is used to obtain a smooth estimation for unseen facial expression, which is better than the reconstruction results of PCA. In the second step, Markov network is adopted to learn the low-level local facial feature’s relationship between the residual neutral and the expressional face image’s patches in the training set, then belief propagation is employed to infer the expressional residual face image for that person. By integrating the two approaches, we obtain the final results. The experimental results show that the hallucinated facial expression is not only expressive but also close to the ground truth. 相似文献

17.

Facial attractiveness: beauty and the machine

Eisenthal Y Dror G Ruppin E 《Neural computation》2006,18(1):119-142

This work presents a novel study of the notion of facial attractiveness in a machine learning context. To this end, we collected human beauty ratings for data sets of facial images and used various techniques for learning the attractiveness of a face. The trained predictor achieves a significant correlation of 0.65 with the average human ratings. The results clearly show that facial beauty is a universal concept that a machine can learn. Analysis of the accuracy of the beauty prediction machine as a function of the size of the training data indicates that a machine producing human-like attractiveness rating could be obtained given a moderately larger data set. 相似文献

18.

Advances in computational facial attractiveness methods

Shu Liu Yang-Yu Fan Ashok Samal Zhe Guo 《Multimedia Tools and Applications》2016,75(23):16633-16663

Attractiveness of a face plays an important role in many social endeavors. It influences careers like digital entertainment, modeling and acting, as well as person’s career prospect, financial status, and personal relationships. Computational approaches to exploring the nature and components of face attractiveness have been proposed, and have become an emerging topic in facial analysis research. Integrating techniques from image processing, computer vision and machine learning, this subarea aims to develop computational methods to quantify and investigate the attractiveness of a face. This paper summarizes the most recent advances in four related aspects of face attractiveness: (a) facial attractiveness prediction, (b) facial attractiveness enhancement, (c) lateral facial attractiveness and (d) 3D facial attractiveness. The motivations, innovative techniques, and significant results are summarized and discussed. The open problems in these areas and directions for future work are also briefly stated. 相似文献

19.

Expression modeling—a boundary element approach

K.C. Hui H.C. Leung 《Computers & Graphics》2006,30(6):981-993

Popular techniques for modeling facial expression usually rely on the shape blending of a series of pre-defined facial models, the use of feature parameters, or the use of an anatomy based facial model. This requires extensive user interaction to construct the pre-defined facial model, the deformation functions, or the anatomy based facial model. Besides, existing anatomy based facial modeling techniques are targeted for human facial model and may not be used directly for non-human like character models. This paper presents an intuitive technique for the design of facial expressions using a physics based deformation approach. The technique does not require specifying the deformation function associated with facial feature parameters, and does not require a detail anatomical model of the head. By adjusting the contraction or relaxation of a set of facial muscles, different facial expressions can be obtained. Facial muscles and skin are assumed to be linearly elastic. The boundary element method (BEM) is adopted for evaluating deformation of the facial skin. This avoids the use of volumetric elements as in the case of finite element method (FEM) or the setting up of complex mass–spring models. Given a polygon mesh of a facial model, a closed volume of the facial mesh is obtained by offsetting the polygon mesh according to a user defined depth map. Each facial muscle is approximated with a series of muscle polygons on the mesh surface. Deformation of the facial mesh is attained by stretching or compressing the muscle polygons. By pre-computing the inverse of the stiffness matrix, interactive editing of facial expression can be achieved. 相似文献

20.

Investigating an application of speech‐to‐text recognition: a study on visual attention and learning behaviour

下载免费PDF全文

Y‐M. Huang C‐J. Liu R. Shadiev M‐H. Shen W‐Y. Hwang 《Journal of Computer Assisted Learning》2015,31(6):529-545

One major drawback of previous research on speech‐to‐text recognition (STR) is that most findings showing the effectiveness of STR for learning were based upon subjective evidence. Very few studies have used eye‐tracking techniques to investigate visual attention of students on STR‐generated text. Furthermore, not much attention was paid to learning differences, such as learning ability, learning style preferences and gender, to use STR texts. Therefore, this study carried out one experiment. Firstly, participants' visual attention on STR‐generated text was investigated by employing eye‐tracking technique. Secondly, how differently effective STR‐generated texts can be to influence participants' learning achievement was tested. Thirdly, this study compared visual attention and learning behaviour with the different characteristics of participants, such as learning ability, learning style preferences and gender, to use STR‐generated texts. Finally, this study explored students' perceptions regarding usefulness of STR‐generated texts for learning. This paper discusses results, research findings and implications along with conclusions and several suggestions for future development and research. 相似文献