共查询到18条相似文献,搜索用时 625 毫秒
1.
2.
3.
由文本至口形的媒体变换技术的研究 总被引:1,自引:0,他引:1
本文根据汉语的发音特点,提出了由基本音素口形参数和预先建立的口形模型来生成各种音节相应的正面口形图象的方法,从而完成由文本至口形的媒体变换,首先我们建立变形物体运动模型,然后由此构造一个口唇的动态发音模型,最后采用纹理映射技术,由基本音素口形参数和动态发音模型合成正面发音口形的图象。 相似文献
4.
5.
6.
7.
采用风噪声作为基本模型,模拟海面波动态运动过程.海面波的高度受到海面波平面运动方向的函数影响.海面波的运动可表示成平面的两个方向的函数,这两个函数控制着海面波的高度.海面波在某一时间点截面的高度表示成风噪声函数,再通过主函数的形式把这两个函数对海面波高度的影响进行统一,并采用人工参与的遗传算法对仿真效果进行优化.仿真时采用动画帧的形式形成时间片,产生用户满意的三维海面波动态效果. 相似文献
8.
刘洋 《电视字幕·特技与动画》2007,13(10):39-44
在3D人物表情动画制作中,为了实现动画的真实自然,必须使人物说话时的口型(也称唇型)与所说的话同步,即口型与声音同步.目前,能实现口型动画匹配的软件很少,比较著名的有Poser口型大师Mimic,3ds max的口型插件Voice-O-Matic等,但这些软件的设置都较复杂,而且主要是针对英语语言,对汉语不支持或支持差. 相似文献
9.
为了合成具有真实感的视频序列,该文提出一种基于汉语视频三音素的可视语音合成方法。根据汉语的发音规律和音素与视素的对应关系,该文提出视频三音素的概念。在此基础上,建立隐马尔可夫(HMM)训练与合成模型,在训练过程中使用了视频音频联合特征,并加入了动态特征。在合成过程中,连接视频三音素HMM模型形成句子HMM,并从中提取特征参数,合成可视语音。从主观和客观评估结果来看,合成视频的真实感强,满意度较高。 相似文献
10.
李舟 《电视字幕·特技与动画》2009,15(11)
口型动画是制作人物动画的基本要素,常用的抓元音、重音的方法会耗费大量的时间与精力,笔者结合自己的工作实践,总结了"三格一帧,快则两格"的技巧,能极大提高工作效率. 相似文献
11.
This paper describes a new and efficient method for facial expression generation on cloned synthetic head models. The system uses abstract facial muscles called action units (AUs) based on both anatomical muscles and the facial action coding system. The facial expression generation method has real-time performance, is less computationally expensive than physically based models, and has greater anatomical correspondence than rational free-form deformation or spline-based, techniques. Automatic cloning of a real human head is done by adapting a generic facial and head mesh to Cyberware laser scanned data. The conformation of the generic head to the individual data and the fitting of texture onto it are based on a fully automatic feature extraction procedure. Individual facial animation parameters are also automatically estimated during the conformation process. The entire animation system is hierarchical; emotions and visemes (the visual mouth shapes that occur during speech) are defined in terms of the AUs, and higher-level gestures are defined in terms of AUs, emotions, and visemes as well as the temporal relationships between them. The main emphasis of the paper is on the abstract muscle model, along with limited discussion on the automatic cloning process and higher-level animation control aspects. 相似文献
12.
Speech-driven facial animation combines techniques from different disciplines such as image analysis, computer graphics, and speech analysis. Active shape models (ASM) used in image analysis are excellent tools for characterizing lip contour shapes and approximating their motion in image sequences. By controlling the coefficients for an ASM, such a model can also be used for animation. We design a mapping of the articulatory parameters used in phonetics into ASM coefficients that control nonrigid lip motion. The mapping is designed to minimize the approximation error when articulatory parameters measured on training lip contours are taken as input to synthesize the training lip movements. Since articulatory parameters can also be estimated from speech, the proposed technique can form an important component of a speech-driven facial animation system. 相似文献
13.
Lifelike talking faces for interactive services 总被引:1,自引:0,他引:1
Cosatto E. Ostermann J. Graf H.P. Schroeter J. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》2003,91(9):1406-1429
Lifelike talking faces for interactive services are an exciting new modality for man-machine interactions. Recent developments in speech synthesis and computer animation enable the real-time synthesis of faces that look and behave like real people, opening opportunities to make interactions with computers more like face-to-face conversations. This paper focuses on the technologies for creating lifelike talking heads, illustrating the two main approaches: model-based animations and sample-based animations. The traditional model-based approach uses three-dimensional wire-frame models, which can be animated from high-level parameters such as muscle actions, lip postures, and facial expressions. The sample-based approach, on the other hand, concatenates segments of recorded videos, instead of trying to model the dynamics of the animations in detail. Recent advances in image analysis enable the creation of large databases of mouth and eye images, suited for sample-based animations. The sample-based approach tends to generate more naturally looking animations at the expense of a larger size and less flexibility than the model-based animations. Beside lip articulation, a talking head must show appropriate head movements, in order to appear natural. We illustrate how such "visual prosody" is analyzed and added to the animations. Finally, we present four applications where the use of face animation in interactive services results in engaging user interfaces and an increased level of trust between user and machine. Using an RTP-based protocol, face animation can be driven with only 800 bits/s in addition to the rate for transmitting audio. 相似文献
14.
快速、高效地实现语音驱动下的唇形自动合成,以及优化语音与唇动的同步是语音驱动人脸动画的重点。提出了一种基于共振峰分析的语音驱动人脸动画的方法。对语音信号进行加窗分帧,DFT变换,再对短时音频信号的频谱进行第一、第二共振峰分析,将分析结果映射为一组控制序列,并对控制序列进行去奇异点等后处理。设定三维人脸模型的动态基本口形,以定时方式将控制序列导入模型,完成人脸动画驱动。实验结果表明,该方法简单快速,有效实现了语音和唇形的同步,动画效果连贯自然,可广泛用于各类虚拟角色的配音,缩短虚拟人物的制作周期。 相似文献
15.
《Signal Processing Magazine, IEEE》2001,18(3):17-25
We describe the components of the system used for real-time facial communication using a cloned head. We begin with describing the automatic face cloning using two orthogonal photographs of a person. The steps in this process are the face model matching and texture generation. After an introduction to the MPEG-4 parameters that we are using, we proceed with the explanation of the facial feature tracking using a video camera. The technique requires an initialization step and is further divided into mouth and eye tracking. These steps are explained in detail. We then explain the speech processing techniques used for real-time phoneme extraction and subsequent speech animation module. We conclude with the results and comments on the integration of the modules towards a complete system 相似文献
16.
人脸动画广泛地应用于游戏行业、远程会议、代理和化身等许多其它领域,近年吸引了很多学者的研究,其中口腔/眼睛等器官的动画一直是一个较大的难点。本文提出了一种将口腔/眼睛的器官样本图片融合到人脸图像中并根据单张中性人脸图片生成人脸动画的方法。该方法根据特征点生成样条,在极坐标上对样条插值来实现空间映射,然后采用后向映射和插值进行图像重采样得到融合图像。实验结果表明,该方法产生的融合图片较为自然,能实现口腔/眼球等器官的运动,能满足人脸动画生成的实时性要求。 相似文献
17.
Realistic speech animation based on observed 3D face dynamics 总被引:1,自引:0,他引:1
Muller P. Kalberer G.A. Proesmans M. Van Gool L. 《Vision, Image and Signal Processing, IEE Proceedings -》2005,152(4):491-500
An efficient system for realistic speech animation is proposed. The system supports all steps of the animation pipeline, from the capture or design of 3D head models up to the synthesis and editing of the performance. This pipeline is fully 3D, which yields high flexibility in the use of the animated character. Real detailed 3D face dynamics, observed at video frame rate for thousands of points on the face of speaking actors, underpin the realism of the facial deformations. These are given a compact and intuitive representation via independent component analysis (ICA). Performances amount to trajectories through this 'viseme space'. When asked to animate a face, the system replicates the 'visemes' that it has learned, and adds the necessary coarticulation effects. Realism has been improved through comparisons with motion captured groundtruth. Faces for which no 3D dynamics have been observed can be animated nonetheless. Their visemes are adapted automatically to their physiognomy by localising the face in a 'face space'. 相似文献
18.
通过主观评价实验来分析频响范围对语言清晰度的影响。将测试得到的不同频响范围的清晰度平均值进行比较后发现,高于4kHz的信号对语言清晰度仍有较大影响,特别是4~6kHz频率段的信号影响最大。而对于下限频率来说,至少可以认为只要不低于300Hz,对语言清晰度不会有太大的影响。根据声母及韵母的频谱特性对实验结果进行了分析。 相似文献