首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 78 毫秒
1.
基于视频与语音的多通道游戏用户界面系统   总被引:2,自引:1,他引:2  
设计和实现了一套基于视频和语音的多通道游戏用户界面系统,以增强计算机游戏的交互性和游戏用户的沉浸感.系统新创建并有效地整合了基于视频与语音两种交互通道,其中包含脸部模型重建、头部姿态估计、汉语语音识别三个模块,可快速实现个性化的游戏角色脸部模型,并允许游戏用户使用头部姿态和语音命令实时控制游戏角色和游戏进展.测试和应用结果表明:该系统适用于普通游戏用户和实际游戏环境.  相似文献   

2.
三维人脸模型已经广泛应用到视频电话、视频会议、影视制作、电脑游戏、人脸识别等多个领域。目前三维人脸建模一般使用多幅图像,且要求表情中性。本文提出了基于正、侧面任意表情三维人脸重建方法。首先对二维图像中的人脸进行特征提取,然后基于三维人脸统计模型,通过缩放、平移、旋转等方法,及全局和局部匹配,获得特定的三维人脸。基于二维图像中的人脸纹理信息,通过纹理映射,获得完整的三维人脸。通过对大量实际二维人脸图像的三维人脸重建,证实了该方法的有效性和鲁棒性。  相似文献   

3.
Many computer games treat the user in the "1st person" and bind the camera to his or her view. More sophistication in a game can be achieved by enabling the camera to leave the users' viewpoint. This, however, requires new methods for automatic, dynamic camera control. In this paper we present methods and tools for such camera control. We emphasize guiding camera control by constraints ; however, optimal constraint satisfaction tends to lead to the camera jumping around too much. Thus, we pay particular attention to a trade-off between constraint satisfaction and frame coherence. We present a new algorithm for dynamic consideration of the visibility of objects which are deemed to be important in a given game context.  相似文献   

4.
In this paper, we demonstrate how a new interactive 3 D desktop metaphor based on two-handed 3 D direct manipulation registered with head-tracked stereo viewing can be applied to the task of constructing animated characters. In our configuration, a six degree-of-freedom head-tracker and CrystalEyes shutter glasses are used to produce stereo images that dynamically follow the user head motion. 3 D virtual objects can be made to appear at a fixed location in physical space which the user may view from different angles by moving his head. To construct 3 D animated characters, the user interacts with the simulated environment using both hands simultaneously: the left hand, controlling a Spaceball, is used for 3 D navigation and object movement, while the right hand, holding a 3 D mouse, is used to manipulate through a virtual tool metaphor the objects appearing in front of the screen. In this way, both incremental and absolute interactive input techniques are provided by the system. Hand-eye coordination is made possible by registering virtual space exactly to physical space, allowing a variety of complex 3 D tasks necessary for constructing 3 D animated characters to be performed more easily and more rapidly than is possible using traditional interactive techniques. The system has been tested using both Polhemus Fastrak and Logitech ultrasonic input devices for tracking the head and 3 D mouse.  相似文献   

5.
Head pose estimation is a key task for visual surveillance, HCI and face recognition applications. In this paper, a new approach is proposed for estimating 3D head pose from a monocular image. The approach assumes the full perspective projection camera model. Our approach employs general prior knowledge of face structure and the corresponding geometrical constraints provided by the location of a certain vanishing point to determine the pose of human faces. To achieve this, eye-lines, formed from the far and near eye corners, and mouth-line of the mouth corners are assumed parallel in 3D space. Then the vanishing point of these parallel lines found by the intersection of the eye-line and mouth-line in the image can be used to infer the 3D orientation and location of the human face. In order to deal with the variance of the facial model parameters, e.g. ratio between the eye-line and the mouth-line, an EM framework is applied to update the parameters. We first compute the 3D pose using some initially learnt parameters (such as ratio and length) and then adapt the parameters statistically for individual persons and their facial expressions by minimizing the residual errors between the projection of the model features points and the actual features on the image. In doing so, we assume every facial feature point can be associated to each of features points in 3D model with some a posteriori probability. The expectation step of the EM algorithm provides an iterative framework for computing the a posterori probabilities using Gaussian mixtures defined over the parameters. The robustness analysis of the algorithm on synthetic data and some real images with known ground-truth are included.  相似文献   

6.
Active Appearance Model (AAM) is an algorithm for fitting a generative model of object shape and appearance to an input image. AAM allows accurate, real-time tracking of human faces in 2D and can be extended to track faces in 3D by constraining its fitting with a linear 3D morphable model. Unfortunately, this AAM-based 3D tracking does not provide adequate accuracy and robustness, as we show in this paper. We introduce a new constraint into AAM fitting that uses depth data from a commodity RGBD camera (Kinect). This addition significantly reduces 3D tracking errors. We also describe how to initialize the 3D morphable face model used in our tracking algorithm by computing its face shape parameters of the user from a batch of tracked frames. The described face tracking algorithm is used in Microsoft's Kinect system.  相似文献   

7.
基于图像序列的交互式快速建模系统   总被引:1,自引:1,他引:0  
给出了一个基于图像序列的交互式三维建模系统.通过输入一段未标定的图像或视频序列,系统能够自动地恢复出摄像机参数;然后用户只需要在少量几帧图像上简单勾画出物体的形态结构,系统就能自动解析出多帧之间用户交互的对应关系,从而迅速、逼真地重建出场景的三维模型.该系统提供了点与线段的重建、直线与平面的重建、曲线与曲面的重建等功能,能够满足对现实世界中的复杂场景的快速高精度的重建要求.几组真实拍摄的图像序列的建模实验表明:该系统高效、实用.能够很好地满足实际建模需求.  相似文献   

8.
Digitally sculpting 3D human faces is a very challenging task. It typically requires either 1) highly-skilled artists using complex software packages for high quality results, or 2) highly-constrained simple interfaces for consumer-level avatar creation, such as in game engines. We propose a novel interactive method for the creation of digital faces that is simple and intuitive to use, even for novice users, while consistently producing plausible 3D face geometry, and allowing editing freedom beyond traditional video game avatar creation. At the core of our system lies a specialized anatomical local face model (ALM), which is constructed from a dataset of several hundred 3D face scans. User edits are propagated to constraints for an optimization of our data-driven ALM model, ensuring the resulting face remains plausible even for simple edits like clicking and dragging surface points. We show how several natural interaction methods can be implemented in our framework, including direct control of the surface, indirect control of semantic features like age, ethnicity, gender, and BMI, as well as indirect control through manipulating the underlying bony structures. The result is a simple new method for creating digital human faces, for artists and novice users alike. Our method is attractive for low-budget VFX and animation productions, and our anatomical modeling paradigm can complement traditional game engine avatar design packages.  相似文献   

9.
Reanimating Faces in Images and Video   总被引:8,自引:0,他引:8  
  相似文献   

10.
Eye contact and gaze awareness play a significant role for conveying emotions and intentions during face-to-face conversation. Humans can perceive each other's gaze quite naturally and accurately. However, the gaze awareness/perception are ambiguous during video teleconferencing performed by computer-based devices (such as laptops, tablet, and smart-phones). The reasons for this ambiguity are the (i) camera position relative to the screen and (ii) 2D rendition of 3D human face i.e., the 2D screen is unable to deliver an accurate gaze during video teleconferencing. To solve this problem, researchers have proposed different hardware setups with complex software algorithms. The most recent solution for accurate gaze perception employs 3D interfaces, such as 3D screens and 3D face-masks. However, today commonly used video teleconferencing devices are smart devices with 2D screens. Therefore, there is a need to improve gaze awareness/perception in these smart devices. In this work, we have revisited the question; how to improve a remote user's gaze awareness among his/her collaborators. Our hypothesis is that ‘an accurate gaze perception can be achieved by the3D embodimentof a remote user's head gesture during video teleconferencing’. We have prototyped an embodied telepresence system (ETS) for the 3D embodiment of a remote user's head. Our ETS is based on a 3-DOF neck robot with a mounted smart device (tablet PC). The electromechanical platform in combination with a smart device is a novel setup that is used for studying gaze awareness/perception in 2D screen-based smart devices during video teleconferencing. Two important gaze-related issues are considered in this work; namely (i) ‘Mona-Lisa Gaze Effect’ – the gaze is always directed at the person independent of his position in the room, and (ii) ‘Gaze Awareness/Faithfulness’ – the ability to perceive an accurate spatial relationship between the observing person and the object by an actor. Our results confirm that the 3D embodiment of a remote user head not only mitigates the Mona Lisa gaze effect but also supports three levels of gaze faithfulness, hence, accurately projecting the human gaze in distant space.  相似文献   

11.
《Graphical Models》2001,63(5):333-368
This paper proposes a camera-based real-time system for building a three dimensional (3D) human head model. The proposed system is first trained in a semi-automatic way to locate the user's facial area and is then used to build a 3D model based on the front and profile views of the user's face. This is achieved by directing the user to position his or her face and profile in a highlighted area, which is used to train a neural network to distinguish the background from the face. With a blink from the user, the system is then capable of accurately locating a set of characteristic feature points on the front and profile views of the face, which are used for the adaptation of a generic 3D face model. This adaptation procedure is initialized with a rigid transformation of the model aiming to minimize the distances of the 3D model feature nodes from the calculated 3D coordinates of the 2D feature points. Then, a nonrigid transformation ensures that the feature nodes are displaced optimally close to their exact calculated positions, dragging their neighbors in a way that deforms the facial model in a natural looking manner. A male hair model is created using a 3D ellipsoid, which is truncated and merged with the adapted face model. A cylindrical texture map is finally built from the two image views covering the whole area of the head by exploiting the inherent face symmetry. The final result is a complete, textured model of a specific person's head.  相似文献   

12.
3D human face model reconstruction is essential to the generation of facial animations that is widely used in the field of virtual reality (VR). The main issues of 3D facial model reconstruction based on images by vision technologies are in twofold: one is to select and match the corresponding features of face from two images with minimal interaction and the other is to generate the realistic-looking human face model. In this paper, a new algorithm for realistic-looking face reconstruction is presented based on stereo vision. Firstly, a pattern is printed and attached to a planar surface for camera calibration, and corners generation and corners matching between two images are performed by integrating modified image pyramid Lucas-Kanade (PLK) algorithm and local adjustment algorithm, and then 3D coordinates of corners are obtained by 3D reconstruction. Individual face model is generated by the deformation of general 3D model and interpolation of the features. Finally, realistic-looking human face model  相似文献   

13.
Generating 3D models of objects from video sequences is an important problem in many multimedia applications ranging from teleconferencing to virtual reality. In this paper, we present a method of estimating the 3D face model from a monocular image sequence, using a few standard results from the affine camera geometry literature in computer vision, and spline fitting techniques using a modified non parametric regression technique. We use the bicubic spline functions to model the depth map, given a set of observation depth maps computed from frame pairs in a video sequence. The minimal number of splines are chosen on the basis of the Schwartz's Criterion. We extend the spline fitting algorithm to hierarchical splines. Note that the camera calibration parameters and the prior knowledge of the object shape is not required by the algorithm. The system has been successfully demonstrated to extract 3D face structure of humans as well as other objects, starting from their image sequences.  相似文献   

14.
Visual fidelity and interactivity are the main goals in Computer Graphics research, but recently also audio is assuming an important role. Binaural rendering can provide extremely pleasing and realistic three‐dimensional sound, but to achieve best results it's necessary either to measure or to estimate individual Head Related Transfer Function (HRTF). This function is strictly related to the peculiar features of ears and face of the listener. Recent sound scattering simulation techniques can calculate HRTF starting from an accurate 3D model of a human head. Hence, the use of binaural rendering on large scale (i.e. video games, entertainment) could depend on the possibility to produce a sufficiently accurate 3D model of a human head, starting from the smallest possible input. In this paper we present a completely automatic system, which produces a 3D model of a head starting from simple input data (five photos and some key‐points indicated by user). The geometry is generated by extracting information from images and accordingly deforming a 3D dummy to reproduce user head features. The system proves to be fast, automatic, robust and reliable: geometric validation and preliminary assessments show that it can be accurate enough for HRTF calculation.  相似文献   

15.
In this paper, we present a system for person re-identification in TV series. In the context of video retrieval, person re-identification refers to the task where a user clicks on a person in a video frame and the system then finds other occurrences of the same person in the same or different videos. The main characteristic of this scenario is that no previously collected training data is available, so no person-specific models can be trained in advance. Additionally, the query data is limited to the image that the user clicks on. These conditions pose a great challenge to the re-identification system, which has to find the same person in other shots despite large variations in the person’s appearance. In the study, facial appearance is used as the re-identification cue, since, in contrast to surveillance-oriented re-identification studies, the person can have different clothing in different shots. In order to increase the amount of available face data, the proposed system employs a face tracker that can track faces up to full profile views. This makes it possible to use a profile face image as query image and also to retrieve images with non-frontal poses. It also provides temporal association of the face images in the video, so that instead of using single images for query or target, whole tracks can be used. A fast and robust face recognition algorithm is used to find matching faces. If the match result is highly confident, our system adds the matching face track to the query set. Finally, if the user is not satisfied with the number of returned results, the system can present a small number of candidate face images and lets the user confirm the ones that belong to the queried person. These features help to increase the variation in the query set, making it possible to retrieve results with different poses, illumination conditions, etc. The system is extensively evaluated on two episodes of the TV series Coupling, showing very promising results.  相似文献   

16.
We present a novel performance‐driven approach to animating cartoon faces starting from pure 2D drawings. A 3D approximate facial model automatically built from front and side view master frames of character drawings is introduced to enable the animated cartoon faces to be viewed from angles different from that in the input video. The expressive mappings are built by artificial neural network (ANN) trained from the examples of the real face in the video and the cartoon facial drawings in the facial expression graph for a specific character. The learned mapping model makes the resultant facial animation to properly get the desired expressiveness, instead of a mere reproduction of the facial actions in the input video sequence. Furthermore, the lit sphere, capturing the lighting in the painting artwork of faces, is utilized to color the cartoon faces in terms of the 3D approximate facial model, reinforcing the hand‐drawn appearance of the resulting facial animation. We made a series of comparative experiments to test the effectiveness of our method by recreating the facial expression in the commercial animation. The comparison results clearly demonstrate the superiority of our method not only in generating high quality cartoon‐style facial expressions, but also in speeding up the animation production of cartoon faces. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

17.
An interactive system is described for creating and animating deformable 3D characters. By using a hybrid layered model of kinematic and physics-based components together with an immersive 3D direct manipulation interface, it is possible to quickly construct characters that deform naturally when animated and whose behavior can be controlled interactively using intuitive parameters. In this layered construction technique, called the elastic surface layer model, a simulated elastically deformable skin surface is wrapped around a kinematic articulated figure. Unlike previous layered models, the skin is free to slide along the underlying surface layers constrained by geometric constraints which push the surface out and spring forces which pull the surface in to the underlying layers. By tuning the parameters of the physics-based model, a variety of surface shapes and behaviors can be obtained such as more realistic-looking skin deformation at the joints, skin sliding over muscles, and dynamic effects such as squash-and-stretch and follow-through. Since the elastic model derives all of its input forces from the underlying articulated figure, the animator may specify all of the physical properties of the character once, during the initial character design process, after which a complete animation sequence can be created using a traditional skeleton animation technique. Character construction and animation are done using a 3D user interface based on two-handed manipulation registered with head-tracked stereo viewing. In our configuration, a six degree-of-freedom head-tracker and CrystalEyes shutter glasses are used to display stereo images on a workstation monitor that dynamically follow the user head motion. 3D virtual objects can be made to appear at a fixed location in physical space which the user may view from different angles by moving his head. To construct 3D animated characters, the user interacts with the simulated environment using both hands simultaneously: the left hand, controlling a Spaceball, is used for 3D navigation and object movement, while the right hand, holding a 3D mouse, is used to manipulate through a virtual tool metaphor the objects appearing in front of the screen. Hand-eye coordination is made possible by registering virtual space to physical space, allowing a variety of complex 3D tasks necessary for constructing 3D animated characters to be performed more easily and more rapidly than is possible using traditional interactive techniques.  相似文献   

18.
This paper presents the design of a robust face detector that can detect arbitrarily tilted human faces in color images. This detector locates face regions by identifying mouth corners and eyes. The novel techniques included in the proposed detector are: (1) a method for compensating the colors of the input images, (2) a method for deskewing tilted faces, (3) a method for locating mouth corners, and (4) a discriminant function for positioning eyes. According to the performance evaluation on three test databases which contain 1791 faces on 1580 images, the proposed method achieves a precision rate of 94.62% and a recall rate of 92.24% in average at the detection speed of 1.6 faces per second. The performance of the proposed detector also slightly outperforms a detector from CMU.  相似文献   

19.
《Real》2001,7(2):173-182
Three-dimensional human head modeling is useful in video-conferencing or other virtual reality applications. However, manual construction of 3D models using CAD tools is often expensive and time-consuming. Here we present a robust and efficient method for the construction of a 3D human head model from perspective images viewing from different angles. In our system, a generic head model is first used, then three images of the head are required to adjust the deformable contours on the generic model to make it closer to the target head. Our contributions are as follows. Our system uses perspective images that are more realistic than orthographic projection approximations used in earlier works. Also, for shaping and positioning face organs, we present a method for estimating the camera focal length and the 3D coordinates of facial landmarks when the camera transformation is known. We also provide an alternative for the 3D coordinates estimation using epipolar geometry when the extrinsic parameters are absent. Our experiments demonstrate that our approach produces good and realistic results.  相似文献   

20.
A Video-Based 3D-Reconstruction of Soccer Games   总被引:1,自引:0,他引:1  
In this paper we present SoccerMan, a reconstruction system designed to generate animated, virtual 3D views from two synchronous video sequences of a short part of a given soccer game. After the reconstruction process, which needs also some manual interaction, the virtual 3D scene can be examined and 'replayed' from any viewpoint. Players are modeled as so-called animated texture objects, i.e. 2D player shapes are extracted from video and texture-mapped onto rectangles in 3D space. Animated texture objects have shown very appropriate as a 3D representation of soccer players in motion, as the visual nature of the original human motion is preserved. The trajectories of the players and the ball in 3D space are reconstructed accurately. In order to create a 3D reconstruction of a given soccer scene, the following steps have to be executed: 1) Camera parameters of all frames of both sequences are computed (camera calibration). 2) The playground texture is extracted from the video sequences. 3) Trajectories of the ball and the players' heads are computed after manually specifying their image positions in a few key frames. 4) Player textures are extracted automatically from video. 5) The shapes of colliding or occluding players are separated automatically. 6) For visualization, player shapes are texture-mapped onto appropriately placed rectangles in virtual space. SoccerMan is a novel experimental sports analysis system with fairly ambitious objectives. Its design decisions, in particular to start from two synchronous video sequences and to model players by texture objects, have already proven promising.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号