首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.
Considering the main disadvantage of the existing gaze point estimation methods which restrict user’s head movement and have potential injury on eyes, we propose a gaze point estimation method based on facial normal and binocular vision. Firstly, we calibrate stereo cameras to determine the extrinsic and intrinsic parameters of the cameras; Secondly, face is quickly detected by Viola–Jones framework and the center position of the two irises can be located based on integro-differential operators; The two nostrils and mouth are detected based on the saturation difference and their 2D coordinates can be calculated; Thirdly, the 3D coordinates of these five points are obtained by stereo matching and 3D reconstruction; After that, a plane fitting algorithm based on least squares is adopted to get the approximate facial plane, then, the normal via the midpoint of the two pupils can be figured out; Finally, the point-of-gaze can be obtained by getting the intersection point of the facial normal and the computer screen. Experimental results confirm the accuracy and robustness of the proposed method.  相似文献   

2.
在单相机单光源条件下,针对现有视线估计方法标定过程复杂的问题,提出一种新的单点标定视线估计方法. 该方法预先建立屏幕中多个点的视线估计统计模型,进而通过插值估计用户在屏幕中的视点. 主要创新工作有:1) 提出一种基于统计的单点标定视线估计模型,降低了标定过程的复杂度;2) 采用增量学习方法进一步更新模型,提高模型对不同用户以及头部运动的适应性. 实验证明,本文方法在设备简单、允许头部运动的前提下,只需单点标定就能够取得较高精度.  相似文献   

3.
作为信息获取与人机交互的一种新型方式,视线跟踪技术已经成为计算机视觉领域的热门研究方向。视线跟踪的核心技术是视线估计。针对现有视线估计方法标定复杂、限制头部运动等问题,提出了一种改进的基于二维瞳孔角膜反射技术的视线估计方法。在单相机单光源条件下,通过建立瞳孔角膜反射模型、补偿个体差异误差、补偿头部运动误差等步骤实现单点标定视线估计。实验结果表明,用该算法估计视线,在一定范围内,头部移动不会带来精度的明显下降。  相似文献   

4.
视线跟踪是基于多通道的人机交互技术的重要研究内容.而基于瞳孔-角膜反射技术的视线方向是目前应用最广泛的视线跟踪技术之一。瞳孔-角膜反射技术的主要目的是提取人眼图像中瞳孔-角膜反射向量作为视线方向计算模型所需的视觉信息,通过搭建红外光源设备提取瞳孔-角膜反射向量构建基于瞳孔-角膜反射技术的视线跟踪系统,为面向人机交互的视线跟踪研究提供可行的低成本解决方案。  相似文献   

5.
Spoofing attacks on biometric systems are one of the major impediments to their use for secure unattended applications. This paper explores features for face liveness detection based on tracking the gaze of the user. In the proposed approach, a visual stimulus is placed on the display screen, at apparently random locations, which the user is required to follow while their gaze is measured. This visual stimulus appears in such a way that it repeatedly directs the gaze of the user to specific positions on the screen. Features extracted from sets of collinear and colocated points are used to estimate the liveness of the user. Data are collected from genuine users tracking the stimulus with natural head/eye movements and impostors holding a photograph, looking through a 2D mask or replaying the video of a genuine user. The choice of stimulus and features are based on the assumption that natural head/eye coordination for directing gaze results in a greater accuracy and thus can be used to effectively differentiate between genuine and spoofing attempts. Tests are performed to assess the effectiveness of the system with these features in isolation as well as in combination with each other using score fusion techniques. The results from the experiments indicate the effectiveness of the proposed gaze-based features in detecting such presentation attacks.  相似文献   

6.
Head gaze, or the orientation of the head, is a very important attentional cue in face to face conversation. Some subtleties of the gaze can be lost in common teleconferencing systems, because a single perspective warps spatial characteristics. A recent random hole display is a potentially interesting display for group conversation, as it allows multiple stereo viewers in arbitrary locations, without the restriction of conventional autostereoscopic displays on viewing positions. We represented a remote person as an avatar on a random hole display. We evaluated this system by measuring the ability of multiple observers with different horizontal and vertical viewing angles to accurately and simultaneously judge which targets the avatar is gazing at. We compared three perspective conditions: a conventional 2D view, a monoscopic perspective-correct view, and a stereoscopic perspective-correct views. In the latter two conditions, the random hole display shows three and six views simultaneously. Although the random hole display does not provide high quality view, because it has to distribute display pixels among multiple viewers, the different views are easily distinguished. Results suggest the combined presence of perspective-correct and stereoscopic cues significantly improved the effectiveness with which observers were able to assess the avatar׳s head gaze direction. This motivates the need for stereo in future multiview displays.  相似文献   

7.
HoloTabletop is a low-cost holographic-like tabletop interactive system. This system analyzes user’s head position and gaze location in a real time setting and computes the corresponding anamorphic illusion image. The anamorphic illusion image is displayed on a 2D horizontally-located monitor, yet offers stereo vision to the user. The user is able to view and interact with the 3D virtual objects without wearing any special glasses or devices. The experimental results and user studies verify that the proposed HoloTabletop system offers excellent stereo vision while no visual fatigue will be caused to human eyes. This system is a great solution for many interactive applications such as 3D board games and stereo map browsing.  相似文献   

8.
When estimating human gaze directions from captured eye appearances, most existing methods assume a fixed head pose because head motion changes eye appearance greatly and makes the estimation inaccurate. To handle this difficult problem, in this paper, we propose a novel method that performs accurate gaze estimation without restricting the user's head motion. The key idea is to decompose the original free-head motion problem into subproblems, including an initial fixed head pose problem and subsequent compensations to correct the initial estimation biases. For the initial estimation, automatic image rectification and joint alignment with gaze estimation are introduced. Then compensations are done by either learning-based regression or geometric-based calculation. The merit of using such a compensation strategy is that the training requirement to allow head motion is not significantly increased; only capturing a 5-s video clip is required. Experiments are conducted, and the results show that our method achieves an average accuracy of around 3° by using only a single camera.  相似文献   

9.
Model-based face analysis is a general paradigm with applications that include face recognition, expression recognition, lip-reading, head pose estimation, and gaze estimation. A face model is first constructed from a collection of training data, either 2D images or 3D range scans. The face model is then fit to the input image(s) and the model parameters used in whatever the application is. Most existing face models can be classified as either 2D (e.g. Active Appearance Models) or 3D (e.g. Morphable Models). In this paper we compare 2D and 3D face models along three axes: (1) representational power, (2) construction, and (3) real-time fitting. For each axis in turn, we outline the differences that result from using a 2D or a 3D face model.  相似文献   

10.
Three-dimensional (3D) face reconstruction can be tackled in either measurement-based means or model-based means. The former requires special hardwares or devices, such as structured light setups. This paper addresses 3D face reconstruction by measurement-based means, more specifically a special kind of structured light called space–time speckle projection. Under such a setup, we propose a novel and efficient spatial–temporal stereo scheme towards fast and accurate 3D face recovery. To improve the overall computational efficiency, our scheme consists of a series of optimization strategies including face-cropping-based stereo matching, coarse-to-fine stereo matching strategy applied to face areas, and spatial–temporal integral image (STII) for accelerating the matching cost computation. Based on the results, the proposed scheme is able to reconstruct a 3D face in hundreds of milliseconds on a normal PC, and its performance is validated both qualitatively and quantitatively.  相似文献   

11.
In this paper we present a stereovision based model free 3D head pose (orientation and position) estimation system suitable for human–machine interface applications. The system works by obtaining a ‘face plane’ from the 3D reconstructed face data, which is then used for head pose estimation. The key novelty in this work is the utilization of the face plane together with the eye locations on the reconstructed face data to obtain a robust head pose estimate. This approach leads to a model and initialization free head pose estimation system; therefore it is suitable for natural human–machine interfaces. In order to quantitatively asses the accuracy of the system for such applications, several evaluation experiments were conducted using a commercial motion capture system. The evaluation results indicate that this system can be used in human–computer and human–robot applications.  相似文献   

12.
In this work we elaborate on a novel image-based system for creating video-realistic eye animations to arbitrary spoken output. These animations are useful to give a face to multimedia applications such as virtual operators in dialog systems. Our eye animation system consists of two parts: eye control unit and rendering engine, which synthesizes eye animations by combining 3D and image-based models. The designed eye control unit is based on eye movement physiology and the statistical analysis of recorded human subjects. As already analyzed in previous publications, eye movements vary while listening and talking. We focus on the latter and are the first to design a new model which fully automatically couples eye blinks and movements with phonetic and prosodic information extracted from spoken language. We extended the already known simple gaze model by refining mutual gaze to better model human eye movements. Furthermore, we improved the eye movement models by considering head tilts, torsion, and eyelid movements. Mainly due to our integrated blink and gaze model and to the control of eye movements based on spoken language, subjective tests indicate that participants are not able to distinguish between real eye motions and our animations, which has not been achieved before.  相似文献   

13.
目的 双目视觉是目标距离估计问题的一个很好的解决方案。现有的双目目标距离估计方法存在估计精度较低或数据准备较繁琐的问题,为此需要一个可以兼顾精度和数据准备便利性的双目目标距离估计算法。方法 提出一个基于R-CNN(region convolutional neural network)结构的网络,该网络可以实现同时进行目标检测与目标距离估计。双目图像输入网络后,通过主干网络提取特征,通过双目候选框提取网络以同时得到左右图像中相同目标的包围框,将成对的目标框内的局部特征输入目标视差估计分支以估计目标的距离。为了同时得到左右图像中相同目标的包围框,使用双目候选框提取网络代替原有的候选框提取网络,并提出了双目包围框分支以同时进行双目包围框的回归;为了提升视差估计的精度,借鉴双目视差图估计网络的结构,提出了一个基于组相关和3维卷积的视差估计分支。结果 在KITTI(Karlsruhe Institute of Technology and Toyota Technological Institute)数据集上进行验证实验,与同类算法比较,本文算法平均相对误差值约为3.2%,远小于基于双目视差图估计算法(11.3%),与基于3维目标检测的算法接近(约为3.9%)。另外,提出的视差估计分支改进对精度有明显的提升效果,平均相对误差值从5.1%下降到3.2%。通过在另外采集并标注的行人监控数据集上进行类似实验,实验结果平均相对误差值约为4.6%,表明本文方法可以有效应用于监控场景。结论 提出的双目目标距离估计网络结合了目标检测与双目视差估计的优势,具有较高的精度。该网络可以有效运用于车载相机及监控场景,并有希望运用于其他安装有双目相机的场景。  相似文献   

14.
Eye contact and gaze awareness play a significant role for conveying emotions and intentions during face-to-face conversation. Humans can perceive each other's gaze quite naturally and accurately. However, the gaze awareness/perception are ambiguous during video teleconferencing performed by computer-based devices (such as laptops, tablet, and smart-phones). The reasons for this ambiguity are the (i) camera position relative to the screen and (ii) 2D rendition of 3D human face i.e., the 2D screen is unable to deliver an accurate gaze during video teleconferencing. To solve this problem, researchers have proposed different hardware setups with complex software algorithms. The most recent solution for accurate gaze perception employs 3D interfaces, such as 3D screens and 3D face-masks. However, today commonly used video teleconferencing devices are smart devices with 2D screens. Therefore, there is a need to improve gaze awareness/perception in these smart devices. In this work, we have revisited the question; how to improve a remote user's gaze awareness among his/her collaborators. Our hypothesis is that ‘an accurate gaze perception can be achieved by the3D embodimentof a remote user's head gesture during video teleconferencing’. We have prototyped an embodied telepresence system (ETS) for the 3D embodiment of a remote user's head. Our ETS is based on a 3-DOF neck robot with a mounted smart device (tablet PC). The electromechanical platform in combination with a smart device is a novel setup that is used for studying gaze awareness/perception in 2D screen-based smart devices during video teleconferencing. Two important gaze-related issues are considered in this work; namely (i) ‘Mona-Lisa Gaze Effect’ – the gaze is always directed at the person independent of his position in the room, and (ii) ‘Gaze Awareness/Faithfulness’ – the ability to perceive an accurate spatial relationship between the observing person and the object by an actor. Our results confirm that the 3D embodiment of a remote user head not only mitigates the Mona Lisa gaze effect but also supports three levels of gaze faithfulness, hence, accurately projecting the human gaze in distant space.  相似文献   

15.
Gaze shifts require the coordinated movement of both the eyes and the head in both animals and humanoid robots. To achieve this the brain and the robot control system needs to be able to perform complex non-linear sensory-motor transformations between many degrees of freedom and resolve the redundancy in such a system. In this article we propose a hierarchical neural network model for performing 3-D coordinated gaze shifts. The network is based on the PC/BC-DIM (Predictive Coding/Biased Competition with Divisive Input Modulation) basis function model. The proposed model consists of independent eyes and head controlled circuits with mutual interactions for the appropriate adjustment of coordination behaviour. Based on the initial eyes and head positions the network resolves redundancies involved in 3-D gaze shifts and produces accurate gaze control without any kinematic analysis or imposing any constraints. Furthermore the behaviour of the proposed model is consistent with coordinated eye and head movements observed in primates.  相似文献   

16.
针对基于姿势的自然人机交互接口设计中头部定位问题,提出了一种快速的人脸三维空间位置定位算法。首先采用级联肤色分类器与类Haar特征分类器的人脸检测器对左右路输入的图像进行快速的人脸区域定位;然后利用仿射模型匹配局部收敛性好、速度快的特点实现了左右路图像的人脸区域对齐;最后通过立体视觉原理恢复人脸的三维空间坐标。实验结果证明,提出的人脸三维空间位置定位方法速度快、定位精度较高。  相似文献   

17.
目的 在基于双目视线相交方法进行3维注视点估计的过程中,眼球光心3维坐标手工测量存在较大误差,且3维注视点估计结果在深度距离方向偏差较大。为此,提出了眼球光心标定与距离修正的方案对3维注视点估计模型进行改进。方法 首先,通过图像处理算法获取左右眼的PCCR(pupil center cornea reflection)矢量信息,并使用二阶多项式映射函数得到左、右眼的2维平面注视点;其次,通过眼球光心标定方法获取眼球光心的3维坐标,避免手工测量方法引入的误差;然后,结合平面注视点得到左、右眼的视线方向,计算视线交点得到初步的3维注视点;最后,针对结果在深度距离方向抖动较大的问题,使用深度方向数据滤波与Z平面截取修正法对3维注视点结果进行修正处理。结果 选择两个不同大小的空间测试,实验结果表明该方法在3050 cm的工作距离内,角度偏差0.7°,距离偏差17.8 mm,在50130 cm的工作距离内,角度偏差1.0°,距离偏差117.4 mm。与其他的3维注视点估计方法相比较,在同样的测试空间条件下,角度偏差和距离偏差均显著减小。结论 提出的眼球光心标定方法可以方便准确地获取眼球光心的3维坐标,避免手工测量方法带来的误差,对角度偏差的减小效果显著。提出的深度方向数据滤波与Z平面截取修正法可以有效抑制数据结果的抖动,对距离偏差的减小效果显著。  相似文献   

18.
Human eye-head co-ordination in natural exploration   总被引:1,自引:0,他引:1  
During natural behavior humans continuously adjust their gaze by moving head and eyes, yielding rich dynamics of the retinal input. Sensory coding models, however, typically assume visual input as smooth or a sequence of static images interleaved by volitional gaze shifts. Are these assumptions valid during free exploration behavior in natural environments? We used an innovative technique to simultaneously record gaze and head movements in humans, who freely explored various environments (forest, train station, apartment). Most movements occur along the cardinal axes, and the predominance of vertical or horizontal movements depends on the environment. Eye and head movements co-occur more frequently than their individual statistics predicts under an independence assumption. The majority of co-occurring movements point in opposite directions, consistent with a gaze-stabilizing role of eye movements. Nevertheless, a substantial fraction of eye movements point in the same direction as co-occurring head movements. Even under the very most conservative assumptions, saccadic eye movements alone cannot account for these synergistic movements. Hence nonsaccadic eye movements that interact synergistically with head movements to adjust gaze cannot be neglected in natural visual input. Natural retinal input is continuously dynamic, and cannot be faithfully modeled as a mere sequence of static frames with interleaved large saccades.  相似文献   

19.
Accurate head poses are useful for many face-related tasks such as face recognition, gaze estimation, and emotion analysis. Most existing methods estimate head poses that are included in the training data (i.e., previously seen head poses). To predict head poses that are not seen in the training data, some regression-based methods have been proposed. However, they focus on estimating continuous head pose angles, and thus do not systematically evaluate the performance on predicting unseen head poses. In this paper, we use a dense multivariate label distribution (MLD) to represent the pose angle of a face image. By incorporating both seen and unseen pose angles into MLD, the head pose predictor can estimate unseen head poses with an accuracy comparable to that of estimating seen head poses. On the Pointing’04 database, the mean absolute errors of results for yaw and pitch are 4.01° and 2.13°, respectively. In addition, experiments on the CAS-PEAL and CMU Multi-PIE databases show that the proposed dense MLD-based head pose estimation method can obtain the state-of-the-art performance when compared to some existing methods.  相似文献   

20.
目的 视线追踪是人机交互的辅助系统,针对传统的虹膜定位方法误判率高且耗时较长的问题,本文提出了一种基于人眼几何特征的视线追踪方法,以提高在2维环境下视线追踪的准确率。方法 首先通过人脸定位算法定位人脸位置,使用人脸特征点检测的特征点定位眼角点位置,通过眼角点计算出人眼的位置。直接使用虹膜中心定位算法的耗时较长,为了使虹膜中心定位的速度加快,先利用虹膜图片建立虹膜模板,然后利用虹膜模板检测出虹膜区域的位置,通过虹膜中心精定位算法定位虹膜中心的位置,最后提取出眼角点、虹膜中心点等信息,对点中包含的角度信息、距离信息进行提取,组合成眼动向量特征。使用神经网络模型进行分类,建立注视点映射关系,实现视线的追踪。通过图像的预处理对图像进行增强,之后提取到了相对的虹膜中心。提取到需要的特征点,建立相对稳定的几何特征代表眼动特征。结果 在普通的实验光照环境中,头部姿态固定的情况下,识别率最高达到98.9%,平均识别率达到95.74%。而当头部姿态在限制区域内发生变化时,仍能保持较高的识别率,平均识别率达到了90%以上。通过实验分析发现,在头部变化的限制区域内,本文方法具有良好的鲁棒性。结论 本文提出使用模板匹配与虹膜精定位相结合的方法来快速定位虹膜中心,利用神经网络来对视线落点进行映射,计算视线落点区域,实验证明本文方法具有较高的精度。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号