首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 46 毫秒
1.
The coordinated movement of the eyes, the head and the arm is an important ability in both animals and humanoid robots. To achieve this, the brain and the robot control system need to be able to perform complex non-linear sensory-motor transformations in the forward and inverse directions between many degrees of freedom. In this article, we apply an omnidirectional basis function neural network to this task. The proposed network can perform 3-D coordinated gaze shifts and 3-D arm reaching movements to a visual target. Particularly, it can perform direct sensory-motor transformations to shift gaze and to execute arm reach movements and can also perform inverse sensory-motor transformations in order to shift gaze to view the hand.  相似文献   

2.
In this paper we address the problem of executing fast gaze shifts toward a visual target with a robotic platform. The robotic platform is an anthropomorphic head with seven degrees of freedom (DOFs) that was designed to mimic the physical dimensions (i.e. geometry and masses), the performances (i.e. angles and velocities) and the functional abilities (i.e. neck-movements and eyes vergence) of the human head. In our approach the “gold performance” of the robotic head is represented by the accurate eye-head coordination that is observed during head-free gaze saccades in humans. To this aim, we implemented and tested on the robotic head a well-characterized, biologically inspired model of gaze control and we investigate the effectiveness of the bioinspired paradigm to achieve an appropriate control of the multi-DOF robotic head. Moreover, in order to verify if the proposed model can reproduce the typical patterns of actual human movements, we performed a quantitative investigation of the relation between movement amplitude, duration and peak velocity. In the latter case, we compared the actual robot performances with existing data on human main sequence which is known to provide a general method for quantifying the dynamic of oculomotor control. The obtained results confirmed (1) the ability of the proposed bioinspired control to achieve and maintain and stable fixation of the target which was always well-positioned within the fovea and (2) the ability to reproduce the typical human main sequence diagrams which were never been successfully implemented on a fully anthropomorphic head. Even if fundamentally aimed at the experimental investigation of the underlying neurophysiologic models, the present study is also intended to provide some possible relevant solutions to the development of human-like eye movements in humanoid robots.  相似文献   

3.
A novel approach to 3-D gaze tracking using stereo cameras   总被引:1,自引:0,他引:1  
A novel approach to three-dimensional (3-D) gaze tracking using 3-D computer vision techniques is proposed in this paper. This method employs multiple cameras and multiple point light sources to estimate the optical axis of user's eye without using any user-dependent parameters. Thus, it renders the inconvenient system calibration process which may produce possible calibration errors unnecessary. A real-time 3-D gaze tracking system has been developed which can provide 30 gaze measurements per second. Moreover, a simple and accurate calibration method is proposed to calibrate the gaze tracking system. Before using the system, each user only has to stare at a target point for a few (2-3) seconds so that the constant angle between the 3-D line of sight and the optical axis can be estimated. The test results of six subjects showed that the gaze tracking system is very promising achieving an average estimation error of under 1 degrees.  相似文献   

4.
In this paper, a real-time system to create a talking head from a video sequence without any user intervention is presented. In the proposed system, a probabilistic approach, to decide whether or not extracted facial features are appropriate for creating a three-dimensional (3-D) face model, is presented. Automatically extracted two-dimensional facial features from a video sequence are fed into the proposed probabilistic framework before a corresponding 3-D face model is built to avoid generating an unnatural or nonrealistic 3-D face model. To extract face shape, we also present a face shape extractor based on an ellipse model controlled by three anchor points, which is accurate and computationally cheap. To create a 3-D face model, a least-square approach is presented to find a coefficient vector that is necessary to adapt a generic 3-D model into the extracted facial features. Experimental results show that the proposed system can efficiently build a 3-D face model from a video sequence without any user intervention for various Internet applications including virtual conference and a virtual story teller that do not require much head movements or high-quality facial animation.  相似文献   

5.
Stabilizing the visual system is a crucial issue for any sighted mobile creature, whether it will be natural or artificial. The more immune the gaze of an animal or a robot is to various kinds of disturbances (e.g., those created by body or head movements when walking or flying), the less troublesome it will be for the visual system to carry out its many information processing tasks. The gaze control system that we describe in this paper takes a lesson from the Vestibulo-Ocular Reflex (VOR), which is known to contribute to stabilizing the human gaze and keeping the retinal image steady. The gaze control system owes its originality and its high performances to the combination of two sensory modalities, as follows:
• a visual sensor called Optical Sensor for the Control of Autonomous Robots (OSCAR) which delivers a retinal angular position signal. A new, miniature (10 g), piezo-based version of this visual sensor is presented here;

• an inertial sensor which delivers an angular head velocity signal.

We built a miniature (30 g), one degree of freedom oculomotor mechanism equipped with a micro-rate gyro and the new version of the OSCAR visual sensor. The gaze controller involves a feedback control system based on the retinal position error measurement and a feedforward control system based on the angular head velocity measurement. The feedforward control system triggers a high-speed “Vestibulo-ocular reflex” that efficiently and rapidly compensates for any rotational disturbances of the head. We show that a fast rotational step perturbation (3° in 40 ms) applied to the head is almost completely (90%) rejected within a very short time (70 ms). Sinusoidal head perturbations are also rapidly compensated for, thus keeping the gaze stabilized on its target (an edge) within a 10 times smaller angular range than the perturbing head rotations, which were applied here at frequencies of up to 6 Hz in an amplitude range of up to 6°. This high standard of performance in terms of head rotational disturbance rejection is comparable to that afforded by the human vestibulo-oculomotor system.  相似文献   


6.
In this paper, a human–machine interface for disabled people with spinal cord injuries is proposed. The designed human–machine interface is an assistive system that uses head movements and blinking for mouse control. In the proposed system, by moving one's head, the user moves the mouse pointer to the required coordinates and then blinks to send commands. The considered head mouse control is based on image processing including facial recognition, in particular, the recognition of the eyes, mouth, and nose. The proposed recognition system is based on the convolutional neural network, which uses the low-quality images that are captured by a computer's camera. The convolutional neural network (CNN) includes convolutional layers, a pooling layer, and a fully connected network. The CNN transforms the head movements to the actual coordinates of the mouse. The designed system allows people with disabilities to control a mouse pointer with head movements and to control mouse buttons with blinks. The results of the experiments demonstrate that this system is robust and accurate. This invention allows people with disabilities to freely control mouse cursors and mouse buttons without wearing any equipment.  相似文献   

7.
基于单视觉主动红外光源系统,提出了一种视线检测方法.在眼部特征检测阶段,采用投影法定位人脸;根据人脸对称性和五官分布的先验知识,确定瞳孔潜在区域;最后进行人眼特征的精确分割.在视线方向建模阶段,首先在头部静止的情况下采用非线性多项式建立从平面视线参数到视线落点的映射模型;然后采用广义回归神经网络对不同头部位置造成的视线偏差进行补偿,使非线性映射函数扩展到任何头部位置.实验结果及在交互式图形界面系统中的应用验证了该方法的有效性.  相似文献   

8.
For a long time it was known that motion control of both head and eyes is a strategic aspect of human visual perception. Despite the continuous shifts and often sudden changes of the gaze direction, this capability guarantees stable and effective 3D perception, even in case of ambiguities and relative motion between the observer and the external objects. The article considers a robotic vision system, with an anthropomorphic kinematic structure, and proposes a formal characterization of the perceptual advantages related to the active control of the cameras. It is shown that movement of the cameras, besides providing computational and behavioral advantages, is essential to gather optimal measurements. In particular, the control objective of the eye‐head system is related to a quality measure function describing the sensitivity of the transformation from world to camera coordinates. The theoretical results presented are accompanied by simulations and experimental data, coming from a real setup, which give clear evidence of some relevant aspects of the biological vision‐like fixation, vergence, and eye‐head compensation. ©1999 John Wiley & Sons, Inc.  相似文献   

9.
This paper proposes a new gaze-detection method based on a 3-D eye position and the gaze vector of the human eyeball. Seven new developments compared to previous works are presented. First, a method of using three camera systems, i.e., one wide-view camera and two narrow-view cameras, is proposed. The narrow-view cameras use autozooming, focusing, panning, and tilting procedures (based on the detected 3-D eye feature position) for gaze detection. This allows for natural head and eye movement by users. Second, in previous conventional gaze-detection research, one or multiple illuminators were used. These studies did not consider specular reflection (SR) problems, which were caused by the illuminators when working with users who wore glasses. To solve this problem, a method based on dual illuminators is proposed in this paper. Third, the proposed method does not require user-dependent calibration, so all procedures for detecting gaze position operate automatically without human intervention. Fourth, the intrinsic characteristics of the human eye, such as the disparity between the pupillary and the visual axes in order to obtain accurate gaze positions, are considered. Fifth, all the coordinates obtained by the left and right narrow-view cameras, as well as the wide-view camera coordinates and the monitor coordinates, are unified. This simplifies the complex 3-D converting calculation and allows for calculation of the 3-D feature position and gaze position on the monitor. Sixth, to upgrade eye-detection performance when using a wide-view camera, the adaptive-selection method is used. This involves an IR-LED on/off scheme, an AdaBoost classifier, and a principle component analysis method based on the number of SR elements. Finally, the proposed method uses an eigenvector matrix (instead of simply averaging six gaze vectors) in order to obtain a more accurate final gaze vector that can compensate for noise. Experimental results show that the root mean square error of gaze detection was about 0.627 cm on a 19-in monitor. The processing speed of the proposed method (used to obtain the gaze position on the monitor) was 32 ms (using a Pentium IV 1.8-GHz PC). It was possible to detect the user's gaze position at real-time speed.  相似文献   

10.
The iCat is a user-interface robot with the ability to express a range of emotions through its facial features. This article summarizes our research to see whether we can increase the believability and likability of the iCat for its human partners through the application of gaze behaviour. Gaze behaviour serves several functions during social interaction such as mediating conversation flow, communicating emotional information and avoiding distraction by restricting visual input. There are several types of eye and head movements that are necessary for realizing these functions. We designed and evaluated a gaze behaviour system for the iCat robot that implements realistic models of the major types of eye and head movements found in living beings: vergence, vestibulo ocular reflexive, smooth pursuit movements and gaze shifts. We discuss how these models are integrated into the software environment of the iCat and can be used to create complex interaction scenarios. We report about some user tests and draw conclusions for future evaluation scenarios.  相似文献   

11.
视线估计能够反映人的关注焦点,对理解人类的情感、兴趣等主观意识有重要作用。但目前用于视线估计的单目眼睛图像容易因头部姿态的变化而失真,导致视线估计的准确性下降。提出一种新型分类视线估计方法,利用三维人脸模型与单目相机的内在参数,通过人脸的眼睛与嘴巴中心的三维坐标形成头部姿态坐标系,从而合成相机坐标系与头部姿态坐标系,并建立归一化坐标系,实现相机坐标系的校正。复原并放大归一化得到的灰度眼部图像,建立基于表观的卷积神经网络模型分类方法以估计视线方向,并利用黄金分割法优化搜索,进一步降低误差。在MPIIGaze数据集上的实验结果表明,相比已公开的同类算法,该方法能降低约7.4%的平均角度误差。  相似文献   

12.
This paper is about gaze control in active vision. The problem to tackle is, given a camera imaging a particular 3-D point, to place the 3-D point at the center of the image by rotating the camera about its own optical center. For separating this procedure from structure estimation, we use a normalized camera coordinate system, which leads to formulation defined on the unit sphere. In designing the algorithm, we try to avoid the disadvantages of local coordinates or approximation. For this purpose, we design the algorithm by considering the intrinsic geometric properties of the underlying space, not using any kinds of parameterization or approximation. The proposed algorithm is simple and of a closed form, and that makes it suitable for real-time application.  相似文献   

13.
Powerful data reduction and selection processes, such as selective attention mechanisms and space-variant sensing in humans, can provide great advantages for developing effective real-time robot vision systems. The use of such processes should be closely coupled with motor capabilities, in order to actively interact with the environment. In this paper, an anthropomorphic vision system architecture integrating retina-like sensing, hierarchical structures and selective attention mechanisms is proposed. Direction of gaze is shifted based on both the sensory and semantic characteristics of the visual input, so that a task-dependent attentive behavior is produced. The sensory features currently included in the system are related to optical flow invariants, thus providing the system with motion detection capabilities. A neural network architecture for visual recognition is also included, which produces semantic-driven gaze shifts.  相似文献   

14.
《Advanced Robotics》2013,27(10):1057-1072
It is an easy task for the human visual system to gaze continuously at an object moving in three-dimensional (3-D) space. While tracking the object, human vision seems able to comprehend its 3-D shape with binocular vision. We conjecture that, in the human visual system, the function of comprehending the 3-D shape is essential for robust tracking of a moving object. In order to examine this conjecture, we constructed an experimental system of binocular vision for motion tracking. The system is composed of a pair of active pan-tilt cameras and a robot arm. The cameras are for simulating the two eyes of a human while the robot arm is for simulating the motion of the human body below the neck. The two active cameras are controlled so as to fix their gaze at a particular point on an object surface. The shape of the object surface around the point is reconstructed in real-time from the two images taken by the cameras based on the differences in the image brightness. If the two cameras successfully gaze at a single point on the object surface, it is possible to reconstruct the local object shape in real-time. At the same time, the reconstructed shape is used for keeping a fixation point on the object surface for gazing, which enables robust tracking of the object. Thus these two processes, reconstruction of the 3-D shape and maintaining the fixation point, must be mutually connected and form one closed loop. We demonstrate the effectiveness of this framework for visual tracking through several experiments.  相似文献   

15.
In this paper, a new kind of human-computer interface allowing three-dimensional (3-D) visualization of multimedia objects and eye controlled interaction is proposed. In order to explore the advantages and limitations of the concept, a prototype system has been set up. The testbed includes a visual operating system for integrating novel forms of interaction with a 3-D graphic user interface, autostereoscopic (free-viewing) 3-D displays with close adaptation to the mechanisms of binocular vision, and solutions for nonintrusive eye-controlled interaction (video-based head and gaze tracking). The paper reviews the system's key components and outlines various applications implemented for user testing. Preliminary results show that most of the users are impressed by a 3-D graphic user interface and the possibility to communicate with a computer by simply looking at the object of interest. On the other hand, the results emphasize the need for a more intelligent interface agent to avoid misinterpretation of the user's eye-controlled input and to reset undesired activities  相似文献   

16.
Autonomous virtual characters (AVCs) are becoming more prevalent both for real‐time interaction and also as digital actors in film and TV production. AVCs require believable virtual human animations, accompanied by natural attention generation, and thus the software that controls the AVCs needs to model when and how to interact with the objects and other characters that exist in the virtual environment. This paper models automatic attention behaviour using a saliency model that generates plausible targets for combined gaze and head motions. The model was compared with the default behaviour of the Second Life (SL) system in an object observation scenario while it was compared with real actors' behaviour in a conversation scenario. Results from a study run within the SL system demonstrate a promising attention model that is not just believable and realistic but also adaptable to varying task, without any prior knowledge of the virtual scene. Copyright © 2011 John Wiley & Sons, Ltd.  相似文献   

17.
This article describes real-time gaze control using position-based visual servoing. The main control objective of the system is to enable a gaze point to track the target so that the image feature of the target is located at each image center. The overall system consists of two parts: the vision process and the control system. The vision system extracts a predefined color feature from images. An adaptive look-up table method is proposed in order to get the 3-D position of the feature within the video frame rate under varying illumination. An uncalibrated camera raises the problem of the reconstructed 3-D positions not being correct. To solve the calibration problem in the position-based approach, we constructed an end-point closed-loop system using an active head-eye system. In the proposed control system, the reconstructed position error is used with a Jacobian matrix of the kinematic relation. The system stability is locally guaranteed, like image-based visual servoing, and the gaze position was shown to converge to the feature position. The proposed approach was successfully applied to a tracking task with a moving target in some simulations and some real experiments. The processing speed satisfies the property of real time. This work was presented in part at the Sixth International Symposium on Artificial Life and Robotics, Tokyo, January 15–17, 2001  相似文献   

18.
19.
This paper introduces a cost-efficient immersive teleconference system. Especially to enhance immersive interaction capability during one-to-many telecommunication, this paper concentrates on the design of a teleconference system that is composed of a set of robotic devices and web-based control/monitoring interfaces for human-robot-avatar interaction. To this end, we first propose a serverclient network model for human-machine interaction systems based on the latest HTML5 technology. As a hardware system for teleconferencing, the simplest robot is designed specially for remote users to be able to experience augmented reality naturally on live scenes. Modularized software systems are then explained in view of accessibility and functionality. This paper also describes how a human head motion is captured, synchronized with a robot in the real world, and rendered through an 3-D avatar in the augmented world for human-robot-avatar interaction. Finally, the proposed system is evaluated through a questionnaire survey that follows a series of user-experience tests.  相似文献   

20.
Human eye-head co-ordination in natural exploration   总被引:1,自引:0,他引:1  
During natural behavior humans continuously adjust their gaze by moving head and eyes, yielding rich dynamics of the retinal input. Sensory coding models, however, typically assume visual input as smooth or a sequence of static images interleaved by volitional gaze shifts. Are these assumptions valid during free exploration behavior in natural environments? We used an innovative technique to simultaneously record gaze and head movements in humans, who freely explored various environments (forest, train station, apartment). Most movements occur along the cardinal axes, and the predominance of vertical or horizontal movements depends on the environment. Eye and head movements co-occur more frequently than their individual statistics predicts under an independence assumption. The majority of co-occurring movements point in opposite directions, consistent with a gaze-stabilizing role of eye movements. Nevertheless, a substantial fraction of eye movements point in the same direction as co-occurring head movements. Even under the very most conservative assumptions, saccadic eye movements alone cannot account for these synergistic movements. Hence nonsaccadic eye movements that interact synergistically with head movements to adjust gaze cannot be neglected in natural visual input. Natural retinal input is continuously dynamic, and cannot be faithfully modeled as a mere sequence of static frames with interleaved large saccades.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号