Two-handed gesture recognition and fusion with speech to command a robot |
| |
Authors: | B Burger I Ferrané F Lerasle G Infantes |
| |
Affiliation: | 1. CNRS, LAAS, 7 avenue du Colonel Roche, 31077, Toulouse Cedex, France 2. IRIT, Université de Toulouse, 118 route de Narbonne, 31062, Toulouse Cedex, France 3. Université de Toulouse, UPS, INSA, INP, ISAE; UT1, UTM, LAAS, 31077, Toulouse Cedex, France 4. Onera, 2 avenue Edouard Belin, 31055, Toulouse Cedex 4, France
|
| |
Abstract: | Assistance is currently a pivotal research area in robotics, with huge societal potential. Since assistant robots directly
interact with people, finding natural and easy-to-use user interfaces is of fundamental importance. This paper describes a
flexible multimodal interface based on speech and gesture modalities in order to control our mobile robot named Jido. The
vision system uses a stereo head mounted on a pan-tilt unit and a bank of collaborative particle filters devoted to the upper
human body extremities to track and recognize pointing/symbolic mono but also bi-manual gestures. Such framework constitutes
our first contribution, as it is shown, to give proper handling of natural artifacts (self-occlusion, camera out of view field,
hand deformation) when performing 3D gestures using one or the other hand even both. A speech recognition and understanding
system based on the Julius engine is also developed and embedded in order to process deictic and anaphoric utterances. The
second contribution deals with a probabilistic and multi-hypothesis interpreter framework to fuse results from speech and
gesture components. Such interpreter is shown to improve the classification rates of multimodal commands compared to using
either modality alone. Finally, we report on successful live experiments in human-centered settings. Results are reported
in the context of an interactive manipulation task, where users specify local motion commands to Jido and perform safe object
exchanges. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|