共查询到20条相似文献,搜索用时 15 毫秒
1.
《Advanced Robotics》2013,27(6):629-653
We have developed a human tracking system for use by robots that integrate sound and face localization. Conventional systems usually require many microphones and/or prior information to localize several sound sources. Moreover, they are incapable of coping with various types of background noise. Our system, the cross-power spectrum phase analysis of sound signals obtained with only two microphones, is used to localize the sound source without having to use prior information such as impulse response data. An expectation-maximization (EM) algorithm is used to help the system cope with several moving sound sources. The problem of distinguishing whether sounds are coming from the front or back is also solved with only two microphones by rotating the robot's head. A developed method that uses facial skin colors classified by another EM algorithm enables the system to detect faces in various poses. It can compensate for the error in the sound localization for a speaker and also identify noise signals entering from undesired directions by detecting a human face. A developed probability-based method is used to integrate the auditory and visual information in order to produce a reliable tracking path in real-time. Experiments using a robot showed that our system can localize two sounds at the same time and track a communication partner while dealing with various types of background noise. 相似文献
2.
Device-free localization (DFL) is the method of using distributed wireless sensors to localize the target without carrying any devices. Existing DFL methods leverage the variation of narrowband received signal strength (NRSS) which is vulnerable to multipath fading, and thus results in considerable performance degradation in indoor environments. Moreover, the inefficient sensor deployment of traditional DFL involves huge human efforts, which is not suitable for emergency scenarios. In this paper, we utilize sensors transmitting ultra-wideband (UWB) signals to solve both problems. We proposed two RSS variation estimation methods based on the channel impulse response (CIR) measurements provided by UWB sensors, which turn out to be more robust to multipath than NRSS due to the fine multipath resolution of UWB signals. We also employ a higher operating frequency to enhance the shadowing loss for mitigating the multipath effect. Additionally, satisfactory sensor self-localization is achieved under the cooperative localization framework owing to the accurate ranging capability of UWB sensors. We conducted experiments in three different environments to explore the feasibility of our method. The results show that the proposed method gains much better localization performance and requires less human efforts than narrowband DFL. 相似文献
3.
4.
基于传声器阵列和互相关算法的时间延迟技术对声源进行定位,互相关算法对宽频带信号(扫频信号)定位比较准确,对窄带信号(风琴信号)定位不显著,分析了影响声源定位精度的因素,并改进声源定位系统.通过实验验证了影响风琴信号声源定位的因素,实现了风琴信号的声源定位,并在NI CompactRIO系统上开发了一个实时声源定位系统. 相似文献
5.
《Robotics and Autonomous Systems》2007,55(3):216-228
Mobile robots in real-life settings would benefit from being able to localize and track sound sources. Such a capability can help localizing a person or an interesting event in the environment, and also provides enhanced processing for other capabilities such as speech recognition. To give this capability to a robot, the challenge is not only to localize simultaneous sound sources, but to track them over time. In this paper we propose a robust sound source localization and tracking method using an array of eight microphones. The method is based on a frequency-domain implementation of a steered beamformer along with a particle filter-based tracking algorithm. Results show that a mobile robot can localize and track in real-time multiple moving sources of different types over a range of 7 m. These new capabilities allow a mobile robot to interact using more natural means with people in real-life settings. 相似文献
6.
《Advanced Robotics》2013,27(1-2):135-152
Sound source localization is an important function in robot audition. Most existing works perform sound source localization using static microphone arrays. This work proposes a framework that simultaneously localizes the mobile robot and multiple sound sources using a microphone array on the robot. First, an eigenstructure-based generalized cross-correlation method for estimating time delays between microphones under multi-source environments is described. Using the estimated time delays, a method to compute the farfield source directions as well as the speed of sound is proposed. In addition, the correctness of the sound speed estimate is utilized to eliminate spurious sources, which greatly enhances the robustness of sound source detection. The arrival angles of the detected sound sources are used as observations in a bearing-only simultaneous localization and mapping procedure. As the source signals are not persistent and there is no identification of the signal content, data association is unknown and it is solved using the FastSLAM algorithm. The experimental results demonstrate the effectiveness of the proposed method. 相似文献
7.
The purpose of this study was to determine how well humans localize sound sources in the horizontal plane while wearing protective headgear with and without hearing protection. In a source identification task, a stimulus was presented from 1 of 20 loudspeakers arrayed in a semicircular arc, and participants stated which loudspeaker emitted the sound. Each participant was tested in 8 conditions involving various combinations of wearing a Kevlar army helmet and two types of earplugs. Testing was conducted at each of 2 orientations (frontal and lateral). In the frontal orientation, overall error was slightly greater in all protected conditions than in the bare-head control condition. In the lateral orientation, overall error score in the protected conditions was substantially and significantly greater than in the bare-head control conditions. Most errors in the lateral orientation were accounted for by front-back confusions, indicating that the protective devices disrupted high-frequency spectral cues that are the basis for discriminating front from back sound sources. The results have practical implications for the use of protective headgear and earplugs in industrial or military environments where localization of critical sounds is important. 相似文献
8.
Hiroshi G. Okuno Kazuhiro Nakadai Tino Lourens Hiroaki Kitano 《Applied Intelligence》2004,20(3):253-266
Mobile robots capable of auditory perception usually adopt the stop-perceive-act principle to avoid sounds made during moving due to motor noise. Although this principle reduces the complexity of the problems involved in auditory processing for mobile robots, it restricts their capabilities of auditory processing. In this paper, sound and visual tracking are investigated to compensate each other's drawbacks in tracking objects and to attain robust object tracking. Visual tracking may be difficult in case of occlusion, while sound tracking may be ambiguous in localization due to the nature of auditory processing. For this purpose, we present an active audition system for humanoid robot. The audition system of the highly intelligent humanoid requires localization of sound sources and identification of meanings of the sound in the auditory scene. The active audition reported in this paper focuses on improved sound source tracking by integrating audition, vision, and motor control. Given the multiple sound sources in the auditory scene, SIG the humanoid actively moves its head to improve localization by aligning microphones orthogonal to the sound source and by capturing the possible sound sources by vision. The system adaptively cancels motor noises using motor control signals. The experimental result demonstrates the effectiveness of sound and visual tracking. 相似文献
9.
《Advanced Robotics》2013,27(1-2):145-164
The paper describes a two-dimensional (2-D) sound source mapping system for a mobile robot. The robot localizes the directions of sound sources while moving and estimates the positions of sound sources using triangulation from a short time period of directional localization results. Three key components are denoted. (i) Directional localization and separation method of different pressure sound sources by combining the Delay and Sum Beam Forming (DSBF) and the Frequency Band Selection (FBS) algorithms. (ii) The design of the microphone array by beam forming simulation to increase the resolution of the localization procedure and its robustness to ambient noise. (iii) Sound position estimation by using the RAndom SAmple Consensus (RANSAC) algorithm. Then we achieved 2-D multiple sound source mapping from time-limited data with high accuracy. Applying FBS as a binary filter after DSBF improves robustness for multiple sound source localization under robotic movement. In addition, a moving sound source separation method is shown by using segments of the DSBF enhanced signal derived from the localization process. 相似文献
10.
Nagata Y. Iwasaki S. Hariyama T. Fujioka T. Obara T. Wakatake T. Abe M. 《IEEE transactions on audio, speech, and language processing》2009,17(1):52-65
This paper addresses the problem of direction-of-arrival (DOA) estimation both in azimuthal and elevation angle from binaural sound that is processed with a head-related transfer function (HRTF). Previously, we proposed a weighted Wiener gain (WWG) method for two-dimensional DOA estimation with two-directional microphones. However, for signals processed with HRTFs, peaks in the spatial spectra of WWG indicating true sources can mingle with spurious peaks. To resolve this situation, we propose to apply incremental source attenuation (ISA) in combination with WWG. In fact, ISA reduces spectral components originating from specified sound sources and thereby improves the localization accuracy of the next targeted source in the proposed incremental estimation procedure. We conduct computer simulations using directional microphones and four HRTF sets corresponding to four individuals. The proposed method is compared to two DOA estimation methods that are equivalent to two generalized cross-correlation functions and two high-resolution methods of multiple signal classification (MUSIC) and minimum variance method. For comparison purposes, we introduce binary coherence detection (BCD) to high-resolution methods for emphasizing valid spectral components for localization in multiple source conditions. Evaluation results demonstrate that, although MUSIC with BCD yield comparable performance to that of WWG in conditions where single speech source exists, WWG with ISA surpasses the other methods in conditions including two or three speech sources. 相似文献
11.
12.
通过球麦克风阵列采集高阶声场的声压信息,采用球谐函数分解声场并建立信号模型,应用MUSIC算法提取出声源的方位。由于MUSIC算法在信号源相干性比较高,特别是声源比较接近时,其分辨率会严重下降,提出了一种基于空间平滑的瓣分块方法来提高定位的效果。仿真实验采用了一个72元的球阵,实验结果表明:提出的方法能同时有效地确定声场中多个声源的位置,能比较好地对抗噪声的影响。 相似文献
13.
Development of intelligent systems with the pursuit of detecting abnormal events in real world and in real time is challenging due to difficult environmental conditions, hardware limitations, and computational algorithmic restrictions. As a result, degradation of detection performance in dynamically changing environments is often encountered. However, in the next‐generation factories, an anomaly detection system based on acoustic signals is especially required to quickly detect and interfere with the abnormal events during the industrial processes due to the increased cost of complex equipment and facilities. In this study we propose a real time Acoustic Anomaly Detection (AAD) system with the use of sequence‐to‐sequence Autoencoder (AE) models in the industrial environments. The proposed processing pipeline makes use of the audio features extracted from the streaming audio signal captured by a single‐channel microphone. The reconstruction error generated by the AE model is calculated to measure the degree of abnormality of the sound event. The performance of Convolutional Long Short‐Term Memory AE (Conv‐LSTMAE) is evaluated and compared with sequential Convolutional AE (CAE) using sounds captured from various industrial manufacturing processes. In the experiments conducted with the real time AAD system, it is shown that the Conv‐LSTMAE‐based AAD demonstrates better detection performance than CAE model‐based AAD under different signal‐to‐noise ratio conditions of sound events such as explosion, fire and glass breaking. 相似文献
14.
D. Blake Barber Joshua D. Redding Timothy W. McLain Randal W. Beard Clark N. Taylor 《Journal of Intelligent and Robotic Systems》2006,47(4):361-382
This paper presents a method for determining the GPS location of a ground-based object when imaged from a fixed-wing miniature air vehicle (MAV). Using the pixel location of the target in an image, measurements of MAV position and attitude, and camera pose angles, the target is localized in world coordinates. The main contribution of this paper is to present four techniques for reducing the localization error. In particular, we discuss RLS filtering, bias estimation, flight path selection, and wind estimation. The localization method has been implemented and flight tested on BYU’s MAV testbed and experimental results are presented demonstrating the localization of a target to within 3 m of its known GPS location. 相似文献
15.
ABSTRACTIn a tele-operated robot environment, reproducing auditory scenes and conveying 3D spatial information of sound sources are inevitable in order to make operators feel more realistic tele-presence. In this paper, we propose a tele-presence robot system that enables reproduction and manipulation of auditory scenes. This tele-presence system is carried out on the basis of 3D information about where targeted humans are speaking, and matching with the operator's head orientation. We employed multiple microphone arrays and human tracking technologies to localize and separate voices around a robot. In the operator site, separated sound sources are rendered using head-related transfer functions (HRTF) according to the sound sources' spatial positions and the operator's head orientation that is being tracked real time. Two-party and three-party interaction experiments indicated that the proposed system has significantly higher accuracy when perceiving direction of sounds and gains higher subjective scores in sense of presence and listenability, compared to a baseline system which uses stereo binaural sounds obtained by two microphones mounted on the humanoid robot's ears. 相似文献
16.
针对噪声与混响环境下的声源定位问题,采用了一种基于粒子滤波的麦克风阵列的声源定位方法。在粒子滤波框架下,将到达麦克风的语音信号作为观测信息,通过计算麦克风阵列波束形成器的输出能量来构建似然函数。实验结果表明,方法提高了声源定位系统的抗噪声与抗混响能力,即使在低信噪比强混响的环境下也能获得较高的定位精度。 相似文献
17.
In this work, we describe an autonomous mobile robotic system for finding, investigating, and modeling ambient noise sources
in the environment. The system has been fully implemented in two different environments, using two different robotic platforms
and a variety of sound source types. Making use of a two-step approach to autonomous exploration of the auditory scene, the
robot first quickly moves through the environment to find and roughly localize unknown sound sources using the auditory evidence
grid algorithm. Then, using the knowledge gained from the initial exploration, the robot investigates each source in more
depth, improving upon the initial localization accuracy, identifying volume and directivity, and, finally, building a classification
vector useful for detecting the sound source in the future. 相似文献
18.
19.
Jean-Marc Jot 《Multimedia Systems》1999,7(1):55-69
This paper gives an overview of the principles and methods for synthesizing complex 3D sound scenes by processing multiple
individual source signals. Signal-processing techniques for directional sound encoding and rendering over loudspeakers or
headphones are reviewed, as well as algorithms and interface models for synthesizing and dynamically controling room reverberation
and distance effects. A real-time modular spatial-sound-processing software system, called Spat, is presented. It allows reproducing and controling the localization of sound sources in three dimensions and the reverberation
of sounds in an existing or virtual space. A particular aim of the Spatialisateur project is to provide direct and computationally
efficient control over perceptually relevant parameters describing the interaction of each sound source with the virtual space,
irrespective of the chosen reproduction format over loudspeakers or headphones. The advantages of this approach are illustrated
in practical contexts, including professional audio, computer music, multimodal immersive simulation systems, and architectural
acoustics. 相似文献
20.
Donatas Trapenskas
rjan Johansson 《International Journal of Industrial Ergonomics》2001,27(6):405-410
One of the problems associated with listening to binaurally recorded sound events is localization confusions. The main objective of this investigation was to find out whether a short training session prior to listening to binaural recordings through headphones would facilitate correct spatial perception of the sound field. Focus was on the localization of the sound stimuli in median plane. Sound signals were recorded with an artificial head in three different conditions namely, anechoic, highly reverberant and moderately reverberant. Fourteen subjects participated in the listening tests. All subjects were required to localize all virtual sound stimuli under two different conditions. The first condition had a short training session binaurally recorded in the same environments as preceeding sound stimuli, and only sound stimuli recorded in the same environment were presented. The second condition did not have a training session, and sound stimuli recorded in different environments were presented. Results showed that a short training session prior to listening to binaurally recorded sounds through headphones was useful as it facilitated localization performance. The biggest effect was in reduced amount of sounds perceived inside the head. It was most pronounced for sound stimuli recorded in anechoic environment. 相似文献