共查询到18条相似文献,搜索用时 96 毫秒
1.
提出用双耳听觉模型对空间声音色进行分析的普遍方法,并以Ambisonics为例进行了分析。Ambisonics是基于物理声场重构的空间声系统,其最终重构声场误差以及音色改变是由传声器捡拾和重放空间混叠误差共同引起的。采用修正的Moore双耳响度模型计算了Ambisonics重构声场的双耳响度级谱并和目标声场的情况比较,从而定量评价重构声场的音色改变。结果表明,在理想捡拾信号的情况下,无音色改变重放的上限频率和区域大小随Ambisonics的阶数而增加。而对于传声器阵列捡拾的情况,只要阵列的上限频率大于Ambisonics重放的上限频率,在重放的上限频率以下,传声器阵列空间混叠误差对最终重构声场及其感知音色的影响就可以忽略。在此基础上,提出了一种综合考虑捡拾与重放性能的Ambisonics系统优化设计方法。心理声学实验得到了和双耳听觉模型一致的结果,从而也验证了模型分析的有效性。 相似文献
2.
双耳重放的目标之一是在耳机重放中产生不同方向和距离的虚拟源感知。本文研究了动态双耳Ambisonics重放自由场虚拟源方向和距离信息的简化信号处理方法。该信号处理方法包括两步:第1步是基于目标声场的球谐函数分解,合成采用扬声器的近场Ambisonics重放中逐级重构目标声场的信号;第2步是采用虚拟扬声器重放的方法,用动态头相关函数滤波处理将Ambisonics的扬声器重放信号转换为双耳重放信号并用耳机重放。进一步研究了动态双耳Ambisonics的阶数对定位效果的影响,为简化信号处理提供依据。对重放产生的双耳声压分析表明,5阶动态双耳Ambisonics重放足以提供听觉方向定位和距离感知的重要信息。同时心理声学的实验结果表明,结合声源距离相关的响度因素,5阶动态双耳Ambisonics重放可产生不同方向和1.0 m以下不同近场距离的自由场虚拟源的听觉感知。本文的方法仅需要固定距离的48个均匀空间方向的远场非个性化HRTF处理,实现了信号处理的简化。 相似文献
3.
在两扬声器虚拟声重放中,通过精确重构双耳声压而产生不同的空间听觉感知。其重放的定位性能应该是由双耳声压控制的代价和稳定性所共同决定的。过去研究主要对双耳声压控制的稳定性进行分析,并以此作为扬声器布置和信号处理的依据。该文研究表明仅对双耳声压的稳定性分析是不足以完全衡量扬声器虚拟声重放的定位性能的。进一步采用虚拟声信号处理滤波器响应平均功率对双耳声压控制的代价进行分析。结果表明,缩窄左右对称扬声器布置的张角或采用非对称扬声器布置会明显增加产生侧向目标虚拟源时的双耳声压控制代价。虚拟源(虚拟声像)定位实验表明,双耳声压控制代价增加会引起虚拟源定位缺陷。实际应用中,为了有效产生侧向虚拟源,应避免采用过窄张角(如立体声偶极)和非对称的扬声器布置。 相似文献
4.
为建立水下噪声音色特征的定量表达以用于目标识别, 本文将主观评价实验获得的 4个本质音色维度得分与声音的听觉中枢响应建立联系, 得到音色的偏最小二乘回归模型, 并基于回归系数对每个维度进行物理分析. 为验证该方法的有效性, 本文提取大量音色描述符作为自变量进行对比, 结果表明听觉中枢模型预测能力有一定优势. 同时发现, 前 3个本质音色维度可分别由高频能量比例、谱平坦程度和时域连续性描述, 而第4 维度则无法与任何声学特征建立联系.
关键词:
本质音色
听觉中枢模型
偏最小二乘回归
音色描述符 相似文献
5.
为改善5.1通路环绕声的双耳重放性能,提出一种基于低价头踪迹跟踪模块的动态双耳重放方法。头踪迹跟踪模块通过单片机采集磁传感器、加速度传感器的输出数据,计算出倾听者头部水平方向信息,并将其经USB接口传给计算机进行动态双耳声信号合成。心理声学实验表明,本文提出的方法可以消除虚拟声源前后混乱和头中定位现象,提升5.1通路环绕声双耳重放的虚拟声源定位性能。 相似文献
6.
随着VR眼镜技术的发展,普通的智能手机已可以作为虚拟现实和动态声、视频重放的平台。该文提出了一种基于手机的多通路环绕声动态双耳重放技术及其信号处理的高效实现方法。利用手机内的加速度传感器、电子罗盘、陀螺仪组成头踪迹跟踪器,实时检测倾听者头部的方向,并利用手机的信号处理芯片实现动态双耳合成。采用头相关脉冲响应的最小相位近似和主成分分解的方法简化双耳合成处理,提高了信号处理的效率。文中给出了系统的结构和软、硬件设计方法,并给出了实现22.2通路空间环绕声动态双耳重放的例子。客观测量和心理声学实验验证了所提出的方法。 相似文献
7.
环绕声重放中通路信号相关性与听觉空间印象 总被引:1,自引:0,他引:1
通过心理声学实验研究了5.1通路环绕声重放中前方左、右,以及左环绕、右环绕四个扬声器通路信号相关性与听觉空间印象之间的关系。结果表明,对前方左、右扬声器重放或左环绕、右环绕扬声器重放,都可以通过控制通路信号的相关系数在一定程度上改变前方或后方声像的宽度。对不同频率范围的信号,声像宽度与通路信号的相关系数之间的定量关系有所不同。但对一对侧向扬声器重放,基本上不能通过控制通路信号相关系数来改变声像的宽度,并且声像宽度很窄。对于前方和环绕两对扬声器同时重放,对粉红噪声和中心频率不大于1 kHz的倍频程信号,适当控制各扬声器对通路信号的相关系数可以获得较强的包围感;但是对中心频率为2 kHz和4 kHz的倍频程信号则无法获得包围感。进一步的理论计算和实验测量结果表明,重放声像宽度和双耳听觉互相关系数(IACC)并没有唯一对应的关系,这可能和IACC的计算方法有关。对于IACC的计算方法和适用性还需要进一步的实验验证。本文的结果将有助于实际的环绕声节目制作和评价。 相似文献
8.
<正>陈克安著北京:科学出版社,2014,348页,定价:128元声学是声音的科学,研究声音的产生、传播、接收和效应。声或声音原指人耳听觉所能觉察的空气中传播的振动现象。人类认识声音自语言始,通过观测声学现象,研究其规律,人们很早就认识到声的波动性质,并创造了声学设备(主要是乐器),发展了测试方法,取得不少重要结果。但直到17世纪初伽利略提出频率和周期的概念止,都没有对振动和波动的本质的研究,当然也没有声速的概念。原因在于没有找到描述特定声学(物理)现象的特性的物理量及其测量方法。 相似文献
9.
《声学学报:英文版》2015,(3)
为提高复杂场景下的听障患者的语言理解度,本文提出一种仿人耳听觉的助听器双耳声源定位算法。算法首先借鉴耳蜗分频特性和听觉掩蔽特性,将声音信号进行多通道分解,并提取人耳敏感频带的信号进行双耳时间差(Interaural Time Difference,ITD)估计;然后基于人耳哈斯效应,提取有效的ITD信息;最后采用头相关模型,将ITD转化为声源方向信息。同时,为了改善混响和多干扰声场景下的声源定位能力,本文提出一种多通道的加权联合策略。仿真和场景测试实验表明,算法的抗干扰性强,定位精度高。而且,在7名受试者的理解度测试中,同现有的助听器增强算法相比,结合定位算法的语音增强算法达到3~5 dB的性能改善。 相似文献
10.
自20世纪30年代引入立体声以来,人类对逼真的听觉体验一直进行着孜孜不倦的追求。双耳音频处理技术基于人耳听觉感知特性,利用计算机和数字信号处理等技术在听者双耳鼓膜处模拟出与真实场景相同的声压,以期给人以“身临其境”的体验,一直是音频信号处理领域的重要研究内容,特别是近两年随着虚拟现实等应用的蓬勃发展,得到更多关注。该文主要围绕双耳音频处理技术中所涉及的关键环节:双耳录音、双耳合成、耳机重放、扬声器重放、头跟踪等领域,以及相关典型应用场景进行较为系统的介绍,最后给出总结与展望。 相似文献
11.
《声学学报:英文版》2015,(4)
A scheme for analyzing the timbre in spatial sound with binaural auditory model is proposed and the Ambisonics is taken as an example for analysis.Ambisonics is a spatial sound system based on physical sound field reconstruction.The errors and timbre colorations in the final reconstructed sound field depend on the spatial aliasing errors on both the recording and reproducing stages of Ambisonics.The binaural loudness level spectra in Ambisonics reconstruction is calculated by using Moore's revised loudness model and then compared with the result of real sound source,so as to evaluate the timbre coloration in Ambisonics quantitatively.The results indicate that,in the case of ideal independent signals,the high-frequency limit and radius of region without perceived timbre coloration increase with the order of Ambisonics.On the other hand,in the case of recording by microphone array,once the high-frequency limit of microphone array exceeds that of sound field reconstruction,array recording influences little on the binaural loudness level spectra and thus timbre in final reconstruction up to the highfrequency limit of reproduction.Based on the binaural auditory model analysis,a scheme for optimizing design of Ambisonics recording and reproduction is also suggested.The subjective experiment yields consistent results with those of binaural model,thus verifies the effectiveness of the model analysis. 相似文献
12.
Major criteria for a successful binaural reproduction are not only a suitable localization performance, but also the authenticity and plausibility of the presented scene. It is therefore interesting to examine whether the binaural reproduction can be perceptually distinguished from a real source. The aim of the presented investigation is to compare the quality of the binaural reproduction via headphones with two different microphone setups (miniature microphone in Open-Dome and ear plug) for individual head-related-transfer-function (HRTF) and headphone-transfer-function (HpTF) measurements. Listening tests with a total of 80 subjects were carried out focusing on plausibility and authenticity. In the examination of plausibility detection rates showed that subjects were not able to match the reproduced pink noise to its reproduction system (real source vs. binaural reproduction via headphones). The authenticity of the static binaural reproduction was highly dependent on the stimulus. Pink noise could often be distinguished due to coloration in higher frequencies and small differences in location. A difference between microphone setups could not be found in neither of the listening tests. 相似文献
13.
The spectral resolution of the binaural system was measured using a tone-detection task in a binaural analog of the notched-noise technique. Three listeners performed 2-interval, 2-alternative, forced choice tasks with a 500-ms out-of-phase signal within 500 ms of broadband masking noise consisting of an "outer" band of either interaurally uncorrelated or anticorrelated noise, and an "inner" band of interaurally correlated noise. Three signal frequencies were tested (250, 500, and 750 Hz), and the asymmetry of the filter was measured by keeping the signal at a constant frequency and moving the correlated noise band relative to the signal. Thresholds were taken for bandwidths of correlated noise ranging from 0 to 400 Hz. The equivalent rectangular bandwidth of the binaural filter was found to increase with signal frequency, and estimates tended to be larger than monaural bandwidths measured for the same listeners using equivalent techniques. 相似文献
14.
There are many approaches to achieving high-performance speech enhancement. The modeling of the human auditory system is a good approach, since human beings can focus on target speech under concurrent speech conditions. One example of the binaural models is the time domain binaural model. However, this model has a high-calculation cost because the algorithm is based on auto-correlation, which is computationally intensive. Another example is the frequency domain binaural model proposed by Nakashima et al. [Nakashima H, Chisaki Y, Usagawa T, Ebata M. Frequency domain binaural model based on interaural phase and level differences. Acoust Sci Technol 2003;24(4):172-8]. Since the frequency domain binaural model uses the fast fourier transform, the calculation cost is much lower than that of the time domain binaural model. Therefore, it is not difficult to perform real-time processing using recent hardware such as digital signal processors and even laptop personal computers. However the quality of the segregated sound obtained using the frequency domain binaural model depends on system parameters such as frequency resolution and frame shift length for overlap adding in time domain. This paper introduces the construction of a prototype of a hearing assistant system based on the frequency domain binaural model. The detailed implementation techniques and parameter tuning are mentioned. The proposed system runs in real-time after parameter tuning. The directional attenuation levels, that is, the directivity patterns of the proposed system is measured. Finally, it is shown that the prototype can extract sounds coming from specific directions in real-time. 相似文献
15.
J W Hall R S Tyler M A Fernandes 《The Journal of the Acoustical Society of America》1983,73(3):894-898
Several studies using bandlimited masking noise have indicated that NOSO frequency resolution is better than that for NOS pi. The present study examined NOSO and NOS pi frequency resolution with two different masking methods: bandlimited noise and notched noise. Noise spectrum levels of 10, 30, and 50 dB/Hz were used. Thresholds were determined for a 500-Hz signal, using a three-alternative forced-choice adaptive procedure, as a function of masker bandwidth and notchwidth. For NOSO presentation, 3-dB down points were comparable for the notched-noise and bandlimiting methods. For NOS pi presentation, 3-dB down points were generally greater for the bandlimiting method than the notched noise method. Furthermore, for NOS pi presentation, the 3-dB down estimate increased as noise level increased for the bandlimiting method, but stayed constant for the notched-noise method. It is suggested that the two masking methods measured different aspects of binaural processing. 相似文献
16.
The fidelity of reproducing free-field sounds using a virtual auditory display was investigated in two experiments. In the first experiment, listeners directly compared stimuli from an actual loudspeaker in the free field with those from small headphones placed in front of the ears. Headphone stimuli were filtered using head-related transfer functions (HRTFs), recorded while listeners were wearing the headphones, in order to reproduce the pressure signatures of the free-field sounds at the eardrum. Discriminability was investigated for six sound-source positions using broadband noise as a stimulus. The results show that the acoustic percepts of real and virtual sounds were identical. In the second experiment, discrimination between virtual sounds generated with measured and interpolated HRTFs was investigated. Interpolation was performed using HRTFs measured for loudspeaker positions with different spatial resolutions. Broadband noise bursts with flat and scrambled spectra were used as stimuli. The results indicate that, for a spatial resolution of about 6 degrees, the interpolation does not introduce audible cues. For resolutions of 20 degrees or more, the interpolation introduces audible cues related to timbre and position. For intermediate resolutions (10 degrees - 15 degrees) the data suggest that only timbre cues were used. 相似文献
17.
In a 3D auditory display, sounds are presented over headphones in a way that they seem to originate from virtual sources in a space around the listener. This paper describes a study on the possible merits of such a display for bandlimited speech with respect to intelligibility and talker recognition against a background of competing voices. Different conditions were investigated: speech material (words/sentences), presentation mode (monaural/binaural/3D), number of competing talkers (1-4), and virtual position of the talkers (in 45 degrees-steps around the front horizontal plane). Average results for 12 listeners show an increase of speech intelligibility for 3D presentation for two or more competing talkers compared to conventional binaural presentation. The ability to recognize a talker is slightly better and the time required for recognition is significantly shorter for 3D presentation in the presence of two or three competing talkers. Although absolute localization of a talker is rather poor, spatial separation appears to have a significant effect on communication. For either speech intelligibility, talker recognition, or localization, no difference is found between the use of an individualized 3D auditory display and a general display. 相似文献