首页 | 官方网站   微博 | 高级检索  
 共查询到20条相似文献,搜索用时 15 毫秒
A significant improvement in speech intelligibility in a background noise was shown in a group of industrial subjects conditioned to working in noise compared with a control group of university staff. Progressive deterioration of speech intelligibility in noise was found with noise-induced hearing loss after losses had occurred at the 2 kHz pure-tone audiometric frequency.  相似文献   

Noise abatement in office environments often focuses on the reduction of background speech intelligibility and noise level, as attainable with frequency-specific insulation. However, only limited empirical evidence exists regarding the effects of reducing speech intelligibility on cognitive performance and subjectively perceived disturbance. Three experiments tested the impact of low background speech (35 dB(A)) of both good and poor intelligibility, in comparison to silence and highly intelligible speech not lowered in level (55 dB(A)). The disturbance impact of the latter speech condition on verbal short-term memory (n = 20) and mental arithmetic (n = 24) was significantly reduced during soft and poorly intelligible speech, but not during soft and highly intelligible speech. No effect of background speech on verbal-logical reasoning performance (n = 28) was found. Subjective disturbance ratings, however, were consistent over all three experiments with, for example, soft and poorly intelligible speech rated as the least disturbing speech condition but still disturbing in comparison to silence. It is concluded, therefore, that a combination of objective performance tests and subjective ratings is desirable for the comprehensive evaluation of acoustic office environments and their alterations.  相似文献   

This paper describes speech intelligibility enhancement for Hidden Markov Model (HMM) generated synthetic speech in noise. We present a method for modifying the Mel cepstral coefficients generated by statistical parametric models that have been trained on plain speech. We update these coefficients such that the glimpse proportion – an objective measure of the intelligibility of speech in noise – increases, while keeping the speech energy fixed. An acoustic analysis reveals that the modified speech is boosted in the region 1–4 kHz, particularly for vowels, nasals and approximants. Results from listening tests employing speech-shaped noise show that the modified speech is as intelligible as a synthetic voice trained on plain speech whose duration, Mel cepstral coefficients and excitation signal parameters have been adapted to Lombard speech from the same speaker. Our proposed method does not require these additional recordings of Lombard speech. In the presence of a competing talker, both modification and adaptation of spectral coefficients give more modest gains.  相似文献   

基于麦克风小阵的多噪声环境语音增强算法   总被引:1,自引:0,他引:1  
针对助听器等设备在非平稳或多种噪声并存环境下使用效果急剧下降的问题,提出一种基于小尺寸麦克风阵的相干滤波广义旁瓣抵消(CF-GSC)语音增强算法。该算法结合麦克风阵采集信号的特点,对各阵元间采集时表现为弱相关的海浪、风扇等近似白噪声,以及采集时表现为强相关的点源信号及其他竞争噪声,分别利用相干滤波和传统广义旁瓣抵消(GSC)结构对弱相关与强相关噪声的良好滤除效果,结合语音活动检测(VAD)在噪声段进行联合处理。仿真实验表明在多类噪声存在环境下,该算法能取得相对改进的通道间相干函数滤波算法及传统广义旁瓣抵消算法2 dB左右的增强效果提升,同时能获得良好的话音可懂度。  相似文献   

基于听觉掩蔽效应的MMSE语音增强算法   总被引:2,自引:2,他引:0       下载免费PDF全文
针对MMSE语音增强算法低信噪比时产生较大的语音畸变的缺点,提出了一种结合人耳听觉掩蔽效应的MMSE语音增强算法。该算法利用掩蔽阈值来调整MMSE算法中的增益值,使得增强后的语音信号残留噪声和语音畸变较小。通过计算机仿真对增强前后语音信号的信噪比分析以及主观试听表明:改进的MMSE语音增强算法不仅提高了语音信号的信噪比,而且减少了语音畸变,提高了语音的可懂度。  相似文献   

针对现有的助听器语音增强算法在非平稳噪声环境下,残留大量背景噪声的同时还引入了“音乐噪声”,致使增强语音可懂度和信噪比不理想等问题。提出了一种基于噪声估计的二值掩蔽语音增强算法,该算法利用人耳听觉感知理论,结合人耳的听觉特性和耳蜗的工作机理。采用最小值控制递归平均(Minima-Controlled Recursive Averaging,MCRA)算法获得估计噪声和初步增强语音;将估计噪声和初步增强语音分别通过可以模拟人工耳蜗模型的gammatone滤波器组进行滤波处理,得到各自的时频表示形式;利用人耳的听觉掩蔽特性,计算含噪语音在时频域的二值掩蔽;利用二值掩蔽得到增强语音。实验结果表明:该算法很大程度上去除了谱减法引入的“音乐噪声”,与基于MCRA谱减法相比,增强语音的语言可懂度指数(Speech Intelligibility Index,SII)、主观语音质量评估(Perceptual Evaluation of Speech Quality,PESQ)和信噪比(Signal to Noise Ratio,SNR)都得到了提高。  相似文献   

Post-filtering can be used in mobile communications to improve the quality and intelligibility of speech. Energy reallocation with a high-pass type filter has been shown to work effectively in improving the intelligibility of speech in difficult noise conditions. This paper introduces a post-filtering algorithm that adapts to the background noise level as well as to the fundamental frequency of the speaker and models the spectral effects observed in natural Lombard speech. The introduced method and another post-filtering technique were compared to unprocessed telephone speech in subjective listening tests in terms of intelligibility and quality. The results indicate that the proposed method outperforms the reference method in difficult noise conditions.  相似文献   

针对低信噪比条件下基本谱减算法存在降噪效果不佳,产生音乐噪声过大,语音可懂度不高的问题,提出了一种改进型的谱减算法。算法先计算语音信号的倒谱距离值,检测出噪音段和语音段,用动态计算的噪声值代替基本谱减法采用的噪声统计均值;根据当前帧和噪声帧的倒谱距离比值动态设置谱减系数,改进了传统算法中谱减系数保持不变的缺点;同时采用三种方法抑制音乐噪声。仿真实验表明,在低信噪比情况下,改进型的谱减算法可以有效降噪,提高信噪比和可懂度,达到语音增强的目的。  相似文献   

This papers studies the synthesis of speech over a wide vocal effort continuum and its perception in the presence of noise. Three types of speech are recorded and studied along the continuum: breathy, normal, and Lombard speech. Corresponding synthetic voices are created by training and adapting the statistical parametric speech synthesis system GlottHMM. Natural and synthetic speech along the continuum is assessed in listening tests that evaluate the intelligibility, quality, and suitability of speech in three different realistic multichannel noise conditions: silence, moderate street noise, and extreme street noise. The evaluation results show that the synthesized voices with varying vocal effort are rated similarly to their natural counterparts both in terms of intelligibility and suitability.  相似文献   

针对语音系统受外界强噪声干扰而导致识别精度降低以及通信质量受损的问题,提出一种基于自适应噪声估计的语音增强方法。通过端点检测将语音信号分为语音段与非语音段,对这两种情况的噪声幅度谱分别进行自适应估计,并对谱减法中不具有通用性的假设进行研究从而改进原理公式。实验结果表明,相对于传统谱减法,该方法能更好地抑制音乐噪声,并保持较高清晰度和可懂度,提高了强噪声环境下的语音识别精度和通信质量。  相似文献   

维纳滤波算法是改善噪声环境下听障患者语音理解度的常用算法之一。针对传统维纳滤波算法噪声谱估计偏差大的问题,提出一种基于改进的多通道维纳滤波算法的助听器语音降噪算法。算法首先结合人耳听觉特性和助听器响度补偿的特点,将语音信号进行Gammatone分解为多路子带信号。然后在每个子带内用基于先验信噪比估计的维纳滤波器进行语音增强处理。最后通过综合子带信号,得到增强的语音。此外,为了改善维纳滤波算法噪声谱估计的问题,提出一种基于包络估计的语音活动检测算法,并用于改善维纳滤波性能。实验结果表明,与传统维纳滤波法相比,该方法能更有效地抑制残留噪声,提高语音可懂度,具有较高的实用价值。  相似文献   

对于开放型办公室语音掩蔽系统性能的评价,语言可懂度是很重要的一个方面,目前通常采取的客观评价方法是STI。将语音信号按一定时间帧长反转后得到的信号我们称为时间反转语音,时间反转语音已被作为有效掩蔽信号之一。虽然对于由平稳噪声掩蔽的语音信号,STI与主观理解的语言可懂度相关性很好。但研究发现STI不适用于估计由时间反转语音掩蔽的语音信号的语言可懂度。文章分析了STI、PESQ及mNCM客观评价方法并进行了实验,实验结果表明,PESQ及mNCM对于由反转语音掩蔽的语音信号仍能较好估计语言可懂度。文章根据客观评价结果,进一步比较了反转语音掩蔽算法的不同参数(反转帧长与信噪比)对于语言可懂度的影响。发现反转帧长的增加和信噪比的降低会导致较低的语言可懂度。  相似文献   

由于噪声的影响导致语音信号的质量降低,因此需要对语音信号进行语音增强。语音增强是语音信号处理的前沿领域,其主要目标足从带噪语音中提取纯净的原始语音信号。介绍了实现语音增强方法的原理,利用实验仿真了传统谱减法和改进谱减方法,改进法通过对带噪信号进行参数调整,然后进行频域谱减,实验结果表明改进方法对语音增强效果明显好于传统方法。此外,对传统谱减法和改进谱减法的信噪比分别进行了计算,结果表明改进谱减方法的信噪比相对传统谱减方法有很大提高。  相似文献   

What makes speech produced in the presence of noise (Lombard speech) more intelligible than conversational speech produced in quiet conditions? This study investigates the hypothesis that speakers modify their speech in the presence of noise in such a way that acoustic contrasts between their speech and the background noise are enhanced, which would improve speech audibility.Ten French speakers were recorded while playing an interactive game first in quiet condition, then in two types of noisy conditions with different spectral characteristics: a broadband noise (BB) and a cocktail-party noise (CKTL), both played over loudspeakers at 86 dB SPL.Similarly to (Lu and Cooke, 2009b), our results suggest no systematic “active” adaptation of the whole speech spectrum or vocal intensity to the spectral characteristics of the ambient noise. Regardless of the type of noise, the gender or the type of speech segment, the primary strategy was to speak louder in noise, with a greater adaptation in BB noise and an emphasis on vowels rather than any type of consonants.Active strategies were evidenced, but were subtle and of second order to the primary strategy of speaking louder: for each gender, fundamental frequency (f0) and first formant frequency (F1) were modified in cocktail-party noise in a way that optimized the release in energetic masking induced by this type of noise. Furthermore, speakers showed two additional modifications as compared to shouted speech, which therefore cannot be interpreted in terms of vocal effort only: they enhanced the modulation of their speech in f0 and vocal intensity and they boosted their speech spectrum specifically around 3 kHz, in the region of maximum ear sensitivity associated with the actor's or singer's formant.  相似文献   

针对非平稳噪声环境和低信噪比的情况,提出了一种基于低频区语音特性的非平稳噪声估计方法,通过构造一个时变的权值,实现对噪声的实时估计,同时结合人耳听觉掩蔽效应,利用估计出的噪声自适应设定增强系数。仿真实验表明,该方法能够较好地抑制背景噪声,提高信噪比,减少语音失真。  相似文献   

The Callsign Acquisition Test (CAT) is a speech intelligibility test developed by the US Army Research Laboratory. The test has been used to evaluate speech transmission through various communication systems but has not been yet sufficiently standardised and validated. The aim of this study was to compare CAT and Modified Rhyme Test (MRT) performance in the presence of white noise across a range of signal-to-noise ratios (SNRs). A group of 16 normal-hearing listeners participated in the study. The speech items were presented at 65 dB(A) in the background of white noise at SNRs of -18, -15, -12, -9 and -6 dB. The results showed a strong positive association (75.14%) between the two tests, but significant differences between the CAT and MRT absolute scores in the range of investigated SNRs. Based on the data, a function to predict CAT scores based on existing MRT scores and vice versa was formulated. STATEMENT OF RELEVANCE: This work compares performance data of a common speech intelligibility test (MRT) with a new test (CAT) in the presence of white noise. The results here can be used as a part of the standardisation procedures and provide insights to the predictive capabilities of the CAT to quantify speech intelligibility communication in high-noise military environments.  相似文献   

传统生成对抗网络的语音增强算法(SEGAN)将时域语音波形作为映射目标, 在低信噪比条件下, 语音时域波形会淹没在噪声中, 导致SEGAN的增强性能会急剧下降, 语音失真现象较为严重. 针对该问题, 提出了一种多阶段的时频域生成对抗网络的语音增强算法(multi-stage-time-frequency SEGAN, MS-TFSEGAN). MS-TFSEGAN采用了多阶段生成器与时频域双鉴别器的模型结构, 不断对映射结果进行完善, 同时捕获时域与频域信息. 另外, 为了进一步提升模型对频域细节信息的学习能力, MS-TFSEGAN在生成器损失函数中引入了频域L1损失. 实验证明, 在低信噪比条件下, MS-TFSEGAN的语音质量和可懂度与SEGAN相比分别提升了约13.32%和8.97%, 作为语音识别前端时在CER上实现了7.3%的相对提升.  相似文献   

A novel dual-microphone speech enhancement technique is proposed in the present paper. The technique utilizes the coherence between the target and noise signals as a criterion for noise reduction and can be generally applied to arrays with closely-spaced microphones, where noise captured by the sensors is highly correlated. The proposed algorithm is simple to implement and requires no estimation of noise statistics. In addition, it offers the capability of coping with multiple interfering sources that might be located at different azimuths. The proposed algorithm was evaluated with normal hearing listeners using intelligibility listening tests and compared against a well-established beamforming algorithm. Results indicated large gains in speech intelligibility relative to the baseline (front microphone) algorithm in both single and multiple-noise source scenarios. The proposed algorithm was found to yield substantially higher intelligibility than that obtained by the beamforming algorithm, particularly when multiple noise sources or competing talker(s) were present. Objective quality evaluation of the proposed algorithm also indicated significant quality improvement over that obtained by the beamforming algorithm. The intelligibility and quality benefits observed with the proposed coherence-based algorithm make it a viable candidate for hearing aid and cochlear implant devices.  相似文献   

葛宛营  张天骐 《计算机应用》2019,39(10):3065-3070
单通道语音增强算法通过从带噪语音中估计并抑制噪声成分来得到增强语音。然而,噪声估计算法在计算时存在过估现象,导致部分估计噪声能量值比实际值大。尽管可以通过补偿消去这些过估值,但引入的误差同样会降低增强语音的整体质量。针对此问题,提出一种基于计算听觉场景分析(CASA)的时频掩蔽估计与优化算法。首先,通过直接判决(DD)算法估计先验信噪比(SNR)并计算初始掩蔽;其次,利用噪声与带噪语音在Gammatone频带内的互相关(ICC)系数来计算噪声的存在概率,结合带噪语音能量谱得到新的噪声估计,减少原估计噪声中的过估成分;然后,利用优化算法对初始掩蔽进行迭代处理以减少其中因噪声过估而存在的误差并增加其中的目标语音成分,在满足条件后停止迭代并得到新的掩蔽;最后,利用新的掩蔽合成增强语音。实验结果表明在不同的背景噪声下,相比优化前,新的掩蔽使增强语音获得了较高的主观语音质量(PESQ)和语音可懂度(STOI)值,提升了语音听感与可懂度。  相似文献   

The study investigated whether properties of speech produced in noise (Lombard speech) were more distributed (thus potentially more distinct) and/or more consistent than those from speech produced in quiet. This was examined for auditory tokens by measuring vowel space dispersion and by determining the consistency of formant production across repeated instances. Vowel space was not expanded for speech produced in noise; there was a tendency for formants to be produced more consistently in noise (with less variation in formant frequency across repeated instances) but this was not a secure effect. The distinctiveness and consistency of Lombard visual speech were also examined using motion capture data. Relative distinctiveness was determined by comparing the amount of mouth and jaw motion for speech produced in noise and quiet; relative consistency by comparing the size of correlations for motion produced across repeated instances in the noise or in quiet conditions. Mouth, and jaw motion was larger for speech in noise, however there was no greater association between the movement measures for repeated instances of speech in noise compared to in quiet. We also examined whether the correlation between auditory and motion properties was greater for speech produced in noise than in quiet. It was found that the association between speech RMS energy and jaw motion was greater for speech in noise. The results show that although Lombard speech affects both auditory and visible articulatory properties in ways likely to enhance speech perception it does not increase production consistency.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号