首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 151 毫秒
1.
The speech cepstral coefficients affected by additive noise are investigated. The cepstral vector changes as the level of additive noise increases. The behaviour of cepstral vector change shows that the cepstral vector shrinks in its norm and converges to the cepstral vector of the noise. This nonlinear behaviour of the cepstral vector can be approximated by a simple linear expression. Based on this representation, a model adaptation method is developed using deviation vectors. For every model state mean, a deviation vector is calculated according to the extracted noise spectrum and a pre-defined noise-to-signal ratio. During the pattern matching, an optimal scaling factor for the deviation vector is determined frame by frame, and the scaled deviation vector is added to the state mean of speech models so that the clean speech models are adapted to the noisy environment. Experimental results show that the proposed method is effective for white noise and coloured noise. It also outperforms the weighted projection measure method in experiments  相似文献   

2.
We propose a novel feature processing technique which can provide a cepstral liftering effect in the log‐spectral domain. Cepstral liftering aims at the equalization of variance of cepstral coefficients for the distance‐based speech recognizer, and as a result, provides the robustness for additive noise and speaker variability. However, in the popular hidden Markov model based framework, cepstral liftering has no effect in recognition performance. We derive a filtering method in log‐spectral domain corresponding to the cepstral liftering. The proposed method performs a high‐pass filtering based on the decorrelation of filter‐bank energies. We show that in noisy speech recognition, the proposed method reduces the error rate by 52.7% to conventional feature.  相似文献   

3.
A new cepstrum normalisation method is proposed which can be used to compensate for distortion caused by additive noise. Conventional methods only compensate for the deviation of the cepstral mean and/or variance. However, deviations of higher order moments also exist in noisy speech signals. The proposed method normalises the cepstrum up to its third-order moment, providing closer probability density functions between clean and noisy cepstra than is possible using conventional methods. From the speaker-independent isolated-word recognition experiments, it is shown that the proposed method gives improved performance compared with that of conventional methods, especially in heavy noise environments  相似文献   

4.
Wavelet transform has been found to be an effective tool for the time-frequency analysis of non-stationary and quasi-stationary signals. Recent years have seen wavelet transform being used for feature extraction in speech recognition applications. In the paper a sub-band feature extraction technique based on an admissible wavelet transform is proposed and the features are modified to make them robust to additive white Gaussian noise. The performance of this system is compared with the conventional mel frequency cepstral coefficients (MFCC) under various signal to noise ratios. The recognition performance based on the eight sub-band features is found to be superior under the noisy conditions compared with MFCC features.  相似文献   

5.
In this paper, we propose a robust distant-talking speech recognition by combining cepstral domain denoising autoencoder (DAE) and temporal structure normalization (TSN) filter. As DAE has a deep structure and nonlinear processing steps, it is flexible enough to model highly nonlinear mapping between input and output space. In this paper, we train a DAE to map reverberant and noisy speech features to the underlying clean speech features in the cepstral domain. For the proposed method, after applying a DAE in the cepstral domain of speech to suppress reverberation, we apply a post-processing technology based on temporal structure normalization (TSN) filter to reduce the noise and reverberation effects by normalizing the modulation spectra to reference spectra of clean speech. The proposed method was evaluated using speech in simulated and real reverberant environments. By combining a cepstral-domain DAE and TSN, the average Word Error Rate (WER) was reduced from 25.2 % of the baseline system to 21.2 % in simulated environments and from 47.5 % to 41.3 % in real environments, respectively.  相似文献   

6.
A segment-based speech recognition scheme is proposed. The basic idea is to model explicitly the correlation among successive frames of speech signals by using features representing contours of spectral parameters. The speech signal of an utterance is regarded as a template formed by directly concatenating a sequence of acoustic segments. Each constituent acoustic segment is of variable length in nature and represented by a fixed dimensional feature vector formed by coefficients of discrete orthonormal polynomial expansions for approximating its spectral parameter contours. In the training, an automatic algorithm is proposed to generate several segment-based reference templates for each syllable class. In the testing, a frame-based dynamic programming procedure is employed to calculate the matching score of comparing the test utterance with each reference template. Performance of the proposed scheme was examined by simulations on multi-speaker speech recognition for 408 highly confusing isolated Mandarin base-syllables. A recognition rate of 81.1% was achieved for the case using 5-segment, 8-reference template models with cepstral and delta-cepstral coefficients as the recognition features. It is 4.5% higher than that of a well-modelled 12-state, 5-mixture CHMM method using cepstral, delta cepstral, and delta-delta cepstral coefficients  相似文献   

7.
Lee  L.-M. Wang  H.-C. 《Electronics letters》1995,31(8):616-617
The state parameters of the hidden Markov model are represented by the autocorrelation coefficients of a context window that can be adaptively transformed to cepstral and delta cepstral coefficients according to the environmental noise. Experimental results show that it can significantly improve the speech recognition rate under noisy environments  相似文献   

8.
基于信号递归度分析的语音端点检测方法   总被引:1,自引:0,他引:1  
针对低信噪比、非平稳噪声环境下的语音端点检测,提出了一种基于语音/噪声的信源系统动力学特性差异,通过分析信号递归度变化,设定双门限判定语音端点的方法。和传统的能量法、倒谱距离测度法比较,准确率较高。为语音特征提取和识别研究提供了新的途径。  相似文献   

9.
基于倒谱特征的带噪语音端点检测   总被引:44,自引:0,他引:44       下载免费PDF全文
胡光锐  韦晓东 《电子学报》2000,28(10):95-97
在语音识别系统中产生错误识别的原因之一是端点检测有误差.在高信噪比情况下,正确地确定语音的端点并不困难.然而,大多数实际的语音识别系统需工作在低信噪比情况下,一些常规的端点检测方法,例如基于能量的端点检测方法在噪声环境下不能有效地工作.本文利用倒谱特征来检测语音端点,提出了带噪语音端点检测的两个算法,第一个算法利用倒谱距离代替短时能量作为判决的门限,第二个算法改进了基于隐马尔柯夫模型(HMM)的语音检测以适应噪声的变化,实验结果表明本方法可得到高正确率的带噪语音端点检测.  相似文献   

10.
张殿飞  杨震  胡海峰 《信号处理》2016,32(9):1065-1071
本文针对含噪语音压缩感知在低信噪比时重构性能差的问题,提出了一种自适应快速重构算法。该算法将行阶梯观测矩阵与一种新型的快速重构算法结合,并根据含噪语音信号的信噪比自适应选择最佳重构参数,使得在重构语音的同时提高了重构信噪比。算法实现简单快速,且不需要预先计算信号的稀疏度。实验结果表明,自适应快速重构算法重构性能优于基追踪算法和自适应共轭梯度投影算法以及快速重构算法,重构速度略慢于快速重构算法,但快于基追踪算法和自适应共轭梯度投影算法。   相似文献   

11.
基于数据驱动字典和稀疏表示的语音增强   总被引:1,自引:0,他引:1       下载免费PDF全文
孙林慧  杨震 《信号处理》2011,27(12):1793-1800
本文提出了一种基于数据驱动字典和过完备稀疏表示的自适应语音增强方法。首先在训练阶段采用干净语音基于K奇异值分解(K singular value decomposition, K SVD)算法训练过完备字典,然后在测试阶段根据含噪语音的噪声方差自适应选择最优的阈值,采用正交匹配追踪算法对含噪语音信号在过完备字典上进行稀疏分解,最后利用系数稀疏表示重构语音信号,从而达到语音增强的目。该方法不像传统语音增强方法那样减少或消去噪声,而是从字典中选取适当的原子表示纯净信号,从而把纯净信号从含噪信号中分离出来。对白噪声和有色噪声环境下重构语音进行了主客观评价。仿真结果显示:该方法能有效去除加性噪声,并且改善了语音质量。   相似文献   

12.
陈楠  鲍长春 《电子学报》2019,47(1):227-233
借助双耳线索编码原理,通过构建一个语音和噪声的双耳线索先验码书,本文提出一种单通道语音增强方法.首先,该算法将语音和噪声的双耳线索作为语音和噪声的先验知识,在线下被训练成为先验码书.之后,在线上通过加权码书映射(Weighted CodeBook Mapping,WCBM)算法估计纯净线索参数,最后,利用双耳线索编码原理增强含噪语音.此外,本文采用深度神经网络,即堆栈式自编码器(Stacked Auto-Encoders,SAE)代替WCBM算法估计纯净线索参数,提出了基于深度神经网络的双耳线索语音增强算法.进一步提高了增强算法的性能.客观测试结果表明,本文所提方法优于参考算法.  相似文献   

13.
基于DCT与维纳滤波的单通道语音增强算法   总被引:5,自引:0,他引:5  
针对复杂噪声背景下的语音增强问题,基于离散余弦变换(DCT)和维纳滤波提出了一种新的单通道语音增强算法。该算法不依赖任何语音信号模型且无需对噪声的统计特性进行先验假定,它利用DCT域中连续时刻语音信号分量间的相关特性结合最小均方误差算法实现纯净语音分量的最优估计,弥补了一般算法仅依赖单帧带噪语音对语音分量估计得不足。多种噪声背景下的仿真结果表明,该算法在主观和客观测试中都具有良好的语音增强效果。  相似文献   

14.
王骞  何培宇  徐自励 《信号处理》2020,36(6):902-910
针对现有深度神经网络语音增强方法对带噪语音的去噪能力有限、语音质量提升不高的问题,提出了一种基于奇异谱分析的深度神经网络语音增强方法。通过引入奇异谱分析算法对带噪语音进行预处理,以初步分离得到语音信号与噪声。接着将语音信号与噪声用于深度神经网络模型得训练,以得到性能更优的网络模型,从而使得本文方法具有更好的性能。最后在重建干净语音的环节中,同时使用神经网络估计得到的对数功率谱和带噪语音的对数功率谱,并加入了权重系数,使得本文提出的方法可以适应不同信噪比的情形,有效的去除背景噪声,降低语音信号的失真。本文通过仿真实验验证了该方法的有效性和鲁棒性。   相似文献   

15.
噪声自适应的多数据流复合子带语音识别方法   总被引:3,自引:0,他引:3  
张军  韦岗 《电子与信息学报》2006,28(7):1183-1187
首先针对现有丢失数据语音识别技术中的边缘化(marginalisation)技术在特征运用上的局限,提出了一种倒谱特征分量的可靠性估计方法,将边缘化技术推广到常用的倒谱语音识别系统中; 然后利用基于全带和子带倒谱特征的边缘化识别器在不同噪声中的互补性能,提出了一种噪声自适应的多数据流复合子带语音识别方法。实验结果表明,所提识别方法可以自适应地选出全带和子带数据流中受噪声影响较小者并以之为主要依据进行识别,有效地提高了识别系统在多变噪声环境中的鲁棒性。  相似文献   

16.
依据车载自组织网络的特点,提出了一种基于椭圆曲线零知识证明的匿名安全认证机制,利用双向匿名认证算法避免消息收发双方交换签名证书,防止节点身份隐私在非安全信道上泄露;利用基于消息认证码的消息聚合算法,通过路边单元协助对消息进行批量认证,提高消息认证速度,避免高交通密度情形下大量消息因得不到及时认证而丢失。分析与仿真实验表明,该机制能实现车辆节点的隐私保护和可追踪性,确保消息的完整性。与已有车载网络匿名安全认证算法相比,该机制具有较小的消息延迟和消息丢失率,且通信开销较低。  相似文献   

17.
针对传统的语音信号线性预测分析算法在噪声环境下性能恶化的问题,提出了一种新的基于超高斯激励的噪声顽健线性预测算法。该算法采用具有超高斯特性的学生t分布对语音信号线性预测激励建模,并显式地考虑环境噪声的影响,从而构建语音信号线性预测分析的概率图模型。在此基础上,利用变分贝叶斯的方法求解模型参数的近似后验分布,进而实现对带噪语音线性预测系数的最优估计。实验结果表明,该算法能够有效提高噪声环境下语音信号线性预测分析的顽健性。  相似文献   

18.
简志华  杨震 《信号处理》2007,23(3):383-387
本文提出了一种改进的倒谱域特征参数补偿算法GMCSM。根据语音信号的时变特性,GMCSM算法使用广义自回归条件异方差(Generalized Auto-Regressive Conditional Heteroscedasticity,GARCH)模型对语音信号的方差进行建模。实验数据表明,与常规倒谱相减法CSM和MEMCSM相比,GMCSM能够更有效地补偿因加性噪声引起的倒谱特征参数失真,减少识别的错误率,特别是在信噪比较低的情况下,GMCSM的性能更为显著。  相似文献   

19.
The authors propose a degradation model which represents the spectral changes of speech signals by the Lombard effect and noise contamination in noisy environments. According to this model, spectral magnitude normalisation and cepstral coefficient transforms are used to restore the cepstrum of clean speech from noisy-Lombard speech  相似文献   

20.
结合经典语音谱相减算法的基本理论,针对在恢复时域信号过程中利用带噪语音相位来代替纯净语音相位而使消噪效果变差的缺点,基于带噪语音功率谱、噪声谱和纯净语音功率谱三者相位几何关系,提出一种直接使用纯净语音相位来恢复信号的改进算法.经过仿真实验,通过对时域波形图以及信噪比的比较,结果表明提出的算法比经典的谱相减算法均有一定的提升.  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号