共查询到19条相似文献,搜索用时 62 毫秒
1.
语音是由多个发音器官共同作用产生的,发音器官动作与语音之间有着内在的必然联系.研究了利用神经网络预测视位参数中的选择语音参数、确定输入语音时域范围、优化神经网络结构等因素.实验结果表明,线性预测参数加短时能量优于其他语音参数,前向协同发音较后向协同发音影响更大,反馈对前馈神经网络的性能有所改善.考虑到实验采用的是任意连续语流,均方误差约为0.0114的实验结果还是很有吸引力的. 相似文献
2.
本文的目的是阐明一种Mel频率倒谱参数特征的改进算法。该算法是通过线性预测的方法从语音信号中提取出残差相位,同时将残差相位与传统的MFCC相结合,并应用到语音识别系统中。该改进算法比传统的MFCC算法具有更好的识别率。 相似文献
3.
4.
本文提出了一种线性预测分析方法。通过估计频率抽样获得谱包,由归一化频率估计谱包;谱包规定在mel频率级,由IDFT提取抽样自相关估计,我们从抽样自相关的结果最终获得谱包cepstral系数(SEC)。HMM(Hidden Markov Model)识别实验表明,SEC与其它算法相比较,在低信噪比时,识别性能明显提高。 相似文献
5.
语音识别系统及其特征参数的提取研究 总被引:2,自引:0,他引:2
在语音识别系统中,特征参数的选择对系统的识别性能有关键性的影响,本文主要研究几种重要的语音特征参数,包括线性预测倒谱系数、美尔倒谱系数、基于小波分析的参数等,并对这些参数进行了分析和比较,最后对语音识别的研究未来进行了展望. 相似文献
6.
7.
在噪声环境下能准确有效地提取语音信息是语音识别的重点难点,将其应用于嵌入式系统中,有一定的研究意义.通过比较分析传统的语音特征参数提取的方法:线性预测倒谱系数,Mel频率倒谱系数,提出了一种新的方法,采用Mel频率倒谱系数与一阶差分Mel频率倒谱系数(MFCC+ A MFCC)相结合的方法提取语音特征参数,结合双门限检测法进行端点检测和HMM模型进行模型匹配,并进行了以ARMSX2410为核心硬件与软件的系统设计.该方法较传统方法提高了系统的鲁棒性、识别的准确率和系统效率,适用于噪声环境下的语音识别. 相似文献
8.
9.
线性预测倒谱参数(LPCC)能很好的体现人的声道特性,而梅尔倒谱参数(MFCC)能很好的模拟人耳的听觉效应。针对MFCC在不同频率段的识别精度不一致和LPCC不能准确模拟人的听觉系统问题,将MFCC参数和IMFCC参数分别作为语音不同频率段的特征参数,结合线性预测参数(LPCC),均衡滤波器的分布,完整覆盖到整个频率段范围。将梅尔倒谱参数和线性预测参数结合起来作为语音识别的特征提取参数。实验结果表明,改进之后的算法从效率上和识别率上都有不同程度的提高。 相似文献
10.
该文实现了线性判别分析在汉语连续语音识别系统中的应用。通过将多帧原始特征联合后进行特征选择,特征之间的帧间相关性得到了有效的利用,从而提高了语音的识别率。实验结果表明,系统误识率下降16.90%。 相似文献
11.
Ekman L.A. Kleijn W.B. Murthi M.N. 《IEEE transactions on audio, speech, and language processing》2008,16(1):65-73
All-pole spectral envelope estimates based on linear prediction (LP) for speech signals often exhibit unnaturally sharp peaks, especially for high-pitch speakers. In this paper, regularization is used to penalize rapid changes in the spectral envelope, which improves the spectral envelope estimate. Based on extensive experimental evidence, we conclude that regularized linear prediction outperforms bandwidth-expanded linear prediction. The regularization approach gives lower spectral distortion on average, and fewer outliers, while maintaining a very low computational complexity. 相似文献
12.
Joe Frankel Simon King 《IEEE transactions on audio, speech, and language processing》2007,15(1):246-256
The majority of automatic speech recognition systems rely on hidden Markov models, in which Gaussian mixtures model the output distributions associated with sub-phone states. This approach, whilst successful, models consecutive feature vectors (augmented to include derivative information) as statistically independent. Furthermore, spatial correlations present in speech parameters are frequently ignored through the use of diagonal covariance matrices. This paper continues the work of Digalakis and others who proposed instead a first-order linear state-space model which has the capacity to model underlying dynamics, and furthermore give a model of spatial correlations. This paper examines the assumptions made in applying such a model and shows that the addition of a hidden dynamic state leads to increases in accuracy over otherwise equivalent static models. We also propose a time-asynchronous decoding strategy suited to recognition with segment models. We describe implementation of decoding for linear dynamic models and present TIMIT phone recognition results 相似文献
13.
14.
15.
介绍在通用PC机上实现语音三维语谱绘图软件的设计方法。根据线性预测分析技术,采用两种出图方式,通过软件编程将提取的频谱包络在显示器和打印机上绘制成图。其绘图功能及效果可与绘图仪相比拟,具有方便灵活的特点。 相似文献
16.
论文基于矢量量化模型下的说话人识别系统,研究了几种说话人特征,即线性预测系数(LPC)及其导出的特征,包括线性预测倒谱系数(LPCC),反射系数(REFL),对数面积比系数(LAR),反正弦系数(ARCSIN)和线谱频率(LSF),以及共振峰。仿真实验中,对比了不同参数设置情况下这些特征的分类误差,总结出线性预测分析在应用于说话人特征提取时选择参数的规律。 相似文献
17.
从介绍隐马可夫模型和Bayes选择规则着手,进而介绍了语音识别中基础性算法一线性词典动态规划搜索算法,实现了一个数字音识别系统,并对该实现系统作了较为详尽的描述。 相似文献
18.
Mesot B.. Barber D.. 《IEEE transactions on audio, speech, and language processing》2007,15(6):1850-1858
Real world applications such as hands-free dialling in cars may have to deal with potentially very noisy environments. Existing state-of-the-art solutions to this problem use feature-based HMMs, with a preprocessing stage to clean the noisy signal. However, the effect that raw signal noise has on the induced HMM features is poorly understood, and limits the performance of the HMM system. An alternative to feature-based HMMs is to model the raw signal, which has the potential advantage that including an explicit noise model is straightforward. Here we jointly model the dynamics of both the raw speech signal and the noise, using a switching linear dynamical system (SLDS). The new model was tested on isolated digit utterances corrupted by Gaussian noise. Contrary to the autoregressive HMM and its derivatives, which provides a model of uncorrupted raw speech, the SLDS is comparatively noise robust and also significantly outperforms a state-of-the-art feature-based HMM. The computational complexity of the SLDS scales exponentially with the length of the time series. To counter this we use expectation correction which provides a stable and accurate linear-time approximation for this important class of models, aiding their further application in acoustic modeling. 相似文献
19.
稳健语音识别技术研究 总被引:4,自引:0,他引:4
文章在简单叙述稳健语音识别技术产生的背景后,着重介绍了现阶段国内外有关稳健语音识别的主要技术、研究现状及未来发展方向。首先简述了引起语音质量恶化、影响语音识别系统稳健性的干扰源。然后介绍了抗噪语音特征的提取、声学预处理、麦克风阵列及基于人耳的听觉处理等技术路线及发展现状。最后讨论了稳健语音识别技术未来的发展方向。 相似文献