首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 93 毫秒
1.
基于加权Mel倒谱系数的说话人识别   总被引:2,自引:0,他引:2  
说话人识别中的首要问题是从语音信号中提取能唯一表现说话人个性特征的有效而稳定可靠的特征参数.把感知加权技术应用到Mel倒谱分析中,通过对基于心理声学模型计算得到的信号掩蔽比插值获得权重函数,并将权重函数应用到Mel倒谱分析中获得加权Mel倒谱系数(WMCEP),以此为特征进行说话人识别.实验结果表明,WMCEP比MFCC和Mel倒谱系数(MCEP)能更好地逼近说话人的谱包络,在噪声环境下的鲁棒性更好,因此其识别性能要优于MFCC和MCEP.  相似文献   

2.
一种适用于说话人识别的改进Mel滤波器   总被引:1,自引:0,他引:1  
项要杰  杨俊安  李晋徽  陆俊 《计算机工程》2013,(11):214-217,222
Mel倒谱系数(MFcc)侧重提取语音信号的低频信息,对语音信号的频谱分布特性描述不充分,不能有效区分说话人个性信息。为此,通过分析语音信号各频段所含说话人个性信息的不同,结合Mel滤波器和反Mel滤波器在高低频段的不同特性,提出一种适于说话人识别的改进Mel滤波器。实验结果表明,改进Mel滤波器提取的新特征能够获得比传统Mel倒谱系数以及反Mel倒谱系数(IMFCC)更好的识别效果,并且基本不增加说话人识别系统训练和识别的时间开销。  相似文献   

3.
卜奎昊 《福建电脑》2010,26(5):99-100
支持向量机是统计学习理论的一个重要学习方法,它是专门针对小样本的;N维Mel倒谱系数和能较好的表征说话人特征。该文使用支持向量机和Mel倒谱特征和建立了一个文本无关的说话人识别系统,并且该系统不受说话人情绪影响。实验表明该系统对说话人识别有很强的适应性。  相似文献   

4.
宋乐  白静 《计算机工程与设计》2014,(5):1772-1775,1781
为了提取到能够区分不同说话人个性特征的最优特征参数,采用在Mel频率倒谱系数(Mel-frequency cepstrum coefficients,MFCC)基础上进行改进的复合参数,即增加归一化短时能量参数和一阶差分所构成的特征矢量作为特征。针对高维特征参数,提出了一种基于相关距离Fisher准则的特征选取方法,利用该方法对提取出的参数进行加权降维。通过实验对比结果表明,该算法提高了识别率,具备可行性与优越性,是一种有效的特征提取算法。  相似文献   

5.
说话人识别和确认是信号处理中研究的热点之一,但有关文献表明识别效率并不是很高,而且训练和识别的语音要求都比较长,距离实际应用还有一定差距.分析了说话人识别中有关参数的选取对识别结果的影响,采用线性预测倒谱和基音参数共同作为识别参数,并采用矢量量化,改进了线性预测倒谱距离的加权函数,提供了与文本无关的说话人识别系统.最后给出了实验结果和有关分析,在低噪声时识别正确率可达99%以上,在高噪声时也能达到98%以上的正确率.  相似文献   

6.
特征提取是说话人识别系统中最关键的一个步骤.特征提取通俗的来说是提取代表说话人个性的语音特征.直接关乎识别系统的准确率.通常人们能从说话人声音的品质,频率的高低,音量的大小等信息中感知说话人的个性特点.文章采用Mel频率倒谱域参数,是因为Mel频率尺度更加贴近入耳的听觉特性.Mel频率倒谱域参数不仅具有低频段高谱分辨率的优势,而且对噪声鲁棒能力很强.文章以声道模型和听觉模型为例,对比了LPC参数和MFCC参数分布.得出了MFCC不受全极点模型限制,对环境的适应性更强,且可降低不同人说话引起的差异度的影响.其参数性能优于LPC参数.  相似文献   

7.
基于MFCC和加权矢量量化的说话人识别系统   总被引:14,自引:4,他引:14  
文章介绍的说话人识别系统,采用能够反映人对语音的感知特性的Mel频率倒谱系数(Mel-FrequencyCeptralCoefficients,MFCC)作为特征参数,同时考虑到特征参数各维分量对于不同说话人的区分程度,采用加权的办法进行矢量量化。取得了很好的结果,系统训练和识别计算量和存储量都比较低。  相似文献   

8.
尹许梅  何选森 《计算机工程》2011,37(11):192-194
为提高低信噪比环境下语音的鲁棒性,提出一种改进的Mel频率倒谱系数(MFCC)特征提取方法。在传统MFCC特征提取的基础上,引入更适应人耳听觉系统的Bark子波变换,在快速傅里叶变换之前对语音进行预处理,并在MFCC提取方法中代替离散余弦变换;在语音预处理阶段,利用改进的Lanczos窗函数抑制旁瓣以提高语音鲁棒性。实验表明,与传统MFCC方法相比,在噪声环境下,改进方法具有更高的说话人识别率。  相似文献   

9.
杜晓青  于凤芹 《计算机工程》2013,(11):197-199,204
Mel频率倒谱系数(MFCC)与线性预测倒谱系数(LPCC)融合算法只能反映语音静态特征,且LPCC对语音低频局部特征描述不足。为此,提出将希尔伯特黄变换(HHT)倒谱系数与相对光谱一感知线性预测倒谱系数(RASTA—PLPCC)融合,得到一种既反映发声机理又体现人耳感知特性的说话人识别算法。HHT倒谱系数体现发声机理,能反映语音动态特性,并更好地描述信号低频局部特征,可改进LPCC的不足。PLPCC体现人耳感知特性,识别性能强于MFCC,用3种融合算法对两者进行融合,将融合特征用于高斯混合模型进行说话人识别。仿真实验结果表明,该融合算法较已有的MFCC与LPCC融合算法识别率提高了8.0%。  相似文献   

10.
基于高斯混合模型的说话人确认系统   总被引:5,自引:1,他引:4  
杨澄宇  赵文  杨鉴 《计算机应用》2001,21(4):7-8,11
由于在人的话音频谱中,低频和较高频段含有较多说话人的个性信息,本文提出一种LPC倒谱的改进算法用于与文本无关的说话人识别,该改进算法通过话音频谱的各频段进行加权,突出说话人的个性信息,从而使说话人更易于区分。  相似文献   

11.
12.
This paper presents the feature analysis and design of compensators for speaker recognition under stressed speech conditions. Any condition that causes a speaker to vary his or her speech production from normal or neutral condition is called stressed speech condition. Stressed speech is induced by emotion, high workload, sleep deprivation, frustration and environmental noise. In stressed condition, the characteristics of speech signal are different from that of normal or neutral condition. Due to changes in speech signal characteristics, performance of the speaker recognition system may degrade under stressed speech conditions. Firstly, six speech features (mel-frequency cepstral coefficients (MFCC), linear prediction (LP) coefficients, linear prediction cepstral coefficients (LPCC), reflection coefficients (RC), arc-sin reflection coefficients (ARC) and log-area ratios (LAR)), which are widely used for speaker recognition, are analyzed for evaluation of their characteristics under stressed condition. Secondly, Vector Quantization (VQ) classifier and Gaussian Mixture Model (GMM) are used to evaluate speaker recognition results with different speech features. This analysis help select the best feature set for speaker recognition under stressed condition. Finally, four VQ based novel compensation techniques are proposed and evaluated for improvement of speaker recognition under stressed condition. The compensation techniques are speaker and stressed information based compensation (SSIC), compensation by removal of stressed vectors (CRSV), cepstral mean normalization (CMN) and combination of MFCC and sinusoidal amplitude (CMSA) features. Speech data from SUSAS database corresponding to four different stressed conditions, Angry, Lombard, Question and Neutral, are used for analysis of speaker recognition under stressed condition.  相似文献   

13.
Speaker recognition faces many practical difficulties, among which signal inconsistency due to environmental and acquisition channel factors is most challenging. The noise imposed to the voice signal varies greatly and a priori noise model is usually unavailable. In this article, we propose a robust speaker recognition method that employs a novel adaptive wavelet shrinkage method for noise suppression. In our method, wavelet subband coefficient thresholds are automatically computed, which are proportional to the noise contamination. In the application of wavelet shrinkage for noise removal, a dual-threshold strategy is developed to suppress noise, preserve signal coefficients and minimize the introduction of artifacts. The recognition is achieved using modification of Mel-frequency cepstral coefficient of overlapped voice signal segments. The efficacy of our method is evaluated with voice signals from two public available speech signal databases and is compared with state-of-the-art methods. It is demonstrated that our proposed method exhibits great robustness in various noise conditions. The improvement is significant especially when noise dominates the underlying speech.  相似文献   

14.
邓蕾  高勇 《计算机系统应用》2017,26(12):227-232
针对噪声环境中说话人识别性能急剧下降的问题. 提出了一种用于说话人识别的鲁棒特征提取的方法. 采用弯折滤波器组(Warped filter banks,WFBS)来模拟人耳听觉特性,将立方根压缩算法、相对谱滤波技术(RASTA)、倒谱均值方差归一化算法(CMVN)引入到鲁棒特征的提取中. 在高斯混合模型(GMM)下进行仿真,实验结果表明该方法提取的特征参数在鲁棒性和识别性能上均优于MFCC特征参数和CFCC特征参数.  相似文献   

15.
针对多声源干扰环境下说话人识别系统性能急剧下降的问题,提出一种提取目标语音的前端处理方法,该方法依据独立语音时频域的近似稀疏性,基于目标语音方位信息采用非线性时频掩蔽方法提取目标语音。建立了基于梅尔倒谱系数(MFCC)的高斯混合模型(GMM)说话人识别系统。仿真实验证明,该方法能有效提取目标语音,提高说话人识别系统的鲁棒性。该文多声源干扰仿真实验条件下,说话人识别系统的识别率平均提高了25%左右。  相似文献   

16.
Speaker verification techniques neglect the short-time variation in the feature space even though it contains speaker related attributes. We propose a simple method to capture and characterize this spectral variation through the eigenstructure of the sample covariance matrix. This covariance is computed using sliding window over spectral features. The newly formulated feature vectors representing local spectral variations are used with classical and state-of-the-art speaker recognition systems. Results on multiple speaker recognition evaluation corpora reveal that eigenvectors weighted with their normalized singular values are useful in representing local covariance information. We have also shown that local variability features can be extracted using mel frequency cepstral coefficients (MFCCs) as well as using three recently developed features: frequency domain linear prediction (FDLP), mean Hilbert envelope coefficients (MHECs) and power-normalized cepstral coefficients (PNCCs). Since information conveyed in the proposed feature is complementary to the standard short-term features, we apply different fusion techniques. We observe considerable relative improvements in speaker verification accuracy in combined mode on text-independent (NIST SRE) and text-dependent (RSR2015) speech corpora. We have obtained up to 12.28% relative improvement in speaker recognition accuracy on text-independent corpora. Conversely in experiments on text-dependent corpora, we have achieved up to 40% relative reduction in EER. To sum up, combining local covariance information with the traditional cepstral features holds promise as an additional speaker cue in both text-independent and text-dependent recognition.  相似文献   

17.
This study analyzes the effect of degradation on human and automatic speaker verification (SV) tasks. The perceptual test is conducted by the subjects having knowledge about speaker verification. An automatic SV system is developed using the Mel-frequency cepstral coefficients (MFCC) and Gaussian mixture model (GMM). The human and automatic speaker verification performances are compared for clean train and different degraded test conditions. Speech signals are reconstructed in clean and degraded conditions by highlighting different speaker specific information and compared through perceptual test. The perceptual cues that the human subjects used as speaker specific information are investigated and their importance in degraded condition is highlighted. The difference in the nature of human and automatic SV tasks is investigated in terms of falsely accepted and falsely rejected speech pairs. Speech signals are reconstructed in clean and degraded conditions by highlighting different speaker specific information and compared through perceptual test. A discussion on human vs automatic speaker verification is carried out and the possibility of performance improvement of automatic speaker verification under degraded condition is suggested.  相似文献   

18.
目前说话人识别系统在理想环境下识别率已可达90%以上,但在实际通信环境下识别率却迅速下降.本文对信道失配环境下的鲁棒说话人识别进行研究.首先建立了一个基于高斯混合模型(GMM)的说话人识别系统,然后通过对实际通信信道的测试和分析,提出了两种改进方法.一是由实测数据建立了一个通用信道模型,将干净语音经通用信道模型滤波后再作为训练语音训练说话人模型;二是通过对比实测信道﹑理想低通信道及语音梅尔倒谱系数(MFCC)的特点,提出合理舍去语音第一﹑二维特征参数的方法.实验结果表明,通过处理后,系统在通信环境下的识别率提升了20%左右,与传统的倒谱均值减(CMS)方法相比,识别率提高了9%-12%.  相似文献   

19.
为了提高噪声中的说话人识别率,根据各维倒谱系数鉴别能力的不同,在识别过程中对GMM(Gauss mixture model)模型的各维分量直接加权,提出了直接倒谱加权的GMM模型,并且研究了在噪声情况下衡量各维特征鉴别能力的新方法。将该方法与MMSE(Minimum mean square error)相融合,对白噪声和地铁噪声进行实验,得到基线系统和MMSE增强系统在不同噪声情况下最优的加权窗函数。试验结果表明,直接倒谱加权GMM能显著提高系统识别精度。  相似文献   

20.
研究了基于美尔倒谱特征参数及高斯混合模型的文本无关的说话人识别系统,为了提高噪声环境下识别系统的识别率,从两个角度研究改善该系统抗噪性能的方法,即利用语音识别将文本无关的系统转化为文本有关的说话人识别方法和通过选择鲁棒性较强的帧进行说话人识别的方法,分析了以上方法对系统识别性能的改善作用,并通过实验验证上述方法确实可以提高系统在噪声环境下的识别率。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号