首页 | 官方网站   微博 | 高级检索  
     

有效频带多分辨率特征提取及说话人年龄识别
引用本文:杜先娜,俞一彪.有效频带多分辨率特征提取及说话人年龄识别[J].信号处理,2016,32(9):1101-1107.
作者姓名:杜先娜  俞一彪
作者单位:苏州大学电子信息学院
摘    要:针对文本无关非特定说话人年龄识别,本文提出了一种基于有效频带多分辨率特征的统计分析识别方法。输入语音,通过小波包变换进行有效频带分解,然后将各有效频带的小波包系数连接构成一个整体计算美尔频率倒谱系数,得到有效频带多分辨率特征参数WPMFC(Wavelet Packet Mel-Frequency Cepstrum),说话人按年龄划分为儿童、青年、中年和老年四个阶段,并进一步按性别训练各年龄段语音得到8个高斯混合模型。测试语音依据最大似然准则进行识别判决。实验对本文提出的方法与传统的短时谱统计分析方法进行了比较,结果显示本文提出的方法有较好的识别性能,集内平均识别率达到65.17%。同时,实验结果也说明相对语音文本变化的影响,不同说话人发音特征的变化对识别性能的影响更大。 

关 键 词:说话人年龄识别    有效频带    多分辨率特征    小波包变换
收稿时间:2016-02-19

Multi resolution feature extraction of effective frequency bands for age recognition
Affiliation:School of Electronic and Information Engineering, Soochow University
Abstract:For speaker and text independent age recognition, a new multi-resolution feature extraction algorithm is proposed. The input speech is decomposed by wavelet packet transform, and then the wavelet packet coefficients of each effective frequency band are connected to form a intermediate signal for further calculating of its Mel-frequency cepstrum coefficients which is called Wavelet Packet Mel-Frequency Cepstrum Coefficient (WPMFC). The speaker age is divided into four age groups such as children, youths, adult and older, and totally eight Gaussian mixture models are trained for each age group and gender. Testing speech recognition decision is based on maximum likelihood criterion. The results of experimental prove that the performance of age recognition based on proposed feature extraction algorithm is successful compared with traditional short time spectral statistical analysis methods, the average recognition rate of outset speaker age reached 65.17%. What’s more, comparing with the influence of the change of the voice content, the change of the characteristics of the speaker's pronunciation has more influence on the recognition performance. 
Keywords:
点击此处可从《信号处理》浏览原始摘要信息
点击此处可从《信号处理》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号