首页 | 官方网站   微博 | 高级检索  
     

基于多特征i-vector的短语音说话人识别算法
引用本文:孙念,张毅,林海波,黄超.基于多特征i-vector的短语音说话人识别算法[J].计算机应用,2018,38(10):2839-2843.
作者姓名:孙念  张毅  林海波  黄超
作者单位:1. 重庆邮电大学 先进制造工程学院, 重庆 400065;2. 重庆邮电大学 自动化学院, 重庆 400065
基金项目:重庆市基础科学与前沿技术研究专项重点项目(cstc2015jcyjBX0066)。
摘    要:当测试语音时长充足时,单一特征的信息量和区分性足够完成说话人识别任务,但是在测试语音很短的情况下,语音信号里缺乏充分的说话人信息,使得说话人识别性能急剧下降。针对短语音条件下的说话人信息不足的问题,提出一种基于多特征i-vector的短语音说话人识别算法。该算法首先提取不同的声学特征向量组合成一个高维特征向量,然后利用主成分分析(PCA)去除高维特征向量的相关性,使特征之间正交化,最后采用线性判别分析(LDA)挑选出最具区分性的特征,并且在一定程度上降低空间维度,从而实现更好的说话人识别性能。结合TIMIT语料库进行实验,同一时长的短语音(2 s)条件下,所提算法比基于i-vector的单一的梅尔频率倒谱系数(MFCC)、线性预测倒谱系数(LPCC)、感知对数面积比系数(PLAR)特征系统在等错误率(EER)上分别有相对72.16%、69.47%和73.62%的下降。不同时长的短语音条件下,所提算法比基于i-vector的单一特征系统在EER和检测代价函数(DCF)上大致都有50%的降低。基于以上两种实验的结果充分表明了所提算法在短语音说话人识别系统中可以充分提取说话人的个性信息,有利地提高说话人识别性能。

关 键 词:说话人识别  i-vector  短语音  多特征  主成分分析  线性判别分析  
收稿时间:2018-03-23
修稿时间:2018-05-30

Short utterance speaker recognition algorithm based on multi-featured i-vector
SUN Nian,ZHANG Yi,LIN Haibo,HUANG Chao.Short utterance speaker recognition algorithm based on multi-featured i-vector[J].journal of Computer Applications,2018,38(10):2839-2843.
Authors:SUN Nian  ZHANG Yi  LIN Haibo  HUANG Chao
Affiliation:1. School of Advanced Manufacturing Engineering, Chongqing University of Posts and Telecommunications, Chongqing 400065, China;2. School of Automation, University of Posts and Telecommunications, Chongqing 400065, China
Abstract:When the length of the test speech is sufficient, the information and discrimination of single feature is sufficient to complete the speaker recognition task. However, when the length of the test speech was very short, the performance of speaker recognition is decreased significantly due to the small data size and insufficient discrimination. Aiming at the problem of insufficient speaker information under the short speech condition, a short utterance speaker recognition algorithm based on multi-featured i-vector was proposed. Firstly, different acoustic feature vectors were extracted and combined into a high-dimensional feature vector. Then Principal Component Analysis (PCA) was used to remove the correlation of the feature vectors, so that the features were orthogonalized. Finally, the most discriminating features were picked out by Linear Discriminant Analysis (LDA), which led to reduce the spatial dimension. Therefore, this multi-featured system can achieve a better speaker recognition performance. With the TIMIT corpus under the same short speech (2 s) condition, the experimental results showed that the Equal Error Rate (EER) of the multi-featured system decreased respectively by 72.16%, 69.47% and 73.62% compared with the single-featured systems including Mel-Frequency Cepstrum Coefficient (MFCC), Linear Prediction Cepstrum Coefficient (LPCC) and Perceptual Log Area Ratio (PLAR) based on i-vector. For the different lengths of the short speech, the proposed algorithm provided rough 50% improvement on EER and Detection Cost Function (DCF) compared with the single-featured system based on i-vector. Experimental results fully indicate that the multi-featured system can make full use of the speaker's characteristic information in the short utterance speaker recognition, and improves the speaker recognition performance.
Keywords:speaker recognition  i-vector  short utterance  multi-feature  Principal Component Analysis (PCA)  Linear Discriminant Analysis (LDA)  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号