首页 | 官方网站   微博 | 高级检索  
     

基于码本映射和GMM的语音带宽扩展
引用本文:王迎雪,于莹莹,赵胜辉,匡镜明.基于码本映射和GMM的语音带宽扩展[J].北京理工大学学报,2017,37(9):970-974.
作者姓名:王迎雪  于莹莹  赵胜辉  匡镜明
作者单位:北京理工大学信息与电子学院,北京,100081;北京理工大学信息与电子学院,北京,100081;北京理工大学信息与电子学院,北京,100081;北京理工大学信息与电子学院,北京,100081
摘    要:采用传统的高斯混合模型(Gaussian mixture model,GMM)进行语音带宽扩展时,会出现所估计的特征参数过平滑的问题,其主要原因是协方差估计不准确而导致扩展的高频特征细节信息的丢失,因此本文提出了码本映射(codebook mapping,CM)与高斯混合模型相结合的语音带宽扩展算法.提取高、低频特征参数,并训练高斯混合模型,基于高斯混合模型参数训练偏移矢量的码本;在扩展阶段,利用偏移矢量的码本将低频偏移矢量映射为高频偏移矢量,再将高频偏移矢量与高斯混合模型估计部分相加作为估计的高频特征参数.对利用该方法进行带宽扩展后的语音质量进行主观/客观评测.实验结果表明,相比传统的GMM语音带宽方法,CM-GMM合成的高频语音更接近原始高频语音,明显消除了高频过平滑现象. 

关 键 词:语音带宽扩展  高斯混合模型  码本映射
收稿时间:2015/12/13 0:00:00

Speech Bandwidth Extension Based on Codebook Mapping and GMM
WANG Ying-xue,YU Ying-ying,ZHAO Sheng-hui and KUANG Jing-ming.Speech Bandwidth Extension Based on Codebook Mapping and GMM[J].Journal of Beijing Institute of Technology(Natural Science Edition),2017,37(9):970-974.
Authors:WANG Ying-xue  YU Ying-ying  ZHAO Sheng-hui and KUANG Jing-ming
Affiliation:School of Information and Electronics, Beijing Institute of Technology, Beijing 100081, China
Abstract:Speech bandwidth extension (BWE) based on the conventional Gaussian mixture model (GMM) often suffers from the overly smoothed problem, and the main reason is the low accuracy of the estimated covariance which results in the loss of specific high frequency feature. Thus, a speech bandwidth extension base on codebook mapping (CM) and GMM was proposed in this paper. Firstly, the feature of low frequency (LF) and high frequency (HF) were extracted, and the GMM model was trained. Then, an offset vector codebook was designed based on the trained GMM parameters. In the reconstruction phase, LF offset vectors were transformed to HF offset vectors according to the trained offset vector codebook. The final HF feature parameter was obtained by adding the HF offset vectors to the estimated part by GMM. It is shown by subjective evaluations and objective evaluations that the CM-GMM significantly overcomes the overly smoothed problem and obviously improves the quality of the synthesized speech signals compared with the conventional GMM-based BWE method.
Keywords:speech bandwidth extension  gaussian mixture model (GMM)  codebook mapping
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《北京理工大学学报》浏览原始摘要信息
点击此处可从《北京理工大学学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号