首页 | 官方网站   微博 | 高级检索  
     

特征联合优化深度信念网络的语音增强算法
引用本文:王雁,贾海蓉,吉慧芳,王卫梅.特征联合优化深度信念网络的语音增强算法[J].计算机工程与应用,2019,55(9):38-42.
作者姓名:王雁  贾海蓉  吉慧芳  王卫梅
作者单位:太原理工大学 信息与计算机学院,山西 榆次,030600;太原理工大学 信息与计算机学院,山西 榆次,030600;太原理工大学 信息与计算机学院,山西 榆次,030600;太原理工大学 信息与计算机学院,山西 榆次,030600
基金项目:国家自然科学基金;山西省自然科学基金
摘    要:针对深度信念网络(Deep Believe Network,DBN)模型泛化能力较弱,导致语音增强效果不佳的问题,提出了一种特征联合优化的回归DBN语音增强算法。该算法对语音和噪声不做任何假设。该算法分别提取语音信号的LMPS(Log-Mel frequency Power Spectrum)和MFCC(Mel-Frequency Cepstral Coefficients)特征。LMPS用于直接重构增强语音,保证了语音听觉质量,MFCC作为辅助次级特征。将两种特征联合输入到DBN体系中对网络参数进行优化。这种联合优化在对LMPS的直接预测中加入MFCC限制,提升了模型对LMPS估计的泛化能力,更加准确地重构增强语音。仿真结果表明,在不同的信噪比环境下,与LPS(Log Power Spectrum)和LMPS单特征优化相比,LMPS和MFCC联合优化使增强语音获得了较高的PESQ和SNR,提高了语音质量和可懂度。

关 键 词:深度信念网络  语音增强  联合优化  回归

Feature Joint Optimization of Deep Belief Network for Speech Enhancement
WANG Yan,JIA Hairong,JI Huifang,WANG Weimei.Feature Joint Optimization of Deep Belief Network for Speech Enhancement[J].Computer Engineering and Applications,2019,55(9):38-42.
Authors:WANG Yan  JIA Hairong  JI Huifang  WANG Weimei
Affiliation:College of Information and Computer, Taiyuan University of Technology, Yuci, Shanxi 030600, China
Abstract:Concerning the problem that the poor generalization ability of Deep Believe Network(DBN) which leads to poor speech enhancement performance, a regression DBN speech enhancement algorithm based on features jointing optimization is proposed. It is not necessary to make any assumptions about speech and noise in advance. The Log-Mel frequency Power Spectrum(LMPS) of speech is extracted to be used directly for constructing the enhanced speech signals to ensure the quality of speech hearing, and the Mel-Frequency Cepstral Coefficients(MFCC) of speech is extracted as an auxiliary features, respectively. All the parameters of the original deep belief network architecture are optimized by integrating the combination feature into DBN system. This joint optimization estimation scheme imposes MFCC constraints not available in the direct prediction of LMPS, and improves the generalization ability of the model to estimate the LMPS, and reconstructs the enhanced speech more accurately. Simulation results in different SNR enviroment show that compared with single feature optimization such as Log Power Spectrum(LPS) and LMPS, LMPS and MFCC joint optimization can enable the enhanced speech obtain higher PESQ and SNR, and improve speech quality and intelligibility.
Keywords:Deep Believe Network(DBN)  speech enhancement  joint optimization  regression  
本文献已被 维普 万方数据 等数据库收录!
点击此处可从《计算机工程与应用》浏览原始摘要信息
点击此处可从《计算机工程与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号