特征联合优化深度信念网络的语音增强算法 Feature Joint Optimization of Deep Belief Network for Speech Enhancement期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

特征联合优化深度信念网络的语音增强算法

引用本文：	王雁,贾海蓉,吉慧芳,王卫梅.特征联合优化深度信念网络的语音增强算法[J].计算机工程与应用,2019,55(9):38-42.

作者姓名：	王雁贾海蓉吉慧芳王卫梅

作者单位：	太原理工大学信息与计算机学院,山西榆次,030600;太原理工大学信息与计算机学院,山西榆次,030600;太原理工大学信息与计算机学院,山西榆次,030600;太原理工大学信息与计算机学院,山西榆次,030600

基金项目：	国家自然科学基金;山西省自然科学基金

摘要：	针对深度信念网络(Deep Believe Network,DBN)模型泛化能力较弱,导致语音增强效果不佳的问题,提出了一种特征联合优化的回归DBN语音增强算法。该算法对语音和噪声不做任何假设。该算法分别提取语音信号的LMPS(Log-Mel frequency Power Spectrum)和MFCC(Mel-Frequency Cepstral Coefficients)特征。LMPS用于直接重构增强语音,保证了语音听觉质量,MFCC作为辅助次级特征。将两种特征联合输入到DBN体系中对网络参数进行优化。这种联合优化在对LMPS的直接预测中加入MFCC限制,提升了模型对LMPS估计的泛化能力,更加准确地重构增强语音。仿真结果表明,在不同的信噪比环境下,与LPS(Log Power Spectrum)和LMPS单特征优化相比,LMPS和MFCC联合优化使增强语音获得了较高的PESQ和SNR,提高了语音质量和可懂度。
关键词：	深度信念网络语音增强联合优化回归
Feature Joint Optimization of Deep Belief Network for Speech Enhancement

WANG Yan,JIA Hairong,JI Huifang,WANG Weimei.Feature Joint Optimization of Deep Belief Network for Speech Enhancement[J].Computer Engineering and Applications,2019,55(9):38-42.

Authors:	WANG Yan JIA Hairong JI Huifang WANG Weimei

Affiliation:	College of Information and Computer, Taiyuan University of Technology, Yuci, Shanxi 030600, China

Abstract:	Concerning the problem that the poor generalization ability of Deep Believe Network（DBN） which leads to poor speech enhancement performance, a regression DBN speech enhancement algorithm based on features jointing optimization is proposed. It is not necessary to make any assumptions about speech and noise in advance. The Log-Mel frequency Power Spectrum（LMPS） of speech is extracted to be used directly for constructing the enhanced speech signals to ensure the quality of speech hearing, and the Mel-Frequency Cepstral Coefficients（MFCC） of speech is extracted as an auxiliary features, respectively. All the parameters of the original deep belief network architecture are optimized by integrating the combination feature into DBN system. This joint optimization estimation scheme imposes MFCC constraints not available in the direct prediction of LMPS, and improves the generalization ability of the model to estimate the LMPS, and reconstructs the enhanced speech more accurately. Simulation results in different SNR enviroment show that compared with single feature optimization such as Log Power Spectrum（LPS） and LMPS, LMPS and MFCC joint optimization can enable the enhanced speech obtain higher PESQ and SNR, and improve speech quality and intelligibility.

Keywords:	Deep Believe Network（DBN） speech enhancement joint optimization regression
本文献已被维普万方数据等数据库收录！
	点击此处可从《计算机工程与应用》浏览原始摘要信息
	点击此处可从《计算机工程与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏