Sparse Representation with Optimized Learned Dictionary for Robust Voice Activity Detection期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Sparse Representation with Optimized Learned Dictionary for Robust Voice Activity Detection

Authors:	Datao You Jiqing Han Guibin Zheng Tieran Zheng Jie Li

Affiliation:	1. School of Computer Science and Technology, Harbin Institute of Technology, 92 West Dazhi Street, Nan Gang District, Harbin?, 150001, China 2. Henan University, Minglun Street, Kaifeng?, 475001, China

Abstract:	Traditionally, most of voice activity detection (VAD) methods are based on speech features such as spectrum, temporal energy, and periodicity. The robustness of these features plays a critical role on the performance of VAD. However, since these features are always directly generated from observed signal, the robustness of these features would be significantly degraded in non-stationary noise environments, especially at low level signal-to-noise ratio (SNR) condition. This paper proposes a kind of robust feature for VAD based on sparse representation with an optimized learned dictionary. To do so, a speech dictionary and a noise dictionary are first learned from speech corpus and noise corpus, respectively. Then an optimization algorithm is designed to reduce the mutual coherence between the two learned dictionaries. After that the proposed feature is generated from the optimized dictionary-based sparse representation, and a VAD method is derived from the proposed feature. The proposed method is evaluated over seven types of noise and four types of SNR level, experimental results show that the optimized dictionary is important for enhancing the robustness of the proposed method, and the proposed method performs well under non-stationary noise, especially at low level SNR condition.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏