An unsupervised approach for co-channel speech separation using Hilbert–Huang transform and Fuzzy C-Means clustering期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

An unsupervised approach for co-channel speech separation using Hilbert–Huang transform and Fuzzy C-Means clustering

Authors:	M K Prasanna Kumar R Kumaraswamy

Affiliation:	1.BMS College of Engineering,Bangalore,India;2.Siddaganga Institute of Technology,Tumkur,India

Abstract:	In this paper we discuss an unsupervised approach for co-channel speech separation where two speakers are speaking simultaneously over same channel. We propose a two stage separation process where the initial stage is based on empirical mode decomposition (EMD) and Hilbert transform generally known as Hilbert–Huang transform. EMD decomposes the mixed signal into oscillatory functions known as intrinsic mode functions. Hilbert transform is applied to find the instantaneous amplitudes and Fuzzy C-Means clustering is applied to group the speakers at initial stage. In second stage of separation speaker groups are transformed into time–frequency domain using short time Fourier transform (STFT). Time–frequency ratio’s are computed by dividing the STFT matrix of mixed speech signal and STFT matrix of stage1 recovered speech signals. Histogram of the ratios obtained can be used to estimate the ideal binary mask for each speaker. These masks are applied to the speech mixture and the underlying speakers are estimated. Masks are estimated from the speech mixture and helps in imputing the missing values after stage1 grouping of speakers. Results obtained show significant improvement in objective measures over other existing single-channel speech separation methods.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏