Emotional speech feature normalization and recognition based on speaker-sensitive feature clustering |
| |
Authors: | Chengwei Huang Baolin Song Li Zhao |
| |
Affiliation: | 1.School of Information Science and Engineering,Southeast University,Nanjing,China |
| |
Abstract: | In this paper we propose a feature normalization method for speaker-independent speech emotion recognition. The performance of a speech emotion classifier largely depends on the training data, and a large number of unknown speakers may cause a great challenge. To address this problem, first, we extract and analyse 481 basic acoustic features. Second, we use principal component analysis and linear discriminant analysis jointly to construct the speaker-sensitive feature space. Third, we classify the emotional utterances into pseudo-speaker groups in the speaker-sensitive feature space by using fuzzy k-means clustering. Finally, we normalize the original basic acoustic features of each utterance based on its group information. To verify our normalization algorithm, we adopt a Gaussian mixture model based classifier for recognition test. The experimental results show that our normalization algorithm is effective on our locally collected database, as well as on the eNTERFACE’05 Audio-Visual Emotion Database. The emotional features achieved using our method are robust to the speaker change, and an improved recognition rate is observed. |
| |
Keywords: | |
本文献已被 SpringerLink 等数据库收录! |
|