Towards robustness to speech rate in mandarin all-syllable recognition |
| |
Authors: | Email author" target="_blank">Chen?YiNing?Email author Zhu?Xuan Liu?Jia Liu?RunSheng |
| |
Affiliation: | (1) Department of Electronic Engineering, Tsinghua University, 100084 Beijing, P.R. China |
| |
Abstract: | In mandarin all-syllable recognition, many insert errors occur due to the influence of non-consonant syllables. Introducing the duration model into the recognition process is a direct way to lessen these errors. But that usually could not work well as expected, for the duration is sensitive to speech rate. Hence, aiming at this problem, a novel context dependent duration distribution normalized by speech rate is proposed in this paper and applied to a speech recognition system based on the frame of improved Hidden Markov Model (HMM). To realize this algorithm, the authors employ a new method to estimate the speech rate of a sentence; then compute the duration probability combined with speech rate; and finally implement this duration information in the post-processing stage. With little change in the recognition process and resource demand, the duration model is adopted efficiently in the system. The experimental results indicate that the syllable error rates decrease significantly in two different speech corpora. Especially for the insertions, the error rates reduce about sixty to eighty percent. |
| |
Keywords: | speech recognition speech rate duration distribution |
本文献已被 CNKI 维普 万方数据 SpringerLink 等数据库收录! |
| 点击此处可从《计算机科学技术学报》浏览原始摘要信息 |
|
点击此处可从《计算机科学技术学报》下载全文 |