群延时谱参数在汉语数字语音识别中的应用 |
| |
引用本文: | 周峰,俞一彪. 群延时谱参数在汉语数字语音识别中的应用[J]. 信号处理, 2017, 33(9): 1215-1220. DOI: 10.16798/j.issn.1003-0530.2017.09.008 |
| |
作者姓名: | 周峰 俞一彪 |
| |
作者单位: | 苏州大学电子信息学院 |
| |
基金项目: | 国家自然科学基金(61271360);江苏省自然科学基金(BK20131196) |
| |
摘 要: | 汉语数字语音之间的高混淆性直接影响了汉语数字语音识别的效果,传统的语音识别方法很难对易混淆的语音做出有效的区分。本文提出了一种多参数、多级识别策略,先采用MEL谱参数基于HMM进行初级数字语音识别,然后对易混淆的数字对采用一种新的群延时谱参数——RRCGD-CC(Reflected Roots Chirp Group Delay-Cepstral Coefficients)基于SVM进行二次分类。实验结果表明,通过多参数多级识别方法,数字“2”和“8”的识别率提高了8%,数字识别系统的整体识别率提高了2.3%。这一结果充分说明了本文提出的多参数多级识别方法有利于提高汉语数字语音识别系统的识别性能,同时也说明了RRCGD-CC在易混淆数字语音的识别上是有效的。
|
关 键 词: | 数字识别 群延时 多级识别 |
收稿时间: | 2017-03-28 |
Application of group delay spectrum parameters in mandarin digit speech recognition |
| |
Affiliation: | School of Electronic and Information Engineering, Soochow University |
| |
Abstract: | The high confusion between Chinese digits directly affects the performance of Chinese digit speech recognition. Traditional methods are difficult to make an effective distinction between easy-confused digits. This paper presents a multi-parameter and multi-level recognition strategy. Firstly the digits are recognized by Mel spectral parameters based on HMM, then take secondary classification for the easy-confused digits using RRCGD-CC(Reflected Roots Chirp Group Delay-Cepstral Coefficients), which is a new parameter based on group delay spectrum, and SVM. The experimental results show that the recognition rate of“2”and”8” is improved by 8%, and the recognition rate of the system is improved by 2.3%. This result is fully explained that the RRCGD-CC is valid for easily confused digits. |
| |
Keywords: | |
|
| 点击此处可从《信号处理》浏览原始摘要信息 |
|
点击此处可从《信号处理》下载全文 |
|