Ameliorated language modelling for lecture speech recognition of Indian English期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Ameliorated language modelling for lecture speech recognition of Indian English

Authors:	Disha Kaur Phull G Bharadwaja Kumar

Affiliation:	1.VIT University,Chennai,India

Abstract:	A great amount of research is growing towards the automatic transcription of lectures that consist of numerous information and knowledge that could be helpful to the educational systems and institutes. In large vocabulary speech recognition, language model plays a paramount role in reducing the humongous search space. However, language modelling is very brittle when moving from one domain to another or when moving from read speech to spontaneous speech. Also, lecture speech recognition will have some of the characteristics of spontaneous speech. Hence, it is very challenging to build the language model for this task. In this paper, a judicious approach to adapt the language model in a way where the language model will be in close proximity to the topic spoken in the lecture speech has been depicted. The evaluation of the language model is devised using the proposed approach with the existing language models such as CMU Sphinx, Gigaword and HUB-4. We observed the results analysis that the language models devised from the proposed approach outperform from the existing language models in terms of word error rate, perplexity and out of vocabulary rate. Analysis shows that the presented two-phase approach has resulted in an average decrease of the word error rate to be approximately 14% and the perplexity is decreased by half on average.

Keywords:
本文献已被 SpringerLink 等数据库收录！

设为首页 | 免责声明 | 关于勤云 | 加入收藏