首页 | 官方网站   微博 | 高级检索  
     


Ameliorated language modelling for lecture speech recognition of Indian English
Authors:Disha Kaur Phull  G Bharadwaja Kumar
Affiliation:1.VIT University,Chennai,India
Abstract:A great amount of research is growing towards the automatic transcription of lectures that consist of numerous information and knowledge that could be helpful to the educational systems and institutes. In large vocabulary speech recognition, language model plays a paramount role in reducing the humongous search space. However, language modelling is very brittle when moving from one domain to another or when moving from read speech to spontaneous speech. Also, lecture speech recognition will have some of the characteristics of spontaneous speech. Hence, it is very challenging to build the language model for this task. In this paper, a judicious approach to adapt the language model in a way where the language model will be in close proximity to the topic spoken in the lecture speech has been depicted. The evaluation of the language model is devised using the proposed approach with the existing language models such as CMU Sphinx, Gigaword and HUB-4. We observed the results analysis that the language models devised from the proposed approach outperform from the existing language models in terms of word error rate, perplexity and out of vocabulary rate. Analysis shows that the presented two-phase approach has resulted in an average decrease of the word error rate to be approximately 14% and the perplexity is decreased by half on average.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号