首页 | 官方网站   微博 | 高级检索  
     


Reviewing automatic language identification
Authors:Muthusamy  YK Barnard  E Cole  RA
Affiliation:Syst. & Inf. Sci. Lab., Texas Instrum. Inc., Dallas, TX;
Abstract:The Oregon Graduate Institute Multi-language Telephone Speech Corpus (OGI-TS) was designed specifically for language identification research. It currently consists of spontaneous and fixed-vocabulary utterances in 11 languages: English, Farsi, French, German, Hindi, Japanese, Korean, Mandarin, Spanish, Tamil, and Vietnamese. These utterances were produced by 90 native speakers in each language over real telephone lines. Language identification is related to speaker-independent speech recognition and speaker identification in several interesting ways. It is therefore not surprising that many of the recent developments in language identification can be related to developments in those two fields. We review some of the more important recent approaches to language identification against the background of successes in speaker and speech recognition. In particular, we demonstrate how approaches to language identification based on acoustic modeling and language modeling, respectively, are similar to algorithms used in speaker-independent continuous speech recognition. Thereafter, prosodic and duration-based information sources are studied. We then review an approach to language identification that draws heavily on speaker identification. Finally, the performance of some representative algorithms is reported
Keywords:
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号