首页 | 官方网站   微博 | 高级检索  
文章检索
  按 检索   检索词:      
出版年份:   被引次数:   他引次数: 提示:输入*表示无穷大
  收费全文   3篇
  免费   0篇
工业技术   3篇
  2016年   1篇
  2003年   1篇
  1999年   1篇
排序方式: 共有3条查询结果,搜索用时 13 毫秒
1
1.
Is adaptation of English NLP applications the right way to gomultilingual? Should one prefer ``language-independent'' systems with aview to applying them to a large number of different languages? Experience from the processing of Portuguese in several differentareas (part-of-speech tagging, corpus tools, lexical decomposition,machine translation, etc.) suggests that neither of these offers a satisfactory solution. This paper argues for a thorough study of the way individual languageswork in order to develop applications suited for the language inquestion, i.e., ``language-dependent'' systems.  相似文献   
2.
ABSTRACT

The Albanian diaspora is one of the largest in the world, compared to the population it originates from. Its degree of assimilation varies depending on time, settlement, and other social factors. While the family played the major role in preserving the identity, nowadays it is not sufficient to convey identity to the younger generations that spend substantial time online and need adequate web resources. This article aims to analyze the main websites created by and for the Albanian diaspora in Albanian, and some initiatives of national and private institutions, which present the intangible and tangible heritage to the wider diaspora.  相似文献   
3.
Statistical approaches in speech technology, whether used for statistical language models, trees, hidden Markov models or neural networks, represent the driving forces for the creation of language resources (LR), e.g., text corpora, pronunciation and morphology lexicons, and speech databases. This paper presents a system architecture for the rapid construction of morphologic and phonetic lexicons, two of the most important written language resources for the development of ASR (automatic speech recognition) and TTS (text-to-speech) systems. The presented architecture is modular and is particularly suitable for the development of written language resources for inflectional languages. In this paper an implementation is presented for the Slovenian language. The integrated graphic user interface focuses on the morphological and phonetic aspects of language and allows experts to produce good performances during analysis. In multilingual TTS systems, many extensive external written language resources are used, especially in the text processing part. It is very important, therefore, that representation of these resources is time and space efficient. It is also very important that language resources for new languages can be easily incorporated into the system, without modifying the common algorithms developed for multiple languages. In this regard the use of large external language resources (e.g., morphology and phonetic lexicons) represent an important problem because of the required space and slow look-up time. This paper presents a method and its results for compiling large lexicons, using examples for compiling German phonetic and morphology lexicons (CISLEX), and Slovenian phonetic (SIflex) and morphology (SImlex) lexicons, into corresponding finite-state transducers (FSTs). The German lexicons consisted of about 300,000 words, SIflex consisted of about 60,000 and SImlex of about 600,000 words (where 40,000 words were used for representation using finite-state transducers). Representation of large lexicons using finite-state transducers is mainly motivated by considerations of space and time efficiency. A great reduction in size and optimal access time was achieved for all lexicons. The starting size for the German phonetic lexicon was 12.53 MB and 18.49 MB for the morphology lexicon. The starting size for the Slovenian phonetic lexicon was 1.8 MB and 1.4 MB for the morphology lexicon. The final size of the corresponding FSTs was 2.78 MB for the German phonetic lexicon, 6.33 MB for the German morphology lexicon, 253 KB for SIflex and 662 KB for the SImlex lexicon. The achieved look-up time is optimal, since it only depends on the length of the input word and not on the size of the lexicon. Integration of lexicons for new languages into the multilingual TTS system is easy when using such representations and does not require any changes in the algorithms used for such lexicons.  相似文献   
1
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号