期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

全文获取类型

收费全文	3篇
免费	0篇

学科分类

工业技术

3篇

出版年

2016年	1篇
2003年	1篇
1999年	1篇

排序方式： 共有3条查询结果，搜索用时 13 毫秒

Toward Language-dependent Applications

Diana Santos 《Machine Translation》1999,14(2):83-112

Is adaptation of English NLP applications the right way to gomultilingual? Should one prefer ``language-independent'' systems with aview to applying them to a large number of different languages? Experience from the processing of Portuguese in several differentareas (part-of-speech tagging, corpus tools, lexical decomposition,machine translation, etc.) suggests that neither of these offers a satisfactory solution. This paper argues for a thorough study of the way individual languageswork in order to develop applications suited for the language inquestion, i.e., ``language-dependent'' systems. 相似文献

Transmitting Albanian Cultural Identity in the Age of the Internet

Bendis Pustina 《New Review of Information Networking》2016,21(1):24-36

ABSTRACT

The Albanian diaspora is one of the largest in the world, compared to the population it originates from. Its degree of assimilation varies depending on time, settlement, and other social factors. While the family played the major role in preserving the identity, nowadays it is not sufficient to convey identity to the younger generations that spend substantial time online and need adequate web resources. This article aims to analyze the main websites created by and for the Albanian diaspora in Albanian, and some initiatives of national and private institutions, which present the intangible and tangible heritage to the wider diaspora. 相似文献

Efficient Development of Lexical Language Resources and their Representation

Matej Rojc Zdravko Kačič 《International Journal of Speech Technology》2003,6(3):259-275

Statistical approaches in speech technology, whether used for statistical language models, trees, hidden Markov models or neural networks, represent the driving forces for the creation of language resources (LR), e.g., text corpora, pronunciation and morphology lexicons, and speech databases. This paper presents a system architecture for the rapid construction of morphologic and phonetic lexicons, two of the most important written language resources for the development of ASR (automatic speech recognition) and TTS (text-to-speech) systems. The presented architecture is modular and is particularly suitable for the development of written language resources for inflectional languages. In this paper an implementation is presented for the Slovenian language. The integrated graphic user interface focuses on the morphological and phonetic aspects of language and allows experts to produce good performances during analysis. In multilingual TTS systems, many extensive external written language resources are used, especially in the text processing part. It is very important, therefore, that representation of these resources is time and space efficient. It is also very important that language resources for new languages can be easily incorporated into the system, without modifying the common algorithms developed for multiple languages. In this regard the use of large external language resources (e.g., morphology and phonetic lexicons) represent an important problem because of the required space and slow look-up time. This paper presents a method and its results for compiling large lexicons, using examples for compiling German phonetic and morphology lexicons (CISLEX), and Slovenian phonetic (SIflex) and morphology (SImlex) lexicons, into corresponding finite-state transducers (FSTs). The German lexicons consisted of about 300,000 words, SIflex consisted of about 60,000 and SImlex of about 600,000 words (where 40,000 words were used for representation using finite-state transducers). Representation of large lexicons using finite-state transducers is mainly motivated by considerations of space and time efficiency. A great reduction in size and optimal access time was achieved for all lexicons. The starting size for the German phonetic lexicon was 12.53 MB and 18.49 MB for the morphology lexicon. The starting size for the Slovenian phonetic lexicon was 1.8 MB and 1.4 MB for the morphology lexicon. The final size of the corresponding FSTs was 2.78 MB for the German phonetic lexicon, 6.33 MB for the German morphology lexicon, 253 KB for SIflex and 662 KB for the SImlex lexicon. The achieved look-up time is optimal, since it only depends on the length of the input word and not on the size of the lexicon. Integration of lexicons for new languages into the multilingual TTS system is easy when using such representations and does not require any changes in the algorithms used for such lexicons. 相似文献