首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 15 毫秒
1.
A connected digit speech recognition is important in many applications such as automated banking system, catalogue-dialing, automatic data entry, automated banking system, etc. This paper presents an optimum speaker-independent connected digit recognizer for Malayalam language. The system employs Perceptual Linear Predictive (PLP) cepstral coefficient for speech parameterization and continuous density Hidden Markov Model (HMM) in the recognition process. Viterbi algorithm is used for decoding. The training data base has the utterance of 21 speakers from the age group of 20 to 40 years and the sound is recorded in the normal office environment where each speaker is asked to read 20 set of continuous digits. The system obtained an accuracy of 99.5 % with the unseen data.  相似文献   

2.
In this paper, Krawtchouk moment-based shape features at lower orders are proposed for Indian sign language (ISL) recognition system which gives local information about the shape from a specific region of interest. The shape recognition capability of Krawtchouk moment-based local features is verified on two databases: the standard Jochen Triesch’s database and 26 ISL alphabets which are collected from 72 different subjects, with variations in position, scale and rotation. Feature selection is performed to minimise redundancy. The effect of order and feature dimensionality for different classifiers is studied. Results show that Krawtchouk moment-based local features are found to exhibit user, scale, rotation and translation invariance. Moreover, they have shape identification capability.  相似文献   

3.
N USHA RANI  P N GIRIJA 《Sadhana》2012,37(6):747-761
Speech is one of the most important communication channels among the people. Speech Recognition occupies a prominent place in communication between the humans and machine. Several factors affect the accuracy of the speech recognition system. Much effort was involved to increase the accuracy of the speech recognition system, still erroneous output is generating in current speech recognition systems. Telugu language is one of the most widely spoken south Indian languages. In the proposed Telugu speech recognition system, errors obtained from decoder are analysed to improve the performance of the speech recognition system. Static pronunciation dictionary plays a key role in the speech recognition accuracy. Modification should be performed in the dictionary, which is used in the decoder of the speech recognition system. This modification reduces the number of the confusion pairs which improves the performance of the speech recognition system. Language model scores are also varied with this modification. Hit rate is considerably increased during this modification and false alarms have been changing during the modification of the pronunciation dictionary. Variations are observed in different error measures such as F-measures, error-rate and Word Error Rate (WER) by application of the proposed method.  相似文献   

4.
张永锋  田勇  张阳 《声学技术》2015,34(1):51-53
抗噪连续语音识别是当前汉语连续语音识别的重要研究领域。采用通过度量连续语音帧之间频谱的稳定性,将连续语音切分成份,再将切分结果(无论时间长短)变换为与时间无关的大小固定的频谱空间特征,通过与模板库进行比较实现语音识别。新的频谱空间特征,与语音时长无关,同时表现出较好的抗噪声能力。在特定人连续语音识别测试系统中,取得了不错的识别效果。  相似文献   

5.
董滨  赵庆卫  颜永红 《声学技术》2006,25(5):473-477
提出了一种用于电话语音识别系统的置信度快速估计算法,此算法是在语音识别器帧同步束搜索的过程中基于状态图的同步估计算法,使用同识别器解码相同的声学模型进行置信度估计,此算法取得了比传统的两遍解码估计置信度算法更好的性能,而且计算复杂度较低,运行速度快,解决了计算置信度时使用模型的区分度与计算速度之间的矛盾。  相似文献   

6.
7.
研究了将语音识别中的DTW(DynamicTimeWarping,动态时间规整)算法用于声纹鉴别的技术。通过引入“样本域”的概念,由所给的有限个样本建立最大相似于样本点的样本域,计算被测样本的相似度。该算法提高了语音鉴别(区分不同发音者)的效率。有限人数的实验结果显示该算法辨伪率为98.75%(400人次),识别率81-93%(80人次)。  相似文献   

8.
9.
Lam  V.S.W. 《Software, IET》2008,2(5):391-403
Despite the fact that there has been a wide adoption of unified modelling language activity diagrams (UML ADs) for software development, research focusing on the equivalence notions of UML ADs is scarce. To address this area of concern, the author presents a sound theoretical foundation for UML ADs. Through the use of these formal definitions of UML ADs, the author propounds a method which classifies various types of equivalences of UML ADs in a systematic way. The proposed classification, which is the core result of our work, provides a framework that enables the study of the properties and inter-relationships of the equivalences.  相似文献   

10.
We investigate the incorporation of frequency masking curves in the feature extraction module of automatic speech recognition systems to improve noise robustness. Frequency-masking curves are mathematically derived based on an auditory model, in which a basilar membrane is modeled as a cascade system of damped simple harmonic oscillators. Based on the analysis of the motion under speech signals, we derive the relationship between the amplitudes of neighboring oscillators and convert it into frequency-masking curves, which are used in the computation of spectral-masking thresholds to modify the speech spectrum. Evaluated on the Aurora 2.0 noisy-digit speech database, the proposed methodology achieves a significant improvement in noise-robustness.  相似文献   

11.
The focus of this paper is to automatically segment and label continuous speech signal into syllable-like units for Indian languages. In this approach, the continuous speech signal is first automatically segmented into syllable-like units using group delay based algorithm. Similar syllable segments are then grouped together using an unsupervised and incremental training (UIT) technique. Isolated style HMM models are generated for each of the clusters during training. During testing, the speech signal is segmented into syllable-like units which are then tested against the HMMs obtained during training. This results in a syllable recognition performance of 42·6% and 39·94% for Tamil and Telugu. A new feature extraction technique that uses features extracted from multiple frame sizes and frame rates during both training and testing is explored for the syllable recognition task. This results in a recognition performance of 48·7% and 45·36%, for Tamil and Telugu respectively. The performance of segmentation followed by labelling is superior to that of a flat start syllable recogniser (27·8%and 28·8%for Tamil and Telugu respectively).  相似文献   

12.
支持向量机应用于语音情感识别的研究   总被引:3,自引:0,他引:3       下载免费PDF全文
为了有效识别包含在语音信号中情感信息的类型,提出一种将支持向量机应用于语音情感识别的新方法。利用支持向量机把提取的韵律情感特征数据映射到高维空间,从而构建最优分类超平面实现对汉语普通话中生气、高兴、悲伤、惊奇4种主要情感类型的识别。计算机仿真实验结果表明,与已有的多种语音情感识别方法相比,支持向量机对情感识别取得的识别效果优于其他方法。  相似文献   

13.
In order for an Intelligent Decision Support System (IDSS) to interact properly with a user, it must know what the user is doing. Accident Sequence Modelling (ASM) provides a possible frame of reference for monitoring operator activities, but it cannot be used directly: (1) operators may deviate from the scenario described in ASM, (2) the actual situation may developed differently from the scenario, (3) operators are normally involved in several activities at the same time, and (4) modelling of operator activities must focus on the level of individual actions, while the ASM only addresses the global view. The reference provided by the ASM scenario must therefore be supplemented by a more direct modelling of what the operator does. This requires a recognition of the operator's current plans, i.e. his goals and the strategies he employs to reach them. The paper describes an ongoing programme to develop an expert system that does this, within the ESPRIT project Graphical Dialogue Environment.  相似文献   

14.
15.
16.
基于小波纹理分析的文种识别方法,提取的是文档图像的整体特征,具有算法简单、处理较快的特点。我们对这种算法进行了改进,使用可变阈值代替距离公式,使得判决结果可按用户的意愿调节,并增加了拒识功能,用1420幅图进行实验,得到的准确率是76.40%。  相似文献   

17.
The Pitman shorthand language (PSL) is a recording medium practised in all organizations, where English is the transaction medium. It has the practical advantage of high speed of recording, more than 120–200 words per minute, because of which it is universally acknowledged. This recording medium has its continued existence in spite of considerable developments in speech processing systems, which are not universally established yet. In order to exploit the vast transcribing potential of PSL a new area of research on automation of PSL processing is conceived. It has three major steps, namely, shape recognition of PSL strokes, their validation and English text production from these strokes. The paper describes a knowledge-based approach for the recognition of PSL strokes. Information about location and the direction of the starting point and final point of strokes are considered the knowledge base for recognition of strokes. The work comprises preprocessing, determination of starting and final points, acquisition of quadrant knowledge, graph-based traversal and finally a rule-based inference process for generating phonetic equivalent of English language characters for the strokes. The proposed work is thoroughly tested for a large number of handwritten strokes.  相似文献   

18.
The complex and inconsistent nature of the English language presents problems for patent searchers researching the prior art. This is true for native speakers as well as for those who use it as a second language. These problems include confusion in translations; “Patentese”, the jargon used by patent attorneys; terminology, which can take time to be adopted; “faux amis”, words which you think you know as they look identical to foreign words; the oddities of English spelling; multiple meanings for the same words; words that have opposite meanings; synonyms; Americanisms as different spellings and different words; words that are both nouns and verbs; compound nouns, which are often spelt as two words; spelling mistakes; and syntax. Conclusions suggest using broad classes together with keywords; looking for synonyms; allowing for two words in compound nouns; using adjacency operators; combining sets of results; and using citation searching as an additional search, especially if little is found, or the invention is difficult to describe. A thesaurus of recommended words and spellings would be useful if adopted by those preparing abstracts.  相似文献   

19.
Static checking is key for the security of software components. As a component model, this paper considers a Java class enriched with annotations from the Java modelling language (JML). It defines a formal execution semantics for repetitive method invocations from this annotated class, called the class in isolation semantics. Afterwards, a pattern of liveness properties is defined, together with its formal semantics, providing a foundation for both static and runtime checking. This pattern is then inscribed in a complete language of temporal properties, called Java temporal pattern language, extending JML. The authors particularly address the verification of liveness properties by automatically translating the temporal properties into JML annotations for this class. This automatic translation is implemented in a tool called JML annotation generator. Correctness of the generated annotations ensures that the temporal property is established for the executions of the class in isolation.  相似文献   

20.
Synthesis of continuous and unlimited speech is a matter of theoretical as well as technological interest. Independent efforts are needed for synthesis in Indian languages which are substantially different from English and other European languages. The paper discusses basic synthesis issues like text-to-phoneme and phoneme-to-speech conversion and incorporation of prosody. The three commonly adopted methodologies of concatenation, formant and articulatory syntheses are compared. The TIFR phoneme-to-speech synthesizer which utilizes a standard formant synthesizer as a speech production model is described and the methodology for evolving and organizing formant-based rules to drive the used synthesizer is emphasized. The results of some perception tests are reported and a few potential applications are suggested. The direction of the future work for enhancing the quality and expanding the scope of the synthesizer is indicated. Deceased  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号