期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

基于HMM的语音识别系统的Matlab仿真

沈泉波《电声技术》2012,36(10):56-57,70

隐马尔可夫模型(HMM)已成为语音识别中的主流技术,首先介绍了语音识别技术的原理和结构,然后介绍了HMM的三个基本问题及其解决方法,最后利用Matlab仿真工具设计了一个孤立词的语音识别系统,实现了数字0～9的识别. 相似文献

2.

从线性预测HMM到一种新的语音识别的混合模型 总被引：1，自引：0，他引：1

下载免费PDF全文

欧智坚王作英《电子学报》2002,30(9):1313-1316

线性预测HMM(Linear Prediction HMM,LPHMM)并没有象传统HMM那样引入状态输出独立同分布假设,但实用中识别性能并不佳.通过分析两种HMM的各自优劣,本文提出了一种新的语音识别的混合模型,将语音静态特性(基于传统HMM)和动态特性(基于LPHMM)分别描述又有机结合在一起,更为精确地刻划了真实的语音现象,同时又继承使系统的实现改动很小和较小的计算量.汉语大词汇量非特定人连续语音识别的实验表明,混合模型的识别性能显著好于LPHMM和传统HMM.理论上,本文还给出了LPHMM的一组闭式参数重估公式. 相似文献

3.

基于HMM和PNN的混合语音识别模型

李战明苏敏赵正天李二超《电声技术》2007,31(12):44-46,50

基于隐马尔可夫模型(HMM)和改进后的概率神经网络(PNN)模型提出了一种用于语音识别的混合模型,该模型首先利用HMM生成最佳语音状态序列,然后对最佳状态序列进行时间规整,最后通过PNN神经网络进行分类识别。给出了HMM参数训练及时间规整的算法。实验结果表明这种模型比HMM具有更好的识别效果。相似文献

4.

基于连续分布型HMM的汉语连续语音的声调识别方法

赵力邹采荣吴镇扬《信号处理》2000,16(1):20-23

本文介绍了基于连续分布型HMM的汉语连续语音声调识别方法,提出了一个适合于汉语连续语音声调识别的特征参数提取和识别方案.通过对汉语连续语音声调特点的分析,选择了8个音节单位的连续分布型HMM作为声调识别用基元模型进行识别试验,识别结果表明,10名话者1070个句子的连续语音声调识别的平均识别率是95.1%. 相似文献

5.

基于HMM和LVQ网络混合模型的语音识别方法

吴金南宫宁生《微电子学与计算机》2009,26(3)

提出一种基于隐马尔可夫模型(HMM)和学习向量量化(LVQ)神经网络的语音识别方法.该方法先用HMM生成最佳语音状态序列,然后用函数逼近技术产生对最佳状态序列进行时闻归正,最后通过LVQ神经网络进行分类识别.理论和实验结果表明,混合模型的识别率明显高于隐马尔可夫模型的识别率. 相似文献

6.

基于HMM模型语音识别系统中声学模型的建立

胡石章毅陈芳陈心怡《通讯世界》2017,(8):233-234

语音识别是近些年来一项高速发展的技术.让计算机识别人的语音,甚至让人和计算机进行交流是所有从事模式识别专业人的梦寐以求的理想.本文主要介绍了应用于模式识别系统中的隐马尔科夫模型的基础理论,以及在隐马尔科夫模型理论的基础上建立了一种语音识别系统.详细探讨了这种模型系统中声学模型的建立过程,最后提出了这种基于HMM模型的语音识别系统的优点和改进展望. 相似文献

7.

基于段长分布的HMM语音识别模型 总被引：23，自引：0，他引：23

下载免费PDF全文

王作英肖熙《电子学报》2004,32(1):46-49

本文针对齐次HMM语音识别模型在使用段长信息时存在的缺陷,形式化地定义了一种适合语音信号描述的自左向右非齐次隐含马尔科夫模型,证明了这种模型的状态转移概率表示与状态段长表示的等效性,并在此基础上提出了基于段长分布的HMM模型(DDBHMM).非特定人连续语音实验结果表明,仅仅利用状态段长信息的DDBHMM语音识别模型比经典HMM模型的性能有了明显的提高(误识率降低了17.8%),展示了DDBHMM的良好的性能,为语音信号的时长、语速、时间断续性以及语音特征的相关性等重要特征的描述和利用开辟了空间. 相似文献

8.

利用SVM的聚类算法在时间序列信号识别中的应用

汪永涛《微电子学与计算机》2012,29(3):182-184

研究了一维时间序列信号识别的问题.针对基于混合高斯模型的隐马尔科夫(HMM)编码准确率低的问题,提出了一种利用多个支持向量机构造混合支持向量机,从而为隐马尔科夫模型提供更精确的观测值编码和发生矩阵,能有效的提高HMM在语音信号识别或者文字识别中的准确率.本方法可以应用到语音识别,文字识别以及生物信息处理等领域. 相似文献

9.

基于PCANN/HMM混合结构的语音识别方法 总被引：1，自引：0，他引：1

赵力邹采荣吴镇扬《信号处理》2001,17(5):473-476

本文提出了一种基于PCANN/HMM混合结构的语音识别方法,它采用相继几帧组成的特征参数矢量作为语音识别HMM的输入,能有效地在语音识别HMM中引入帧间相关信息,同时为了改善多帧特征输入HMM的输出概率密度函数性能,在HMM的前端增加语音参数压缩的主分量分析神经网络(PCANN).通过对多讲者汉语连续语音识别实验,证实了本文提出方法的有效性. 相似文献

10.

HMM在语音识别系统中的应用 总被引：1，自引：0，他引：1

苗苗马海武《现代电子技术》2006,29(16):64-66

介绍语音识别技术的应用状况与发展,对基于动态时间伸缩技术、隐含马尔科夫模型及人工神经网络的3种不同的语音识别系统进行了比较,重点介绍了隐含马尔科夫模型(HMM)在语音识别系统中的应用。其中基于HMM的语音识别系统是在UniSpeech芯片上实现基于DHMM的识别系统,然后又在同一平台上实现了基于CHMM的识别系统。相似文献

11.

语音识别隐马尔可夫模型的改进 总被引：7，自引：1，他引：6

战普明王作英《电子学报》1994,22(1):9-15

由于在语音识别中被广泛应用的隐马尔可夫模型是一重马尔可夫模型，它不能充分地描述语音信号的时间相依性。虽然理论上可将ＨＭＭ扩展成多重马尔可夫模型，但由于所需运算量和存储量将成指数增长而使其难以应用。因此，本文提出一种新模型，它是由ＨＭＭ与一个能描述语音信号时间相依性的多维高斯密度函数相结合构成的。本文从理论上论证了新模型的合理性。对汉语不计声调的全部４０９个单音节的识别实验结果表明：新模型的识别率显相似文献

12.

Speech recognition algorithm based on neural network and hidden Markov model

Zhao Jianhui Gao Hongbo Liu Yuchao Cheng Bo 《中国邮电高校学报(英文版)》2018,25(4):28-37

This study proposes a hybrid model of speech recognition parallel algorithm based on hidden Markov model (HMM) and artificial neural network (ANN). First, the algorithm uses HMM for time-series modeling of speech signals and calculates the voice to the HMM of the output probability score. Second, with the probability score as input to the neural network, the algorithm gets information for classification and recognition and makes a decision based on the hybrid model. Finally, Matlab software is used to train and test sample data. Simulation results show that using the strong time-series modeling ability of HMM and the classification features of neural network, the proposed algorithm possesses stronger noise immunity than the traditional HMM. Moreover, the hybrid model enhances the individual flaws of the HMM and the neural network and greatly improves the speed and performance of speech recognition. 相似文献

13.

语音识别HMM中引入帧间相关信息的一种参数化模型 总被引：4，自引：1，他引：3

杨浩荣王作英陆大《电子学报》1998,26(10):50-54,8

虽然隐马尔可夫模型（ＨＭＭ）是当前最为流行的语音识别模型，但由于一般都采用了状态输出独立假设，因此存在着不能描述语音现象中时间相关性的固有缺陷，本文提出的新模型对语音状态输出特征矢量序列的静态和动态特性信息分别进行参数化建模，然后将它们结合在一起，由此在基于段长分布的ＨＭＭ（ＤＤＢＨＭＭ）中引入了帧间相关信息，这种上引入帧间相关信息的ＨＭＭ能够更为精确地描述真实的语音现象。本文在给出新模型的框架后相似文献

14.

基于环境特征判别学习的顽健语音识别方法 总被引：3，自引：0，他引：3

下载免费PDF全文

韩纪庆高文《电子学报》2001,29(2):196-198

提出一种基于环境特征判别学习的顽健语音识别方法 ,它首先通过使用一个简单的分类器和梯度下降法迭代地学得环境特征 ,接着利用得到的环境特征从观测到的混噪语音特征中估计出纯净的语音特征 ,然后将估计出来的纯净语音特征用到后端的HMM分类器中 .使用所提出的方法对不特定话者小词表进行实验 ,其系统误识率与基本HMM系统相比下降了 33 3% . 相似文献

15.

The Hidden Markov Model of co-articulation and its application to the continuous speech recognition

Lee Tranzai Zheng Fang Wu Wenhu Chen Daowen 《电子科学学刊(英文版)》2000,17(3):242-247

The co-articulation is one of the main reasons that makes the speech recognition difficult. However, the traditional Hidden Markov Models(HMM) can not model the co-articulation, because they depend on the first-order assumption. In this paper, for modeling the co-articulation, a more perfect HMM than traditional first order HMM is proposed on the basis of the authors' previous works(1997, 1998) and they give a method in that this HMM is used in continuous speech recognition by means of multilayer perceptrons(MLP), i.e. the hybrid HMM/MLP method with triple MLP structure. The experimental result shows that this new hybrid HMM/MLP method decreases error rate in comparison with authors' previous works. 相似文献

16.

Neural networks for statistical recognition of continuous speech 总被引：4，自引：0，他引：4

Morgan N. Bourlard H.A. 《Proceedings of the IEEE. Institute of Electrical and Electronics Engineers》1995,83(5):742-772

In recent years there has been a significant body of work, both theoretical and experimental, that has established the viability of artificial neural networks (ANN's) as a useful technology for speech recognition. It has been shown that neural networks can be used to augment speech recognizers whose underlying structure is essentially that of hidden Markov models (HMM's). In particular, we have demonstrated that fairly simple layered structures, which we lately have termed big dumb neural networks (BDNN's), can be discriminatively trained to estimate emission probabilities for an HMM. Recently simple speech recognition systems (using context-independent phone models) based on this approach have been proved on controlled tests, to be both effective in terms of accuracy (i.e., comparable or better than equivalent state-of-the-art systems) and efficient in terms of CPU and memory run-time requirements. Research is continuing on extending these results to somewhat more complex systems. In this paper, we first give a brief overview of automatic speech recognition (ASR) and statistical pattern recognition in general. We also include a very brief review of HMM's, and then describe the use of ANN's as statistical estimators. We then review the basic principles of our hybrid HMM/ANN approach and describe some experiments. We discuss some current research topics, including new theoretical developments in training ANN's to maximize the posterior probabilities of the correct models for speech utterances. We also discuss some issues of system resources required for training and recognition. Finally, we conclude with some perspectives about fundamental limitations in the current technology and some speculations about where we can go from here 相似文献

17.

Addable Stress Speech Recognition with Multiplexing HMM: Training and Non-training Decision

Pakapong Amornkul Kosin Chamnongthai Punnarumol Temdee 《Wireless Personal Communications》2014,76(3):503-521

In stress speech recognition, a recognition model that is capable of processing multi-stress speech needs to be designed in the view points of accuracy and add-ability. This paper proposes addable stress speech recognition with multiplexing Hidden-Markov model (HMM). To achieve multi-stress speech, we propose a multiplexing topology that combines multiple stress speech models. Since each stress affects a speech in different way, having a speech recognition model that specifically trained to recognize words effected by the stress help improve the recognition rates. However, since each stress speech model gives it own independent recognized word, we need to have an effective decision module to choose the correct word. In each stress speech model, a MFCC is applied to the input speech. The result is fed into a HMM that is segmented into N parts. Each part of the segmentation provides its own tentative recognized word which in turn is an input to the proposed non-training decision module. Based on these tentative recognized words from segments of all stress speech models, the final recognized word is decided using coarse-to-fine concept performed by a majority vote, segment-weighted difference square score and next best score, respectively. Besides neutral speech, the proposed method was verified using three stresses including angry, loud, and Lombard. The results showed that the proposed method achieved 94.7 % recognition rate comparing to 94.2 % of the training-based decision method. 相似文献