首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 796 毫秒
1.
Hidden Markov models (HMMs) with bounded state durations (HMM/BSD) are proposed to explicitly model the state durations of HMMs and more accurately consider the temporal structures existing in speech signals in a simple, direct, but effective way. A series of experiments have been conducted for speaker dependent applications using 408 highly confusing first-tone Mandarin syllables as the example vocabulary. It was found that in the discrete case the recognition rate of HMM/BSD (78.5%) is 9.0%, 6.3%, and 1.9% higher than the conventional HMMs and HMMs with Poisson and gamma distribution state durations, respectively. In the continuous case (partitioned Gaussian mixture modeling), the recognition rates of HMM/BSD (88.3% with 1 mixture, 88.8% with 3 mixtures, and 89.4% with 5 mixtures) are 6.3%, 5.0%, and 5.5% higher than those of the conventional HMMs, and 5.9% (with 1 mixture), 3.9% (with 3 mixtures) and 3.1% (with 1 mixture), 1.8% (with 3 mixtures) higher than HMMs with Poisson and gamma distributed state durations, respectively  相似文献   

2.
We present a discriminative training algorithm, that uses support vector machines (SVMs), to improve the classification of discrete and continuous output probability hidden Markov models (HMMs). The algorithm uses a set of maximum-likelihood (ML) trained HMM models as a baseline system, and an SVM training scheme to rescore the results of the baseline HMMs. It turns out that the rescoring model can be represented as an unnormalized HMM. We describe two algorithms for training the unnormalized HMM models for both the discrete and continuous cases. One of the algorithms results in a single set of unnormalized HMMs that can be used in the standard recognition procedure (the Viterbi recognizer), as if they were plain HMMs. We use a toy problem and an isolated noisy digit recognition task to compare our new method to standard ML training. Our experiments show that SVM rescoring of hidden Markov models typically reduces the error rate significantly compared to standard ML training.  相似文献   

3.
In applying hidden Markov modeling for recognition of speech signals, the matching of the energy contour of the signal to the energy contour of the model for that signal is normally achieved by appropriate normalization of each vector of the signal prior to both training and recognition. This approach, however, is not applicable when only noisy signals are available for recognition. A unified approach is developed for gain adaptation in recognition of clean and noisy signals. In this approach, hidden Markov models (HMMs) for gain-normalized clean signals are designed using maximum-likelihood (ML) estimates of the gain contours of the clean training sequences. The models are combined with ML estimates of the gain contours of the clean test signals, obtained from the given clean or noisy signals, in performing recognition using the maximum a posteriori decision rule. The gain-adapted training and recognition algorithms are developed for HMMs with Gaussian subsources using the expectation-minimization (EM) approach  相似文献   

4.
A method of integrating the Gibbs distributions (GDs) into hidden Markov models (HMMs) is presented. The probabilities of the hidden state sequences of HMMs are modeled by GDs in place of the transition probabilities. The GDs offer a general way in modeling neighbor interactions of Markov random fields where the Markov chains in HMMs are special cases. An algorithm for estimating the model parameters is developed based on Baum reestimation, and an algorithm for computing the probability terms is developed using a lattice structure. The GD models were used for experiments in speech recognition on the TI speaker-independent, isolated digit database. The observation sequences of the speech signals were modeled by mixture Gaussian autoregressive densities. The energy functions of the GDs were developed using very few parameters and proved adequate in hidden layer modeling. The results of the experiments showed that the GD models performed at least as well as the HMM models  相似文献   

5.
This paper reports an upper bound for the Kullback–Leibler divergence (KLD) for a general family of transient hidden Markov models (HMMs). An upper bound KLD (UBKLD) expression for Gaussian mixtures models (GMMs) is presented which is generalized for the case of HMMs. Moreover, this formulation is extended to the case of HMMs with nonemitting states, where under some general assumptions, the UBKLD is proved to be well defined for a general family of transient models. In particular, the UBKLD has a computationally efficient closed-form for HMMs with left-to-right topology and a final nonemitting state, that we refer to as left-to-right transient HMMs. Finally, the usefulness of the closed-form expression is experimentally evaluated for automatic speech recognition (ASR) applications, where left-to-right transient HMMs are used to model basic acoustic-phonetic units. Results show that the UBKLD is an accurate discrimination indicator for comparing acoustic HMMs used for ASR.   相似文献   

6.
For the acoustic models of embedded speech recognition systems, hidden Markov models (HMMs) are usually quantized and the original full space distributions are represented by combinations of a few quantized distribution prototypes. We propose a maximum likelihood objective function to train the quantized distribution prototypes. The experimental results show that the new training algorithm and the link structure adaptation scheme for the quantized HMMs reduce the word recognition error rate by 20.0%.  相似文献   

7.
1 Introduction Manyrealobserveddataarecharacterizedbymultiplecoupledcausesorfactors.Forinstance ,faceimagesmaybegeneratedbycombiningeyebrows,eyes ,noseandmouth .Similarly ,speechsignalsmayresultfromanin teractionofmotionsoffactorssuchasthejaw ,tongue ,velum ,lipandmouth .RecentlyZemelandHintonpro posedafactoriallearningarchitecture[1~ 2 ] todealwithfactorialdata .Thegoaloffactoriallearningistodiscov erthemultipleunderlyingcausesorfactorsfromtheob serveddataandfindarepresentationthatwillbo…  相似文献   

8.
Based on global optimisation, a new genetic algorithm for training hidden Markov models (HMMs) is proposed. The results of speech recognition are presented and a comparison made with the classic training HMM algorithm  相似文献   

9.
林丽 《电子器件》2020,43(2):466-470
研究提出了一种基于声发射源特征识别的矿井旋转机组碰摩故障检测方法。为了能对矿井旋转机组实时远程监控并实现分布式网络化管理,设计了一种基于ARM嵌入式系统的矿井旋转机组振动监测系统。针对高斯混合模型在建模时需要较多的训练数据的缺陷,提出了一种基于模糊矢量量化混合模型的声发射识别方法,该方法综合考虑了模糊集理论、矢量量化和高斯混合模型的优点,通过用模糊矢量量化误差尺度取代传统高斯混合模型的输出概率函数,减少了建模时对训练数据量的要求,提高了模型精度和识别速度。通过实验观察上位机输出结果,验证了监测数据的实时性和准确性,达到了对旋转机组运行的状态信息实时监测和故障诊断的要求。  相似文献   

10.
Although the continuous hidden Markov model (CHMM) technique seems to be the most flexible and complete tool for speech modelling. It is not always used for the implementation of speech recognition systems because of several problems related to training and computational complexity. Thus, other simpler types of HMMs, such as discrete (DHMM) or semicontinuous (SCHMM) models, are commonly utilised with very acceptable results. Also, the superiority of continuous models over these types of HMMs is not clear. The authors' group has previously introduced the multiple vector quantisation (MVQ) technique, the main feature of which is the use of one separated VQ codebook for each recognition unit. The MVQ technique applied to DHMM models generates a new HMM modelling (basic MVQ models) that allows incorporation into the recognition dynamics of the input sequence information wasted by the discrete models in the VQ process. The authors propose a new variant of HMM models that arises from the idea of applying MVQ to SCHMM models. These are SCMVQ-HMM (semicontinuous multiple vector quantisation HMM) models that use one VQ codebook per recognition unit and several quantisation candidates for each input vector. It is shown that SCMVQ modelling is formally the closest one to CHMM, although requiring even less computation than SCHMMs. After studying several implementation issues of the MVQ technique. Such as which type of probability density function should be used, the authors show the superiority of SCMVQ models over other types of HMM models such as DHMMs, SCHMMs or the basic MVQs  相似文献   

11.
It is shown that a reduction of the set that contains 24 parameters (i.e. the base set) to one that contains only 5, 6, 7, 8 or 10 parameters using a genetic algorithm increases the recognition rate by ~4.5%. Also, a 0.8% increase in the recognition rate is obtained when a mixture of Gaussian densities are used instead of a single Gaussian density to represent a speaker  相似文献   

12.
Linear regression for Hidden Markov Model (HMM) parameters is widely used for the adaptive training of time series pattern analysis especially for speech processing. The regression parameters are usually shared among sets of Gaussians in HMMs where the Gaussian clusters are represented by a tree. This paper realizes a fully Bayesian treatment of linear regression for HMMs considering this regression tree structure by using variational techniques. This paper analytically derives the variational lower bound of the marginalized log-likelihood of the linear regression. By using the variational lower bound as an objective function, we can algorithmically optimize the tree structure and hyper-parameters of the linear regression rather than heuristically tweaking them as tuning parameters. Experiments on large vocabulary continuous speech recognition confirm the generalizability of the proposed approach, especially when the amount of adaptation data is limited.  相似文献   

13.
A novel framework of an online unsupervised learning algorithm is presented to flexibly adapt the existing speaker-independent hidden Markov models (HMMs) to nonstationary environments induced by varying speakers, transmission channels, ambient noises, etc. The quasi-Bayes (QB) estimate is applied to incrementally obtain word sequence and adaptation parameters for adjusting HMMs when a block of unlabelled data is enrolled. The underlying statistics of a nonstationary environment can be successively traced according to the newest enrolment data. To improve the QB estimate, the adaptive initial hyperparameters are employed in the beginning session of online learning. These hyperparameters are estimated from a cluster of training speakers closest to the test environment. Additionally, a selection process is developed to select reliable parameters from a list of candidates for unsupervised learning. A set of reliability assessment criteria is explored for selection. In a series of speaker adaptation experiments, the effectiveness of the proposed method is confirmed and it is found that using the adaptive initial hyperparameters in online learning and the multiple assessments in parameter selection can improve the recognition performance  相似文献   

14.
Hidden Markov modeling of flat fading channels   总被引:2,自引:0,他引:2  
Hidden Markov models (HMMs) are a powerful tool for modeling stochastic random processes. They are general enough to model with high accuracy a large variety of processes and are relatively simple allowing us to compute analytically many important parameters of the process which are very difficult to calculate for other models (such as complex Gaussian processes). Another advantage of using HMMs is the existence of powerful algorithms for fitting them to experimental data and approximating other processes. In this paper, we demonstrate that communication channel fading can be accurately modeled by HMMs, and we find closed-form solutions for the probability distribution of fade duration and the number of level crossings  相似文献   

15.
Wavelet-based statistical signal processing techniques such as denoising and detection typically model the wavelet coefficients as independent or jointly Gaussian. These models are unrealistic for many real-world signals. We develop a new framework for statistical signal processing based on wavelet-domain hidden Markov models (HMMs) that concisely models the statistical dependencies and non-Gaussian statistics encountered in real-world signals. Wavelet-domain HMMs are designed with the intrinsic properties of the wavelet transform in mind and provide powerful, yet tractable, probabilistic signal models. Efficient expectation maximization algorithms are developed for fitting the HMMs to observational signal data. The new framework is suitable for a wide range of applications, including signal estimation, detection, classification, prediction, and even synthesis. To demonstrate the utility of wavelet-domain HMMs, we develop novel algorithms for signal denoising, classification, and detection  相似文献   

16.
This paper presents the design of a speech recognition IC using hidden Markov models (HMMs) with continuous observation densities. Results of offline and live recognition tests are also given. Our design employs a table look-up method to simplify the computation and hence the architecture of the circuit. Currently each state of the HMMs is represented by a double-mixture Gaussian distribution. With minor modifications, the proposed architecture can be extended to implement a recognizer in which models with higher order multi-mixture Gaussian distribution are used for more precise acoustic modeling. The test chip is fabricated with a 0.35 μm CMOS technology. The maximum operating frequency is 62.5 MHz at 3.3 V. For a 50-word vocabulary, the estimated recognition time is about 0.16 s. Using noise-corrupted utterances, the recognition accuracy is 93.8% for isolated English digits. Such a performance is comparable to the software implementation with the same algorithm. Live recognition test was also run for a vocabulary of 11 Chinese words. The accuracy is 91.8% for five male and five female speakers.
Wei HanEmail:
  相似文献   

17.
We develop a hidden Markov mixture model based on a Dirichlet process (DP) prior, for representation of the statistics of sequential data for which a single hidden Markov model (HMM) may not be sufficient. The DP prior has an intrinsic clustering property that encourages parameter sharing, and this naturally reveals the proper number of mixture components. The evaluation of posterior distributions for all model parameters is achieved in two ways: 1) via a rigorous Markov chain Monte Carlo method; and 2) approximately and efficiently via a variational Bayes formulation. Using DP HMM mixture models in a Bayesian setting, we propose a novel scheme for music analysis, highlighting the effectiveness of the DP HMM mixture model. Music is treated as a time-series data sequence and each music piece is represented as a mixture of HMMs. We approximate the similarity of two music pieces by computing the distance between the associated HMM mixtures. Experimental results are presented for synthesized sequential data and from classical music clips. Music similarities computed using DP HMM mixture modeling are compared to those computed from Gaussian mixture modeling, for which the mixture modeling is also performed using DP. The results show that the performance of DP HMM mixture modeling exceeds that of the DP Gaussian mixture modeling.  相似文献   

18.
This paper shows a complete seismic-event classification and monitoring system that has been developed based on the seismicity observed during three summer Antarctic surveys at the Deception Island Volcano, Antarctica. The system is based on the state of the art in hidden Markov modeling (HMM) techniques successfully applied to other scenarios. A database that contains a representative set of different seismic events including volcano-tectonic earthquakes, long period (LP) events, volcanic tremor, and hybrid events that were recorded during the 1994-1995 and 1995-1996 seismic surveys was collected for training and testing. Simple left-to-right HMMs and multivariate Gaussian probability density functions with a diagonal covariance matrix were used. The feature vector consists of the log-energies of a filter bank that consists of 16 triangular weighting functions that were uniformly spaced between 0 and 20 Hz and the first- and second-order derivatives. The system is suitable to operate in real time, and its accuracy for this task is about 90%. On the other hand, when the system was tested with a different data set including mainly LP events that were registered during several seismic swarms during the 2001-2002 field survey, more than 95% of the recognized events were marked by the recognition system  相似文献   

19.
茅晓泉  胡光锐  唐斌 《电子学报》2002,30(1):148-150
隐马尔柯夫模型(HMM)作为描述语音信号的一个工具,按输出概率分布的不同,可分为连续HMM(CHMM)和离散HMM(DHMM).经典的训练方法Baum-Welch算法虽然收敛迅速,但是这类基于爬山的算法只能取得局部最优解,从而影响了系统的识别率.对于CHMM,借助于分类K平均方法可以取得可靠的初始点以保证迅速准确的收敛.而对于DHMM,该方法收益不大,最终所得的仍是局部最优解.由于进化计算一个最重要的特点便是全局搜索,这样可得全局最优解或次优解.本文将进化计算应用到DHMM的训练中,提出了一个把传统算法和进化计算相结合的混合算法.实验结果表明该方法既保证了全局搜索又实现了快速收敛,最终所得的模型优于传统方法和简单进化计算方法.  相似文献   

20.
The authors present a new type of hidden Markov model (HMM) for vowel-to-consonant (VC) and consonant-to-vowel (CV) transitions based on the locus theory of speech perception. The parameters of the model can be trained automatically using the Baum-Welch algorithm and the training procedure does not require that instances of all possible CV and VC pairs be present. When incorporated into an isolated word recognizer with a 75000 word vocabulary it leads to the modest improvement in recognition rates. The authors give recognition results for the state interpolation HMM and compare them to those obtained by standard context-independent HMMs and generalized triphone models  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号