首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 0 毫秒
1.
2.
In applying hidden Markov modeling for recognition of speech signals, the matching of the energy contour of the signal to the energy contour of the model for that signal is normally achieved by appropriate normalization of each vector of the signal prior to both training and recognition. This approach, however, is not applicable when only noisy signals are available for recognition. A unified approach is developed for gain adaptation in recognition of clean and noisy signals. In this approach, hidden Markov models (HMMs) for gain-normalized clean signals are designed using maximum-likelihood (ML) estimates of the gain contours of the clean training sequences. The models are combined with ML estimates of the gain contours of the clean test signals, obtained from the given clean or noisy signals, in performing recognition using the maximum a posteriori decision rule. The gain-adapted training and recognition algorithms are developed for HMMs with Gaussian subsources using the expectation-minimization (EM) approach  相似文献   

3.
已有的研究表明基于模型的压缩采样信号重建可以取得更好的重建效果。本文提出一种结合小波域马尔可夫树模型的压缩采样图像重建方法。马尔可夫树模型很好的匹配了图像小波变换后的系数在尺度间的持续性。这种统计特性可以在正交匹配追踪算法中协助原子的选取,从而更准确的选取具有大幅值系数的原子。在本文提出的新算法中,每次迭代新增的原子是从与残差信号较匹配的候选原子中选取。候选原子中使模型的状态似然函数最大的原子被选出。实验结果表明,新算法可以更准确选出具有大系数原子,重建的图像质量好于其它传统方法。  相似文献   

4.
A new deformed shape recognition method that relies on hidden Markov models to evaluate the sequentiality of the relevant points of the shape is proposed. These points are extracted from its adaptively calculated curvature function to give stability against noise transformations and deformations. The proposed method is very fast. Comparative tests for different shapes have been successful.  相似文献   

5.
Hidden Markov models (HMMs) with bounded state durations (HMM/BSD) are proposed to explicitly model the state durations of HMMs and more accurately consider the temporal structures existing in speech signals in a simple, direct, but effective way. A series of experiments have been conducted for speaker dependent applications using 408 highly confusing first-tone Mandarin syllables as the example vocabulary. It was found that in the discrete case the recognition rate of HMM/BSD (78.5%) is 9.0%, 6.3%, and 1.9% higher than the conventional HMMs and HMMs with Poisson and gamma distribution state durations, respectively. In the continuous case (partitioned Gaussian mixture modeling), the recognition rates of HMM/BSD (88.3% with 1 mixture, 88.8% with 3 mixtures, and 89.4% with 5 mixtures) are 6.3%, 5.0%, and 5.5% higher than those of the conventional HMMs, and 5.9% (with 1 mixture), 3.9% (with 3 mixtures) and 3.1% (with 1 mixture), 1.8% (with 3 mixtures) higher than HMMs with Poisson and gamma distributed state durations, respectively  相似文献   

6.
The paper presents a hybrid of a hidden Markov model and a Markov chain model for speech recognition. In this hybrid, the hidden Markov model is concerned with the time-varying property of spectral features, while the Markov chain accounts for the interdependence of spectral features. The log-likelihood scores of the two models, with respect to a given utterance, are combined by a postprocessor to yield a combined log-likelihood score for word classification. Experiments on speaker-independent and multispeaker isolated English alphabet recognition show that the hybrid outperformed both the hidden Markov model and the Markov chain model in terms of recognition  相似文献   

7.
Brookes  D.M. Leung  M.H. 《Electronics letters》1998,34(19):1827-1829
A novel procedure is presented for noise compensation in hidden Markov model speech recognisers. The procedure uses two microphone signals and, unlike previous approaches, does not require the noise spectrum to be stationary even in the short term. Results are presented showing that the performance of the compensated system equals or exceeds that obtained using matched training  相似文献   

8.
Motion trajectories provide rich spatiotemporal information about an object's activity. This paper presents novel classification algorithms for recognizing object activity using object motion trajectory. In the proposed classification system, trajectories are segmented at points of change in curvature, and the subtrajectories are represented by their principal component analysis (PCA) coefficients. We first present a framework to robustly estimate the multivariate probability density function based on PCA coefficients of the subtrajectories using Gaussian mixture models (GMMs). We show that GMM-based modeling alone cannot capture the temporal relations and ordering between underlying entities. To address this issue, we use hidden Markov models (HMMs) with a data-driven design in terms of number of states and topology (e.g., left-right versus ergodic). Experiments using a database of over 5700 complex trajectories (obtained from UCI-KDD data archives and Columbia University Multimedia Group) subdivided into 85 different classes demonstrate the superiority of our proposed HMM-based scheme using PCA coefficients of subtrajectories in comparison with other techniques in the literature.  相似文献   

9.
10.
The authors evaluate continuous density hidden Markov models (CDHMM), dynamic time warping (DTW) and distortion-based vector quantisation (VQ) for speaker recognition, emphasising the performance of each model structure across incremental amounts of training data. Text-independent (TI) experiments are performed with VQ and CDHMMs, and text-dependent (TD) experiments are performed with DTW, VQ and CDHMMs. For TI speaker recognition, VQ performs better than an equivalent CDHMM with one training version, but is outperformed by CDHMM when trained with ten training versions. For TD experiments, DTW outperforms VQ and CDHMMs for sparse amounts of training data, but with more data the performance of each model is indistinguishable. The performance of the TD procedures is consistently superior to TI, which is attributed to subdividing the speaker recognition problem into smaller speaker-word problems. It is also shown that there is a large variation in performance across the different digits, and it is concluded that digit zero is the best digit for speaker discrimination  相似文献   

11.
This paper represents an ongoing investigation of dexterous and natural control of upper extremity prostheses using the myoelectric signal. The scheme described within uses a hidden Markov model (HMM) to process four channels of myoelectric signal, with the task of discriminating six classes of limb movement. The HMM-based approach is shown to be capable of higher classification accuracy than previous methods based upon multilayer perceptrons. The method does not require segmentation of the myoelectric signal data, allowing a continuous stream of class decisions to be delivered to a prosthetic device. Due to the fact that the classifier learns the muscle activation patterns for each desired class for each individual, a natural control actuation results. The continuous decision stream allows complex sequences of manipulation involving multiple joints to be performed without interruption. The computational complexity of the HMM in its operational mode is low, making it suitable for a real-time implementation. The low computational overhead associated with training the HMM also enables the possibility of adaptive classifier training while in use.  相似文献   

12.
The techniques used to develop an acoustic-phonetic hidden Markov model, the problems associated with representing the whole acoustic-phonetic structure, the characteristics of the model, and how it performs as a phonetic decoder for recognition of fluent speech are discussed. The continuous variable duration model was trained using 450 sentences of fluent speech, each of which was spoken by a single speaker, and segmented and labeled using a fixed number of phonemes, each of which has a direct correspondence to the states of the matrix. The inherent variability of each phoneme is modeled as the observable random process of the Markov chain, while the phonotactic model of the unobservable phonetic sequence is represented by the state transition matrix of the hidden Markov model. The model assumes that the observed spectral data were generated by a Gaussian source. However, an analysis of the data shows that the spectra for the most of the phonemes are not normally distributed and that an alternative representation would be beneficial  相似文献   

13.
语音同步识别系统的发展方向是连续性的人机交互,采用传统系统易受到突发性噪声影响,致使识别效果较差,提出基于隐马尔可夫模型的连续语音同步识别系统。结合语音识别原理,设计系统硬件总体结构。利用JFET输入高保真运放的OPA604低通滤波器,保证信号处理结果的有效性。通过OMAP5912ZZG型号芯片对处理后的信号进行存储,使用矢量图缓冲音频,经由以太网接口移植相关语音识别序列,由此实现连续语音同步识别。由实验对比结果可知,该系统比传统系统识别效果最高值高出48%,推进了语音识别技术研究的快速发展。  相似文献   

14.
A new deformed shape recognition method based on hidden Markov models (HMMs), which is very resistant against transformations and non-rigid deformations, is presented. Since shape features are not referred to an absolute point, the method is also resistant to severe shape distortions. The method has been successfully tested using different databases  相似文献   

15.
为了利用语音识别技术来操控无线控制设备的运转,设计一种用语音无线控制开、关设备(白炽灯等)的装置。该装置利用LD3320作为语音数据采集和处理芯片,STC12C5A60S2单片机作为语音采样对比模块的微控制器,STC15F104E单片机作为接收、控制微控制器,利用无线通信模块HC-12实现数据信号发射和接收。结果表明,该装置在语音识别和无线传输上表现出良好的性能,识别率达到97%左右,且实现了语音控制灯的开关动作。  相似文献   

16.
李楠  姬光荣 《现代电子技术》2012,35(8):54-56,60
为了更详细地研究隐马尔科夫模型在图像识别中的应用,以指纹识别为例,纵向总结了几种基于隐马尔科夫模型的指纹图像识别算法,包括一维隐马尔科夫模型、伪二维隐马尔科夫模型、二维模型及一维模型组。分别从时间复杂度、识别精确度等方面总结出这四种隐马尔科夫模型在图像识别时的优缺点,得出不同待识别图像适合使用的识别模型的结论。  相似文献   

17.
Kim  H.R. Lee  H.S. 《Electronics letters》1991,27(18):1633-1635
A modified corrective training method using state segment information in the hidden Markov model is presented. The proposed algorithm is shown to result in a higher recognition rate than the conventional corrective training method and requires less computation.<>  相似文献   

18.
The authors demonstrate the effectiveness of phonemic hidden Markov models with Gaussian mixture output densities (mixture HMMs) for speaker-dependent large-vocabulary word recognition. Speech recognition experiments show that for almost any reasonable amount of training data, recognizers using mixture HMMs consistently outperform those employing unimodal Gaussian HMMs. With a sufficiently large training set (e.g. more than 2500 words), use of HMMs with 25-component mixture distributions typically reduces recognition errors by about 40%. It is also found that the mixture HMMs outperform a set of unimodal generalized triphone models having the same number of parameters. Previous attempts to employ mixture HMMs for speech recognition proved discouraging because of the high complexity and computational cost in implementing the Baum-Welch training algorithm. It is shown how mixture HMMs can be implemented very simply in unimodal transition-based frameworks by allowing multiple transitions from one state to another  相似文献   

19.
A novel framework of an online unsupervised learning algorithm is presented to flexibly adapt the existing speaker-independent hidden Markov models (HMMs) to nonstationary environments induced by varying speakers, transmission channels, ambient noises, etc. The quasi-Bayes (QB) estimate is applied to incrementally obtain word sequence and adaptation parameters for adjusting HMMs when a block of unlabelled data is enrolled. The underlying statistics of a nonstationary environment can be successively traced according to the newest enrolment data. To improve the QB estimate, the adaptive initial hyperparameters are employed in the beginning session of online learning. These hyperparameters are estimated from a cluster of training speakers closest to the test environment. Additionally, a selection process is developed to select reliable parameters from a list of candidates for unsupervised learning. A set of reliability assessment criteria is explored for selection. In a series of speaker adaptation experiments, the effectiveness of the proposed method is confirmed and it is found that using the adaptive initial hyperparameters in online learning and the multiple assessments in parameter selection can improve the recognition performance  相似文献   

20.
It is demonstrated how the hidden Markov model (HMM) frequency tracker can be extended by the addition of amplitude and phase information. The HMM tracker as originally formulated uses a gate of spectral bins from fast Fourier transform (FFT) processing, and associates each cell with a state of the hidden Markov chain. A measurement sequence based on the output of a simple threshold detector forms the input to the HMM tracker. Two extensions to the original tracker are proposed. The first, the HMM/A tracker, incorporates the FFT amplitudes in the cells of the measurement sequence. The second, the HMM/AP tracker, does not use a measurement sequence, but uses instead the FFT amplitude and phase values in all cells within the gate. A comparison of the results obtained in using the three HMM-based trackers with simulated data reveals that the extended trackers outperform the original. An analysis of the effect of parameter mismatch for the three trackers is presented. Their use as detectors is also discussed  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号