期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

1.

徐向华朱杰郭强《信号处理》2004,20(5):497-500

针对汉语语音单音节结构的特点,考虑音节间协同发音的现象,本文提出了一种对三音子模型进行分级聚类的方法。与传统的基于决策树的状态聚类算法相比,该方法通过对稀少三音子模型聚类,更充分地利用训练数据,减少稀少三音子对状态聚类的影响,从而提高声学模型的鲁棒性。实验结果表明:大词汇量连续语音识别器采用这种分级聚类方法,不仅可以大大减少模型及其参数的数量,还可使系统识别率有所提高,其中误识率相对于传统的决策树状态聚类系统降低了4.93％。相似文献

2.

基于HMM的连续小词量语音识别系统的研究

高建《现代电子技术》2011,34(11):205-207

为了提高语音识别效率及对环境的依赖性,文章对语音识别算法部分和硬件部分做了分析与改进,采用ARMS3C2410微处理器作为主控制模块,采用UDA1314TS音频处理芯片作为语音识别模块,利用HMM声学模型及Viterbi算法进行模式训练和识别,设计了一种连续的、小词量的语音识别系统。实验证明,该语音识别系统具有较高的识别率和一定程度的鲁棒性,实验室识别率和室外识别率分别达到95.6%,92.3%。相似文献

3.

高性能汉语数码串语音识别 总被引：9，自引：0，他引：9

下载免费PDF全文

李虎生刘加刘润生《电子学报》2001,29(5):595-599

本文给出了一个高性能汉语数码串非特定人连续语音识别系统,其声学模型基于Mel倒谱系数和连续HMM,识别时采用多候选帧同步搜索算法,并采用了MCE算法进行训练以提高系统的区分能力,实验证明该系统的识别率为94.8%(不定长数字串)和96.8%(定长数字串).为增强系统的实用性,本文还研究了基于MAP算法的说话人自适应算法和基于置信度的拒识算法.在进行自适应后,误识率可相对下降40%以上,在拒绝掉5%的正确语音时,系统识别率可以上升到96.9%(不定长数字串)和98.7%(定长数字串). 相似文献

4.

Isolated Mandarin syllable recognition using segmental features

Chang S. Chen S.-H. 《Vision, Image and Signal Processing, IEE Proceedings -》1995,142(1):59-64

A segment-based speech recognition scheme is proposed. The basic idea is to model explicitly the correlation among successive frames of speech signals by using features representing contours of spectral parameters. The speech signal of an utterance is regarded as a template formed by directly concatenating a sequence of acoustic segments. Each constituent acoustic segment is of variable length in nature and represented by a fixed dimensional feature vector formed by coefficients of discrete orthonormal polynomial expansions for approximating its spectral parameter contours. In the training, an automatic algorithm is proposed to generate several segment-based reference templates for each syllable class. In the testing, a frame-based dynamic programming procedure is employed to calculate the matching score of comparing the test utterance with each reference template. Performance of the proposed scheme was examined by simulations on multi-speaker speech recognition for 408 highly confusing isolated Mandarin base-syllables. A recognition rate of 81.1% was achieved for the case using 5-segment, 8-reference template models with cepstral and delta-cepstral coefficients as the recognition features. It is 4.5% higher than that of a well-modelled 12-state, 5-mixture CHMM method using cepstral, delta cepstral, and delta-delta cepstral coefficients 相似文献

5.

汉语连续语音识别中上下文相关的识别单元(三音子)的研究 总被引：1，自引：0，他引：1

赵庆卫王作英陆大《电子学报》1999,27(6):79-82,117

本文详细研究了汉语语音识别中如何有效地建立上下文相关的识别单元,以解决连续语音之间的协同发音问题。相似文献

6.

汉语数码语音识别自适应算法 总被引：4，自引：0，他引：4

李虎生杨明杰《电路与系统学报》1999,4(2):1-6

说话人自适应是提高非特定人语音识别性能的有效方法之一。本文将ＭＡＰ算法应用于汉语数码语音识别中,并讨论了几种加快自适应速度的方法以及自适应对非自适应人的影响。实验表明,ＭＡＰ算法可以有效地降低汉语数码识别对被适应人的误识率,而且对非自适应人性能影响很小。相似文献

7.

改进的高效动态时间规整算法语音识别系统

王新胜巩捷甫喻明艳《太赫兹科学与电子信息学报》2015,13(6):942-946

动态时间规整算法是结合了动态时间规整(DTW)技术和距离测度计算技术的一种非线性规整算法,在语音识别模板匹配中有重要的应用。为此提出一种改进的高效动态时间规整算法,其能有效加快搜索路径的寻找。基于Matlab实现了隐马尔科夫算法、高效动态时间规整算法和改进的高效动态时间规整算法的语音识别系统,同时进行了算法的仿真实验。实验结果表明,基于改进高效动态时间规整算法的训练速度远大于基于隐马尔可夫算法和高效动态时间规整算法的训练速度,而识别率下降很小,对于小词汇量非连续语音识别中高效动态时间规整算法的识别率为97.56%,隐马尔可夫算法的识别率为97.14%,改进高效动态时间规整算法的识别率为96.43%。相似文献

8.

连续语音识别前端鲁棒性研究

胡丹曾庆宁龙超黄桂敏《电视技术》2015,39(24):43-46

针对大词汇量连续语音识别中识别率不高的问题,提出了将语音增强级联在识别系统前端,在语音增强中将谱减法和对数最小均方误差算法（logmmse）与用于噪声估计的最小控制递归平均算法（imcra）相结合。识别系统使用Mel频率倒谱系数（MFCC）提取特征,用隐马尔科夫模型（HMM）训练与识别。实验结果表明,提出的方法最高能使单词识别率提高38.9%,使句子正确率提高21.8%。该方法用于大词汇量连续语音识别是可行的,有效的。相似文献

9.

基于连续识别的嵌入式孤立词识别系统

冷冰涛梁维谦董保帅原道德《电声技术》2011,35(11):42-45

基于线性网络的孤立词识别系统识别时间与词表规模成正比,识别性能严重受限于词表的规模.根据汉语孤立词特点,提出了一种基于连续识别的大词表孤立词识别系统.该系统围绕嵌入式识别所关注的速度和内存消耗性能,采用先实现应用多级搜索、定点化策略的连续识别,再对连续识别的识别结果进行音字转换处理的方法,将连续语音识别应用于大词表孤立... 相似文献

10.

Integration of phonetic and prosodic information for robustutterance verification

Wu C.-H. Chen Y.-J. Yan G.-L. 《Vision, Image and Signal Processing, IEE Proceedings -》2000,147(1):55-61

Mandarin speech is known for its tonal characteristic, and prosodic information plays an important role in Mandarin speech recognition. Driven by this property, phonetic and prosodic information are integrated and used for Mandarin telephone speech keyword spotting. A two-stage strategy, with recognition followed by verification, is adopted. For keyword recognition, 132 subsyllable models, two general acoustic filler models and one background/silence model are separately trained and used as the basic recognition units. For utterance verification, 12 anti-subsyllable models, 175 context-dependent prosodic models and five anti-prosodic models are constructed. A keyword verification function combining phonetic-phase and prosodic-phase verification is investigated. Using a test set of 3088 conversational speech utterances from 33 speakers (20 males and 13 females) and a vocabulary of 2583 faculty names, at 8.5% false rejection, the proposed verification method results in an 18.3% false alarm rate. Furthermore, this method is able correctly to reject 90.9% of non-keywords. Comparison with a baseline system without prosodic-phase verification shows that prosodic information can benefit the verification performance 相似文献

11.

改进汉语数码语音识别中的语音特征提取性能 总被引：3，自引：0，他引：3

顾良刘润生《电路与系统学报》1997,2(4):1-6

汉语数据码语音识别中存在三种与语音特征提取性能有关的语音混淆。相似文献

12.

汉语语音识别研究面临的一些科学问题 总被引：12，自引：0，他引：12

杜利民侯自强《电子学报》1995,23(10):110-116,61

本文简述汉语语音自动识别从实验室技术过渡到实际商用技术所必须解决的一些科学问题，列举了汉语语音编码的结构特点和规则，强调（１）在汉语音节的声母、韵母层面上的语言模型对语音的识别很有帮助，也会提供文字语言和讲话语言的有用知识；（２）使用区别性导引特征和描述性均匀特征有助于加速语音识别的搜索速度，减少失配和改善对音位变体的细分，本文还着重讨论了在语音信号的声学处理环节提高语音识别鲁棒性的重要问题和途径，文中还提出了标注性学习、提示性猜测的逐步过渡的训练和自适应方法，用于汉语大词汇连续语音识别。相似文献

13.

Segmental probability distribution model approach for isolatedMandarin syllable recognition

Shen J.-L. 《Vision, Image and Signal Processing, IEE Proceedings -》1998,145(6):384-390

A segmental probability distribution model (SPDM) approach is proposed for fast and accurate recognition of isolated Mandarin syllables. Instead of the conventional frame-based approach such as the hidden Markov model (HMM), the model matching process in the proposed SPDM is evaluated segment-by-segment based on information-theoretic distance measurements. The training and recognition procedures for the SPDM are developed first. Several distance measurement criteria, including the Chernoff distance, Bhattacharyya distance, Patrick-Fisher (1969) distance, divergence and a Bayesian-like distance, are used, and formulations and comparative results are discussed. Experimental results show that, compared to the widely used sub-unit based continuous density HMM, the proposed method leads to an improvement of 15.27% in the error rate, with a 12-fold increase in recognition speed and less than three quarters of the mixture requirements 相似文献

14.

多层前向感知机汉语孤立数码语音识别

钟林刘加刘润生《电路与系统学报》2000,5(2):82-86

本文从模板匹配的角度研究了多层前向感知机（ＭＬＰ）在汉语孤立数目字语音识别中的应用,针对训练样本数受限的情况提出了新的训练方法,研究了语音固化、特征提取、学习算法和策略诸方面问题。对特定人和非特定人汉语孤立数目字语音识别分别达至了９５．７％和９３．０％（无拒识）的识别率。相似文献

15.

基于CHMM的语音识别仿真系统实现

李浩亮靳双燕贾伟伟《电声技术》2013,(12):75-78

介绍了一种基于连续M元高斯混合密度的隐马尔可夫模型（HMM）的非特定人孤立词语音识别仿真系统。通过研究模型状态数、训练时间以及特征参数选取对语音识别率的影响,得出HMM状态数取4,训练次数为20次,特征参数选取48维LPCC和MFCC的混合参数,可使语音识别系统对于汉语孤立词的识别率达到90％。相似文献

16.

基于电话语料的维吾尔连续音素识别

米日古力·阿布都热素艾克白尔·帕塔尔艾斯卡尔·艾木都拉《通信技术》2012,45(7):54-56

结合维吾尔语的语音特征和语义信息,在大量电话语音语料库的基础上,以建立维吾尔语连续音素识别平台为目标,通过构建隐马尔科夫模型工具HTK(Hidden Markov Model Toolkit)工具实现了维吾尔语连续音素识别算法:首先根据具体技术指标完成了较大规模电话语音语料库的录制和标注工作;确定音素为基元,通过训练获得了每个音素的HMM(Hidden Markov Model)声学模型,随后对输入的语音进行识别,声学模型在不同的高斯混合数目下,得出了识别结果;统计了32个音素的识别率并对它进行分析,为了进一步提高识别率奠定了基础。相似文献

17.

语音识别算法的VC++实现

乔兵吴庆林阴玉梅《光机电信息》2011,28(4):50-55

随着语音识别算法的不断发展,其识别率不断提高,逐步达到可以应用的阶段.本文利用VC++软件实现了一种语音识别算法,并对其识别能力进行了测试.结果表明,该软件实现的算法识别成功率较高,短词可达95%以上,长词可达90%以上;同时识别延迟<50 ms,识别效率高,可满足应用需要. 相似文献

18.

汉语语音识别的抗噪性前端算法及性能分析

林建臻孙甲松王作英《电声技术》2004,(3):45-48,52

讨论了欧洲电信标准委员会ETSI提出的分布式语音识别系统的抗噪前端特征提取算法,该算法融合多种抗噪技术。结合汉语语音的特点,进行了汉语语音识别整体框架下的算法实现,并进行了实验和分析,典型噪声环境下的识别结果证明,相对于基线MFCC特征提取算法,稳健性有较大提高。相似文献

19.

汉语连续语音识别中多项式拟合语音轨迹模型的研究

下载免费PDF全文

欧智坚王作英《电子学报》2003,31(4):608-611

尽管作为当前最为流行的语音识别模型, HMM由于采用状态输出独立同分布假设,忽略了对语音轨迹动态特性的描述.本文基于一个更为灵活的语音描述统计框架—广义DDBHMM,提出了一个具体的多项式拟合语音轨迹模型,以及新的训练和识别算法,更好地刻划了真实的语音特性.本文还给出了一种有效的剪枝算法,得到一个实用化模型.汉语大词汇量非特定人连续语音识别的实验表明,这种剪枝的多项式拟合语音轨迹模型以较少的计算量明显改善了识别系统的性能. 相似文献

20.

Isolated-utterance speech recognition using hidden Markov modelswith bounded state durations

Hung-Yan Gu Chiu-Yu Tseng Lin-Shan Lee 《Signal Processing, IEEE Transactions on》1991,39(8):1743-1752

Hidden Markov models (HMMs) with bounded state durations (HMM/BSD) are proposed to explicitly model the state durations of HMMs and more accurately consider the temporal structures existing in speech signals in a simple, direct, but effective way. A series of experiments have been conducted for speaker dependent applications using 408 highly confusing first-tone Mandarin syllables as the example vocabulary. It was found that in the discrete case the recognition rate of HMM/BSD (78.5%) is 9.0%, 6.3%, and 1.9% higher than the conventional HMMs and HMMs with Poisson and gamma distribution state durations, respectively. In the continuous case (partitioned Gaussian mixture modeling), the recognition rates of HMM/BSD (88.3% with 1 mixture, 88.8% with 3 mixtures, and 89.4% with 5 mixtures) are 6.3%, 5.0%, and 5.5% higher than those of the conventional HMMs, and 5.9% (with 1 mixture), 3.9% (with 3 mixtures) and 3.1% (with 1 mixture), 1.8% (with 3 mixtures) higher than HMMs with Poisson and gamma distributed state durations, respectively 相似文献