自适应视听信息融合用于抗噪语音识别 Adaptive fusion of acoustic and visual information in noise-robust speech recognition期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

自适应视听信息融合用于抗噪语音识别

引用本文：	梁冰,陈德运,程慧. 自适应视听信息融合用于抗噪语音识别[J]. 控制理论与应用, 2011, 28(10): 1461-1466

作者姓名：	梁冰陈德运程慧

作者单位：	1. 大连理工大学创新实验学院,辽宁大连,116024 2. 哈尔滨理工大学计算机科学与技术学院,黑龙江哈尔滨,150080 3. 哈尔滨工程大学计算机科学与技术学院,黑龙江哈尔滨,150001

基金项目：	国家自然科学基金资助项目(60572153); 黑龙江省博士后基金资助项目(LBH–Z09102); 哈尔滨理工大学青年科学研究基金资助项目(2009YF015); 中央高校基本科研业务费专项资金资助项目(DUT11RC(3)54).

摘要：	为了提高噪音环境中语音识别的准确性和鲁棒性，提出了基于自适应视听信息融合的抗噪语音识别方法，视听信息在识别过程中具有变化的权重，动态的自适应于环境输入的信噪比．根据信噪比和反馈的识别性能，通过学习自动机计算视觉信息的最优权重；根据视听信息的特征向量，利用隐马尔科夫模型进行视听信息的模式匹配，并根据最优权重组合视觉和声音隐马尔科夫模型的决策，获得最终的识别结果．实验结果表明，在各种噪音水平下，自适应权重比不变权重的视听信息融合的语音识别性能更优．
关键词：	视听信息融合语音识别自适应权重学习自动机隐马尔科夫模型
收稿时间：	2010-05-02
修稿时间：	2010-11-15
Adaptive fusion of acoustic and visual information in noise-robust speech recognition

LIANG Bing,CHEN De-yun and CHENG Hui. Adaptive fusion of acoustic and visual information in noise-robust speech recognition[J]. Control Theory & Applications, 2011, 28(10): 1461-1466

Authors:	LIANG Bing CHEN De-yun CHENG Hui

Affiliation:	School of Innovation Experiment, Dalian University of Technology,College of Computer Science and Technology, Harbin University of Science and Technology,College of Computer Science and Technology, Harbin Engineering University

Abstract:	We propose the adaptive fusion of acoustic and visual information for improving the accuracy and the robustness in the speech recognition. The acoustic and visual information is involved in the recognition process with different weights, which are adaptively determined according to the signal-to-noise ratio(SNR) between the environment inputs during the process of recognition. Based on the SNR and the performance feedback, a learning automata is used for computing the adaptive weights for the visual information. A hidden Markov model is used to match the patterns of the acoustic information and the visual information. The hidden Markov model decides the final recognition results by combining the acoustic information and the visual information with optimal weights. Experiments under various noise-level conditions are performed; results show that the speech recognition based on adaptive weights surpasses the speech recognition basedon fixed weights.

Keywords:	audio-visual information fusion speech recognition adaptive weights learning automata(LA) hidden Markov model
本文献已被 CNKI 万方数据等数据库收录！
	点击此处可从《控制理论与应用》浏览原始摘要信息
	点击此处可从《控制理论与应用》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏