首页 | 官方网站   微博 | 高级检索  
     

自适应视听信息融合用于抗噪语音识别
引用本文:梁冰,陈德运,程慧. 自适应视听信息融合用于抗噪语音识别[J]. 控制理论与应用, 2011, 28(10): 1461-1466
作者姓名:梁冰  陈德运  程慧
作者单位:1. 大连理工大学创新实验学院,辽宁大连,116024
2. 哈尔滨理工大学计算机科学与技术学院,黑龙江哈尔滨,150080
3. 哈尔滨工程大学计算机科学与技术学院,黑龙江哈尔滨,150001
基金项目:国家自然科学基金资助项目(60572153); 黑龙江省博士后基金资助项目(LBH–Z09102); 哈尔滨理工大学青年科学研究基金资助项目(2009YF015); 中央高校基本科研业务费专项资金资助项目(DUT11RC(3)54).
摘    要:为了提高噪音环境中语音识别的准确性和鲁棒性,提出了基于自适应视听信息融合的抗噪语音识别方法,视听信息在识别过程中具有变化的权重,动态的自适应于环境输入的信噪比.根据信噪比和反馈的识别性能,通过学习自动机计算视觉信息的最优权重;根据视听信息的特征向量,利用隐马尔科夫模型进行视听信息的模式匹配,并根据最优权重组合视觉和声音隐马尔科夫模型的决策,获得最终的识别结果.实验结果表明,在各种噪音水平下,自适应权重比不变权重的视听信息融合的语音识别性能更优.

关 键 词:视听信息融合  语音识别  自适应权重  学习自动机  隐马尔科夫模型
收稿时间:2010-05-02
修稿时间:2010-11-15

Adaptive fusion of acoustic and visual information in noise-robust speech recognition
LIANG Bing,CHEN De-yun and CHENG Hui. Adaptive fusion of acoustic and visual information in noise-robust speech recognition[J]. Control Theory & Applications, 2011, 28(10): 1461-1466
Authors:LIANG Bing  CHEN De-yun  CHENG Hui
Affiliation:School of Innovation Experiment, Dalian University of Technology,College of Computer Science and Technology, Harbin University of Science and Technology,College of Computer Science and Technology, Harbin Engineering University
Abstract:We propose the adaptive fusion of acoustic and visual information for improving the accuracy and the robustness in the speech recognition. The acoustic and visual information is involved in the recognition process with different weights, which are adaptively determined according to the signal-to-noise ratio(SNR) between the environment inputs during the process of recognition. Based on the SNR and the performance feedback, a learning automata is used for computing the adaptive weights for the visual information. A hidden Markov model is used to match the patterns of the acoustic information and the visual information. The hidden Markov model decides the final recognition results by combining the acoustic information and the visual information with optimal weights. Experiments under various noise-level conditions are performed; results show that the speech recognition based on adaptive weights surpasses the speech recognition basedon fixed weights.
Keywords:audio-visual information fusion   speech recognition   adaptive weights   learning automata(LA)   hidden Markov model
本文献已被 CNKI 万方数据 等数据库收录!
点击此处可从《控制理论与应用》浏览原始摘要信息
点击此处可从《控制理论与应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号