基于TDNN-FSMN的蒙古语语音识别技术研究 Mongolian Speech Recognition Based on TDNN-FSMN期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于TDNN-FSMN的蒙古语语音识别技术研究

引用本文：	王勇和,飞龙,高光来.基于TDNN-FSMN的蒙古语语音识别技术研究[J].中文信息学报,2018,32(9):28-34.

作者姓名：	王勇和飞龙高光来

作者单位：	内蒙古大学计算机学院,内蒙古呼和浩特 010021

基金项目：	国家自然科学基金(61563040,61773224);内蒙古自然科学基金(2016ZD06)

摘要：	为了提高蒙古语语音识别性能,该文首先将时延神经网络融合前馈型序列记忆网络应用于蒙古语语音识别任务中,通过对长序列语音帧建模来充分挖掘上下文相关信息;此外研究了前馈型序列记忆网络“记忆”模块中历史信息和未来信息长度对模型的影响;最后分析了融合的网络结构中隐藏层个数及隐藏层节点数对声学模型性能的影响。实验结果表明,时延神经网络融合前馈型序列记忆网络相比深度神经网络、时延神经网络和前馈型序列记忆网络具有更好的性能,单词错误率与基线深度神经网络模型相比降低22.2%。
关键词：	蒙古语语音识别时延神经网络前馈型序列记忆网络
Mongolian Speech Recognition Based on TDNN-FSMN

WANG Yonghe,BAO Feilong,GAO Guanglai.Mongolian Speech Recognition Based on TDNN-FSMN[J].Journal of Chinese Information Processing,2018,32(9):28-34.

Authors:	WANG Yonghe BAO Feilong GAO Guanglai

Affiliation:	College of Computer Science, Inner Mongolia University, Hohhot, Inner Mongolia 010021, China

Abstract:	In order to improve Mongolian speech recognition, the Time Delay Neural Network (TDNN) and Feed-forward Sequential Memory Network (FSMN) are combined to model the long sequence speech frames. In addition, we investigate the influence caused by the information from the preceding and the subsequent frames in the memory block over FSMN. We compare the performance of the TDNN-LSTM using different hidden layers and nodes. The results show that the fusion of TDNN and FSMN produces better performance than DNN, TDNN and FSMN, reducing the word error rate (WER) by 22.2% compared with the DNN baseline.

Keywords:	Mongolian speech recognition Time Delay Neural Network Feed-forward Sequential Memory Network

	点击此处可从《中文信息学报》浏览原始摘要信息
	点击此处可从《中文信息学报》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏