基于韵母发音事件匹配与位置时延分析的音唇一致性判决方法 Lip Motion and Voice Consistency Recognition Based on Audio-Visual Matching of Vowel Pronunciation Events and Position Delay Analysis期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于韵母发音事件匹配与位置时延分析的音唇一致性判决方法

引用本文：	朱铮宇,廖丽平,杨春玲,王泳,蔡君,邱华愉. 基于韵母发音事件匹配与位置时延分析的音唇一致性判决方法[J]. 电子学报, 2021, 49(1): 140-148. DOI: 10.12263/DZXB.20190238

作者姓名：	朱铮宇廖丽平杨春玲王泳蔡君邱华愉

作者单位：	广东技术师范大学网络空间安全学院,广东广州510665;华南理工大学电子与信息学院,广东广州510641;广东技术师范大学网络空间安全学院,广东广州510665;华南理工大学电子与信息学院,广东广州510641

基金项目：	国家自然科学基金;广东省普通高校青年创新人才

摘要：	针对传统一致性判决方法主要对整句(段)话进行分析,并无对分析内容加以筛选,存在字典规模过大、计算复杂度高及结果易受静音等弱关联片段影响等不足,本文以唇型变化显著的韵母为代表性发音事件,结合音唇初始时延分布范围的统计结果,提出基于韵母发音事件匹配与位置时延分析的一致性判决方法.先利用提出的音视频结合韵母切分法对字典学习数...
关键词：	一致性分析声韵母切分字典学习
收稿时间：	2019-03-03
Lip Motion and Voice Consistency Recognition Based on Audio-Visual Matching of Vowel Pronunciation Events and Position Delay Analysis

ZHU Zheng-yu,LIAO Li-ping,YANG Chun-ling,WANG Yong,CAI Jun,QIU Hua-yu. Lip Motion and Voice Consistency Recognition Based on Audio-Visual Matching of Vowel Pronunciation Events and Position Delay Analysis[J]. Acta Electronica Sinica, 2021, 49(1): 140-148. DOI: 10.12263/DZXB.20190238

Authors:	ZHU Zheng-yu LIAO Li-ping YANG Chun-ling WANG Yong CAI Jun QIU Hua-yu

Affiliation:	1. School of Electronics and Information, Guangdong Polytechnic Normal University, Guangzhou, Guangdong 510665, China;2. School of Electronic and Information Engineering, South China University of Technology, Guangzhou, Guangdong 510641, China

Abstract:	For the mainstream lip motion and voice coherence judgment method,the whole sentence (segment) is analyzed without screening the content.This leads to large dictionary size and high computational complexity,and the result is vulnerable to weak related segments such as mute.Considering the vowel with significant lip shape changes as a representative pronunciation event and combining with the statistical results of the audio-visual initial delay distribution range,a consistent decision method based on audio-visual matching of vowel pronunciation events and position delay analysis is proposed.Firstly,the dictionary learning data is selected by the proposed audio-visual vowel segmentation method,and then the vowel dictionary is used to analyze the matching of the vowel event,and the time delay distribution of each vowel position is statistically scored.A consistency judgment is made by a scoring mechanism in which the vowel pronunciation event lip matching score and the position delay analysis score are combined.Experimental results show that the proposed method is superior to compared algorithms in recognition performance and reduces the amount of computation compared with the traditional dictionary method.

Keywords:	coherence analysis initial/final segmentation dictionary learning
本文献已被万方数据等数据库收录！
	点击此处可从《电子学报》浏览原始摘要信息
	点击此处可从《电子学报》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏