首页 | 官方网站   微博 | 高级检索  
     

基于长短时记忆和深度神经网络的视觉手势识别技术
作者姓名:何 坚  廖俊杰  张 丞  魏 鑫  白佳豪  王伟东
作者单位:(1. 北京市物联网软件与系统工程技术研究中心,北京 100124; 2. 北京工业大学信息学部,北京 100124)
基金项目:国家自然科学基金项目(61602016);北京市科技计划项目(D171100004017003)
摘    要:针对基于视觉的动态手势识别易受光照、背景和手势形状变化影响等问题,在分 析人体手势空间上下文特征的基础上,首先建立一种基于人体骨架和部件轮廓特征的动态手势 模型,并采用卷积姿势机和单发多框检测器技术构造深度神经网络进行人体手势骨架和部件轮 廓特征提取。其次,引入长短时记忆网络提取动态人体手势中骨架、左右手和头部轮廓的时序 特征,进而分类识别手势。在此基础上,设计了一种空间上下文与时序特征融合的动态手势识 别机(GRSCTFF),并通过交警指挥手势视频样本库对其进行网络训练和实验分析。实验证明, 该系统 可以快速准确识别动态交警指挥手势,准确率达到94.12%,并对光线、背景和手势形 状变化具有较强的抗干扰能力。

关 键 词:手势识别  空间上下文  长短时记忆  特征提取  

Visual gesture recognition technology based on long short term memory and deep neural network
Authors:HE Jian  LIAO Jun-jie  ZHANG Cheng  WEI Xin  BAI Jia-hao  WANG Wei-dong
Affiliation:(1. Software and System Engineering Technology Center, Beijing 100124, China; 2. Faculty of Information, Beijing University of Technology, Beijing 100124, China)
Abstract:Aiming at the problem that visual gesture recognition is susceptible to light conditions, background information and changes in gesture shape, this paper analyzed the spatial context features of human gestures. First, this paper established a dynamic gesture model based on the contour features of human skeleton and body parts. The convolutional pose machine (CPM) and the single shot multibox detector (SSD) technology were utilized to build deep neural network, so as to extract the contour features of human gesture skeleton and body parts. Next, the long short term memory (LSTM) network was introduced to extract the temporal features of skeleton, left and right hand, and head contour in dynamic human gestures, so as to further classify and recognize gestures. On this basis, this paper designed a dynamic gesture recognizer based on spatial context and temporal feature fusion (GRSCTFF), and conducted network training and experimental analysis on GRSCTFF through the video sample database of traffic police command gestures. The experimental results show that GRSCTFF can quickly and accurately recognize the dynamic traffic police command gestures with an accuracy of 94.12%, and it has strong anti-interference ability to light, background and gesture shape changes.
Keywords:gesture recognition  spatial context  long short term memory  feature extraction  
本文献已被 CNKI 等数据库收录!
点击此处可从《》浏览原始摘要信息
点击此处可从《》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号