首页 | 官方网站   微博 | 高级检索  
     

基于视频深度学习的时空双流人物动作识别模型
引用本文:杨天明,陈志,岳文静.基于视频深度学习的时空双流人物动作识别模型[J].计算机应用,2018,38(3):895-899.
作者姓名:杨天明  陈志  岳文静
作者单位:1. 南京邮电大学 计算机学院, 南京 210023;2. 南京邮电大学 通信与信息工程学院, 南京 210003
基金项目:国家自然科学基金资助项目(61501253);江苏省自然科学基金资助项目(BK20151506);江苏省"六大人才高峰"第十一批高层次人才选拔培养资助项目(XXRJ-009);江苏省重点研发计划(社会发展)项目(BE2016778);南京邮电大学科研项目(NY217054)。
摘    要:深度学习在人物动作识别方面已取得较好的成效,但当前仍然需要充分利用视频中人物的外形信息和运动信息。为利用视频中的空间信息和时间信息来识别人物行为动作,提出一种时空双流视频人物动作识别模型。该模型首先利用两个卷积神经网络分别抽取视频动作片段空间和时间特征,接着融合这两个卷积神经网络并提取中层时空特征,最后将提取的中层特征输入到3D卷积神经网络来完成视频中人物动作的识别。在数据集UCF101和HMDB51上,进行视频人物动作识别实验。实验结果表明,所提出的基于时空双流的3D卷积神经网络模型能够有效地识别视频人物动作。

关 键 词:人物动作识别  时空模型  深度学习  卷积神经网络  视频挖掘  
收稿时间:2017-07-14
修稿时间:2017-09-07

Spatio-temporal two-stream human action recognition model based on video deep learning
YANG Tianming,CHEN Zhi,YUE Wenjing.Spatio-temporal two-stream human action recognition model based on video deep learning[J].journal of Computer Applications,2018,38(3):895-899.
Authors:YANG Tianming  CHEN Zhi  YUE Wenjing
Affiliation:1. College of Computer, Nanjing University of Posts and Telecommunications, Nanjing Jiangsu 210023, China;2. College of Communication and Information Technology, Nanjing University of Posts and Telecommunications, Nanjing Jiangsu 210003, China
Abstract:Deep learning has achieved good results in human action recognition, but it still needs to make full use of video human appearance information and motion information. To recognize human actions by using spatial information and temporal information in video, a video human action recognition model based on spatio-temporal two-stream was proposed. Two convolutional neural networks were used to extract spatial and temporal features of video sequences respectively in the proposed model, and then the two neural networks were merged to extract the middle spatio-temporal features, finally the video human action recognition was completed by inputting the extracted features into a 3D convolutional neural network. The video human action recognition experiments were carried out on the data set UCF101 and HMDB51. Experimental results show that the proposed 3D convolutional neural network model based on the spatio-temporal two-stream can effectively recognize the video human actions.
Keywords:human action recognition                                                                                                                        spatio-temporal model                                                                                                                        deep learning                                                                                                                        Convolution Neural Network (CNN)                                                                                                                        video mining
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号