首页 | 官方网站   微博 | 高级检索  
相似文献
 共查询到20条相似文献,搜索用时 31 毫秒
1.

Athlete detection and action recognition in sports video is a very challenging task due to the dynamic and cluttered background. Several attempts for automatic analysis focus on athletes in many sports videos have been made. However, taekwondo video analysis remains an unstudied field. In light of this, a novel framework for automatic techniques analysis in broadcast taekwondo video is proposed in this paper. For an input video, in the first stage, athlete tracking and body segmentation are done through a modified Structure Preserving Object Tracker. In the second stage, the de-noised frames which completely contain the body of analyzed athlete from video sequence, are trained by a deep learning network PCANet to predict the athlete action of each single frame. As one technique is composed of many consecutive actions and each action corresponds a video frame, focusing on video sequences to achieve techniques analysis makes sense. In the last stage, linear SVM is used with the predicted action frames to get a techniques classifier. To evaluate the performance of the proposed framework, extensive experiments on real broadcast taekwondo video dataset are provided. The results show that the proposed method achieves state-of-the-art results for complex techniques analysis in taekwondo video.

  相似文献   

2.
Action recognition on large categories of unconstrained videos taken from the web is a very challenging problem compared to datasets like KTH (6 actions), IXMAS (13 actions), and Weizmann (10 actions). Challenges like camera motion, different viewpoints, large interclass variations, cluttered background, occlusions, bad illumination conditions, and poor quality of web videos cause the majority of the state-of-the-art action recognition approaches to fail. Also, an increased number of categories and the inclusion of actions with high confusion add to the challenges. In this paper, we propose using the scene context information obtained from moving and stationary pixels in the key frames, in conjunction with motion features, to solve the action recognition problem on a large (50 actions) dataset with videos from the web. We perform a combination of early and late fusion on multiple features to handle the very large number of categories. We demonstrate that scene context is a very important feature to perform action recognition on very large datasets. The proposed method does not require any kind of video stabilization, person detection, or tracking and pruning of features. Our approach gives good performance on a large number of action categories; it has been tested on the UCF50 dataset with 50 action categories, which is an extension of the UCF YouTube Action (UCF11) dataset containing 11 action categories. We also tested our approach on the KTH and HMDB51 datasets for comparison.  相似文献   

3.
针对动态复杂场景下的操作动作识别,提出一种基于手势特征融合的动作识别框架,该框架主要包含RGB视频特征提取模块、手势特征提取模块与动作分类模块。其中RGB视频特征提取模块主要使用I3D网络提取RGB视频的时间和空间特征;手势特征提取模块利用Mask R-CNN网络提取操作者手势特征;动作分类模块融合上述特征,并输入到分类器中进行分类。在EPIC-Kitchens数据集上,提出的方法识别抓取手势的准确性高达89.63%,识别综合动作的准确度达到了74.67%。  相似文献   

4.
Human action recognition, defined as the understanding of the human basic actions from video streams, has a long history in the area of computer vision and pattern recognition because it can be used for various applications. We propose a novel human action recognition methodology by extracting the human skeletal features and separating them into several human body parts such as face, torso, and limbs to efficiently visualize and analyze the motion of human body parts.Our proposed human action recognition system consists of two steps: (i) automatic skeletal feature extraction and splitting by measuring the similarity between neighbor pixels in the space of diffusion tensor fields, and (ii) human action recognition by using multiple kernel based Support Vector Machine. Experimental results on a set of test database show that our proposed method is very efficient and effective to recognize the actions using few parameters.  相似文献   

5.
This paper describes an approach to human action recognition based on a probabilistic optimization model of body parts using hidden Markov model (HMM). Our method is able to distinguish between similar actions by only considering the body parts having major contribution to the actions, for example, legs for walking, jogging and running; arms for boxing, waving and clapping. We apply HMMs to model the stochastic movement of the body parts for action recognition. The HMM construction uses an ensemble of body‐part detectors, followed by grouping of part detections, to perform human identification. Three example‐based body‐part detectors are trained to detect three components of the human body: the head, legs and arms. These detectors cope with viewpoint changes and self‐occlusions through the use of ten sub‐classifiers that detect body parts over a specific range of viewpoints. Each sub‐classifier is a support vector machine trained on features selected for the discriminative power for each particular part/viewpoint combination. Grouping of these detections is performed using a simple geometric constraint model that yields a viewpoint‐invariant human detector. We test our approach on three publicly available action datasets: the KTH dataset, Weizmann dataset and HumanEva dataset. Our results illustrate that with a simple and compact representation we can achieve robust recognition of human actions comparable to the most complex, state‐of‐the‐art methods.  相似文献   

6.
Action recognition and pose estimation are two closely related topics in understanding human body movements; information from one task can be leveraged to assist the other, yet the two are often treated separately. We present here a framework for coupled action recognition and pose estimation by formulating pose estimation as an optimization over a set of action-specific manifolds. The framework allows for integration of a 2D appearance-based action recognition system as a prior for 3D pose estimation and for refinement of the action labels using relational pose features based on the extracted 3D poses. Our experiments show that our pose estimation system is able to estimate body poses with high degrees of freedom using very few particles and can achieve state-of-the-art results on the HumanEva-II benchmark. We also thoroughly investigate the impact of pose estimation and action recognition accuracy on each other on the challenging TUM kitchen dataset. We demonstrate not only the feasibility of using extracted 3D poses for action recognition, but also improved performance in comparison to action recognition using low-level appearance features.  相似文献   

7.
Recognition of human actions is a very important, task in many applications such as Human Computer Interaction, Content based video retrieval and indexing, Intelligent video surveillance, Gesture Recognition, Robot learning and control, etc. An efficient action recognition system using Difference Intensity Distance Group Pattern (DIDGP) method and recognition using Support Vector Machines (SVM) classifier is presented. Initially, Region of Interest (ROI) is extracted from the difference frame, where it represents the motion information. The extracted ROI is divided into two blocks B1 and B2. The proposed DIDGP feature is applied on the maximum intensity block of the ROI to discriminate the each action from video sequences. The feature vectors obtained from the DIDGP are recognized using SVM with polynomial and RBF kernel. The proposed work has been evaluated on KTH action dataset which consists of actions like walking, running, jogging, hand waving, clapping and boxing. The proposed method has been experimentally tested on KTH dataset and an overall accuracy of 94.67% for RBF kernel.  相似文献   

8.
目的 基于骨骼的动作识别技术由于在光照变化、动态视角和复杂背景等情况下具有更强的鲁棒性而成为研究热点。利用骨骼/关节数据识别人体相似动作时,因动作间关节特征差异小,且缺少其他图像语义信息,易导致识别混乱。针对该问题,提出一种基于显著性图像特征强化的中心连接图卷积网络(saliency image feature enhancement based center-connected graph convolutional network,SIFE-CGCN)模型。方法 首先,设计一种骨架中心连接拓扑结构,建立所有关节点到骨架中心的连接,以捕获相似动作中关节运动的细微差异;其次,利用高斯混合背景建模算法将每一帧图像与实时更新的背景模型对比,分割出动态图像区域并消除背景干扰作为显著性图像,通过预训练的VGG-Net(Visual Geometry Group network)提取特征图,并进行动作语义特征匹配分类;最后,设计一种融合算法利用分类结果对中心连接图卷积网络的识别结果强化修正,提高对相似动作的识别能力。此外,提出了一种基于骨架的动作相似度的计算方法,并建立一个相似动作数据集。结果 ...  相似文献   

9.
行为识别是当前计算机视觉方向中视频理解领域的重要研究课题。从视频中准确提取人体动作的特征并识别动作,能为医疗、安防等领域提供重要的信息,是一个十分具有前景的方向。本文从数据驱动的角度出发,全面介绍了行为识别技术的研究发展,对具有代表性的行为识别方法或模型进行了系统阐述。行为识别的数据分为RGB模态数据、深度模态数据、骨骼模态数据以及融合模态数据。首先介绍了行为识别的主要过程和人类行为识别领域不同数据模态的公开数据集;然后根据数据模态分类,回顾了RGB模态、深度模态和骨骼模态下基于传统手工特征和深度学习的行为识别方法,以及多模态融合分类下RGB模态与深度模态融合的方法和其他模态融合的方法。传统手工特征法包括基于时空体积和时空兴趣点的方法(RGB模态)、基于运动变化和外观的方法(深度模态)以及基于骨骼特征的方法(骨骼模态)等;深度学习方法主要涉及卷积网络、图卷积网络和混合网络,重点介绍了其改进点、特点以及模型的创新点。基于不同模态的数据集分类进行不同行为识别技术的对比分析。通过类别内部和类别之间两个角度对比分析后,得出不同模态的优缺点与适用场景、手工特征法与深度学习法的区别和融合多模态的优势。最后,总结了行为识别技术当前面临的问题和挑战,并基于数据模态的角度提出了未来可行的研究方向和研究重点。  相似文献   

10.
驾驶员的危险行为会增加交通事故的发生率,目前对驾驶员行为的研究中大多通过面部识别等方法对异常行为如疲劳驾驶、接电话等进行识别.这种方法仅客观地对驾驶员行为进行分类,而忽略了他们在驾驶过程中的主观心理.眼动仪是记录和分析驾驶员眼动数据的有效工具,可以清晰地了解驾驶员的想法并总结其视觉认知模式.因为目前还没有针对驾驶员眼动...  相似文献   

11.
12.
近年来,利用计算机技术实现基于多模态数据的情绪识别成为自然人机交互和人工智能领域重要 的研究方向之一。利用视觉模态信息的情绪识别工作通常都将重点放在脸部特征上,很少考虑动作特征以及融合 动作特征的多模态特征。虽然动作与情绪之间有着紧密的联系,但是从视觉模态中提取有效的动作信息用于情绪 识别的难度较大。以动作与情绪的关系作为出发点,在经典的 MELD 多模态情绪识别数据集中引入视觉模态的 动作数据,采用 ST-GCN 网络模型提取肢体动作特征,并利用该特征实现基于 LSTM 网络模型的单模态情绪识别。 进一步在 MELD 数据集文本特征和音频特征的基础上引入肢体动作特征,提升了基于 LSTM 网络融合模型的多 模态情绪识别准确率,并且结合文本特征和肢体动作特征提升了上下文记忆模型的文本单模态情绪识别准确率, 实验显示虽然肢体动作特征用于单模态情绪识别的准确度无法超越传统的文本特征和音频特征,但是该特征对于 多模态情绪识别具有重要作用。基于单模态和多模态特征的情绪识别实验验证了人体动作中含有情绪信息,利用 肢体动作特征实现多模态情绪识别具有重要的发展潜力。  相似文献   

13.
目的 为了提高视频中动作识别的准确度,提出基于动作切分和流形度量学习的视频动作识别算法。方法 首先利用基于人物肢体伸展程度分析的动作切分方法对视频中的动作进行切分,将动作识别的对象具体化;然后从动作片段中提取归一化之后的全局时域特征和空域特征、光流特征、帧内的局部旋度特征和散度特征,构造一种7×7的协方差矩阵描述子对提取出的多种特征进行融合;最后结合流形度量学习方法有监督式地寻找更优的距离度量算法提高动作的识别分类效果。结果 对Weizmann公共视频集的切分实验统计结果表明本文提出的视频切分方法具有很好的切分能力,能够作好动作识别前的预处理;在Weizmann公共视频数据集上进行了流形度量学习前后的识别效果对比,结果表明利用流形度量学习方法对动作识别效果提升2.8%;在Weizmann和KTH两个公共视频数据集上的平均识别率分别为95.6%和92.3%,与现有方法的比较表明,本文提出的动作识别方法有更好的识别效果。结论 多次实验结果表明本文算法在预处理过程中动作切分效果理想,描述动作所构造协方差矩阵对动作的表达有良好的多特征融合能力,而且光流信息和旋度、散度信息的加入使得人体各部位的运动方向信息具有了更多细节的描述,有效提高了协方差矩阵的描述能力,结合流形度量学习方法对动作识别的准确性有明显提高。  相似文献   

14.
健身动作识别是智能健身系统的核心环节。为了提高健身动作识别算法的精度和速度,并减少健身动作中人体整体位移对识别结果的影响,提出了一种基于人体骨架特征编码的健身动作识别方法。该方法包括三个步骤:首先,构建精简的人体骨架模型,并利用人体姿态估计技术提取骨架模型中各关节点的坐标信息;其次,利用人体中心投影法提取动作特征区域以消除人体整体位移对动作识别的影响;最后,将特征区域编码作为特征向量并输入多分类器进行动作识别,同时通过优化特征向量长度使识别率和速度达到最优。实验结果表明,本方法在包含28种动作的自建健身数据集上的动作识别率为97.24%,证明该方法能够有效识别各类健身动作;在公开的KTH和Weizmann数据集上,所提方法的动作识别率分别为91.67%和90%,优于其他同类型方法。  相似文献   

15.
This paper addresses the problem of silhouette-based human activity recognition. Most of the previous work on silhouette based human activity recognition focus on recognition from a single view and ignores the issue of view invariance. In this paper, a system framework has been presented to recognize a view invariant human activity recognition approach that uses both contour-based pose features from silhouettes and uniform rotation local binary patterns for view invariant activity representation. The framework is composed of three consecutive modules: (1) detecting and locating people by background subtraction, (2) combined scale invariant contour-based pose features from silhouettes and uniform rotation invariant local binary patterns (LBP) are extracted, and (3) finally classifying activities of people by Multiclass Support vector machine (SVM) classifier. The rotation invariant nature of uniform LBP provides view invariant recognition of multi-view human activities. We have tested our approach successfully in the indoor and outdoor environment results on four multi-view datasets namely: our own view point dataset, VideoWeb Multi-view dataset [28], i3DPost multi-view dataset [29], and WVU multi-view human action recognition dataset [30]. The experimental results show that the proposed method of multi-view human activity recognition is robust, flexible and efficient.  相似文献   

16.
17.
针对课堂教学场景遮挡严重、学生众多,以及目前的视频行为识别算法并不适用于课堂教学场景,且尚无学生课堂行为的公开数据集的问题,构建了课堂教学视频库以及学生课堂行为库,提出了基于深度时空残差卷积神经网络的课堂教学视频中实时多人学生课堂行为识别算法.首先,结合实时目标检测和跟踪,得到每个学生的实时图片流;接着,利用深度时空残...  相似文献   

18.
We present a method for the recognition of complex actions. Our method combines automatic learning of simple actions and manual definition of complex actions in a single grammar. Contrary to the general trend in complex action recognition that consists in dividing recognition into two stages, our method performs recognition of simple and complex actions in a unified way. This is performed by encoding simple action HMMs within the stochastic grammar that models complex actions. This unified approach enables a more effective influence of the higher activity layers into the recognition of simple actions which leads to a substantial improvement in the classification of complex actions. We consider the recognition of complex actions based on person transits between areas in the scene. As input, our method receives crossings of tracks along a set of zones which are derived using unsupervised learning of the movement patterns of the objects in the scene. We evaluate our method on a large dataset showing normal, suspicious and threat behaviour on a parking lot. Experiments show an improvement of ~ 30% in the recognition of both high-level scenarios and their composing simple actions with respect to a two-stage approach. Experiments with synthetic noise simulating the most common tracking failures show that our method only experiences a limited decrease in performance when moderate amounts of noise are added.  相似文献   

19.
康复锻炼是脑卒中患者的重要治疗方式,为提高康复动作识别的准确率与实时性,更好地辅助患者在居家环境中进行长期康复训练,结合姿态估计与门控循环单元(GRU)网络提出一种人体康复动作识别算法Pose-AMGRU。采用OpenPose姿态估计方法从视频帧中提取骨架关节点,经过姿态数据预处理后得到表达肢体运动的关键动作特征,并利用注意力机制构建融合三层时序特征的GRU网络实现人体康复动作分类。实验结果表明,该算法在KTH和康复动作数据集中的识别准确率分别为98.14%和100%,且在GTX1060显卡上的运行速度达到14.23frame/s,具有较高的识别准确率与实时性。  相似文献   

20.
本文提出了一个基于流形学习的动作识别框架,用来识别深度图像序列中的人体行为。本文从Kinect设备获得的深度信息中评估出人体的关节点信息,并用相对关节点位置差作为人体特征表达。在训练阶段,本文利用Lapacian eigenmaps(LE)流形学习对高维空间下的训练集进行降维,得到低维隐空间下的运动模型。在识别阶段,本文用最近邻差值方法将测试序列映射到低维流形空间中去,然后进行匹配计算。在匹配过程中,通过使用改进的Hausdorff距离对低维空间下测试序列和训练运动集的吻合度和相似度进行度量。本文用Kinect设备捕获的数据进行了实验,取得了良好的效果;同时本文也在MSR Action3D数据库上进行了测试,结果表明在训练样本较多情况下,本文识别效果优于以往方法。实验结果表明本文所提的方法适用于基于深度图像序列的人体动作识别。  相似文献   

设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号