首页 | 官方网站   微博 | 高级检索  
     

基于分块注意力机制和交互位置关系的群组活动识别
引用本文:刘博,卿粼波,王正勇,刘美,姜雪.基于分块注意力机制和交互位置关系的群组活动识别[J].计算机应用,2022,42(7):2052-2057.
作者姓名:刘博  卿粼波  王正勇  刘美  姜雪
作者单位:四川大学 电子信息学院,成都 610065
基金项目:国家自然科学基金资助项目(61871278)~~;
摘    要:复杂场景下的群体活动识别是一项具有挑战性的任务,它涉及一组人在场景中的相互作用和相对空间位置关系。针对当前复杂场景下群组行为识别方法缺乏精细化设计以及没有充分利用个体间交互式特征的问题,提出了基于分块注意力机制和交互位置关系的网络框架,进一步考虑个体肢体语义特征,同时挖掘个体间交互特征相似性与行为一致性的关系。首先,采用原始视频序列和光流图像序列作为网络的输入,并引入一种分块注意力模块来细化个体的肢体运动特征;然后,将空间位置和交互式距离作为个体的交互特征;最后,将个体运动特征和空间位置关系特征融合为群体场景无向图的节点特征,并利用图卷积网络(GCN)进一步捕获全局场景下的活动交互,从而识别群体活动。实验结果表明,此框架在两个群组行为识别数据集(CAD和CAE)上分别取得了92.8%和97.7%的识别准确率,在CAD数据集上与成员关系图(ARG)和置信度能量循环网络(CERN)相比识别准确率分别提高了1.8个百分点和5.6个百分点,同时结合消融实验结果验证了所提算法有较高的识别精度。

关 键 词:群组活动识别  注意力机制  交互关系  视频理解  图卷积网络  
收稿时间:2021-06-03
修稿时间:2021-09-11

Group activity recognition based on partitioned attention mechanism and interactive position relationship
Bo LIU,Linbo QING,Zhengyong WANG,Mei LIU,Xue JIANG.Group activity recognition based on partitioned attention mechanism and interactive position relationship[J].journal of Computer Applications,2022,42(7):2052-2057.
Authors:Bo LIU  Linbo QING  Zhengyong WANG  Mei LIU  Xue JIANG
Affiliation:College of Electronic and Information Engineering,Sichuan University,Chengdu Sichuan 610065,China
Abstract:Group activity recognition is a challenging task in complex scenes, which involves the interaction and the relative spatial position relationship of a group of people in the scene. The current group activity recognition methods either lack the fine design or do not take full advantage of interactive features among individuals. Therefore, a network framework based on partitioned attention mechanism and interactive position relationship was proposed, which further considered individual limbs semantic features and explored the relationship between interaction feature similarity and behavior consistency among individuals. Firstly, the original video sequences and optical flow image sequences were used as the input of the network, and a partitioned attention feature module was introduced to refine the limb motion features of individuals. Secondly, the spatial position and interactive distance were taken as individual interaction features. Finally, the individual motion features and spatial position relation features were fused as the features of the group scene undirected graph nodes, and Graph Convolutional Network (GCN) was adopted to further capture the activity interaction in the global scene, thereby recognizing the group activity. Experimental results show that this framework achieves 92.8% and 97.7% recognition accuracy on two group activity recognition datasets (CAD (Collective Activity Dataset) and CAE (Collective Activity Extended Dataset)). Compared with Actor Relationship Graph (ARG) and Confidence Energy Recurrent Network (CERN) on CAD dataset, this framework has the recognition accuracy improved by 1.8 percentage points and 5.6 percentage points respectively. At the same time, the results of ablation experiment show that the proposed algorithm achieves better recognition performance.
Keywords:group activity recognition  attention mechanism  interactive relationship  video understanding  Graph Convolutional Network (GCN)  
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号