Augmented two stream network for robust action recognition adaptive to various action videos期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Augmented two stream network for robust action recognition adaptive to various action videos

Affiliation:	1. Department of Electrical Engineering, Indian Institute of Technology Patna, Bihta 801 106, Patna, India;2. Department of Physics, Indian Institute of Technology Patna, Bihta 801 106, Patna, India

Abstract:	In video-based action recognition, using videos with different frame numbers to train a two-stream network can result in data skew problems. Moreover, extracting the key frames from a video is crucial for improving the training and recognition efficiency of action recognition systems. However, previous works suffer from problems of information loss and optical-flow interference when handling videos with different frame numbers. In this paper, an augmented two-stream network (ATSNet) is proposed to achieve robust action recognition. A frame-number-unified strategy is first incorporated into the temporal stream network to unify the frame numbers of videos. Subsequently, the grayscale statistics of the optical-flow images are extracted to filter out any invalid optical-flow images and produce the dynamic fusion weights for the two branch networks to adapt to different action videos. Experiments conducted on the UCF101 dataset demonstrate that ATSNet outperforms previously defined methods, improving the recognition accuracy by 1.13%.

Keywords:	Two-stream network Action recognition Data skew
本文献已被 ScienceDirect 等数据库收录！