结合时序注意力机制的多特征融合行人序列图像属性识别方法 Pedestrian Sequence Attribute Recognition Method with Multi-feature Fusion Combined with Temporal Attention Mechanism期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

结合时序注意力机制的多特征融合行人序列图像属性识别方法

引用本文：	黄晨,裴继红,赵阳. 结合时序注意力机制的多特征融合行人序列图像属性识别方法[J]. 信号处理, 2022, 38(1): 64-73. DOI: 10.16798/j.issn.1003-0530.2022.01.008

作者姓名：	黄晨裴继红赵阳

作者单位：	深圳大学电子与信息工程学院，广东深圳 518060

基金项目：	国家自然科学基金项目(62071303,61871269);广东省基础与应用基础研究基金(2019A1515011861);深圳市科技计划项目(JCYJ20190808151615540)。

摘要：	目前绝大多数的行人属性识别任务都是基于单张图像的,单张图像所含信息有限,而图像序列中包含丰富的有用信息和时序特征,利用序列信息是提高行人属性识别性能的一个重要途径.本文提出了结合时序注意力机制的多特征融合行人序列图像属性识别网络,该网络除了使用常见的空-时二次平均池化特征聚合和空-时平均最大池化特征聚合提取序列的特征外...
关键词：	行人属性识别时序注意力机制特征融合时序建模
收稿时间：	2021-03-02
Pedestrian Sequence Attribute Recognition Method with Multi-feature Fusion Combined with Temporal Attention Mechanism

HUANG Chen,PEI Jihong,ZHAO Yang. Pedestrian Sequence Attribute Recognition Method with Multi-feature Fusion Combined with Temporal Attention Mechanism[J]. Signal Processing(China), 2022, 38(1): 64-73. DOI: 10.16798/j.issn.1003-0530.2022.01.008

Authors:	HUANG Chen PEI Jihong ZHAO Yang

Affiliation:	School of Electronic and Information Engineering，Shenzhen University，Shenzhen，Guangdong 518060，China

Abstract:	The majority of pedestrian attribute recognition tasks are based on a single image. The information contained in a single image is limited， and the image sequence contained rich useful information and temporal features. Using sequence information is an important way to improve the performance of pedestrian attribute recognition. This paper proposed a multi feature fusion pedestrian sequence attribute recognition network based on temporal attention mechanism. In addition to using common spatial-temporal quadratic average pooling feature aggregation and spatial-temporal mean maximum pooling feature aggregation to extract features， the network also designs spatial-temporal attention factor weighted feature aggregation branch to further extract sequence features. By fusing the sequence features of the above three branches， the network can obtain more abundant information. In the spatial-temporal attention factor weighted feature aggregation branches， a full channel spatial-temporal attention factor generation network based on 3D convolution is designed to better capture the spatial-temporal features in a sequence. Based on the cross-entropy loss， this paper adds the Tversky loss， which is used to constrain the number of FP and FN， as the overall loss function of the network， so that the network has a better trade-off between the precision and the recall. The experimental results show that the proposed method is superior to the method based on a single image and other common feature fusion and time series modeling methods in each performance metrics.

Keywords:	pedestrian attribute recognition temporal attention mechanism feature fusion temporal modeling
本文献已被维普等数据库收录！
	点击此处可从《信号处理》浏览原始摘要信息
	点击此处可从《信号处理》下载免费的PDF全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏