基于多层次注意力与图模型的图像多标签分类算法 Multi-label Image Classification Algorithm Based on Multi-scale Attention and Graph Model期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

基于多层次注意力与图模型的图像多标签分类算法

引用本文：	朱旭东,熊贇.基于多层次注意力与图模型的图像多标签分类算法[J].计算机工程,2022,48(4):173-178+190.

作者姓名：	朱旭东熊贇

作者单位：	1. 复旦大学计算机科学与技术学院, 上海 200433;2. 上海市数据科学重点实验室, 上海 200433

基金项目：	国家自然科学基金（U1636207）；

摘要：	图像多标签分类作为计算机视觉领域的重要研究方向，在图像识别、检测等场景下得到广泛应用。现有图像多标签分类方法无法有效利用标签相关性信息以及标签语义与图像特征的对应关系，导致分类能力较差。提出一种图像多标签分类的新算法，通过利用标签共现信息和标签先验知识构建图模型，使用多尺度注意力学习图像特征中目标，并利用标签引导注意力融合标签语义特征和图像特征信息，从而将标签相关性和标签语义信息融入到模型学习中。在此基础上，基于图注意力机制构建动态图模型，并对标签信息图模型进行动态更新学习，以充分融合图像信息和标签信息。在图像多标签分类任务上的实验结果表明，相比于现有最优算法MLGCN，该算法在VOC-2007数据集及COCO-2012数据集上的mAP值分别提高了0.6、1.2个百分点，性能有明显提升。
关键词：	多标签标签语义图像特征注意力机制动态图多尺度
收稿时间：	2021-03-10
修稿时间：	2021-04-30
Multi-label Image Classification Algorithm Based on Multi-scale Attention and Graph Model

ZHU Xudong,XIONG Yun.Multi-label Image Classification Algorithm Based on Multi-scale Attention and Graph Model[J].Computer Engineering,2022,48(4):173-178+190.

Authors:	ZHU Xudong XIONG Yun

Affiliation:	1. School of Computer Science, Fudan University, Shanghai 200433, China;2. Shanghai Key Laboratory of Data Science, Shanghai 200433, China

Abstract:	As an important research direction in the field of computer vision, multi-label image classification is widely used in recognition, detection, and other applications.Existing multi-label image classification methods cannot effectively use label correlation information and the corresponding relationship between label semantics and image features, resulting in poor classification ability.A new algorithm for multi-label image classification is proposed.By using tag co-occurrence information and tag prior knowledge to build a graph model, multi-scale attention is used to learn the target in image features, and tag guided attention is used to fuse tag semantic features and image feature information to integrate tag correlation and tag semantic information into model learning.On this basis, a dynamic graph model is constructed based on the graph attention mechanism, and the label information graph model is dynamically updated and learned to integrate the image and label information fully.The experimental results on a multi-label image classification task show that, compared with the existing optimal algorithm, Multi-Label Graph Convolutional Network(MLGCN), the mean Average Precision (mAP) values of the algorithm on the Visual Object Classes-2007(VOC-2007) and Common Object in COntext-2012 (COCO-2012) datasets are improved by 0.6 and 1.2 percentage points, respectively, improving the performance significantly.

Keywords:	multi label label semantic image feature attention mechanism dynamic graph multi scale

	点击此处可从《计算机工程》浏览原始摘要信息
	点击此处可从《计算机工程》下载全文

设为首页 | 免责声明 | 关于勤云 | 加入收藏