基于拓扑一致性对抗互学习的知识蒸馏 |
| |
引用本文: | 赖轩,曲延云,谢源,裴玉龙.基于拓扑一致性对抗互学习的知识蒸馏[J].自动化学报,2023,49(1):102-110. |
| |
作者姓名: | 赖轩 曲延云 谢源 裴玉龙 |
| |
作者单位: | 1.厦门大学信息学院 厦门 361005 |
| |
基金项目: | 国家自然科学基金(61876161, 61772524, 61671397, U1065252, 61772440), 上海市人工智能科技支撑专项(21511100700)资助 |
| |
摘 要: | 针对基于互学习的知识蒸馏方法中存在模型只关注教师网络和学生网络的分布差异,而没有考虑其他的约束条件,只关注了结果导向的监督,而缺少过程导向监督的不足,提出了一种拓扑一致性指导的对抗互学习知识蒸馏方法 (Topology-guided adversarial deep mutual learning, TADML).该方法将教师网络和学生网络同时训练,网络之间相互指导学习,不仅采用网络输出的类分布之间的差异,还设计了网络中间特征的拓扑性差异度量.训练过程采用对抗训练,进一步提高教师网络和学生网络的判别性.在分类数据集CIFAR10、CIFAR100和Tiny-ImageNet及行人重识别数据集Market1501上的实验结果表明了TADML的有效性, TADML取得了同类模型压缩方法中最好的效果.
|
关 键 词: | 互学习 生成对抗网络 特征优化 知识蒸馏 |
收稿时间: | 2020-08-18 |
Topology-guided Adversarial Deep Mutual Learning for Knowledge Distillation |
| |
Affiliation: | 1.School of Informatics, Xiamen University, Xiamen 3610052.Department of Computer Science & Technology, East China Normal University, Shanghai 200064 |
| |
Abstract: | The existing mutual-deep-learning based knowledge distillation methods have the limitations: the discrepancy between the teacher network and the student network is only used to supervise the knowledge transfer neglecting other constraints, and the result-driven supervision is only used neglecting process-driven supervision. This paper proposes a topology-guided adversarial deep mutual learning network (TADML). This method trains multiple classification sub-networks of the same task simultaneously and each sub-network learns from others. Moreover, our method uses an adversarial network to adaptively measure the differences between pairwise sub-networks and optimizes the features without changing the model structure. The experimental results on three classification datasets: CIFAR10, CIFAR100 and Tiny-ImageNet and a person re-identification dataset Market1501 show that our method has achieved the best results among similar model compression methods. |
| |
Keywords: | |
|
| 点击此处可从《自动化学报》浏览原始摘要信息 |
|
点击此处可从《自动化学报》下载全文 |
|