首页 | 官方网站   微博 | 高级检索  
     

主动学习的多标签图像在线分类
引用本文:徐美香,孙福明,李豪杰.主动学习的多标签图像在线分类[J].中国图象图形学报,2015,20(2):237-244.
作者姓名:徐美香  孙福明  李豪杰
作者单位:辽宁工业大学电子与信息工程学院, 锦州 121001;辽宁工业大学电子与信息工程学院, 锦州 121001;大连理工大学软件学院, 大连 116300
基金项目:国家自然科学基金项目(61272214,61173104)
摘    要:目的在多标签有监督学习框架中,构建具有较强泛化性能的分类器需要大量已标注训练样本,而实际应用中已标注样本少且获取代价十分昂贵。针对多标签图像分类中已标注样本数量不足和分类器再学习效率低的问题,提出一种结合主动学习的多标签图像在线分类算法。方法基于min-max理论,采用查询最具代表性和最具信息量的样本挑选策略主动地选择待标注样本,且基于KKT(Karush-Kuhn-Tucker)条件在线地更新多标签图像分类器。结果在4个公开的数据集上,采用4种多标签分类评价指标对本文算法进行评估。实验结果表明,本文采用的样本挑选方法比随机挑选样本方法和基于间隔的采样方法均占据明显优势;当分类器达到相同或相近的分类准确度时,利用本文的样本挑选策略选择的待标注样本数目要明显少于采用随机挑选样本方法和基于间隔的采样方法所需查询的样本数。结论本文算法一方面可以减少获取已标注样本所需的人工标注代价;另一方面也避免了传统的分类器重新训练时利用所有数据所产生的学习效率低下的问题,达到了当新数据到来时可实时更新分类器的目的。

关 键 词:多标签分类  主动学习  在线学习  min-max理论
收稿时间:2014/7/25 0:00:00
修稿时间:2014/9/25 0:00:00

Online multi-label image classification with active learning
Xu Meixiang,Sun Fuming and Li haojie.Online multi-label image classification with active learning[J].Journal of Image and Graphics,2015,20(2):237-244.
Authors:Xu Meixiang  Sun Fuming and Li haojie
Affiliation:School of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou 121001, China;School of Electronics and Information Engineering, Liaoning University of Technology, Jinzhou 121001, China;School of Software, Dalian University of Technology, Dalian 116300, China
Abstract:Objective Supervised machine learning methods have been applied to multi-label image classification problems with tremendous success. Despite the different learning mechanisms, the performances of these methods heavily rely on the number of labeled training examples. However, the acquisition of labeled examples requires significant efforts from annotators, which hinders the application of supervised learning methods to large-scale problems. In this paper, we extend an active querying method driven by informativeness and representativeness in single-label learning to multi-label image classification. Given that the training set grows in active learning, the classifier needs to be updated accordingly. A new classifier is required to use all training samples for re-training.Under such condition, the training burden of the classifier increases significantly. A highlyeffective online learning algorithm is explored to speed up learning efficiency. To deal with the massive multi-label classification problems, a novel algorithm named active learning with informativeness and representativeness for online multi-label learning (AIR-OML), which aims at sample selection strategy and atupdate issuesin classifications, is presented.Method The sample selection strategy in active learning is based on the min-max theory, querying the most informative and the most representative examples to retrain the multi-label learner.Kullback-Leibler divergence and Karush-Kuhn-Tucker conditions are utilized to update the multi-label classifier online in real time. Combining active learning with online learning, we propose the AIR-OML algorithm. Result Experiments are conducted in four different real-world datasets with four evaluation criteria to evaluate the presented algorithm. Experimental results demonstrate that the sample selection strategy explored has a significant advantage over the other two existing sample selection strategies, and the classifier can achieve the best performance with fewer new samples by querying the most informative and representative examples. Conclusion The AIR-OML algorithm can reduce the cost of human annotation and realize the goal of updating the classifier timely when newly labeled examples arrive.
Keywords:multi-label classification  active learning  online learning  min-max theory
本文献已被 CNKI 等数据库收录!
点击此处可从《中国图象图形学报》浏览原始摘要信息
点击此处可从《中国图象图形学报》下载免费的PDF全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号