首页 | 官方网站   微博 | 高级检索  
     

基于Tri-training的半监督多标记学习算法
引用本文:刘杨磊,,梁吉业,,高嘉伟,,杨静,.基于Tri-training的半监督多标记学习算法[J].智能系统学报,2013,8(5):439-445.
作者姓名:刘杨磊    梁吉业    高嘉伟    杨静  
作者单位:1.山西大学 计算机与信息技术学院,山西 太原 030006; 2.山西大学 计算智能与中文信息处理教育部重点实验室,山西 太原 030006
摘    要:传统的多标记学习是监督意义下的学习,它要求获得完整的类别标记.但是当数据规模较大且类别数目较多时,获得完整类别标记的训练样本集是非常困难的.因而,在半监督协同训练思想的框架下,提出了基于Tri-training的半监督多标记学习算法(SMLT).在学习阶段,SMLT引入一个虚拟类标记,然后针对每一对类别标记,利用协同训练机制Tri-training算法训练得到对应的分类器;在预测阶段,给定一个新的样本,将其代入上述所得的分类器中,根据类别标记得票数的多少将多标记学习问题转化为标记排序问题,并将虚拟类标记的得票数作为阈值对标记排序结果进行划分.在UCI中4个常用的多标记数据集上的对比实验表明,SMLT算法在4个评价指标上的性能大多优于其他对比算法,验证了该算法的有效性.

关 键 词:多标记学习  半监督学习  Tri-training

Semi-supervised multi-label learning algorithm based on Tri-training
LIU Yanglei,,LIANG Jiye,,GAO Jiawei,,YANG Jing,.Semi-supervised multi-label learning algorithm based on Tri-training[J].CAAL Transactions on Intelligent Systems,2013,8(5):439-445.
Authors:LIU Yanglei    LIANG Jiye    GAO Jiawei    YANG Jing  
Affiliation:1. School of Computer and Information Technology, Shanxi University, Taiyuan 030006, China; 2. Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education, Shanxi University, Taiyuan 030006, China
Abstract:Traditional multi-label learning is in the sense of supervision, in which the complete category labels are required. However, when the size of data is large and there are several categories of labels, it is quite difficult to obtain the training sample sets with complete labels. Therefore, a semi-supervised multi-label learning algorithm based on Tri-training (SMLT) is proposed. In the learning stage, SMLT initially introduces a virtual label, then for each pair of virtual labels, the Tri-training algorithm is utilized to train the corresponding classifiers for each pair of labels. In the forecast stage, a new sample is given, which will be substituted into the obtained classifier described above. According to the votes of each label, the multi-label learning problem is transformed into a label ranking problem, subsequently; the votes of the virtual label are taken as the threshold for distinguishing the label ranking results. The contrast experiments on four commonly used UCI multi-label datasets show the SMLT algorithm behaves better than other comparative algorithms in four evaluation indices and the effectiveness of the proposed algorithm is verified.
Keywords:multi-label learning  semi-supervised learning  Tri-training
点击此处可从《智能系统学报》浏览原始摘要信息
点击此处可从《智能系统学报》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号