首页 | 官方网站   微博 | 高级检索  
     

基于标签进行度量学习的图半监督学习算法
引用本文:吕亚丽,苗钧重,胡玮昕.基于标签进行度量学习的图半监督学习算法[J].计算机应用,2020,40(12):3430-3436.
作者姓名:吕亚丽  苗钧重  胡玮昕
作者单位:1. 山西财经大学 信息学院, 太原 030006;2. 计算智能与中文信息处理教育部重点实验室(山西大学), 太原 030006
基金项目:山西省自然科学基金;山西省回国留学人员科研项目
摘    要:大多基于图的半监督学习方法,在样本间相似性度量时没有用到已有的和标签传播过程中得到的标签信息,同时,其度量方式相对固定,不能有效度量出分布结构复杂多样的数据样本间的相似性。针对上述问题,提出了基于标签进行度量学习的图半监督学习算法。首先,给定样本间相似性的度量方式,从而构建相似度矩阵。然后,基于相似度矩阵进行标签传播,筛选出k个低熵样本作为新确定的标签信息。最后,充分利用所有标签信息更新相似性度量方式,重复迭代优化直至学出所有标签信息。所提算法不仅利用标签信息改进了样本间相似性的度量方式,而且充分利用中间结果降低了半监督学习对标签数据的需求量。在6个真实数据集上的实验结果表明,该算法在超过95%的情况下相较三种传统的基于图的半监督学习算法取得了更高的分类准确率。

关 键 词:机器学习  图半监督学习  度量学习  标签传播  相似度矩阵  
收稿时间:2020-06-12
修稿时间:2020-08-20

Semi-supervised learning algorithm of graph based on label metric learning
LYU Yali,MIAO Junzhong,HU Weixin.Semi-supervised learning algorithm of graph based on label metric learning[J].journal of Computer Applications,2020,40(12):3430-3436.
Authors:LYU Yali  MIAO Junzhong  HU Weixin
Affiliation:1. School of Information, Shanxi University of Finance and Economics, Taiyuan Shanxi 030006, China;2. Key Laboratory of Computational Intelligence and Chinese Information Processing, Ministry of Education;(Shanxi University), Taiyuan Shanxi 030006, China
Abstract:Most graph-based semi-supervised learning methods do not use the known label information and the label information obtained from the label propagation process when measuring the similarity between samples. At the same time, these methods have the measurement methods relatively fixed, which cannot effectively measure the similarity between data samples with complex and varied distribution structures. In order to solve the problems, a semi-supervised learning algorithm of graph based on label metric learning was proposed. Firstly, the similarity measurement method of samples was given, and then the similarity matrix was constructed. Secondly, labels were propagated based on the similarity matrix and k samples with low entropy were selected as the new obtained label information. Finally, the similarity measure method was updated by fully using all label information, and this process was repeated until all label information was learned. The proposed algorithm not only uses label information to improve the measurement method of similarity between samples, but also makes full use of intermediate results to reduce the demand for labeled data in the semi-supervised learning. Experimental results on six real datasets show that, compared with three traditional graph-based semi-supervised learning algorithms, the proposed algorithm achieves higher classification accuracy in more than 95% of the cases.
Keywords:machine learning  graph-based semi-supervised learning  metric learning  label propagation  similarity matrix  
本文献已被 万方数据 等数据库收录!
点击此处可从《计算机应用》浏览原始摘要信息
点击此处可从《计算机应用》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号