首页 | 官方网站   微博 | 高级检索  
     

利用图结构进行半监督学习的短文本分类研究
引用本文:张倩,刘怀亮.利用图结构进行半监督学习的短文本分类研究[J].图书情报工作,2013,57(21):126-132.
作者姓名:张倩  刘怀亮
作者单位:西安电子科技大学经济与管理学院
摘    要:为了解决基于向量空间模型构建短文本分类器时造成的文本结构信息的缺失以及大量样本存在的标注瓶颈问题,提出一种基于图结构的半监督学习分类方法,这种方法既能保留短文本的结构语义关系,又能实现未标注样本的充分利用,提高分类器的性能。通过引入半监督学习的思想,将数量规模较大的未标注样本与少量已标注样本相结合进行基于图结构的自训练学习,不断迭代实现训练样本集的扩充,从而构建最终短文本分类器。经对比实验证明,这种方法能够获得较好的分类效果。

关 键 词:半监督学习  短文本  图结构  自训练  
收稿时间:2013-06-17

Research on Short Text Classification Based on Semi-supervised Learning by Graph Structure
Zhang Qian,Liu Huailiang.Research on Short Text Classification Based on Semi-supervised Learning by Graph Structure[J].Library and Information Service,2013,57(21):126-132.
Authors:Zhang Qian  Liu Huailiang
Affiliation:School of Economics and Management, Xidian University, Xi'an 710071, China
Abstract:In order to resolve the problems of the lack of text structure and semantic information in the vector space model and the bottleneck problem of annotation in dealing with large numbers of unlabeled samples, this paper introduces a method of short texts classification based on semi-supervised learning. It is feasible to maintain the relationship between samples and can also make full use of the unlabeled parts to improve the performance of the classifier. It is a self-training algorithm that connects the large numbers of unlabeled parts and the labeled together to learn based on graph structure, so that the training samples can be enlarged and used to build the final text classifier. The contrast experiment shows that the algorithm of short text classification based on semi-supervised learning can get better classified effect.
Keywords:semi-supervised learning  short text  graph structure  self-training  
本文献已被 CNKI 等数据库收录!
点击此处可从《图书情报工作》浏览原始摘要信息
点击此处可从《图书情报工作》下载全文
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号