首页 | 官方网站   微博 | 高级检索  
     


Building text classifiers using positive,unlabeled and ‘outdated’ examples
Authors:Jiayu Han  Wanli Zuo  Lu Liu  Yuanbo Xu  Tao Peng
Abstract:Learning from positive and unlabeled examples (PU learning) is a partially supervised classification that is frequently used in Web and text retrieval system. The merit of PU learning is that it can get good performance with less manual work. Motivated by transfer learning, this paper presents a novel method that transfers the ‘outdated data’ into the process of PU learning. We first propose a way to measure the strength of the features and select the strong features and the weak features according to the strength of the features. Then, we extract the reliable negative examples and the candidate negative examples using the strong and the weak features (Transfer‐1DNF). Finally, we construct a classifier called weighted voting iterative support vector machine (SVM) that is made up of several subclassifiers by applying SVM iteratively, and each subclassifier is assigned a weight in each iteration. We conduct the experiments on two datasets: 20 Newsgroups and Reuters‐21578, and compare our method with three baseline algorithms: positive example‐based learning, weighted voting classifier and SVM. The results show that our proposed method Transfer‐1DNF can extract more reliable negative examples with lower error rates, and our classifier outperforms the baseline algorithms. Copyright © 2016 John Wiley & Sons, Ltd.
Keywords:text classification  strong features  weak features  Transfer‐1DNF  WV‐ISVM
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号