Higher-Order Smoothing： A Novel Semantic Smoothing Method for Text Classification 期刊界 All Journals 搜尽天下杂志传播学术成果专业期刊搜索期刊信息化学术搜索

Higher-Order Smoothing： A Novel Semantic Smoothing Method for Text Classification

摘要：	It is known that latent semantic indexing （LSI） takes advantage of implicit higher-order （or latent） structure in the association of terms and documents. Higher-order relations in LSI capture ＂latent semantics＂. These findings have inspired a novel Bayesian framework for classification named Higher-Order Naive Bayes （HONB）, which was introduced previously, that can explicitly make use of these higher-order relations. In this paper, we present a novel semantic smoothing method named Higher-Order Smoothing （HOS） for the Naive Bayes algorithm. HOS is built on a similar graph based data representation of the HONB which allows semantics in higher-order paths to be exploited. We take the concept one step further in HOS and exploit the relationships between instances of different classes. As a result, we move beyond not only instance boundaries, but also class boundaries to exploit the latent information in higher-order paths. This approach improves the parameter estimation when dealing with insufficient labeled data. Results of our extensive experiments demonstrate the value of HOS oi1 several benchmark datasets.
关键词：	潜在语义索引高阶通道文本分类平滑法 Bayes算法朴素贝叶斯 Naive 平滑方法
本文献已被维普等数据库收录！