首页 | 官方网站   微博 | 高级检索  
     


A nonparametric term weighting method for information retrieval based on measuring the divergence from independence
Authors:İlker Kocabaş  Bekir Taner Dinçer  Bahar Karaoğlan
Affiliation:1. International Computer Institute, Ege University, Bornova, Izmir, Turkey
2. Department of Statistics, Mu?la University, Mugla, Turkey
3. Department of Computer Engineering, Mu?la University, Mugla, Turkey
Abstract:In this article, we introduce an out-of-the-box automatic term weighting method for information retrieval. The method is based on measuring the degree of divergence from independence of terms from documents in terms of their frequency of occurrence. Divergence from independence has a well-establish underling statistical theory. It provides a plain, mathematically tractable, and nonparametric way of term weighting, and even more it requires no term frequency normalization. Besides its sound theoretical background, the results of the experiments performed on TREC test collections show that its performance is comparable to that of the state-of-the-art term weighting methods in general. It is a simple but powerful baseline alternative to the state-of-the-art methods with its theoretical and practical aspects.
Keywords:
本文献已被 SpringerLink 等数据库收录!
设为首页 | 免责声明 | 关于勤云 | 加入收藏

Copyright©北京勤云科技发展有限公司    京ICP备09084417号-23

京公网安备 11010802026262号