This paper summarizes the work done at the State University of New York at Buffalo (UB) in the GeoCLEF 2006 track. The approach presented uses pure IR techniques (indexing of sing...
Miguel E. Ruiz, June M. Abbas, David Mark, Stuart ...
Spoken Document Retrieval (SDR) is a promising technology for enhancing the utility of spoken materials. After the spoken documents have been transcribed by using a Large Vocabula...
This paper presents a new way of thinking for IR metric optimization. It is argued that the optimal ranking problem should be factorized into two distinct yet interrelated stages:...
The aim of query-based sampling is to obtain a sufficient, representative sample of an underlying (text) collection. Current measures for assessing sample quality are too coarse gr...
Probabilistic Latent Semantic Indexing is a novel approach to automated document indexing which is based on a statistical latent class model for factor analysis of count data. Fit...