: This paper reports on the Joaquim Nabuco Project, a pioneering work in Latin America on document digitalization, enhancement, compression, indexing, retrieval and network transmi...
Abstract. For the 2008 participation at GeoCLEF, we focused on improving the extraction of geographic signatures from documents and optimising their use for GIR. The results show t...
The difficulty with information retrieval for OCR documents lies in the fact that OCR documents comprise of a significant amount of erroneous words and unfortunately most informat...
Encouraged by a significant improvement over LSI (latent semantic indexing) approach in textual information retrieval of the DLSI (differential latent semantic indexing) approach ...
The normal practice of selecting relevant documents for training routing queries is to either use all relevants or the 'best n' of them after a (retrieval) ranking opera...