This paper presents PDF-TREX, an heuristic approach for table recognition and extraction from PDF documents. The heuristics starts from an initial set of basic content elements an...
This paper suggests an alternative solution for the task of spoken document retrieval (SDR). The proposed system runs retrieval on multi-level transcriptions (word and phone) prod...
Shan Jin, Hemant Misra, Thomas Sikora, Joemon M. J...
We argue for that taking into account semantic relations between words in the text can improve information retrieval performance. We implemented the process of information retrieva...
The INEX query languages allow the extraction of fragments from selected documents. This power is not much used in INEX queries. The paper suggests reasons why, and considers which...
In this paper, we address the problem of database selection for XML document collections, that is, given a set of collections and a user query, how to rank the collections based o...