In this paper, we propose a machine learning approach to title extraction from general documents. By general documents, we mean documents that can belong to any one of a number of...
Yunhua Hu, Hang Li, Yunbo Cao, Dmitriy Meyerzon, Q...
The ability to find tables and extract information from them is a necessary component of data mining, question answering, and other information retrieval tasks. Documents often c...
David Pinto, Andrew McCallum, Xing Wei, W. Bruce C...
Researchers spent a large amount of their time searching through an ever increasing number of scientific articles. Although users of scientific search engines prefer the ranking o...
Developed using the principles of the Model-View-Controller architectural pattern, FolksEngine is a parametric search engine for folksonomies that allows us to test arbitrary sear...
Nicola Raffaele Di Matteo, Silvio Peroni, Fabio Ta...
The classical (ad hoc) document retrieval problem has been traditionally approached through ranking according to heuristically developed functions (such as tf.idf or bm25) or gene...