In state-of-the-art image retrieval systems, an image is
represented by a bag of visual words obtained by quantizing
high-dimensional local image descriptors, and scalable
schem...
Zhong Wu (Tsinghua University), Qifa Ke (Microsoft...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents. Titles of HTML documents should be correctly defined in the title fields; however, i...
WAIF is a new framework to facilitate easy user access for Internet users to relevant news items. WAIF supports new kinds of browsers, personalized filters, recommendation systems...
Dag Johansen, Robbert van Renesse, Fred B. Schneid...
Search engines present fix-length passages from documents ranked by relevance against the query. In this paper, we present and compare novel, language-model based methods for extr...
Traditionally, Information Extraction (IE) has focused on satisfying precise, narrow, pre-specified requests from small homogeneous corpora (e.g., extract the location and time o...
Michele Banko, Michael J. Cafarella, Stephen Soder...