Temporal expressions, such as between 1992 and 2000, are frequent across many kinds of documents. Text retrieval, though, treats them as common terms, thus ignoring their inherent...
Irem Arikan, Srikanta J. Bedathur, Klaus Berberich
XML documents are frequently used in applications such as business transactions and medical records involving sensitive information. Typically, parts of documents should be visibl...
Naizhen Qi, Michiharu Kudo, Jussi Myllymaki, Hamid...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Comparing retrieval approaches requires test collections, which consist of documents, queries and relevance assessments. Obtaining consistent and exhaustive relevance assessments ...
Search systems have for some time provided users with the ability to request documents similar to a given document. Interfaces provide this feature via a link or button for each d...