The proliferation of knowledge-sharing communities like Wikipedia and the advances in automated information extraction from Web pages enable the construction of large knowledge ba...
Semistructured data is not strictly typed like relational or object-oriented data and may be irregular or incomplete. It often arises in practice, e.g., when heterogeneous data so...
Serge Abiteboul, Jason McHugh, Michael Rys, Vasili...
XML has become the most useful standard of data interchange in the web and e-business world and there is a large amount of information stored in this format. Nonetheless, a large ...
Temporal expressions, such as between 1992 and 2000, are frequent across many kinds of documents. Text retrieval, though, treats them as common terms, thus ignoring their inherent...
Irem Arikan, Srikanta J. Bedathur, Klaus Berberich
We investigate using topic prediction data, as a summary of document content, to compute measures of search result quality. Unlike existing quality measures such as query clarity t...