The XQuery language is the standard query language for XML. The result of XQuery queries are typically XML documents in other formats. We introduce the term output schemas of an X...
We present the results of experiments using terms from citations for scientific literature search. To index a given document, we use terms used by citing documents to describe that...
Supporting entity extraction from large document collections is important for enabling a variety of important data analysis tasks. In this paper, we introduce the "ad-hoc&quo...
Abstract. Topic models are a discrete analogue to principle component analysis and independent component analysis that model topic at the word level within a document. They have ma...
In order to evaluate the performance of information retrieval and extraction algorithms, we need test collections. A test collection consists of a set of documents, a clearly form...