Versioned textual collections are collections that retain multiple versions of a document as it evolves over time. Important large-scale examples are Wikipedia and the web collect...
Knowledge Discovery in Databases (KDD), also known as data mining, focuses on the computerized exploration of large amounts of data and on the discovery of interesting patterns wi...
Typically, searching for information in a document collection amounts to refining a query and then scanning a large number of documents to determine their relevance. Active Summar...
The goal of text categorization is to classify documents into a certain number of pre-defined categories. The previous works in this area have used a large number of labeled train...
CT This paper explores several methods for visualizing the thematic content of large document collections. As opposed to traditional query-driven document retrieval, these methods ...
Nancy Miller, Elizabeth G. Hetzler, Grant Nakamura...