In this paper, we introduce a visualization method that couples a trend chart with word clouds to illustrate temporal content evolutions in a set of documents. Specifically, we us...
Document retrieval and web search engines index large quantities of text. The static costs associated with storing the index can be traded against dynamic costs associated with us...
Many of the documents in large text collections are duplicates and versions of each other. In recent research, we developed new methods for finding such duplicates; however, as the...
In this paper, we discuss how to present the result of searching elements of any type from XML documents relevant to some information need (relevance-oriented search). As the resu...
Text classification categories Web documents in large collections into predefined classes based on their contents. Unfortunately, the classification process can be time-consumi...