A distributed memory parallel version of the group average Hierarchical Agglomerative Clustering algorithm is proposed to enable scaling the document clustering problem to large c...
Rebecca Cathey, Eric C. Jensen, Steven M. Beitzel,...
This paper presents a Guided tour system for the WWW, called Ariadne, which implements the ideas of trails and guided tours, originating from the hypertext field. Ariadne appears ...
We develop a method for predicting query performance by computing the relative entropy between a query language model and the corresponding collection language model. The resultin...
Current statistical parsers tend to perform well only on their training domain and nearby genres. While strong performance on a few related domains is sufficient for many situatio...
Wikipedia provides a wealth of knowledge, where the first sentence, infobox (and relevant sentences), and even the entire document of a wiki article could be considered as diverse...