Abstract. Modern document collections often contain groups of documents with overlapping or shared content. However, most information retrieval systems process each document separa...
Andrei Z. Broder, Nadav Eiron, Marcus Fontoura, Mi...
In this paper, we present an online citation entry clustering system based on three-tier clustering. The objective is to further process search results returned by bibliography dat...
Huge amount of information is present in the World Wide Web and a large amount is being added to it frequently. A query-specific summary of multiple documents is very helpful to t...
In this paper, we present the multilingual Sense Folder Corpus. After the analysis of different corpora, we describe the requirements that have to be satisfied for evaluating sema...
Next-generation Government Information Systems will integrate large amounts of heterogeneous data sources located on distributed networks like the Internet. We present Net Travele...