Nowadays, structured data such as sales and business forms are stored in data warehouses for decision makers to use. Further, unstructured data such as emails, html texts, images,...
Abstract. Modern document collections often contain groups of documents with overlapping or shared content. However, most information retrieval systems process each document separa...
Andrei Z. Broder, Nadav Eiron, Marcus Fontoura, Mi...
In this paper, we present an online citation entry clustering system based on three-tier clustering. The objective is to further process search results returned by bibliography dat...
Huge amount of information is present in the World Wide Web and a large amount is being added to it frequently. A query-specific summary of multiple documents is very helpful to t...
In this paper, we present the multilingual Sense Folder Corpus. After the analysis of different corpora, we describe the requirements that have to be satisfied for evaluating sema...