In this article, we describe the XML storage system used in the WebContent project. We begin by advocating the use of an XML database in order to store WebContent documents, and w...
This paper presents a novel formulation and approach to the minimal document set retrieval problem. Minimal Document Set Retrieval (MDSR) is a promising information retrieval task...
For more than a decade, researches on OLAP and multidimensional databases have generated methodologies, tools and resource management systems for the analysis of numeric data. With...
This paper presents a new adaptive approach for document image binarization. The proposed method is mainly based on the combination of several stateof-the-art binarization methodo...
Basilios Gatos, Ioannis Pratikakis, Stavros J. Per...
Given a terabyte click log, can we build an efficient and effective click model? It is commonly believed that web search click logs are a gold mine for search business, because th...
Anitha Kannan, Chao Liu 0001, Christos Faloutsos, ...