In this paper, we present our participation in the ImageCLEF 2005 ad-hoc track. First, we describe a preliminary pool of cross-language experiments with the ImageCLEF 2004 testbed...
For the task of near-duplicated document detection, both traditional fingerprinting techniques used in database community and bag-of-word comparison approaches used in information...
Statistical summaries in relational databases mainly focus on the distribution of data values and have been found useful for various applications, such as query evaluation and data...
On the Web, there is a pervasive use of XML to give lightweight semantics to textual collections. Such documentcentric XML collections require a query language that can gracefully...
Jaap Kamps, Maarten Marx, Maarten de Rijke, Bö...
As the rapid growth of PDF document in digital libraries, recognizing the document structure and detecting specific document components are useful for document storage, classifica...