We present a framework to analyze color documents of complex layout. In addition, no assumption is made on the layout. Our framework combines in a content-driven bottom-up approac...
This paper investigates methods to automatically infer structural information from large XML documents. Using XML as a reference format, we approach the schema generation problem ...
In this paper, we study query evaluation on Active XML documents (AXML for short), a new generation of XML documents that has recently gained popularity. AXML documents are XML do...
In order to evaluate the performance of information retrieval and extraction algorithms, we need test collections. A test collection consists of a set of documents, a clearly form...
Recently, high resolution digital cameras have made the digitization process more flexible and convenient than traditional scanning technology. Therefore, document image analysis ...