Structured link vector model (SLVM) is a recently proposed document representation that takes into account both structural and semantic information for measuring XML document simi...
In order to overcome poor readability of text and recognizability of image features in low resolution thumbnails, a novel image representation of compound document images - a Smar...
Kathrin Berkner, Edward L. Schwartz, Christophe Ma...
—Most Web and legacy paper-based documents are available in human comprehensible text form, not readily accessible to or understood by computer programs. Here, we investigate an ...
† There is a significant need for a realistic dataset on which to evaluate layout analysis methods and examine their performance in detail. This paper presents a new dataset (and...
Apostolos Antonacopoulos, David Bridson, Christos ...
We present the STEX system, a semantic extension of LATEX, that allows for producing high-quality PDF documents for (proof)reading and printing, as well as semantic XML/OMDoc docu...
Andrea Kohlhase, Michael Kohlhase, Christoph Lange...