Sciweavers

ICDAR
2009
IEEE
13 years 2 months ago
OCD: An Optimized and Canonical Document Format
Revealing and being able to manipulate the structured content of PDF documents is a difficult task, requiring pre-processing and reverse engineering techniques. In this paper, we ...
Jean-Luc Bloechle, Denis Lalanne, Rolf Ingold
ICMLA
2008
13 years 6 months ago
Text, Image and Vector Graphics Based Appraisal of Contemporary Documents
We have designed a framework for content based appraisal of documents. Our motivation is to provide computer assisted support for answering several appraisal criteria according to...
Sang-Chul Lee, William McFadden, Peter Bajcsy
DIAL
2006
IEEE
130views Image Analysis» more  DIAL 2006»
13 years 8 months ago
Refinement of digitized documents through recognition of mathematical formulae
We are developing a recognition system, named `Infty', for scientific documents including those with mathematical formulae. In this paper, we propose a new system that can re...
Toshihiro Kanahori, Masakazu Suzuki
DIAL
2004
IEEE
156views Image Analysis» more  DIAL 2004»
13 years 8 months ago
Xed: A New Tool for eXtracting Hidden Structures from Electronic Documents
PDF became a very common format for exchanging printable documents. Further, it can be easily generated from the major documents formats, which make a huge number of PDF documents...
Karim Hadjar, Maurizio Rigamonti, Denis Lalanne, R...
DOCENG
2004
ACM
13 years 10 months ago
Creating structured PDF files using XML templates
This paper describes a tool for recombining the logical structure from an XML document with the typeset appearance of the corresponding PDF document. The tool uses the XML represe...
Matthew R. B. Hardy, David F. Brailsford, Peter L....
DOCENG
2009
ACM
13 years 11 months ago
Object-level document analysis of PDF files
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
Tamir Hassan
MKM
2009
Springer
13 years 11 months ago
A Linear Grammar Approach to Mathematical Formula Recognition from PDF
Many approaches have been proposed over the years for the recognition of mathematical formulae from scanned documents. More recently a need has arisen to recognise formulae from PD...
Josef B. Baker, Alan P. Sexton, Volker Sorge