Search Sciweavers | Sciweavers

54 search results - page 1 / 11

» A System for Converting PDF Documents into Structured XML Fo...

click to vote

DAS
2006
Springer

129views Document Analysis» more DAS 2006»

A System for Converting PDF Documents into Structured XML Format

14 years 1 months ago

Download www.xrce.xerox.com

We present in this paper a system for converting PDF legacy documents into structured XML format. This conversion system first extracts the different streams contained in PDF files...

Hervé Déjean, Jean-Luc Meunier

claim paper

Read More »

click to vote

DAS
2006
Springer

202views Document Analysis» more DAS 2006»

XCDF: A Canonical and Structured Document Format

13 years 11 months ago

Download www.bloechle.ch

Accessing the structured content of PDF document is a difficult task, requiring pre-processing and reverse engineering techniques. In this paper, we first present different methods...

Jean-Luc Bloechle, Maurizio Rigamonti, Karim Hadja...

claim paper

Read More »

click to vote

ICDAR
2009
IEEE

191views Document Analysis» more ICDAR 2009»

OCD: An Optimized and Canonical Document Format

13 years 7 months ago

Download www.cvc.uab.es

Revealing and being able to manipulate the structured content of PDF documents is a difficult task, requiring pre-processing and reverse engineering techniques. In this paper, we ...

Jean-Luc Bloechle, Denis Lalanne, Rolf Ingold

claim paper

Read More »

click to vote

ICDAR
2003
IEEE

169views Document Analysis» more ICDAR 2003»

Document Transformation System from Papers to XML Data Based on Pivot XML Document Method

14 years 2 months ago

Download www.cse.salford.ac.uk

This paper proposes a new method for document transformation using OCR to generate various XML documents from printed documents. The proposed method adopts a hierarchical transfor...

Yasuto Ishitani

claim paper

Read More »

click to vote

DOCENG
2009
ACM

166views Document Analysis» more DOCENG 2009»

Object-level document analysis of PDF files

14 years 3 months ago

Download www.dbai.tuwien.ac.at

The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...

Tamir Hassan

claim paper

Read More »

« Prev « First page 1 / 11 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers