Search Sciweavers | Sciweavers

1261 search results - page 50 / 253

» Extracting Text from PostScript

159

click to vote

LREC
2010

189views Education» more LREC 2010»

Automatic Acquisition of Parallel Corpora from Websites with Dynamic Content

15 years 7 months ago

Download cs.haifa.ac.il

Parallel corpora are indispensable resources for a variety of multilingual natural language processing tasks. This paper presents a technique for fully automatic construction of c...

Yulia Tsvetkov, Shuly Wintner

claim paper

Read More »

160

click to vote

CIKM
2008
Springer

194views Information Technology» more CIKM 2008»

Coreex: content extraction from online news articles

15 years 8 months ago

Download ilpubs.stanford.edu

We developed and tested a heuristic technique for extracting the main article from news site Web pages. We construct the DOM tree of the page and score every node based on the amo...

Jyotika Prasad, Andreas Paepcke

claim paper

Read More »

168

click to vote

ICDAR
2003
IEEE

136views Document Analysis» more ICDAR 2003»

Proper Names Extraction from Fax Images Combining Textual and Image Features

15 years 11 months ago

Download www.cse.salford.ac.uk

In the frame of a Unified Messaging System, a crucial task of the system is to provide the user with key information on every message received, like keywords reflecting the object...

Laurence Likforman-Sulem, Pascal Vaillant, Fran&cc...

claim paper

Read More »

144

click to vote

DAS
2006
Springer

114views Document Analysis» more DAS 2006»

Segmentation-Driven Recognition Applied to Numerical Field Extraction from Handwritten Incoming Mail Documents

15 years 10 months ago

Download clement.chatelain.free.fr

Abstract. In this paper, we present a method for the automatic extraction of numerical fields (zip codes, phone numbers, etc.) from incoming mail documents. The approach is based o...

Clément Chatelain, Laurent Heutte, Thierry ...

claim paper

Read More »

188

click to vote

ERCIMDL
2010
Springer

180views Education» more ERCIMDL 2010»

SciPlore Xtract: Extracting Titles from Scientific PDF Documents by Analyzing Style Information (Font Size)

15 years 3 months ago

Download www.sciplore.org

Extracting titles from a PDFs full text is an important task in information retrieval to identify PDFs. Existing approaches apply complicated and expensive (in terms of calculating...

Jöran Beel, Bela Gipp, Ammar Shaker, Nick Fri...

claim paper

Read More »

« Prev « First page 50 / 253 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers