This paper presents a generic architecture for handwriting documents analysis. It covers all analysis steps from the content description of the document (layout analysis, handwrit...
We describe an HTML web page segmentation algorithm, which is applied to segment online medical journal articles (regular HTML and PDF-Converted-HTML files). The web page content ...
The web hasgreatly improved accessto scientific literature. However, scientific articles on the web are largely disorganized, with research articles being spreadacrossarchive site...
This paper describes our experiments in Geographical Information Retrieval (GIR) in the context of our participation in the GeoCLEF 2006 Monolingual English task. The TALPGeoIR sy...
- Large-scale digitisation has led to a number of new possibilities with regard to adaptive and learning based methods in the field of Document Image Analysis and OCR. For ground t...
C. Clausner, Stefan Pletschacher, Apostolos Antona...