This paper investigates the applicability of distributed clustering technique, called RACHET [1], to organize large sets of distributed text data. Although the authors of RACHET c...
One of the major challenges in camera document analysis is to deal with the page curl and perspective distortions. In spite of the prevalence of dewarping techniques, no standard ...
Nikolaos Stamatopoulos, Basilios Gatos, Ioannis Pr...
We present a system that classifies pixels in a document image according to marking type such as machine print, handwriting, and noise. A segmenter module first splits an input ...
In searching a repository of business documents, a task of interest is that of using a query signature image to retrieve from a database, other signatures matching the query. The ...
Sargur N. Srihari, Shravya Shetty, Siyuan Chen, Ha...
Structured link vector model (SLVM) is a recently proposed document representation that takes into account both structural and semantic information for measuring XML document simi...