Decoding noisy document images is commonly needed in applications such as enterprise content management. Available OCR solutions are still not satisfactory especially on noisy ima...
This paper describes a system for efficient indexing and retrieval of words in collections of document images. The proposed method is based on two main principles: unsupervised pr...
In this paper we present a novel model based approach to detect severely broken parallel lines in noisy textual documents. It is important to detect and remove these lines so the ...
Most methods for document image retrieval rely solely on text information to find similar documents. This paper describes a way to use layout information for document image retrie...
Joost van Beusekom, Daniel Keysers, Faisal Shafait...
Current data warehouse and OLAP technologies can be applied to analyze the structured data that companies store in their databases. The circumstances that describe the context ass...