—In this paper, we present a novel approach to search and retrieve from document image collections, without explicit recognition. Existing recognition-free approaches such as wor...
Web pages contain clutter (such as ads, unnecessary images and extraneous links) around the body of an article, which distracts a user from actual content. Extraction of "use...
The representation of information collections needs to be optimized for human cognition. While documents often include rich visual components, collections, including personal coll...
Document image segmentation algorithms primarily aim at separating text and graphics in presence of complex layouts. However, for many non-Latin scripts, segmentation becomes a ch...
This paper describes Gallery, which is our under developing experimental system for supporting the management of personal digital knowledge repositories and which enables its users...