In this paper, we will present a comprehensive voting approach, taking entire layouts obtained from commercial OCR devices as input. Such a layout comprises segments of three kind...
Reading frequently involves not just looking at words on a page, but also underlining, highlighting and commenting, either on the text or in a separate notebook. This combination ...
Bill N. Schilit, Gene Golovchinsky, Morgan N. Pric...
Abstract. CARARE is a best practice network funded by the European Commission’s ICT Policy Support Programme. The network brings together heritage agencies, organisations, archae...
This paper describes our experiments on the two tasks of the TREC 2007 Enterprise track. In data preprocessing stage we stripped the non-letter character from documents and query....
A hierarchical algorithm is presented for determining the similarity and equivalence of document images. Features extracted from the CCIIT fax-compressed representations of two im...