– Better understanding the document logical components is crucial to many applications, e.g., document classification or data integration. As the development of digital libraries...
—In this paper we present a novel approach for fast search of handwritten Arabic word-parts within large lexicons. The algorithm runs through three steps to achieve the required ...
Word searching and indexing in historical document collections is a challenging problem because, characters in these documents are often touching or broken due to degradation/agei...
Abstract—The grapheme codebook is a high-performing technique for offline writer identification. This paper considers whether the de facto standards for initial grapheme extrac...
—Libraries in South Asia hold huge collections of valuable printed documents in Urdu and it is of interest to digitize these collections to make them more accessible. The unavail...