Sciweavers

DIAL
2004
IEEE

Retrieving Imaged Documents in Digital Libraries Based on Word Image Coding

13 years 8 months ago
Retrieving Imaged Documents in Digital Libraries Based on Word Image Coding
A great number of documents are scanned and archived in the form of digital images in digital libraries, to make them available and accessible in the Internet. Information retrieval in these imaged documents has become a growing and challenging problem. For this purpose, a word image coding technique is proposed in this paper, and a web-based system for efficiently retrieving imaged documents from digital libraries is described. Some image preprocessing is first carried out off-line to extract word objects from imaged documents stored in the digital library. Then each word object is represented by a string of feature codes. As a result, each document image is represented by a series of feature code strings of its words, which are stored in a feature code file. Upon receiving a user's request, the server converts the query word into feature code string using the same conversion mechanism as is used in producing feature codes for the underlying imaged documents. Searching is then p...
Yue Lu, Li Zhang, Chew Lim Tan
Added 20 Aug 2010
Updated 20 Aug 2010
Type Conference
Year 2004
Where DIAL
Authors Yue Lu, Li Zhang, Chew Lim Tan
Comments (0)