Group 4 Compressed Document Matching

13 years 11 months ago
Group 4 Compressed Document Matching
Numerous approaches, including textual, structural and featural, to detecting duplicate documents have been investigated. Considering document images are usually stored and transmitted in compressed forms, it is advantageous to perform document matching directly on the compressed data. An algorithm for matching CCITT Group 4 compressed document images using a feature set directly computable from the Group 4 compression scheme is presented. Multiple descriptors based on local arrangement of feature points are constructed for efficient indexing into the database. We describe the procedures for feature extraction and descriptor generation. Performance of the algorithm on the UW database is discussed.
Dar-Shyang Lee, Jonathan J. Hull
Added 05 Aug 2010
Updated 05 Aug 2010
Type Conference
Year 1998
Where DAS
Authors Dar-Shyang Lee, Jonathan J. Hull
Comments (0)