A hierarchical algorithm is presented for determining the similarity and equivalence of document images. Features extracted from the CCIIT fax-compressed representations of two images are compared to determine their visual similarity and whether they are equivalent. Passcodes in the compressed data are used as features. A fixed grid is imposed on the image and a feature vector is derived from the number of pass codes in each grid cell. The features vectors are compared to locate a group of documents that are visually similar to the input image. The equivalence of two documents is determined by applying the Hausdorff distance to the two-dimensional arrangement of passcodes in small patches of each image. 							
						
							
					 															
					Jonathan J. Hull, John F. Cullen