Sciweavers

Share
PARA
2000
Springer

Parallel and Distributed Document Overlap Detection on the Web

11 years 9 months ago
Parallel and Distributed Document Overlap Detection on the Web
Proliferation of digital libraries plus availability of electronic documents from the Internet have created new challenges for computer science researchers and professionals. Documents are easily copied and redistributed or used to create plagiarised assignments and conference papers. This paper presents a new, two-stage approach for identifying overlapping documents. The first stage is identifying a set of candidate documents that are compared in the second stage using a matching-engine. The algorithm of the matching-engine is based on suffix trees and it modifies the known matching statistics algorithm. Parallel and distributed approaches are discussed at both stages and performance results are presented.
Krisztián Monostori, Arkady B. Zaslavsky, H
Added 25 Aug 2010
Updated 25 Aug 2010
Type Conference
Year 2000
Where PARA
Authors Krisztián Monostori, Arkady B. Zaslavsky, Heinz W. Schmidt
Comments (0)
books