In this paper we propose a new approach to improve electronic editions of human science corpus, providing an efficient estimation of manuscripts pages structure. In any handwriti...
Detecting and representing changes to data is important for active databases, data warehousing, view maintenance, and version and configuration management. Most previous work in c...
Sudarshan S. Chawathe, Anand Rajaraman, Hector Gar...
The web hasgreatly improved accessto scientific literature. However, scientific articles on the web are largely disorganized, with research articles being spreadacrossarchive site...
Numerous approaches, including textual, structural and featural, to detecting duplicate documents have been investigated. Considering document images are usually stored and transm...
Text clustering methods can be used to structure large sets of text or hypertext documents. The well-known methods of text clustering, however, do not really address the special p...