More and more documents on the World Wide Web are based on templates. On a technical level this causes those documents to have a quite similar source code and DOM tree structure. G...
Document similarity search (i.e. query by example) aims to retrieve a ranked list of documents similar to a query document in a text corpus or on the Web. Most existing approaches...
The primary objective of document annotation in whatever form, manual or electronic is to allow those who may not have control to original document to provide personal view on inf...
Form document analysis is one of the most essential tasks in document analysis and recognition. One of the most fundamental and crucial tasks is the extraction of the reference li...
Abstract. Effective and efficient management and manipulation of XML documents requires stable decisions at the time a document enters the XML DBMS to provide for storage structure...