The key of overlapping structures or concurrent markup hierarchies in XML encodings of documents is that markup in one hierarchy is not necessarily well-formed with respect to the...
This paper presents PDF-TREX, an heuristic approach for table recognition and extraction from PDF documents. The heuristics starts from an initial set of basic content elements an...
: Business Process Re-engineering (BPR) is an area that requires a lot of technical documents and an important feature of a well-written document is a coherent narrative. Even thou...
This paper presents a methodology for summarization from multiple documents which are about a specic topic. It is based on the specication and identication of the cross-document...
Stergos D. Afantenos, Irene Doura, Eleni Kapellou,...
This work explores the application of clustering methods for grouping structurally similar XML documents. Modeling the XML documents as rooted ordered labeled trees, we apply clust...
Theodore Dalamagas, Tao Cheng, Klaas-Jan Winkel, T...