Word searching and indexing in historical document collections is a challenging problem because, characters in these documents are often touching or broken due to degradation/agei...
Abstract: XML documents are widely used as a generic container for textual contents. As they are increasingly growing in size, XML databases have come up to efficiently store and q...
Abstract. The labeling problem of dynamic XML documents has received increasing research attention. When XML documents are subject to insertions and deletions of nodes, it is impor...
This paper presents an automatic orientation detection and categorization technique that is capable of detecting the orientation of multilingual documents with arbitrary skew and ...
Like HTML, many XML documents are resident on native file systems. Since XML data is irregular and verbose, the disk space and the network bandwidth are wasted. To overcome the ve...