—Retrieval from Hindi document image collections is a challenging task. This is partly due to the complexity of the script, which has more than 800 unique ligatures. In addition,...
Raman Jain, Volkmar Frinken, C. V. Jawahar, Raghav...
Document clustering is a very hard task in Automatic Text Processing since it requires to extract regular patterns from a document collection without a priori knowledge on the cat...
The Health Level 7 Clinic Document Architecture (CDA) is an XML-based document markup standard that specifies the hierarchical structure and semantics of “clinical documents” ...
This paper describes a system for efficient indexing and retrieval of words in collections of document images. The proposed method is based on two main principles: unsupervised pr...
In the last few years an interest in native XML databases has surfaced. With other authors we argue that such databases need their own provisions for concurrency control since tra...