This paper describes the design and implementation of an embedded-type XML storage and retrieval system which is built on top of relational databases. The proposed system stores ea...
As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. The goal of this work i...
We present Storm, a storage system which unifies the desktop and the public network, making Web links between desktop documents more practical. Storm assigns each document a perm...
Benja Fallenstein, Tuomas J. Lukka, Hermanni Hyyti...
The Health Level 7 Clinic Document Architecture (CDA) is an XML-based document markup standard that specifies the hierarchical structure and semantics of “clinical documents” ...
Word fragments or n-grams have been widely used to perform different Natural Language Processing tasks such as information retrieval [1] [2], document categorization [3], automatic...