We describe the design of a censorship-resistant system that employs a unique document storage mechanism. Newly published documents are dependent on the blocks of previously publi...
Lattice-based approaches have been widely used in spoken document retrieval to handle the speech recognition uncertainty and errors. Position Specific Posterior Lattices (PSPL) an...
In this paper, we address the problem of database selection for XML document collections, that is, given a set of collections and a user query, how to rank the collections based o...
— We present a general approach for the hierarchical segmentation and labeling of document layout structures. This approach models document layout as a grammar and performs a glo...
PixED (from Pixel to Electronic Document) is aimed at converting document images into structured electronic documents which can be read by a machine for information retrieval. The...