A wealth of information is available only in web pages, patents, publications etc. Extracting information from such sources is challenging, both due to the typically complex langu...
The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
A new method for augmenting paper documents with electronic information is described that does not modify the format of the paper document in any way. Applicable to both commercia...
Jonathan J. Hull, Berna Erol, Jamey Graham, Qifa K...
Consider a rooted directed acyclic graph G = (V, E) with root r, representing a collection V of web pages connected via a set E of hyperlinks. Each node v is associated with the pr...
This paper addresses the problem of the graceful degradation of user interfaces where an initial interface is transferred to a smaller platform. It presents a technique for pagina...
Murielle Florins, Francisco Montero Simarro, Jean ...