The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
The aggregated structure of documents plays a key role in full-text, multimedia, and network Information Retrieval (IR). Considering aggregation provides new querying facilities a...
Assistance in retrieving of documents on the World Wide Web is provided either by search engines, through keyword based queries, or by catalogues, which organise documents into hi...
We describe our participation in the 2009 CLEF-IP task, which was targeted at priorart search for topic patent documents. Our system retrieved patent documents based on a standard...
It is well known that anchor text plays a critical role in a variety of search tasks performed over hypertextual domains, including enterprise search, wiki search, and web search....
Donald Metzler, Jasmine Novak, Hang Cui, Srihari R...