The PDF format is commonly used for the exchange of documents on the Web and there is a growing need to understand and extract or repurpose data held in PDF documents. Many system...
The AncestorRank algorithm calculates an authority score by using just one characteristic of the web graph—the number of ancestors per node. For scalability, we estimate the num...
Abstract Bioinformatic data sources available on the web are multiple and heterogenous. The lack of documentation and the difficulty of interaction with these data banks require us...
Data items are often associated with a location in which they are present or collected, and their relevance or in uence decays with their distance. Aggregate values over such data...
Many document-based applications, including popular Web browsers, email viewers, and word processors, have a ‘Find on this Page’ feature that allows a user to find every occur...
Kevyn Collins-Thompson, Charles Schweizer, Susan T...