In this study, we describe our system at the Intellectual Property track of the 2009 CrossLanguage Evaluation Forum campaign (CLEF-IP). The CLEF-IP track addressed prior art searc...
We present Opal, a light-weight framework for interactively locating missing web pages (http status code 404). Opal is an example of “in vivo” preservation: harnessing the col...
We consider the problem of efficiently sampling Web search engine query results. In turn, using a small random sample instead of the full set of results leads to efficient approxi...
Aris Anagnostopoulos, Andrei Z. Broder, David Carm...
Phrase structure trees have a hierarchical structure. In many subjects, most notably in taxonomy such tree structures have been studied using ultrametrics. Here syntactical hierar...
Automatic extraction of semantic information from text and links in Web pages is key to improving the quality of search results. However, the assessment of automatic semantic meas...
Ana Gabriela Maguitman, Filippo Menczer, Heather R...