A frozen 18.5 million page snapshot of part of the Web has been created to enable and encourage meaningful and reproducible evaluation of Web search systems and techniques. This c...
David Hawking, Nick Craswell, Paul B. Thistlewaite...
In this paper we present NewsFlash, an adaptive search system that assists a searcher to efficiently search a library of stored TV news reports. The system automatically summarises...
Today, search engine is the most commonly used tool for Web information retrieval, however, its current status is still far from satisfaction. This paper focuses on clustering Web...
Given a user keyword query, current Web search engines return a list of individual Web pages ranked by their "goodness" with respect to the query. Thus, the basic unit fo...
Ramakrishna Varadarajan, Vagelis Hristidis, Tao Li
This paper1 presents an empirical approach to mining parallel corpora. Conventional approaches use a readily available collection of comparable, nonparallel corpora to extract par...