In this paper, we exploit a novel ranking mechanism that processes query samples with noisy labels, motivated by the practical application of web image search re-ranking where the...
Hidden Web databases maintain a collection of specialised documents, which are dynamically generated in response to users' queries. The majority of these documents are genera...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...
— Results of queries by personal names often contain documents related to several people because of the namesake problem. In order to differentiate documents related to different...
Entity matching is an important and difficult step for integrating web data. To reduce the typically high execution time for matching we investigate how we can perform entity matc...
Toralf Kirsten, Lars Kolb, Michael Hartung, Anika ...
We consider the problem of sampling URLs uniformly at random from the Web. A tool for sampling URLs uniformly can be used to estimate various properties of Web pages, such as the ...
Monika Rauch Henzinger, Allan Heydon, Michael Mitz...