Hash tables are one of the most fundamental data structures in computer science, in both theory and practice. They are especially useful in external memory, where their query perf...
A primary challenge to large-scale data integration is creating semantic equivalences between elements from different data sources that correspond to the same real-world entity or...
Shawn R. Jeffery, Michael J. Franklin, Alon Y. Hal...
We present a case study about the application of the inductive database approach to the analysis of Web logs. We consider rich XML Web logs ? called conceptual logs ? that are gen...
Rosa Meo, Pier Luca Lanzi, Maristella Matera, Robe...
In this paper, we study search bot traffic from search engine query logs at a large scale. Although bots that generate search traffic aggressively can be easily detected, a large ...
There has been a large amount of research on efficient document retrieval in both IR and web search areas. One important technique to improve retrieval efficiency is early termina...