As online document collections continue to expand, both on the Web and in proprietary environments, the need for duplicate detection becomes more critical. Few users wish to retri...
A standard load balancing model considers placing Ò balls into Ò bins by choosing possible locations for each ball independently and uniformly at random and sequentially placing...
Michael Mitzenmacher, Balaji Prabhakar, Devavrat S...
Soon, much of the data exchanged over the Internet will be encoded in XML, allowing for sophisticated filtering and content-based routing. We have built a filtering engine called ...
Yanlei Diao, Peter M. Fischer, Michael J. Franklin...
Revisiting the “small-world” experiments of the ’60s, Kleinberg observed that individuals are very effective at constructing short chains of acquaintances between any two p...
Following the exponential growth of social media, there now exist huge repositories of videos online. Among the huge volumes of videos, there exist large numbers of near-duplicate...