We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...
Rapid increase in the number of pages on web sites, and widespread use of search engine optimization techniques, lead to web sites becoming difficult to navigate. Traditional site ...
The multimedia information retrieval community has dedicated extensive research effort to the problem of content-based image retrieval (CBIR). However, these systems find their ma...
Tom Yeh, Brandyn White, Jose San Pedro, Boris Katz...
Background: High-throughput molecular biology provides new data at an incredible rate, so that the increase in the size of biological databanks is enormous and very rapid. This sc...
We present a gateway between the WWW and the Gnutella peer-topeer network that permits searchers on one side to be able to search and retrieve files on the other side of the gatew...