We consider the problem of dust: Different URLs with Similar Text. Such duplicate URLs are prevalent in web sites, as web server software often uses aliases and redirections, and...
Recently the world of the web has become more social and more real-time. Facebook and Twitter are perhaps the exemplars of a new generation of social, real-time web services and w...
Today’s search engines are increasingly required to broaden their capabilities beyond free-text search. More complex features, such as supporting range constraints over numeric ...
Marcus Fontoura, Ronny Lempel, Runping Qi, Jason Y...
Keyword search is a user-friendly mechanism for retrieving XML data in web and scientific applications. An intuitively compelling but vaguely defined goal is to identify matches t...
It is important yet hard to identify navigational queries in Web search due to a lack of sufficient information in Web queries, which are typically very short. In this paper we st...