This paper shares our experience in designing a web crawler that can download billions of pages using a single-server implementation and models its performance. We show that with ...
In this paper we investigate the task of Entity Ranking on the Web. Searchers looking for entities are arguably better served by presenting a ranked list of entities directly, rat...
Rianne Kaptein, Pavel Serdyukov, Arjen P. de Vries...
GiveALink.org is a social bookmarking site where users may donate and view their personal bookmark files online securely. The bookmarks are analyzed to build a new generation of i...
Benjamin Markines, Lubomira Stoilova, Filippo Menc...
In this paper we extend the classical portal (with static portlets) design with HTML DOM Web clipping on the client browser using dynamic JavaScript portlets: the portal server su...
Abstract. The PageRank algorithm is used today within web information retrieval to provide a content-neutral ranking metric over web pages. It employs power method iterations to so...