Current-day crawlers retrieve content only from the publicly indexable Web, i.e., the set of Web pages reachable purely by following hypertext links, ignoring search forms and pag...
Web spamming techniques aim to achieve undeserved rankings in search results. Research has been widely conducted on identifying such spam and neutralizing its influence. However,...
Abstract— In recent years, Content Delivery Networks (CDN) and Peerto-Peer (P2P) networks have emerged as two effective paradigms for delivering multimedia contents over the Inte...
This article presents the most distinguishing features of the Argentinian web as found in a private sample of almost 10 million web pages from 150.000 sites collected in the early...
Gabriel Tolosa, Fernando Bordignon, Ricardo A. Bae...
In this paper we propose a new wiki concept — light constraints — designed to encode community best practices and domain-specific requirements, and to assist in their applica...