Tables are a universal idiom to present relational data. Billions of tables on Web pages express entity references, attributes and relationships. This representation of relational...
Malicious web pages that host drive-by-download exploits have become a popular means for compromising hosts on the Internet and, subsequently, for creating large-scale botnets. In...
Davide Canali, Marco Cova, Giovanni Vigna, Christo...
Recent research in domain-independent information extraction holds the promise of an automatically-constructed structured database derived from the Web. A query system based on th...
Motivated by our work with political scientists who need to manually analyze large Web archives of news sites, we present SpotSigs, a new algorithm for extracting and matching sig...
Martin Theobald, Jonathan Siddharth, Andreas Paepc...
Search engines use content and link information to crawl, index, retrieve, and rank Web pages. The correlations between similarity measures based on these cues and on semantic ass...