Large collections of documents containing various types of multimedia, are made available to the WWW. Unfortunately, due to the un-structuredness of Internet environments it is ha...
In most web sites, web-based applications (such as web portals, emarketplaces, search engines), and in the file systems of personal computers, a wide variety of schemas (such as t...
Paolo Bouquet, Luciano Serafini, Stefano Zanobini,...
Abstract. Since current search engines employ link-based ranking algorithms as an important tool to decide a ranking of sites, Web spammers are making a significant effort to man...
On the Web of Data, entities are often interconnected in a way similar to web documents. Previous works have shown how PageRank can be adapted to achieve entity ranking. In this pa...
— As person names are non-unique, the same name on different Web pages might or might not refer to the same real-world person. This entity identification problem is one of the m...