Web spam is a widely-recognized threat to the quality and security of the Web. Web spam pages pollute search engine indexes, burden Web crawlers and Web mining services, and expos...
In this paper we describe a distributed architecture consisting of a combination of scripting tools that interact with each other in order to help to find and query decentralized ...
Uldis Bojars, Alexandre Passant, Frederick Giasson...
Institutions and companies that are based in countries where the main language is not English typically publish Web sites that offer the same information at least in the local lan...
Filippo Ricca, Paolo Tonella, Emanuele Pianta, Chr...
We present a novel approach to automatic information extraction from Deep Web Life Science databases using wrapper induction. Traditional wrapper induction techniques focus on lear...
An important obstacle to the success of the Semantic Web is that the establishment of the semantic relationship is labor-intensive. This paper proposes an automatic semantic relat...