In this paper we introduce a programming language for Web document processing called WebL. WebL is a high level, object-oriented scripting language that incorporates two novel fea...
Web spam is a widely-recognized threat to the quality and security of the Web. Web spam pages pollute search engine indexes, burden Web crawlers and Web mining services, and expos...
In this paper we describe the semantic partitioner algorithm, that uses the structural and presentation regularities of the Web pages to automatically transform them into hierarchi...
In recent years, language resources acquired from the Web are released, and these data improve the performance of applications in several NLP tasks. Although the language resource...
Abstract: Engineering personalized ubiquitous Web applications requires to develop adaptable Web content as well as to automatically adjust it to varying client devices and dynamic...