As the result of interactions between visitors and a web site, an http log file contains very rich knowledge about users on-site behaviors, which, if fully exploited, can better c...
A large number of web sites publish pages containing structured information about recognizable concepts, but these data are only partially used by current applications. Although s...
Paolo Papotti, Valter Crescenzi, Paolo Merialdo, M...
The structure of the web is increasingly being used to improve organization, search, and analysis of information on the web. For example, Google uses the text in citing documents ...
Eric J. Glover, Kostas Tsioutsiouliklis, Steve Law...
The field of automatic genre classification has primarily focused on extracting textual features from documents. The goal of this research is to investigate whether visual feature...
Most of the current WWW is made up of dynamic pages. The development of dynamic pages is a difficult and costly endeavour, out-of-reach for most users, experts, and content produce...