We propose a novel approach that identifies web page templates and extracts the unstructured data. Extracting only the body of the page and eliminating the template increases the ...
A large number of web sites publish pages containing structured information about recognizable concepts, but these data are only partially used by current applications. Although s...
Paolo Papotti, Valter Crescenzi, Paolo Merialdo, M...
This paper proposes a framework for easily integrating and controlling information visualization (infoVis) components within web pages to create powerful interactive "live&qu...
We introduce a stricter Web community definition to overcome boundary ambiguity of a Web community defined by Flake, Lawrence and Giles [2], and consider the problem of finding co...
Existing commercial Web browsers provide various utilities and functions, e.g., Web bookmarks and a browsing history list. Since the bookmark and history functions only the title ...