We present GoGetIt!, a tool for generating structure-driven crawlers that requires a minimum effort from the users. The tool takes as input a sample page and an entry point to a W...
Altigran Soares da Silva, Edleno Silva de Moura, J...
We demonstrate a system to automatically grab data from data intensive web sites. The system first infers a model that describes at the intensional level the web site as a collec...
Valter Crescenzi, Giansalvatore Mecca, Paolo Meria...
Extracting data from Web pages using wrappers is a fundamental problem arising in a large variety of applications of vast practical interests. There are two main issues relevant t...
In this paper, we present the results of our work that seek to negotiate the gap between low-level features and high-level concepts in the domain of web document retrieval. This wo...
The proliferation of information on the world wide web has made the personalization of this information space a necessity. One possible approach to web personalization is to mine ...