Web spider is a widely used approach to obtain information for search engines. As the size of the Web grows, it becomes a natural choice to parallelize the spider’s crawling proc...
This paper presents a novel method for extracting information from collections of Web pages across different sites. Our method uses a standard wrapper induction algorithm and explo...
Most of the Web-based methods for lexicon augmenting consist in capturing global semantic features of the targeted domain in order to collect relevant documents from the Web. We s...
The World-Wide Web is developing very fast. Currently, nding useful information on the Web is a time consuming process. In this paper, we present WebMate, an agent that helps user...
This paper introduces the MREF framework for representing and correlating information at a higher semantic level than is possible with Web-based information systems today. The rol...