Search engines largely rely on Web robots to collect information from the Web. Due to the unregulated open-access nature of the Web, robot activities are extremely diverse. Such c...
In this paper we present a proposal for the implementation and evaluation of a novel method for automatically using data-rich text for filling form-based input interfaces. Our sol...
Guilherme A. Toda, Eli Cortez, Altigran Soares da ...
ating Structures is a system abstraction that helps to model large-scale distributed systems, whose performance mostly depends on how well the data and messages traffic is organiz...
Information extraction from HTML pages has been conventionally treated as plain text documents extended with HTML tags. However, the growing maturity and correct usage of HTML/XHT...
While scalable data mining methods are expected to cope with massive Web data, coping with evolving trends in noisy data in a continuous fashion, and without any unnecessary stopp...