This paper describes a programming-by-demonstration system, called Internet Scrapbook, which allows users with little programming skill to automate repetitive browsing tasks. With...
One of the main limitations when accessing the web is the lack of explicit structure, whose presence may help in understanding data semantics. Schema for web data can be constructe...
The Web is a very large social network. It is important and interesting to understand the “ecology” of the Web: the general relations of Web pages to their environment. The un...
This paper is concerned with automatic extraction of titles from the bodies of HTML documents (web pages). Titles of HTML documents should be correctly defined in the title fields...
Online auction Web sites are fast changing, highly dynamic, and complex as they involve tremendous sellers and potential buyers, as well as a huge amount of items listed for biddi...