Due to their capability for expressing semantics and relationships among data objects, semi-structured documents have become a common way of representing domain knowledge. Compari...
Henry Tan, Tharam S. Dillon, Fedja Hadzic, Elizabe...
In order for agents to act on behalf of users, they will have to retrieve and integrate vast amounts of textual data on the World Wide Web. However, much of the useful data on the...
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...
The World Wide Web is emerging not only as an infrastructure for data, but also for a broader variety of resources that are increasingly being made available as Web services. Rele...
Abhijit A. Patil, Swapna A. Oundhakar, Amit P. She...
Accurate topical categorization of user queries allows for increased effectiveness, efficiency, and revenue potential in general-purpose web search systems. Such categorization be...
Steven M. Beitzel, Eric C. Jensen, Ophir Frieder, ...