We present in this paper ObjectRunner, a system for extracting, integrating and querying structured data from the Web. Our system harvests real-world items from template-based HTM...
Semistructured data is not strictly typed like relational or object-oriented data and may be irregular or incomplete. It often arises in practice, e.g., when heterogeneous data so...
Serge Abiteboul, Jason McHugh, Michael Rys, Vasili...
In this paper, we present a document model which integrates the logical structure and hypertext link structure of hyperdocuments in order to manage structured documents with hyper...
Yong Kyu Lee, Seong-Joon Yoo, Kyoungro Yoon, P. Br...
The high quality, structured data from Web structured sources is invaluable for many applications. Hidden Web databases are not directly crawlable by Web search engines and are on...
Address geocoding, the process of finding the map location for a structured postal address, is a relatively well-studied problem. In this paper we consider the more general proble...
Tanuja Joshi, Joseph Joy, Tobias Kellner, Udayan K...