Sciweavers

PVLDB
2008
141views more  PVLDB 2008»
13 years 4 months ago
WebTables: exploring the power of tables on the web
The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
DEXA
2005
Springer
109views Database» more  DEXA 2005»
13 years 10 months ago
An XML Approach to Semantically Extract Data from HTML Tables
Abstract. Data intensive information is often published on the internet in the format of HTML tables. Extracting some of the information that is of users’ interest from the inter...
Jixue Liu, Zhuoyun Ao, Ho-Hyun Park, Yongfeng Chen