The World-Wide Web consists of a huge number of unstructured documents, but it also contains structured data in the form of HTML tables. We extracted 14.1 billion HTML tables from...
Michael J. Cafarella, Alon Y. Halevy, Daisy Zhe Wa...
One of the main hurdles towards a wide endorsement of ontologies is the high cost of constructing them. Reuse of existing ontologies offers a much cheaper alternative than buildin...
The World Wide Web (WWW) has provided us with a plethora of information. However, given its unstructured format, this information is useful mainly to humans and cannot be effectiv...
In recent years, language resources acquired from the Web are released, and these data improve the performance of applications in several NLP tasks. Although the language resource...
GiveALink.org is a social bookmarking site where users may donate and view their personal bookmark files online securely. The bookmarks are analyzed to build a new generation of i...
Benjamin Markines, Lubomira Stoilova, Filippo Menc...