Sciweavers

85 search results - page 10 / 17
» Extracting unstructured data from template generated web doc...
Sort
View
VLDB
2011
ACM
251views Database» more  VLDB 2011»
14 years 4 months ago
Harvesting relational tables from lists on the web
A large number of web pages contain data structured in the form of “lists”. Many such lists can be further split into multi-column tables, which can then be used in more seman...
Hazem Elmeleegy, Jayant Madhavan, Alon Y. Halevy
WWW
2006
ACM
15 years 10 months ago
Interactive wrapper generation with minimal user effort
While much of the data on the web is unstructured in nature, there is also a significant amount of embedded structured data, such as product information on e-commerce sites or sto...
Utku Irmak, Torsten Suel
WWW
2004
ACM
15 years 10 months ago
An efficient and systematic method to generate xslt stylesheets for different wireless pervasive devices
It is a tedious and cumbersome process to update directly a WML document for the wireless Web because its content composes of both data and presentation. Thus, XML is used to hand...
Thomas Kwok, Thao Nguyen, Linh Lam, Kakan Roy
78
Voted
WWW
2010
ACM
15 years 4 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
LWA
2008
14 years 11 months ago
Labeling Clusters - Tagging Resources
In order to support the navigation in huge document collections efficiently, tagged hierarchical structures can be used. Often, multiple tags are used to describe resources. For u...
Korinna Bade, Andreas Nürnberger