Sciweavers

43 search results - page 4 / 9
» Automatically Maintaining Wrappers for Web Sources
Sort
View
ESWS
2007
Springer
13 years 12 months ago
Empowering Software Maintainers with Semantic Web Technologies
Abstract. Software maintainers routinely have to deal with a multitude of artifacts, like source code or documents, which often end up disconnected, due to their different represen...
René Witte, Yonggang Zhang, Juergen Rilling
WWW
2006
ACM
14 years 6 months ago
Interactive wrapper generation with minimal user effort
While much of the data on the web is unstructured in nature, there is also a significant amount of embedded structured data, such as product information on e-commerce sites or sto...
Utku Irmak, Torsten Suel
SIGMOD
2009
ACM
140views Database» more  SIGMOD 2009»
14 years 17 days ago
Robust web extraction: an approach based on a probabilistic tree-edit model
On script-generated web sites, many documents share common HTML tree structure, allowing wrappers to effectively extract information of interest. Of course, the scripts and thus ...
Nilesh N. Dalvi, Philip Bohannon, Fei Sha
AAAI
2007
13 years 8 months ago
Template-Independent News Extraction Based on Visual Consistency
Wrapper is a traditional method to extract useful information from Web pages. Most previous works rely on the similarity between HTML tag trees and induced template-dependent wrap...
Shuyi Zheng, Ruihua Song, Ji-Rong Wen
ICWE
2009
Springer
14 years 10 days ago
A Layout-Independent Web News Article Contents Extraction Method Based on Relevance Analysis
Abstract. The traditional Web news article contents extraction methods are time-costly and need much maintenance because they analyze the layout of news pages to generate the wrapp...
Hao Han, Takehiro Tokuda