Sciweavers

1947 search results - page 53 / 390
» On the Automatic Extraction of Data from the Hidden Web
Sort
View
DL
2000
Springer
162views Digital Library» more  DL 2000»
15 years 4 months ago
Snowball: extracting relations from large plain-text collections
Text documents often contain valuable structured data that is hidden in regular English sentences. This data is best exploited if available as a relational table that we could use...
Eugene Agichtein, Luis Gravano
ICDE
2007
IEEE
173views Database» more  ICDE 2007»
16 years 1 months ago
Annotating Structured Data of the Deep Web
An increasing number of databases have become Web accessible through HTML form-based search interfaces. The data units returned from the underlying database are usually encoded in...
Yiyao Lu, Hai He, Hongkun Zhao, Weiyi Meng, Clemen...
COLING
2008
15 years 1 months ago
Homotopy-Based Semi-Supervised Hidden Markov Models for Sequence Labeling
This paper explores the use of the homotopy method for training a semi-supervised Hidden Markov Model (HMM) used for sequence labeling. We provide a novel polynomial-time algorith...
Gholamreza Haffari, Anoop Sarkar
LPNMR
2001
Springer
15 years 4 months ago
Declarative Information Extraction, Web Crawling, and Recursive Wrapping with Lixto
Lixto is a system and method for the visual and interactive generation of wrappers for Web pages under the supervision of a human developer, for automatically extracting informatio...
Robert Baumgartner, Sergio Flesca, Georg Gottlob
DILS
2009
Springer
15 years 6 months ago
Site-Wide Wrapper Induction for Life Science Deep Web Databases
We present a novel approach to automatic information extraction from Deep Web Life Science databases using wrapper induction. Traditional wrapper induction techniques focus on lear...
Saqib Mir, Steffen Staab, Isabel Rojas