Sciweavers

452 search results - page 1 / 91
» Accurately and Reliably Extracting Data from the Web: A Mach...
Sort
View
DEBU
2000
95views more  DEBU 2000»
13 years 4 months ago
Accurately and Reliably Extracting Data from the Web: A Machine Learning Approach
A critical problem in developing information agents for the Web is accessing data that is formatted for human use. We have developed a set of tools for extracting data from web si...
Craig A. Knoblock, Kristina Lerman, Steven Minton,...
SYNASC
2006
IEEE
211views Algorithms» more  SYNASC 2006»
13 years 11 months ago
HTML Pattern Generator--Automatic Data Extraction from Web Pages
Existing methods of information extraction from HTML documents include manual approach, supervised learning and automatic techniques. The manual method has high precision and reca...
Mirel Cosulschi, Adrian Giurca, Bogdan Udrescu, Ni...
WSDM
2012
ACM
252views Data Mining» more  WSDM 2012»
12 years 12 days ago
WebSets: extracting sets of entities from the web using unsupervised information extraction
We describe a open-domain information extraction method for extracting concept-instance pairs from an HTML corpus. Most earlier approaches to this problem rely on combining cluste...
Bhavana Bharat Dalvi, William W. Cohen, Jamie Call...
BMCBI
2008
220views more  BMCBI 2008»
13 years 5 months ago
Gene prediction in metagenomic fragments: A large scale machine learning approach
Background: Metagenomics is an approach to the characterization of microbial genomes via the direct isolation of genomic sequences from the environment without prior cultivation. ...
Katharina J. Hoff, Maike Tech, Thomas Lingner, Rol...
WWW
2005
ACM
14 years 5 months ago
Web data extraction based on partial tree alignment
This paper studies the problem of extracting data from a Web page that contains several structured data records. The objective is to segment these data records, extract data items...
Yanhong Zhai, Bing Liu