Sciweavers

164 search results - page 1 / 33
» Using Clustering and Edit Distance Techniques for Automatic ...
Sort
View
WISE
2007
Springer
13 years 11 months ago
Using Clustering and Edit Distance Techniques for Automatic Web Data Extraction
Manuel Álvarez, Alberto Pan, Juan Raposo, F...
WWW
2004
ACM
14 years 5 months ago
Automatic web news extraction using tree edit distance
The Web poses itself as the largest data repository ever available in the history of humankind. Major efforts have been made in order to provide efficient access to relevant infor...
Davi de Castro Reis, Paulo Braz Golgher, Altigran ...
WISE
2005
Springer
13 years 10 months ago
NET - A System for Extracting Web Data from Flat and Nested Data Records
This paper studies automatic extraction of structured data from Web pages. Each of such pages may contain several groups of structured data records. Existing automatic methods stil...
Bing Liu, Yanhong Zhai
NLPRS
2001
Springer
13 years 9 months ago
Automatically Harvesting Katakana-English Term Pairs from Search Engine Query Logs
This paper describes a method of extracting katakana words and phrases, along with their English counterparts from non-aligned monolingual web search engine query logs. The method...
Eric Brill, Gary Kacmarcik, Chris Brockett
ICDE
2008
IEEE
153views Database» more  ICDE 2008»
14 years 6 months ago
Automatically Extracting Form Labels
We describe a machine-learning-based approach for extracting attribute labels from Web form interfaces. Having these labels is a requirement for several techniques that attempt to ...
Hoa Nguyen, Eun Yong Kang, Juliana Freire