Sciweavers

62 search results - page 2 / 13
» Using web page layout for extraction of sender names
Sort
View
SIGMOD
2003
ACM
190views Database» more  SIGMOD 2003»
13 years 10 months ago
Extracting Structured Data from Web Pages
Many web sites contain large sets of pages generated using a common template or layout. For example, Amazon lays out the author, title, comments, etc. in the same way in all its b...
Arvind Arasu, Hector Garcia-Molina
CICLING
2006
Springer
13 years 9 months ago
Extracting Key Phrases to Disambiguate Personal Names on the Web
Abstract. When you search for information regarding a particular person on the web, a search engine returns many pages. Some of these pages may be for people with the same name. Ho...
Danushka Bollegala, Yutaka Matsuo, Mitsuru Ishizuk...
EWMF
2003
Springer
13 years 10 months ago
Mining Web Sites Using Wrapper Induction, Named Entities, and Post-processing
This paper presents a novel method for extracting information from collections of Web pages across different sites. Our method uses a standard wrapper induction algorithm and explo...
Georgios Sigletos, Georgios Paliouras, Constantine...
LREC
2008
160views Education» more  LREC 2008»
13 years 7 months ago
Automatic Extraction of Textual Elements from News Web Pages
In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...
Hossam Ibrahim, Kareem Darwish, Abdel-Rahim Madany
WIDM
2004
ACM
13 years 11 months ago
Stylistic and lexical co-training for web block classification
Many applications which use web data extract information from a limited number of regions on a web page. As such, web page division into blocks and the subsequent block classifica...
Chee How Lee, Min-Yen Kan, Sandra Lai