Sciweavers

1139 search results - page 3 / 228
» Automatic extraction of informative blocks from webpages
Sort
View
75
Voted
KDD
2002
ACM
148views Data Mining» more  KDD 2002»
15 years 10 months ago
Discovering informative content blocks from Web documents
In this paper, we propose a new approach to discover informative contents from a set of tabular documents (or Web pages) of a Web site. Our system, InfoDiscoverer, first partition...
Shian-Hua Lin, Jan-Ming Ho
93
Voted
KDD
1997
ACM
169views Data Mining» more  KDD 1997»
15 years 1 months ago
Learning to Extract Text-Based Information from the World Wide Web
Thereis a wealthof informationto be minedfromnarrative text on the WorldWideWeb.Unfortunately, standard natural language processing (NLP)extraction techniques expect full, grammat...
Stephen Soderland
LREC
2008
160views Education» more  LREC 2008»
14 years 11 months ago
Automatic Extraction of Textual Elements from News Web Pages
In this paper we present an algorithm for automatic extraction of textual elements, namely titles and full text, associated with news stories in news web pages. We propose a super...
Hossam Ibrahim, Kareem Darwish, Abdel-Rahim Madany
79
Voted
CIKM
2008
Springer
14 years 11 months ago
Closing the loop in webpage understanding
The two most important tasks in information extraction from the Web are webpage structure understanding and natural language sentences processing. However, little work has been don...
Chunyu Yang, Yong Cao, Zaiqing Nie, Jie Zhou, Ji-R...
105
Voted
WWW
2010
ACM
15 years 4 months ago
The paths more taken: matching DOM trees to search logs for accurate webpage clustering
An unsupervised clustering of the webpages on a website is a primary requirement for most wrapper induction and automated data extraction methods. Since page content can vary dras...
Deepayan Chakrabarti, Rupesh R. Mehta