Sciweavers

1139 search results - page 1 / 228
» Automatic extraction of informative blocks from webpages
Sort
View
SAC
2005
ACM
13 years 10 months ago
Automatic extraction of informative blocks from webpages
Search engines crawl and index webpages depending upon their informative content. However, webpages — especially dynamically generated ones — contain items that cannot be clas...
Sandip Debnath, Prasenjit Mitra, C. Lee Giles
DEXAW
2010
IEEE
201views Database» more  DEXAW 2010»
13 years 6 months ago
A New Information Filtering Method for WebPages
Internet is a huge source of information. Search engines have indexed much of this information and are able to extract the relevant webpages that are related to a given query. Howe...
Sergio Lopez, Josep Silva
ISMIS
2005
Springer
13 years 10 months ago
Identifying Content Blocks from Web Documents
Intelligent information processing systems, such as digital libraries or search engines index web-pages according to their informative content. However, web-pages contain several n...
Sandip Debnath, Prasenjit Mitra, C. Lee Giles
ICEIS
2009
IEEE
13 years 11 months ago
Semi-supervised Information Extraction from Variable-length Web-page Lists
We propose two methods for constructing automated programs for extraction of information from a class of web pages that are very common and of high practical significance - varia...
Daniel Nikovski, Alan Esenther, Akihiro Baba
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
13 years 11 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...