Sciweavers

704 search results - page 44 / 141
» Semantic Structure Content for Dynamic Web Pages
Sort
View
125
Voted
WWW
2008
ACM
16 years 4 months ago
iRobot: an intelligent crawler for web forums
We study in this paper the Web forum crawling problem, which is a very fundamental step in many Web applications, such as search engine and Web data mining. As a typical user-crea...
Rui Cai, Jiang-Ming Yang, Wei Lai, Yida Wang, Lei ...
163
Voted
WWW
2010
ACM
15 years 3 months ago
Exploiting content redundancy for web information extraction
We propose a novel extraction approach that exploits content redundancy on the web to extract structured data from template-based web sites. We start by populating a seed database...
Pankaj Gulhane, Rajeev Rastogi, Srinivasan H. Seng...
EDBTW
2010
Springer
15 years 2 months ago
Using visual pages analysis for optimizing web archiving
Due to the growing importance of the World Wide Web, archiving it has become crucial for preserving useful source of information. To maintain a web archive up-to-date, crawlers ha...
Myriam Ben Saad, Stéphane Gançarski
153
Voted
DASFAA
2003
IEEE
139views Database» more  DASFAA 2003»
15 years 9 months ago
Freshness-driven Adaptive Caching for Dynamic Content
With the wide availability of content delivery networks, many e-commerce Web applications utilize edge cache servers to cache and deliver dynamic contents at locations much closer...
Wen-Syan Li, Oliver Po, Wang-Pin Hsiung, K. Sel&cc...
AIIA
2001
Springer
15 years 8 months ago
Evaluation Methods for Focused Crawling
The exponential growth of documents available in the World Wide Web makes it increasingly difficult to discover relevant information on a specific topic. In this context, growing ...
Andrea Passerini, Paolo Frasconi, Giovanni Soda