Sciweavers

326 search results - page 25 / 66
» Optimal crawling strategies for web search engines
Sort
View
77
Voted
IEEECIT
2007
IEEE
15 years 3 months ago
Worrisome Rich-Get-Richer? Not the True Story!
Search engines have become efficient assistants for people to access information on the Web. Some researchers argue that the prevalence of search engines is setting a tough journ...
Mingda Wu, Qiancheng Jiang, Yan Zhang
61
Voted
SIGMOD
2000
ACM
85views Database» more  SIGMOD 2000»
15 years 1 months ago
Finding Replicated Web Collections
Many web documents (such as JAVA FAQs) are being replicated on the Internet. Often entire document collections (such as hyperlinked Linux manuals) are being replicated many times....
Junghoo Cho, Narayanan Shivakumar, Hector Garcia-M...
83
Voted
WSDM
2009
ACM
176views Data Mining» more  WSDM 2009»
15 years 4 months ago
The web changes everything: understanding the dynamics of web content
The Web is a dynamic, ever changing collection of information. This paper explores changes in Web content by analyzing a crawl of 55,000 Web pages, selected to represent different...
Eytan Adar, Jaime Teevan, Susan T. Dumais, Jonatha...
WWW
2010
ACM
15 years 4 months ago
Not so creepy crawler: easy crawler generation with standard xml queries
Web crawlers are increasingly used for focused tasks such as the extraction of data from Wikipedia or the analysis of social networks like last.fm. In these cases, pages are far m...
Franziska von dem Bussche, Klara A. Weiand, Benedi...
CHI
2009
ACM
15 years 10 months ago
Resonance on the web: web dynamics and revisitation patterns
The Web is a dynamic, ever-changing collection of information accessed in a dynamic way. This paper explores the relationship between Web page content change (obtained from an hou...
Eytan Adar, Jaime Teevan, Susan T. Dumais