Sciweavers

23 search results - page 4 / 5
» Estimating the Rate of Web Page Updates
Sort
View
WWW
2007
ACM
14 years 5 months ago
The discoverability of the web
Previous studies have highlighted the high arrival rate of new content on the web. We study the extent to which this new content can be efficiently discovered by a crawler. Our st...
Anirban Dasgupta, Arpita Ghosh, Ravi Kumar, Christ...
WWW
2003
ACM
14 years 5 months ago
Efficient URL caching for world wide web crawling
Crawling the web is deceptively simple: the basic algorithm is (a) Fetch a page (b) Parse it to extract all linked URLs (c) For all the URLs not seen before, repeat (a)?(c). Howev...
Andrei Z. Broder, Marc Najork, Janet L. Wiener
ICC
2007
IEEE
118views Communications» more  ICC 2007»
13 years 11 months ago
Single and Multiple Parameters Sensitivity Study of Location Management Area Partitioning for GSM Networks
–- To obtain optimal location area (LA) partitioning in cellular radio networks is important since it maximizes the usable bandwidth to support services. However, we feel that th...
Yong Huat Chew, Boon Sain Yeo, Daniel Chien Ming K...
NIPS
2007
13 years 6 months ago
Supervised Topic Models
We introduce supervised latent Dirichlet allocation (sLDA), a statistical model of labelled documents. The model accommodates a variety of response types. We derive a maximum-like...
David M. Blei, Jon D. McAuliffe
CIKM
2005
Springer
13 years 10 months ago
Focused crawling for both topical relevance and quality of medical information
Subject-specific search facilities on health sites are usually built using manual inclusion and exclusion rules. These can be expensive to maintain and often provide incomplete c...
Thanh Tin Tang, David Hawking, Nick Craswell, Kath...