Sciweavers

WWW
2006
ACM
15 years 9 months ago
Effective web-scale crawling through website analysis
The web crawler space is often delimited into two general areas: full-web crawling and focused crawling. We present netSifter, a crawler system which integrates features from thes...
Iván Gonzlez, Adam Marcus 0002, Daniel N. M...
94
Voted
WWW
2006
ACM
15 years 9 months ago
Detecting online commercial intention (OCI)
Understanding goals and preferences behind a user's online activities can greatly help information providers, such as search engine and E-Commerce web sites, to personalize c...
Honghua (Kathy) Dai, Lingzhi Zhao, Zaiqing Nie, Ji...
WWW
2006
ACM
15 years 9 months ago
BuzzRank ... and the trend is your friend
Ranking methods like PageRank assess the importance of Web pages based on the current state of the rapidly evolving Web graph. The dynamics of the resulting importance scores, how...
Klaus Berberich, Srikanta J. Bedathur, Michalis Va...
74
Voted
WWW
2006
ACM
15 years 9 months ago
Beyond PageRank: machine learning for static ranking
Since the publication of Brin and Page's paper on PageRank, many in the Web community have depended on PageRank for the static (query-independent) ordering of Web pages. We s...
Matthew Richardson, Amit Prakash, Eric Brill
88
Voted
WWW
2006
ACM
15 years 9 months ago
Compressing and searching XML data via two zips
XML is fast becoming the standard format to store, exchange and publish over the web, and is getting embedded in applications. Two challenges in handling XML are its size (the XML...
Paolo Ferragina, Fabrizio Luccio, Giovanni Manzini...
Internet Technology
Top of PageReset Settings