Search Sciweavers | Sciweavers

123

Voted

NSDI
2010

194views Computer Networks» more NSDI 2010»

The Architecture and Implementation of an Extensible Web Crawler

15 years 4 months ago

Many Web services operate their own Web crawlers to discover data of interest, despite the fact that largescale, timely crawling is complex, operationally intensive, and expensive...

Jonathan M. Hsieh, Steven D. Gribble, Henry M. Lev...

claim paper

Read More »

155

click to vote

WWW
2008
ACM

163views Internet Technology» more WWW 2008»

As we may perceive: finding the boundaries of compound documents on the web

16 years 4 months ago

Download www2008.org

This paper considers the problem of identifying on the Web compound documents (cDocs) ? groups of web pages that in aggregate constitute semantically coherent information entities...

Pavel Dmitriev

claim paper

Read More »

99

click to vote

WWW
2006
ACM

147views Internet Technology» more WWW 2006»

Topical TrustRank: using topicality to combat web spam

16 years 4 months ago

Download www.cse.lehigh.edu

Web spam is behavior that attempts to deceive search engine ranking algorithms. TrustRank is a recent algorithm that can combat web spam. However, TrustRank is vulnerable in the s...

Baoning Wu, Vinay Goel, Brian D. Davison

claim paper

Read More »

122

Voted

CICLING
2009
Springer

335views Natural Language Processing» more CICLING 2009»

Language Identification on the Web: Extending the Dictionary Method

15 years 7 months ago

Download www.fi.muni.cz

Abstract. Automated language identification of written text is a wellestablished research domain that has received considerable attention in the past. By now, efficient and effecti...

Radim Rehurek, Milan Kolkus

claim paper

Read More »

99

Voted

WWW
2006
ACM

77views Internet Technology» more WWW 2006»

Examining the content and privacy of web browsing incidental information

16 years 4 months ago

Download www2006.org

This research examines the privacy comfort levels of participants if others can view traces of their web browsing activity. During a week-long field study, participants used an el...

Kirstie Hawkey, Kori M. Inkpen

claim paper

Read More »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers