Search Sciweavers | Sciweavers

31 search results - page 4 / 7

» Detecting near-duplicates for web crawling

Voted

WWW
2008
ACM

150views Internet Technology» more WWW 2008»

Incremental web page template detection

16 years 1 months ago

Download www2008.org

Most template detection methods process web pages in batches that a newly crawled page can not be processed until enough pages have been collected. This results in large storage c...

Yu Wang, Binxing Fang, Xueqi Cheng, Li Guo, Hongbo...

claim paper

Read More »

102

Voted

DIMVA
2010

208views Computer Networks» more DIMVA 2010»

Why Johnny Can't Pentest: An Analysis of Black-Box Web Vulnerability Scanners

15 years 1 months ago

Download www.cs.ucsb.edu

Black-box web vulnerability scanners are a class of tools that can be used to identify security issues in web applications. These tools are often marketed as "point-and-click ...

Adam Doupé, Marco Cova, Giovanni Vigna

claim paper

Read More »

Voted

IADIS
2003

91views Internet Technology» more IADIS 2003»

SPLAT: A System for Self-Plagiarism Detection

15 years 1 months ago

Download splat.cs.arizona.edu

This paper presents a system for self-plagiarism detection, SPLAT. The system uses a WebL web spider that crawls through the web sites of the top fifty Computer Science department...

Christian S. Collberg, Stephen G. Kobourov, Joshua...

claim paper

Read More »

115

Voted

ICAIL
2007
ACM

147views Artificial Intelligence» more ICAIL 2007»

Essential deduplication functions for transactional databases in law firms

15 years 4 months ago

Download www.conradweb.org

As massive document repositories and knowledge management systems continue to expand, in proprietary environments as well as on the Web, the need for duplicate detection becomes i...

Jack G. Conrad, Edward L. Raymond

claim paper

Read More »

111

click to vote

DEXA
2010
Springer

226views Database» more DEXA 2010»

Vi-DIFF: Understanding Web Pages Changes

14 years 11 months ago

Download www-poleia.lip6.fr

Nowadays, many applications are interested in detecting and discovering changes on the web to help users to understand page updates and more generally, the web dynamics. Web archiv...

Zeynep Pehlivan, Myriam Ben Saad, Stéphane ...

claim paper

Read More »

« Prev « First page 4 / 7 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers