Search Sciweavers | Sciweavers

31 search results - page 3 / 7

» Detecting near-duplicates for web crawling

click to vote

DEXA
2006
Springer

197views Database» more DEXA 2006»

Cleaning Web Pages for Effective Web Content Mining

13 years 7 months ago

Download sol.cs.uwindsor.ca

Classifying and mining noise-free web pages will improve on accuracy of search results as well as search speed, and may benefit webpage organization applications (e.g., keyword-bas...

Jing Li, Christie I. Ezeife

claim paper

Read More »

click to vote

ICDE
2009
IEEE

194views Database» more ICDE 2009»

Top-k Set Similarity Joins

14 years 7 months ago

Download www.cse.unsw.edu.au

Abstract-- Similarity join is a useful primitive operation underlying many applications, such as near duplicate Web page detection, data integration, and pattern recognition. Tradi...

Chuan Xiao, Wei Wang 0011, Xuemin Lin, Haichuan Sh...

claim paper

Read More »

click to vote

JCB
2007

106views more JCB 2007»

Clustered Sequence Representation for Fast Homology Search

13 years 5 months ago

Download web.udl.es

We present a novel approach to managing redundancy in sequence databanks such as GenBank. We store clusters of near-identical sequences as a representative union-sequence and a se...

Michael Cameron, Yaniv Bernstein, Hugh E. Williams

claim paper

Read More »

click to vote

SIGIR
2008
ACM

116views Information Technology» more SIGIR 2008»

Exploring traversal strategy for web forum crawling

13 years 5 months ago

Download research.microsoft.com

In this paper, we study the problem of Web forum crawling. Web forum has now become an important data source of many Web applications; while forum crawling is still a challenging ...

Yida Wang, Jiang-Ming Yang, Wei Lai, Rui Cai, Lei ...

claim paper

Read More »

click to vote

CCS
2011
ACM

231views Security Privacy» more CCS 2011»

Automated black-box detection of side-channel vulnerabilities in web applications

12 years 5 months ago

Download www.cs.virginia.edu

Web applications divide their state between the client and the server. The frequent and highly dynamic client-server communication that is characteristic of modern web application...

Peter Chapman, David Evans

claim paper

Read More »

« Prev « First page 3 / 7 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers