Search Sciweavers | Sciweavers

1161 search results - page 47 / 233

» Using web structure for classifying and describing web pages

169

click to vote

WSE
2003
IEEE

134views Internet Technology» more WSE 2003»

Resolution of Static Clones in Dynamic Web Pages

15 years 10 months ago

Download post.queensu.ca

Cloning is extremely likely to occur in web sites, much more so than in other software. While some clones exist for valid reasons, or are too small to eliminate, cloning percentag...

Nikita Synytskyy, James R. Cordy, Thomas R. Dean

claim paper

Read More »

157

click to vote

AUSAI
2003
Springer

153views Artificial Intelligence» more AUSAI 2003»

Semi-Automatic Construction of Metadata from a Series of Web Documents

15 years 10 months ago

Download qir.kyushu-u.ac.jp

Metadata plays an important role in discovering, collecting, extracting and aggregating Web data. This paper proposes a method of constructing metadata for a speciﬁc topic. The m...

Sachio Hirokawa, Eisuke Itoh, Tetsuhiro Miyahara

claim paper

Read More »

164

click to vote

AIRWEB
2006
Springer

155views Internet Technology» more AIRWEB 2006»

Web Spam Detection with Anti-Trust Rank

15 years 9 months ago

Download airweb.cse.lehigh.edu

Spam pages on the web use various techniques to artificially achieve high rankings in search engine results. Human experts can do a good job of identifying spam pages and pages wh...

Vijay Krishnan, Rashmi Raj

claim paper

Read More »

127

click to vote

WWW
2007
ACM

162views Internet Technology» more WWW 2007»

Detecting near-duplicates for web crawling

16 years 6 months ago

Download infolab.stanford.edu

Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...

Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma

claim paper

Read More »

142

click to vote

SIGIR
2004
ACM

135views Information Technology» more SIGIR 2004»

15 years 11 months ago

Query-related data extraction of hidden web documents

Download dis.shef.ac.uk

The larger amount of information on the Web is stored in document databases and is not indexed by general-purpose search engines (i.e., Google and Yahoo). Such information is dyna...

Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...

claim paper

Read More »

« Prev « First page 47 / 233 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers