Sciweavers

1161 search results - page 47 / 233
» Using web structure for classifying and describing web pages
Sort
View
WSE
2003
IEEE
15 years 2 months ago
Resolution of Static Clones in Dynamic Web Pages
Cloning is extremely likely to occur in web sites, much more so than in other software. While some clones exist for valid reasons, or are too small to eliminate, cloning percentag...
Nikita Synytskyy, James R. Cordy, Thomas R. Dean
AUSAI
2003
Springer
15 years 2 months ago
Semi-Automatic Construction of Metadata from a Series of Web Documents
Metadata plays an important role in discovering, collecting, extracting and aggregating Web data. This paper proposes a method of constructing metadata for a specific topic. The m...
Sachio Hirokawa, Eisuke Itoh, Tetsuhiro Miyahara
87
Voted
AIRWEB
2006
Springer
15 years 1 months ago
Web Spam Detection with Anti-Trust Rank
Spam pages on the web use various techniques to artificially achieve high rankings in search engine results. Human experts can do a good job of identifying spam pages and pages wh...
Vijay Krishnan, Rashmi Raj
67
Voted
WWW
2007
ACM
15 years 10 months ago
Detecting near-duplicates for web crawling
Near-duplicate web documents are abundant. Two such documents differ from each other in a very small portion that displays advertisements, for example. Such differences are irrele...
Gurmeet Singh Manku, Arvind Jain, Anish Das Sarma
SIGIR
2004
ACM
15 years 3 months ago
Query-related data extraction of hidden web documents
The larger amount of information on the Web is stored in document databases and is not indexed by general-purpose search engines (i.e., Google and Yahoo). Such information is dyna...
Yih-Ling Hedley, Muhammad Younas, Anne E. James, M...