Sciweavers

139 search results - page 1 / 28
» An Approach to Identify Duplicated Web Pages
Sort
View
COMPSAC
2002
IEEE
13 years 9 months ago
An Approach to Identify Duplicated Web Pages
A relevant consequence of the unceasing expansion of the Web and e-commerce is the growth of the demand of new Web sites and Web applications. The software industry is facing the ...
Giuseppe A. Di Lucca, Massimiliano Di Penta, Anna ...
APWEB
2004
Springer
13 years 8 months ago
A Query-Dependent Duplicate Detection Approach for Large Scale Search Engines
Duplication of Web pages greatly hurts the perceived relevance of a search engine. Existing methods for detecting duplicated Web pages can be classified into two categories, i.e. o...
Shaozhi Ye, Ruihua Song, Ji-Rong Wen, Wei-Ying Ma
WEBI
2009
Springer
13 years 11 months ago
Revealing Hidden Community Structures and Identifying Bridges in Complex Networks: An Application to Analyzing Contents of Web P
The emergence of scale free and small world properties in real world complex networks has stimulated lots of activity in the field of network analysis. An example of such a netwo...
Faraz Zaidi, Arnaud Sallaberry, Guy Melanço...
DEXA
2006
Springer
197views Database» more  DEXA 2006»
13 years 6 months ago
Cleaning Web Pages for Effective Web Content Mining
Classifying and mining noise-free web pages will improve on accuracy of search results as well as search speed, and may benefit webpage organization applications (e.g., keyword-bas...
Jing Li, Christie I. Ezeife
HICSS
2009
IEEE
150views Biometrics» more  HICSS 2009»
13 years 11 months ago
An N-Gram Based Approach to Automatically Identifying Web Page Genre
The research reported in this paper is the first phase of a larger project on the automatic classification of web pages by their genres, using ngram representations of the web pag...
Jane E. Mason, Michael A. Shepherd, Jack Duffy