Sciweavers

7 search results - page 2 / 2
» Towards a unified approach to document similarity search usi...
Sort
View
WEBDB
2009
Springer
124views Database» more  WEBDB 2009»
13 years 11 months ago
Bridging the Terminology Gap in Web Archive Search
Web archives play an important role in preserving our cultural heritage for future generations. When searching them, a serious problem arises from the fact that terminology evolve...
Klaus Berberich, Srikanta J. Bedathur, Mauro Sozio...
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
13 years 11 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...