Sciweavers

176 search results - page 32 / 36
» Visual structure-based web page clustering and retrieval
Sort
View
SIGIR
2008
ACM
14 years 9 months ago
Comments-oriented document summarization: understanding documents with readers' feedback
Comments left by readers on Web documents contain valuable information that can be utilized in different information retrieval tasks including document search, visualization, and ...
Meishan Hu, Aixin Sun, Ee-Peng Lim
WWW
2008
ACM
15 years 10 months ago
Efficient similarity joins for near duplicate detection
With the increasing amount of data and the need to integrate data from multiple data sources, a challenging issue is to find near duplicate records efficiently. In this paper, we ...
Chuan Xiao, Wei Wang 0011, Xuemin Lin, Jeffrey Xu ...
SIGMOD
2001
ACM
121views Database» more  SIGMOD 2001»
15 years 9 months ago
XML Document Versioning
Managing multiple versions of XML documents represents an important problem, because of many applications ranging from traditional ones, such as software configuration control, to...
Shu-Yao Chien, Vassilis J. Tsotras, Carlo Zaniolo
WSDM
2010
ACM
204views Data Mining» more  WSDM 2010»
15 years 4 months ago
Learning URL patterns for webpage de-duplication
Presence of duplicate documents in the World Wide Web adversely affects crawling, indexing and relevance, which are the core building blocks of web search. In this paper, we pres...
Hema Swetha Koppula, Krishna P. Leela, Amit Agarwa...
MM
2010
ACM
402views Multimedia» more  MM 2010»
14 years 7 months ago
Discriminative codeword selection for image representation
Bag of features (BoF) representation has attracted an increasing amount of attention in large scale image processing systems. BoF representation treats images as loose collections...
Lijun Zhang 0005, Chun Chen, Jiajun Bu, Zhengguang...