Sciweavers

56 search results - page 1 / 12
» Adaptive near-duplicate detection via similarity learning
Sort
View
P2P
2010
IEEE
202views Communications» more  P2P 2010»
13 years 2 months ago
Optimizing Near Duplicate Detection for P2P Networks
—In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efficiently and effectively in large-scale P2P systems. To thi...
Odysseas Papapetrou, Sukriti Ramesh, Stefan Siersd...
SIGIR
2010
ACM
13 years 8 months ago
Adaptive near-duplicate detection via similarity learning
In this paper, we present a novel near-duplicate document detection method that can easily be tuned for a particular domain. Our method represents each document as a real-valued s...
Hannaneh Hajishirzi, Wen-tau Yih, Aleksander Kolcz
KDD
2004
ACM
195views Data Mining» more  KDD 2004»
14 years 5 months ago
Improved robustness of signature-based near-replica detection via lexicon randomization
Detection of near duplicate documents is an important problem in many data mining and information filtering applications. When faced with massive quantities of data, traditional d...
Aleksander Kolcz, Abdur Chowdhury, Joshua Alspecto...
ICML
2009
IEEE
14 years 5 months ago
Domain adaptation from multiple sources via auxiliary classifiers
We propose a multiple source domain adaptation method, referred to as Domain Adaptation Machine (DAM), to learn a robust decision function (referred to as target classifier) for l...
Lixin Duan, Ivor W. Tsang, Dong Xu, Tat-Seng Chua
LREC
2008
125views Education» more  LREC 2008»
13 years 6 months ago
Adaptation of Relation Extraction Rules to New Domains
This paper presents various strategies for improving the extraction performance of less prominent relations with the help of the rules learned for similar relations, for which lar...
Feiyu Xu, Hans Uszkoreit, Hong Li, Niko Felger