Sciweavers

367 search results - page 1 / 74
» Duplicate detection in probabilistic data
Sort
View
ICDE
2010
IEEE
208views Database» more  ICDE 2010»
13 years 4 months ago
Duplicate detection in probabilistic data
Abstract— Collected data often contains uncertainties. Probabilistic databases have been proposed to manage uncertain data. To combine data from multiple autonomous probabilistic...
Fabian Panse, Maurice van Keulen, Ander de Keijzer...
ICDE
2010
IEEE
204views Database» more  ICDE 2010»
13 years 11 months ago
ProbClean: A probabilistic duplicate detection system
— One of the most prominent data quality problems is the existence of duplicate records. Current data cleaning systems usually produce one clean instance (repair) of the input da...
George Beskales, Mohamed A. Soliman, Ihab F. Ilyas...
P2P
2010
IEEE
202views Communications» more  P2P 2010»
13 years 3 months ago
Optimizing Near Duplicate Detection for P2P Networks
—In this paper, we propose a probabilistic algorithm for detecting near duplicate text, audio, and video resources efficiently and effectively in large-scale P2P systems. To thi...
Odysseas Papapetrou, Sukriti Ramesh, Stefan Siersd...
KDD
2005
ACM
104views Data Mining» more  KDD 2005»
14 years 5 months ago
A hit-miss model for duplicate detection in the WHO drug safety database
The WHO Collaborating Centre for International Drug Monitoring in Uppsala, Sweden, maintains and analyses the world's largest database of reports on suspected adverse drug re...
Andrew Bate, G. Niklas Norén, Roland Orre
ESWS
2010
Springer
13 years 8 months ago
Efficient Semantic-Aware Detection of Near Duplicate Resources
Abstract. Efficiently detecting near duplicate resources is an important task when integrating information from various sources and applications. Once detected, near duplicate reso...
Ekaterini Ioannou, Odysseas Papapetrou, Dimitrios ...