Sciweavers

2227 search results - page 270 / 446
» Graph Mining based on a Data Partitioning Approach
Sort
View
144
Voted
KDD
2003
ACM
214views Data Mining» more  KDD 2003»
16 years 3 months ago
Adaptive duplicate detection using learnable string similarity measures
The problem of identifying approximately duplicate records in databases is an essential step for data cleaning and data integration processes. Most existing approaches have relied...
Mikhail Bilenko, Raymond J. Mooney
SDM
2007
SIAM
106views Data Mining» more  SDM 2007»
15 years 5 months ago
Approximating Representations for Large Numerical Databases
The paper introduces a notion of support for realvalued functions. It is shown how to approximate supports of a large class of functions based on supports of so called polynomial ...
Szymon Jaroszewicz, Marcin Korzen
146
Voted
CAISE
2007
Springer
15 years 9 months ago
Declarative XML Data Cleaning with XClean
Data cleaning is the process of correcting anomalies in a data source, that may for instance be due to typographical errors, or duplicate representations of an entity. It is a cruc...
Melanie Weis, Ioana Manolescu
128
Voted
ANOR
2010
119views more  ANOR 2010»
15 years 3 months ago
Alternating local search based VNS for linear classification
We consider the linear classification method consisting of separating two sets of points in d-space by a hyperplane. We wish to determine the hyperplane which minimises the sum of...
Frank Plastria, Steven De Bruyne, Emilio Carrizosa
152
Voted
EDBT
2010
ACM
164views Database» more  EDBT 2010»
15 years 6 months ago
Techniques for efficiently querying scientific workflow provenance graphs
A key advantage of scientific workflow systems over traditional scripting approaches is their ability to automatically record data and process dependencies introduced during workf...
Manish Kumar Anand, Shawn Bowers, Bertram Ludä...