Sciweavers

1031 search results - page 144 / 207
» Managing the operator ordering problem in parallel databases
Sort
View
118
Voted
IQIS
2005
ACM
15 years 6 months ago
Exploiting relationships for object consolidation
Researchers in the data mining area frequently have to spend significant portion of their time on preprocessing the data in order to apply their algorithms to real-world datasets...
Zhaoqi Chen, Dmitri V. Kalashnikov, Sharad Mehrotr...
80
Voted
PVLDB
2008
99views more  PVLDB 2008»
15 years 2 days ago
Industry-scale duplicate detection
Duplicate detection is the process of identifying multiple representations of a same real-world object in a data source. Duplicate detection is a problem of critical importance in...
Melanie Weis, Felix Naumann, Ulrich Jehle, Jens Lu...
BTW
2005
Springer
90views Database» more  BTW 2005»
15 years 2 months ago
The Importance of Being Earnest about Definitions
: Ideas from terminology management, the science of terms and definitions, can be used to improve the quality of software and data models, as well as to facilitate the achievement ...
Susan Thomas
KDD
2009
ACM
202views Data Mining» more  KDD 2009»
16 years 1 months ago
Correlated itemset mining in ROC space: a constraint programming approach
Correlated or discriminative pattern mining is concerned with finding the highest scoring patterns w.r.t. a correlation measure (such as information gain). By reinterpreting corre...
Siegfried Nijssen, Tias Guns, Luc De Raedt
WIDM
2005
ACM
15 years 6 months ago
Query translation scheme for heterogeneous XML data sources
In order to formulate a meaningful XML query, a user must have some knowledge of the schema of the XML documents to be queried. The query will succeed only if the schema of the ac...
Cindy X. Chen, George A. Mihaila, Sriram Padmanabh...