Sciweavers

312 search results - page 51 / 63
» A General Divide and Conquer Approach for Process Mining
Sort
View
DMKD
2004
ACM
139views Data Mining» more  DMKD 2004»
15 years 4 months ago
Iterative record linkage for cleaning and integration
Record linkage, the problem of determining when two records refer to the same entity, has applications for both data cleaning (deduplication) and for integrating data from multipl...
Indrajit Bhattacharya, Lise Getoor
DOLAP
2004
ACM
15 years 4 months ago
Developing a characterization of business intelligence workloads for sizing new database systems
Computer system sizing involves estimating the amount of hardware resources needed to support a new workload not yet deployed in a production environment. In order to determine th...
Ted J. Wasserman, Patrick Martin, David B. Skillic...
KDD
2008
ACM
206views Data Mining» more  KDD 2008»
15 years 11 months ago
Identifying biologically relevant genes via multiple heterogeneous data sources
Selection of genes that are differentially expressed and critical to a particular biological process has been a major challenge in post-array analysis. Recent development in bioin...
Zheng Zhao, Jiangxin Wang, Huan Liu, Jieping Ye, Y...
109
Voted
KDD
2007
ACM
153views Data Mining» more  KDD 2007»
15 years 11 months ago
Exploiting duality in summarization with deterministic guarantees
Summarization is an important task in data mining. A major challenge over the past years has been the efficient construction of fixed-space synopses that provide a deterministic q...
Panagiotis Karras, Dimitris Sacharidis, Nikos Mamo...
WSDM
2009
ACM
104views Data Mining» more  WSDM 2009»
15 years 6 months ago
Top-k aggregation using intersections of ranked inputs
There has been considerable past work on efficiently computing top k objects by aggregating information from multiple ranked lists of these objects. An important instance of this...
Ravi Kumar, Kunal Punera, Torsten Suel, Sergei Vas...