Sciweavers

4085 search results - page 220 / 817
» Benchmarking Data Mining Algorithms
Sort
View
153
Voted
BMCBI
2005
142views more  BMCBI 2005»
15 years 3 months ago
CLU: A new algorithm for EST clustering
Background: The continuous flow of EST data remains one of the richest sources for discoveries in modern biology. The first step in EST data mining is usually associated with EST ...
Andrey A. Ptitsyn, Winston Hide
114
Voted
KDD
2006
ACM
128views Data Mining» more  KDD 2006»
16 years 4 months ago
Workload-aware anonymization
Protecting data privacy is an important problem in microdata distribution. Anonymization algorithms typically aim to protect individual privacy, with minimal impact on the quality...
Kristen LeFevre, David J. DeWitt, Raghu Ramakrishn...
134
Voted
KDD
2001
ACM
216views Data Mining» more  KDD 2001»
16 years 4 months ago
The distributed boosting algorithm
In this paper, we propose a general framework for distributed boosting intended for efficient integrating specialized classifiers learned over very large and distributed homogeneo...
Aleksandar Lazarevic, Zoran Obradovic
92
Voted
KDD
2004
ACM
114views Data Mining» more  KDD 2004»
16 years 4 months ago
Mining reference tables for automatic text segmentation
Automatically segmenting unstructured text strings into structured records is necessary for importing the information contained in legacy sources and text collections into a data ...
Eugene Agichtein, Venkatesh Ganti
HIS
2008
15 years 5 months ago
Genetic-Based Synthetic Data Sets for the Analysis of Classifiers Behavior
In this paper, we highlight the use of synthetic data sets to analyze learners behavior under bounded complexity. We propose a method to generate synthetic data sets with a specif...
Núria Macià, Albert Orriols-Puig, Es...