Sciweavers

1403 search results - page 149 / 281
» Set cover algorithms for very large datasets
Sort
View
KDD
2005
ACM
166views Data Mining» more  KDD 2005»
16 years 6 months ago
A general model for clustering binary data
Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This p...
Tao Li
BIOINFORMATICS
2006
120views more  BIOINFORMATICS 2006»
15 years 6 months ago
Comparison of Affymetrix GeneChip expression measures
Motivation: In the Affymetrix GeneChip system, preprocessing occurs before one obtains expression level measurements. Because the number of competing preprocessing methods was lar...
Rafael A. Irizarry, Zhijin Wu, Harris A. Jaffee
217
Voted
ACIIDS
2010
IEEE
170views Database» more  ACIIDS 2010»
15 years 4 months ago
On the Effectiveness of Gene Selection for Microarray Classification Methods
Microarray data usually contains a high level of noisy gene data, the noisy gene data include incorrect, noise and irrelevant genes. Before Microarray data classification takes pla...
Zhongwei Zhang, Jiuyong Li, Hong Hu, Hong Zhou
ADMA
2010
Springer
271views Data Mining» more  ADMA 2010»
15 years 1 months ago
Exploiting Concept Clumping for Efficient Incremental E-Mail Categorization
We introduce a novel approach to incremental e-mail categorization based on identifying and exploiting "clumps" of messages that are classified similarly. Clumping reflec...
Alfred Krzywicki, Wayne Wobcke
RECOMB
2006
Springer
16 years 6 months ago
Efficient Enumeration of Phylogenetically Informative Substrings
We study the problem of enumerating substrings that are common amongst genomes that share evolutionary descent. For example, one might want to enumerate all identical (therefore co...
Stanislav Angelov, Boulos Harb, Sampath Kannan, Sa...