Clustering is the problem of identifying the distribution of patterns and intrinsic correlations in large data sets by partitioning the data points into similarity classes. This p...
Motivation: In the Affymetrix GeneChip system, preprocessing occurs before one obtains expression level measurements. Because the number of competing preprocessing methods was lar...
Microarray data usually contains a high level of noisy gene data, the noisy gene data include incorrect, noise and irrelevant genes. Before Microarray data classification takes pla...
We introduce a novel approach to incremental e-mail categorization based on identifying and exploiting "clumps" of messages that are classified similarly. Clumping reflec...
We study the problem of enumerating substrings that are common amongst genomes that share evolutionary descent. For example, one might want to enumerate all identical (therefore co...
Stanislav Angelov, Boulos Harb, Sampath Kannan, Sa...