Sciweavers

Free Online Productivity Tools i2Speak i2Symbol i2OCR iTex2Img iWeb2Print iWeb2Shot i2Type iPdf2Split iPdf2Merge i2Bopomofo i2Arabic i2Style i2Image i2PDF iLatex2Rtf Sci2ools

12

ISBRA
2007
Springer

favoriteEmaildiscussreport

132views Bioinformatics» more ISBRA 2007»

Clustering Algorithms Optimizer: A Framework for Large Datasets

13 years 10 months ago

Clustering Algorithms Optimizer: A Framework for Large Datasets

Download neuron.tau.ac.il

Clustering algorithms are employed in many bioinformatics tasks, including categorization of protein sequences and analysis of gene-expression data. Although these algorithms are routinely applied, many of them suffer from the following limitations: (i) relying on predetermined parameters tuning, such as a-priori knowledge regarding the number of clusters; (ii) involving nondeterministic procedures that yield inconsistent outcomes. Thus, a framework that addresses these shortcomings is desirable. We provide a datadriven framework that includes two interrelated steps. The first one is SVDbased dimension reduction and the second is an automated tuning of the algorithm’s parameter(s). The dimension reduction step is efficiently adjusted for very large datasets. The optimal parameter setting is identified according to the internal evaluation criterion known as Bayesian Information Criterion (BIC). This framework can incorporate most clustering algorithms and improve their performance. In...

Roy Varshavsky, David Horn, Michal Linial

Real-time Traffic

Algorithms | Bioinformatics | Dimension Reduction | ISBRA 2007 | Quantum Clustering Algorithms |

claim paper

Related Content

» Hierarchical Density Shaving A clustering and visualization framework for large biological...

» A General Framework for Fast Coclustering on Large Datasets Using Matrix Decomposition

» DIFFRAC a discriminative and flexible framework for clustering

» CRD fast coclustering on large datasets utilizing samplingbased matrix decomposition

» Clustering with XCS on Complex Structure Dataset

» Multiscreen Tiled Displayed Parallel Rendering System for a Large Terrain Dataset

» IMDC An ImageMapped Data Clustering Technique for Large Datasets

» A Compilation Framework for Distributed Memory Parallelization of Data Mining Algorithms

» A New Agglomerative Hierarchical Clustering Algorithm Implementation based on the Map Redu...

Post Info
More Details (n/a)

Added	08 Jun 2010
Updated	08 Jun 2010
Type	Conference
Year	2007
Where	ISBRA
Authors	Roy Varshavsky, David Horn, Michal Linial

Comments (0)