Sciweavers

KDD
2004
ACM

Assessment of discretization techniques for relevant pattern discovery from gene expression data

14 years 5 months ago
Assessment of discretization techniques for relevant pattern discovery from gene expression data
In the domain of gene expression data analysis, various researchers have recently emphasized the promising application of pattern discovery techniques like association rule mining or formal concept extraction from boolean matrices that encode gene properties. To take the most from these approaches, a needed step concerns gene property encoding (e.g., over-expression) and its need for the discretization of raw gene expression data. The impact of this preprocessing step on both the quantity and the relevancy of the extracted patterns is crucial. In this paper, we study the impact of discretization parameters by a sound comparison between the dendrograms, i.e., trees that are generated by a hierarchical clustering algorithm, computed from raw expression data and from the various derived boolean matrices. Thanks to a new similarity measure and practical validation over several gene expression data sets, we propose a method that supports the choice of a discretization technique and its par...
Ruggero G. Pensa, Claire Leschi, Jéré
Added 30 Nov 2009
Updated 30 Nov 2009
Type Conference
Year 2004
Where KDD
Authors Ruggero G. Pensa, Claire Leschi, Jérémy Besson, Jean-François Boulicaut
Comments (0)