Sciweavers

TCBB
2010

Feature Selection for Gene Expression Using Model-Based Entropy

13 years 2 months ago
Feature Selection for Gene Expression Using Model-Based Entropy
—Gene expression data usually contain a large number of genes, but a small number of samples. Feature selection for gene expression data aims at finding a set of genes that best discriminate biological samples of different types. Using machine learning techniques, traditional gene selection based on empirical mutual information suffers the data sparseness issue due to the small number of samples. To overcome the sparseness issue, we propose a model-based approach to estimate the entropy of class variables on the model, instead of on the data themselves. Here, we use multivariate normal distributions to fit the data, because multivariate normal distributions have maximum entropy among all real-valued distributions with specified mean and standard deviation, and are widely used to approximate various distributions. Given that the data follow a multivariate normal distribution, since the conditional distribution of class variables given the selected features is normal distribution, i...
Shenghuo Zhu, Dingding Wang, Kai Yu, Tao Li, Yihon
Added 30 Jan 2011
Updated 30 Jan 2011
Type Journal
Year 2010
Where TCBB
Authors Shenghuo Zhu, Dingding Wang, Kai Yu, Tao Li, Yihong Gong
Comments (0)