We present a novel approach to dealing with overfitting in black-box models. It is based on the leverages of the samples, i.e. on the influence that each observation has on the pa...
Background: One of the most commonly performed tasks when analysing high throughput gene expression data is to use clustering methods to classify the data into groups. There are a...
T. Ian Simpson, J. Douglas Armstrong, Andrew P. Ja...
The importance of text mining stems from the availability of huge volumes of text databases holding a wealth of valuable information that needs to be mined. Text categorization is...
It is possible to broadly characterize two approaches to probabilistic modeling in terms of generative and discriminative methods. Provided with sufficient training data the discr...
Background: Accurate assignment of genes to pathways is essential in order to understand the functional role of genes and to map the existing pathways in a given genome. Existing ...