One major problem of existing methods to mine data streams is that it makes ad hoc choices to combine most recent data with some amount of old data to search the new hypothesis. T...
Abstract - We discuss an ensemble-of-classifiers based algorithm for the missing feature problem. The proposed approach is inspired in part by the random subspace method, and in pa...
Background: Advances in sequencing and genotyping technologies are leading to the widespread availability of multi-species variation data, dense genotype data and large-scale rese...
Daniel Rios, William M. McLaren, Yuan Chen, Ewan B...
Using visualization techniques to explore and understand high-dimensional data is an efficient way to combine human intelligence with the immense brute force computation power ava...
In this paper, we investigate stability-based methods for cluster model selection, in particular to select the number K of clusters. The scenario under consideration is that clust...