We investigate an inherent limitation of top-down decision tree induction in which the continuous partitioning of the instance space progressively lessens the statistical support o...
In today’s applications data is produced at unprecedented rates. While the capacity to collect and store new data rapidly grows, the ability to analyze these data volumes increa...
Analysis of postgenomic biological data (such as microarray and SNP data) is a subtle art and science, and the statistical methods most commonly utilized sometimes prove inadequat...
We consider feature extraction (dimensionality reduction) for compositional data, where the data vectors are constrained to be positive and constant-sum. In real-world problems, t...
The development of accurate models and efficient algorithms for the analysis of multivariate categorical data are important and longstanding problems in machine learning and compu...
Mohammad Emtiyaz Khan, Shakir Mohamed, Benjamin M....