Document clustering is useful in many information retrieval tasks: document browsing, organization and viewing of retrieval results, generation of Yahoo-like hierarchies of docume...
Background: When analyzing microarray gene expression data, missing values are often encountered. Most multivariate statistical methods proposed for microarray data analysis canno...
In this paper we derive an independent-component analysis (ICA) method for analyzing two or more data sets simultaneously. Our model extracts independent components common to all ...
Ana S. Lukic, Miles N. Wernick, Lars Kai Hansen, J...
Abstract. We present a probabilistic model for robust principal component analysis (PCA) in which the observation noise is modelled by Student-t distributions that are independent ...
In this paper, we systematically assess the value of using web-scale N-gram data in state-of-the-art supervised NLP classifiers. We compare classifiers that include or exclude fea...