A key challenge in supporting data-driven scientific applications is the storage and management of input and output data in a distributed environment. In this paper, we describe a...
Stephen Langella, Shannon Hastings, Scott Oster, T...
In this paper, block diagonal linear discriminant analysis (BDLDA) is improved and applied to gene expression data. BDLDA is a classification tool with embedded feature selection...
Lingyan Sheng, Roger Pique-Regi, Shahab Asgharzade...
Fluid flow in porous media is a dynamic process that is traditionally modeled using PDE (Partial Differential Equations). In this approach, physical properties related to fluid fl...
Abstract— We consider the problem of efficiently storing ngram counts for large n over very large corpora. In such cases, the efficient storage of sufficient statistics can ha...
Data stream clustering has emerged as a challenging and interesting problem over the past few years. Due to the evolving nature, and one-pass restriction imposed by the data strea...