This paper presents Clusterfile, a parallel file system that provides parallel file access on a cluster of computers. Existing parallel file systems offer little control over matc...
Abstract. We discuss the High Performance Fortran data parallel programming language as an aid to software engineering and as a tool for exploiting High Performance Computing syste...
In recent years, active learning methods based on experimental design achieve state-of-the-art performance in text classification applications. Although these methods can exploit ...
We propose a distributed parallel support vector machine (DPSVM) training mechanism in a configurable network environment for distributed data mining. The basic idea is to exchange...
We consider the problem of numerical stability and model density growth when training a sparse linear model from massive data. We focus on scalable algorithms that optimize certain...