— In this paper we present a scheme for reducing the amount of synchronization overhead needed between components, after HW/SW partitioning, to preserve the original control flo...
In 2002, Japan announced the Earth Simulator—a supercomputer based on low-volume vector processors and a custom network—and reported that computational scientists had used it ...
Speaker clustering is the task of grouping a set of speech utterances into speaker-specific classes. The basic techniques for solving this task are similar to those used for spea...
This paper presents a delay optimal FPGA clustering algorithm targeting low power. We assume that the configurable logic blocks of the FPGA can be programmed using either a high s...
—We present and analyze two new communication libraries, cudaMPI and glMPI, that provide an MPI-like message passing interface to communicate data stored on the graphics cards of...