Data parallel compilers have long aimed to equal the performance of carefully hand-optimized parallel codes. For tightly-coupled applications based on line sweeps, this goal has b...
We consider the filter decomposition problem in supporting coarse-grained pipelined parallelism. This form of parallelism is suitable for data-driven applications in scenarios wh...
In this paper we discuss object detection when only a small number of training examples are given. Specifically, we show how to incorporate a simple prior on the distribution of n...
Simulation of large-scale networks requires enormous amounts of memory and processing time. One way of speeding up these simulations is to distribute the model over a number of co...
: This is a position paper introducing blob computing: A Blob is a generic primitive used to structure a uniform computing substrate into an easier-to-program parallel virtual mach...