This paper describes a compiler for stream programs that efficiently schedules computational kernels and stream memory operations, and allocates on-chip storage. Our compiler uses...
Abstract: Existing multicore systems already provide deep levels of thread parallelism. Hybrid programming models and composability of parallel libraries are very active areas of r...
Costin Iancu, Steven Hofmeyr, Filip Blagojevic, Yi...
The emergence of grid and a new class of data-driven applications is making a new form of parallelism desirable, which we refer to as coarse-grained pipelined parallelism. This pa...
Abstract—We present LeWI: a novel load balancing algorithm, that can balance applications with very different patterns of imbalance. Our algorithm can balance fine grain imbalan...
Content-based Image Retrieval (CBIR) is a computer vision application that aims at automatically retrieving images based on their visual content. Linear Discriminat Analysis and i...