Complex shaders must be partitioned into multiple passes to execute on GPUs with limited hardware resources. Automatic partitioning gives rise to an NP-hard scheduling problem tha...
—The emergence of multi-core computers has led to explosive development of parallel applications and hence the need of efficient schedulers for parallel jobs. Adaptive online sc...
Asynchronous or "event-driven" programming is a popular technique to efficiently and flexibly manage concurrent interactions. In these programs, the programmer can post ...
This paper presents a technique to map automatically a complete digital signal processing (DSP) application onto a parallel machine with distributed memory. Unlike other applicati...
The most intuitive memory consistency model for shared-memory multi-threaded programming is sequential consistency (SC). However, current concurrent programming languages support ...
Daniel Marino, Abhayendra Singh, Todd D. Millstein...