We present a technique for analyzing the number of cache misses incurred by multithreaded cache oblivious algorithms on an idealized parallel machine in which each processor has a...
In this paper, we use the tensor product notation as the framework of a programming methodology for designing block recursive algorithms. We first express a computational problem ...
A global barrier synchronizes all processors in a parallel system. This paper investigates algorithms that allow disjoint subsets of processors to synchronize independently and in...
Anja Feldmann, Thomas R. Gross, David R. O'Hallaro...
This paper presents Jade, a language which allows a programmer to easily express dynamic coarse-grain parallelism. Starting with a sequential program, a programmer augments those ...
The main purpose of this paper is to describe available (HPC)based implementations of remotely sensed hyperspectral image processing algorithms on multi-computer clusters, heterog...