Tuning parallel code can be a time-consuming and difficult task. We present our approach to automate the performance analysis of OpenMP applications that is based on the notion of ...
Both inherently sequential code and limitations of analysis techniques prevent full parallelization of many applications by parallelizing compilers. Amdahl's Law tells us tha...
Run-time parallelization is often the only way to execute the code in parallel when data dependence information is incomplete at compile time. This situation is common in many imp...
Message-based process architectures are widely regarded as an effective method for structuring parallel protocol processing on shared memory multi-processor platforms. A message-b...
This paper introduces Strings, a high performance distributed shared memory system designed for clusters of symmetrical multiprocessors (SMPs). The distinguishing feature of this ...