Efficient performance tuning of parallel programs is often hard. We present a performance prediction and visualization tool called VPPB. Based on a monitored uni-processor executi...
With the availability of chip multiprocessor (CMP) and simultaneous multithreading (SMT) machines, extracting thread level parallelism from a sequential program has become crucial...
This paper describes the design of a compiler which can translate out-of-core programs written in a data parallel language like HPF. Such a compiler is required for compiling larg...
Rajeev Thakur, Rajesh Bordawekar, Alok N. Choudhar...
Dataflow computation models enable simpler and more efficient management of the memory hierarchy - a key barrier to the performance of many parallel programs. This paper describes...
Parallel programming models based on a mixture of task and data parallelism have shown to be successful in addressing the increasing communication overhead of distributed memory p...