Network processors today consists of multiple parallel processors (microengines) with support for multiple threads to exploit packet level parallelism inherent in network workload...
This paper describes a source to source compilation tool for optimizing MPI-based parallel applications. This tool is able to automatically apply a “prepushing” transformation...
Abstract. With the help of the FPGA technology, the boarder between hardand software has vanished. It is now possible to develop complex designs and fine grained parallel applicat...
Adaptive applications have computational workloads and communication patterns which change unpredictably at runtime, requiring dynamic load balancing to achieve scalable performan...
Hongzhang Shan, Jaswinder Pal Singh, Leonid Oliker...
Efficient performance tuning of parallel programs is often hard. We present a performance prediction and visualization tool called VPPB. Based on a monitored uni-processor executi...