Industry is moving towards multi-core designs as we have hit the memory and power walls. Multi-core designs are very effective to exploit thread-level parallelism (TLP) but do not...
Abstract. We show how computations such as those involved in American or European-style option price valuations with the explicit finite difference method can be performed in par...
Abstract. Multi-core processors with shared last-level caches are vulnerable to performance inefficiencies and fairness issues when the cache is not carefully managed between the m...
Abstract. Manually tuning MPI runtime parameters is a practice commonly employed to optimise MPI application performance on a specific architecture. However, the best setting for ...
Simone Pellegrini, Jie Wang, Thomas Fahringer, Han...
Previous research has shown that Staged Execution (SE), i.e., dividing a program into segments and executing each segment at the core that has the data and/or functionality to bes...