— Applications raising in many scientific fields exhibit both data and task parallelism that have to be exploited efficiently. A classic approach is to structure those applica...
—The current trend in clusters architecture leads toward a massive use of multicore chips. This hardware evolution raises bottleneck issues at the network interface level. The us...
Abstract—The Special PRiority and Urgent Computing Environment (SPRUCE) provides on-demand access to highperformance computing resources for time-critical applications. While SPR...
—The industry is now in agreement that the future of architecture design lies in multiple cores. As a consequence, all computer systems today, from embedded devices to petascale ...
—Distributed file systems that use multiple servers to store data in parallel are becoming commonplace. Much work has already gone into such systems to maximize data throughput....
Nawab Ali, Ananth Devulapalli, Dennis Dalessandro,...
—This paper describes the application of various search techniques to the problem of automatic empirical code optimization. The search process is a critical aspect of auto-tuning...
—User-level communication allows an application process to access the network interface directly. Bypassing the kernel requires that a user process accesses the network interface...
—Several performance analysis tools support hybrid applications. Most originated as MPI profiling or tracing tools and OpenMP capabilities were added to extend the performance a...
—Parallel performance monitoring extends parallel measurement systems with infrastructure and interfaces for online performance data access, communication, and analysis. At the s...
Aroon Nataraj, Allen D. Malony, Allen Morris, Dori...