Sciweavers

155 search results - page 8 / 31
» On the Automatic Parallelization of the Perfect Benchmarks
Sort
View
HPCA
2001
IEEE
16 years 2 months ago
Reducing DRAM Latencies with an Integrated Memory Hierarchy Design
In this papel; we address the severe performance gap caused by high processor clock rates and slow DRAM accesses. We show that even with an aggressive, next-generation memory syst...
Wei-Fen Lin, Steven K. Reinhardt, Doug Burger
ICS
2005
Tsinghua U.
15 years 7 months ago
Towards automatic translation of OpenMP to MPI
We present compiler techniques for translating OpenMP shared-memory parallel applications into MPI messagepassing programs for execution on distributed memory systems. This transl...
Ayon Basumallik, Rudolf Eigenmann
DATE
2010
IEEE
153views Hardware» more  DATE 2010»
15 years 7 months ago
Recursion-driven parallel code generation for multi-core platforms
—We present Huckleberry, a tool for automatically generating parallel implementations for multi-core platforms from sequential recursive divide-and-conquer programs. The recursiv...
Rebecca L. Collins, Bharadwaj Vellore, Luca P. Car...
MIDDLEWARE
2010
Springer
15 years 7 days ago
Automatically Generating Symbolic Prefetches for Distributed Transactional Memories
Abstract. Developing efficient distributed applications while managing complexity can be challenging. Managing network latency is a key challenge for distributed applications. We ...
Alokika Dash, Brian Demsky
ICS
1993
Tsinghua U.
15 years 6 months ago
The EM-4 Under Implicit Parallelism
: The EM-4 is a supercomputer that offers very fast inter processor communication and support for multi threading. In this paper we demonstrate that the EM-4, Together with an auto...
Lubomir Bic, Mayez A. Al-Mouhamed