Sciweavers

583 search results - page 9 / 117
» NAS Parallel Benchmark Results
Sort
View
IPPS
2005
IEEE
15 years 3 months ago
Exploring the Energy-Time Tradeoff in MPI Programs on a Power-Scalable Cluster
Recently, energy has become an important issue in highperformance computing. For example, supercomputers that have energy in mind, such as BlueGene/L, have been built; the idea is...
Vincent W. Freeh, Feng Pan, Nandini Kappiah, David...
ICCS
2004
Springer
15 years 2 months ago
Improving Geographical Locality of Data for Shared Memory Implementations of PDE Solvers
On cc-NUMA multi-processors, the non-uniformity of main memory latencies motivates the need for co-location of threads and data. We call this special form of data locality, geogra...
Henrik Löf, Markus Nordén, Sverker Hol...
MASCOTS
2010
14 years 11 months ago
Efficient Discovery of Loop Nests in Execution Traces
Execution and communication traces are central to performance modeling and analysis. Since the traces can be very long, meaningful compression and extraction of representative beha...
Qiang Xu, Jaspal Subhlok, Nathaniel Hammen
EUROPAR
2001
Springer
15 years 1 months ago
Data-Parallel Compiler Support for Multipartitioning
Multipartitioning is a skewed-cyclic block distribution that yields better parallel efficiency and scalability for line-sweep computations than traditional block partitionings. Th...
Daniel G. Chavarría-Miranda, John M. Mellor...
CLUSTER
2006
IEEE
15 years 1 months ago
A Performance Instrumentation Framework to Characterize Computation-Communication Overlap in Message-Passing Systems
Effective overlap of computation and communication is a well understood technique for latency hiding and can yield significant performance gains for applications on high-end compu...
Aniruddha G. Shet, P. Sadayappan, David E. Bernhol...