Sciweavers

209 search results - page 37 / 42
» Optimizing Parallel Applications for Wide-Area Clusters
Sort
View
ICS
2007
Tsinghua U.
15 years 3 months ago
Automatic nonblocking communication for partitioned global address space programs
Overlapping communication with computation is an important optimization on current cluster architectures; its importance is likely to increase as the doubling of processing power ...
Wei-Yu Chen, Dan Bonachea, Costin Iancu, Katherine...
PVM
2005
Springer
15 years 3 months ago
Implementing Byte-Range Locks Using MPI One-Sided Communication
We present an algorithm for implementing byte-range locks using MPI passive-target one-sided communication. This algorithm is useful in any scenario in which multiple processes of ...
Rajeev Thakur, Robert B. Ross, Robert Latham
HPDC
2012
IEEE
12 years 12 months ago
VNET/P: bridging the cloud and high performance computing through fast overlay networking
networking with a layer 2 abstraction provides a powerful model for virtualized wide-area distributed computing resources, including for high performance computing (HPC) on collec...
Lei Xia, Zheng Cui, John R. Lange, Yuan Tang, Pete...
PLDI
2011
ACM
14 years 12 days ago
Automatic compilation of MATLAB programs for synergistic execution on heterogeneous processors
MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typ...
Ashwin Prasad, Jayvant Anantpur, R. Govindarajan
IEEEPACT
2008
IEEE
15 years 4 months ago
A tuning framework for software-managed memory hierarchies
Achieving good performance on a modern machine with a multi-level memory hierarchy, and in particular on a machine with software-managed memories, requires precise tuning of progr...
Manman Ren, Ji Young Park, Mike Houston, Alex Aike...