Sciweavers

307 search results - page 44 / 62
» Automatic Parallelization Techniques for the EM-4
Sort
View
CLUSTER
2003
IEEE
15 years 5 months ago
Improving the Performance of MPI Derived Datatypes by Optimizing Memory-Access Cost
The MPI Standard supports derived datatypes, which allow users to describe noncontiguous memory layout and communicate noncontiguous data with a single communication function. Thi...
Surendra Byna, William D. Gropp, Xian-He Sun, Raje...
IPPS
2009
IEEE
15 years 6 months ago
Annotation-based empirical performance tuning using Orio
In many scientific applications, significant time is spent tuning codes for a particular highperformance architecture. Tuning approaches range from the relatively nonintrusive (...
Albert Hartono, Boyana Norris, Ponnuswamy Sadayapp...
93
Voted
IPPS
2008
IEEE
15 years 6 months ago
Overcoming scaling challenges in biomolecular simulations across multiple platforms
NAMD† is a portable parallel application for biomolecular simulations. NAMD pioneered the use of hybrid spatial and force decomposition, a technique now used by most scalable pr...
Abhinav Bhatele, Sameer Kumar, Chao Mei, James C. ...
119
Voted
PPOPP
2006
ACM
15 years 5 months ago
Exploiting distributed version concurrency in a transactional memory cluster
We investigate a transactional memory runtime system providing scaling and strong consistency for generic C++ and SQL applications on commodity clusters. We introduce a novel page...
Kaloian Manassiev, Madalin Mihailescu, Cristiana A...
EUROPAR
2008
Springer
15 years 1 months ago
Efficiently Building the Gated Single Assignment Form in Codes with Pointers in Modern Optimizing Compilers
Abstract. Understanding program behavior is at the foundation of program optimization. Techniques for automatic recognition of program constructs characterize the behavior of code ...
Manuel Arenaz, Pedro Amoedo, Juan Touriño