Sciweavers

25 search results - page 4 / 5
» Performance portable GPU code generation for matrix multipli...
Sort
View
TJS
2002
135views more  TJS 2002»
13 years 5 months ago
HPCVIEW: A Tool for Top-down Analysis of Node Performance
Although it is increasingly difficult for large scientific programs to attain a significant fraction of peak performance on systems based on microprocessors with substantial instr...
John M. Mellor-Crummey, Robert J. Fowler, Gabriel ...
AOSD
2007
ACM
13 years 9 months ago
Generating parallel applications for distributed memory systems using aspects, components, and patterns
Developing and debugging parallel programs particularly for distributed memory architectures is still a difficult task. The most popular approach to developing parallel programs f...
Purushotham V. Bangalore
ICS
2005
Tsinghua U.
13 years 11 months ago
Think globally, search locally
A key step in program optimization is the determination of optimal values for code optimization parameters such as cache tile sizes and loop unrolling factors. One approach, which...
Kamen Yotov, Keshav Pingali, Paul Stodghill
CORR
2011
Springer
211views Education» more  CORR 2011»
12 years 9 months ago
Programming Massively Parallel Architectures using MARTE: a Case Study
—Nowadays, several industrial applications are being ported to parallel architectures. These applications take advantage of the potential parallelism provided by multiple core pr...
Antonio Wendell De Oliveira Rodrigues, Fréd...
BMCBI
2006
96views more  BMCBI 2006»
13 years 5 months ago
Structure alignment based on coding of local geometric measures
Background: A structure alignment method based on a local geometric property is presented and its performance is tested in pairwise and multiple structure alignments. In this appr...
Peter L. Chang, Andrew W. Rinne, T. Gregory Dewey