Sciweavers

567 search results - page 72 / 114
» Program Optimization and Parallelization Using Idioms
Sort
View
PLDI
1995
ACM
15 years 1 months ago
Unifying Data and Control Transformations for Distributed Shared Memory Machines
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...
Michal Cierniak, Wei Li
CF
2010
ACM
15 years 2 months ago
Enabling a highly-scalable global address space model for petascale computing
Over the past decade, the trajectory to the petascale has been built on increased complexity and scale of the underlying parallel architectures. Meanwhile, software developers hav...
Vinod Tipparaju, Edoardo Aprà, Weikuan Yu, ...
PODC
2010
ACM
15 years 1 months ago
Transactional predication: high-performance concurrent sets and maps for STM
Concurrent collection classes are widely used in multi-threaded programming, but they provide atomicity only for a fixed set of operations. Software transactional memory (STM) pr...
Nathan Grasso Bronson, Jared Casper, Hassan Chafi,...
IPPS
2006
IEEE
15 years 3 months ago
Bio-sequence database scanning on a GPU
Protein sequences with unknown functionality are often compared to a set of known sequences to detect functional similarities. Efficient dynamic programming algorithms exist for t...
Weiguo Liu, Bertil Schmidt, Gerrit Voss, Adrian Sc...
IPPS
2009
IEEE
15 years 4 months ago
Application profiling on Cell-based clusters
In this paper, we present a methodology for profiling parallel applications executing on the IBM PowerXCell 8i (commonly referred to as the “Cell” processor). Specifically, we...
Hikmet Dursun, Kevin J. Barker, Darren J. Kerbyson...