Sciweavers

1141 search results - page 101 / 229
» Compiler-Directed Performance Model Construction for Paralle...
Sort
View
IFL
2001
Springer
146views Formal Methods» more  IFL 2001»
15 years 5 months ago
Optimizations on Array Skeletons in a Shared Memory Environment
Map- and fold-like skeletons are a suitable abstractions to guide parallel program execution in functional array processing. However, when it comes to achieving high performance, i...
Clemens Grelck
112
Voted
CCGRID
2010
IEEE
15 years 1 months ago
Designing Accelerator-Based Distributed Systems for High Performance
Abstract--Multi-core processors with accelerators are becoming commodity components for high-performance computing at scale. While accelerator-based processors have been studied in...
M. Mustafa Rafique, Ali Raza Butt, Dimitrios S. Ni...
HPCA
2006
IEEE
15 years 6 months ago
Speculative synchronization and thread management for fine granularity threads
Performance of multithreaded programs is heavily influenced by the latencies of the thread management and synchronization operations. Improving these latencies becomes especially...
Alex Gontmakher, Avi Mendelson, Assaf Schuster, Gr...
SC
2003
ACM
15 years 6 months ago
Compiler Support for Exploiting Coarse-Grained Pipelined Parallelism
The emergence of grid and a new class of data-driven applications is making a new form of parallelism desirable, which we refer to as coarse-grained pipelined parallelism. This pa...
Wei Du, Renato Ferreira, Gagan Agrawal
229
Voted
ASPLOS
2009
ACM
16 years 1 months ago
Kendo: efficient deterministic multithreading in software
Although chip-multiprocessors have become the industry standard, developing parallel applications that target them remains a daunting task. Non-determinism, inherent in threaded a...
Marek Olszewski, Jason Ansel, Saman P. Amarasinghe