Sciweavers

20 search results - page 3 / 4
» MPC-MPI: An MPI Implementation Reducing the Overall Memory C...
Sort
View
PC
2006
124views Management» more  PC 2006»
14 years 11 months ago
Message-passing code generation for non-rectangular tiling transformations
Tiling is a well known loop transformation used to reduce communication overhead in distributed memory machines. Although a lot of theoretical research has been done concerning th...
Georgios I. Goumas, Nikolaos Drosinos, Maria Athan...
EUROPAR
2010
Springer
14 years 11 months ago
Optimized On-Chip-Pipelined Mergesort on the Cell/B.E
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Rikard Hultén, Christoph W. Kessler, Jö...
ANCS
2009
ACM
14 years 9 months ago
Range Tries for scalable address lookup
In this paper we introduce the Range Trie, a new multiway tree data structure for address lookup. Each Range Trie node maps to an address range [Na, Nb) and performs multiple comp...
Ioannis Sourdis, Georgios Stefanakis, Ruben de Sme...
DAC
2000
ACM
16 years 17 days ago
Code compression for low power embedded system design
erse approaches at all levels of abstraction starting from the physical level up to the system level. Experience shows that a highlevel method may have a larger impact since the de...
Haris Lekatsas, Jörg Henkel, Wayne Wolf
PPOPP
1997
ACM
15 years 3 months ago
Effective Fine-Grain Synchronization for Automatically Parallelized Programs Using Optimistic Synchronization Primitives
As shared-memory multiprocessors become the dominant commodity source of computation, parallelizing compilers must support mainstream computations that manipulate irregular, point...
Martin C. Rinard