Sciweavers

2155 search results - page 339 / 431
» The EM-X Parallel Computer: Architecture and Basic Performan...
Sort
View
157
Voted
CF
2006
ACM
15 years 8 months ago
Landing openMP on cyclops-64: an efficient mapping of openMP to a many-core system-on-a-chip
This paper presents our experience mapping OpenMP parallel programming model to the IBM Cyclops-64 (C64) architecture. The C64 employs a many-core-on-a-chip design that integrates...
Juan del Cuvillo, Weirong Zhu, Guang R. Gao
ASPLOS
2010
ACM
15 years 11 months ago
MacroSS: macro-SIMDization of streaming applications
SIMD (Single Instruction, Multiple Data) engines are an essential part of the processors in various computing markets, from servers to the embedded domain. Although SIMD-enabled a...
Amir Hormati, Yoonseo Choi, Mark Woh, Manjunath Ku...
ICS
2005
Tsinghua U.
15 years 10 months ago
The implications of working set analysis on supercomputing memory hierarchy design
Supercomputer architects strive to maximize the performance of scientific applications. Unfortunately, the large, unwieldy nature of most scientific applications has lead to the...
Richard C. Murphy, Arun Rodrigues, Peter M. Kogge,...
HPCA
2005
IEEE
16 years 5 months ago
Tapping ZettaRAMTM for Low-Power Memory Systems
ZettaRAMTM is a new memory technology under development by ZettaCoreTM as a potential replacement for conventional DRAM. The key innovation is replacing the conventional capacitor...
Ravi K. Venkatesan, Ahmed S. Al-Zawawi, Eric Roten...
HPCA
2003
IEEE
16 years 5 months ago
A Methodology for Designing Efficient On-Chip Interconnects on Well-Behaved Communication Patterns
As the level of chip integration continues to advance at a fast pace, the desire for efficient interconnects-whether on-chip or off-chip--is rapidly increasing. Traditional interc...
Wai Hong Ho, Timothy Mark Pinkston