Sciweavers

11 search results - page 2 / 3
» Hiding Communication Latency with Non-SPMD, Graph-Based Exec...
Sort
View
FPL
2006
Springer
242views Hardware» more  FPL 2006»
13 years 8 months ago
TMD-MPI: An MPI Implementation for Multiple Processors Across Multiple FPGAs
With current FPGAs, designers can now instantiate several embedded processors, memory units, and a wide variety of IP blocks to build a single-chip, high-performance multiprocesso...
Manuel Saldaña, Paul Chow
IEEEPACT
2005
IEEE
13 years 10 months ago
HUNTing the Overlap
Hiding communication latency is an important optimization for parallel programs. Programmers or compilers achieve this by using non-blocking communication primitives and overlappi...
Costin Iancu, Parry Husbands, Paul Hargrove
ISCA
2003
IEEE
136views Hardware» more  ISCA 2003»
13 years 10 months ago
Transient-Fault Recovery for Chip Multiprocessors
To address the increasing susceptibility of commodity chip multiprocessors (CMPs) to transient faults, we propose Chiplevel Redundantly Threaded multiprocessor with Recovery (CRTR...
Mohamed A. Gomaa, Chad Scarbrough, Irith Pomeranz,...
HPCA
2011
IEEE
12 years 8 months ago
MOPED: Orchestrating interprocess message data on CMPs
Future CMPs will combine many simple cores with deep cache hierarchies. With more cores, cache resources per core are fewer, and must be shared carefully to avoid poor utilization...
Junli Gu, Steven S. Lumetta, Rakesh Kumar, Yihe Su...
MICRO
1997
IEEE
116views Hardware» more  MICRO 1997»
13 years 9 months ago
Tuning Compiler Optimizations for Simultaneous Multithreading
Compiler optimizations are often driven by specific assumptions about the underlying architecture and implementation of the target machine. For example, when targeting shared-mem...
Jack L. Lo, Susan J. Eggers, Henry M. Levy, Sujay ...