Sciweavers

230 search results - page 30 / 46
» Software-Extended Coherent Shared Memory: Performance and Co...
Sort
View
ICS
2009
Tsinghua U.
15 years 4 months ago
Performance modeling and automatic ghost zone optimization for iterative stencil loops on GPUs
Iterative stencil loops (ISLs) are used in many applications and tiling is a well-known technique to localize their computation. When ISLs are tiled across a parallel architecture...
Jiayuan Meng, Kevin Skadron
ICS
1998
Tsinghua U.
15 years 2 months ago
OPTNET: A Cost-effective Optical Network for Multiprocessors
In this paper we propose the OPTNET, a novel optical network and associated coherence protocol for scalable multiprocessors. The network divides its channels into broadcast and po...
Enrique V. Carrera, Ricardo Bianchini
ISCA
1994
IEEE
104views Hardware» more  ISCA 1994»
15 years 1 months ago
Exploring the Design Space for a Shared-Cache Multiprocessor
In the near future, semiconductor technology will allow the integration of multiple processors on a chip or multichipmodule (MCM). In this paper we investigate the architecture an...
Basem A. Nayfeh, Kunle Olukotun
IPPS
2005
IEEE
15 years 3 months ago
Stream PRAM
Parallel random access memory, or PRAM, is a now venerable model of parallel computation that that still retains its usefulness for the design and analysis of parallel algorithms....
Darrell R. Ulm, Michael Scherger
ISCA
2012
IEEE
262views Hardware» more  ISCA 2012»
13 years 6 days ago
Boosting mobile GPU performance with a decoupled access/execute fragment processor
Smartphones represent one of the fastest growing markets, providing significant hardware/software improvements every few months. However, supporting these capabilities reduces the...
Jose-Maria Arnau, Joan-Manuel Parcerisa, Polychron...