Sciweavers

53 search results - page 8 / 11
» Smarter Memory: Improving Bandwidth for Streamed References
Sort
View
CLUSTER
2006
IEEE
15 years 1 months ago
Improving Communication Performance on InfiniBand by Using Efficient Data Placement Strategies
Despite using high-speed network interconnection systems like InfiniBand, the communication overhead for parallel applications is still high. In this paper we show, how such costs...
Robert Rex, Frank Mietke, Wolfgang Rehm, Christoph...
HPCA
2009
IEEE
15 years 10 months ago
Techniques for bandwidth-efficient prefetching of linked data structures in hybrid prefetching systems
Linked data structure (LDS) accesses are critical to the performance of many large scale applications. Techniques have been proposed to prefetch such accesses. Unfortunately, many...
Eiman Ebrahimi, Onur Mutlu, Yale N. Patt
EUROPAR
2010
Springer
14 years 9 months ago
Optimized On-Chip-Pipelined Mergesort on the Cell/B.E
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Rikard Hultén, Christoph W. Kessler, Jö...
IISWC
2009
IEEE
15 years 4 months ago
Understanding PARSEC performance on contemporary CMPs
PARSEC is a reference application suite used in industry and academia to assess new Chip Multiprocessor (CMP) designs. No investigation to date has profiled PARSEC on real hardwa...
Major Bhadauria, Vincent M. Weaver, Sally A. McKee
ISCA
1998
IEEE
136views Hardware» more  ISCA 1998»
15 years 1 months ago
Exploiting Spatial Locality in Data Caches Using Spatial Footprints
Modern cache designs exploit spatial locality by fetching large blocks of data called cache lines on a cache miss. Subsequent references to words within the same cache line result...
Sanjeev Kumar, Christopher B. Wilkerson