Sciweavers

50 search results - page 8 / 10
» Using SCTP to hide latency in MPI programs
Sort
View
PARA
2004
Springer
15 years 3 months ago
Cache Optimizations for Iterative Numerical Codes Aware of Hardware Prefetching
Cache optimizations typically include code transformations to increase the locality of memory accesses. An orthogonal approach is to enable for latency hiding by introducing prefet...
Josef Weidendorfer, Carsten Trinitis
ISCA
2000
IEEE
111views Hardware» more  ISCA 2000»
15 years 2 months ago
Understanding the backward slices of performance degrading instructions
For many applications, branch mispredictions and cache misses limit a processor’s performance to a level well below its peak instruction throughput. A small fraction of static i...
Craig B. Zilles, Gurindar S. Sohi
ICPP
2003
IEEE
15 years 2 months ago
Enabling Partial Cache Line Prefetching Through Data Compression
Hardware prefetching is a simple and effective technique for hiding cache miss latency and thus improving the overall performance. However, it comes with addition of prefetch buff...
Youtao Zhang, Rajiv Gupta
DSN
2002
IEEE
15 years 2 months ago
Ditto Processor
Concentration of design effort for current single-chip Commercial-Off-The-Shelf (COTS) microprocessors has been directed towards performance. Reliability has not been the primary ...
Shih-Chang Lai, Shih-Lien Lu, Jih-Kwon Peir
PC
2010
190views Management» more  PC 2010»
14 years 8 months ago
High-performance cone beam reconstruction using CUDA compatible GPUs
Compute unified device architecture (CUDA) is a software development platform that allows us to run C-like programs on the nVIDIA graphics processing unit (GPU). This paper prese...
Yusuke Okitsu, Fumihiko Ino, Kenichi Hagihara