Sciweavers

1075 search results - page 120 / 215
» Parallel Programming with Transactional Memory
Sort
View
ICPP
2003
IEEE
15 years 8 months ago
Enabling Partial Cache Line Prefetching Through Data Compression
Hardware prefetching is a simple and effective technique for hiding cache miss latency and thus improving the overall performance. However, it comes with addition of prefetch buff...
Youtao Zhang, Rajiv Gupta
96
Voted
ICS
2009
Tsinghua U.
15 years 7 months ago
Single-particle 3d reconstruction from cryo-electron microscopy images on GPU
Single-particle 3D reconstruction from cryo-electron microscopy (cryo-EM) images is a kernel application of biological molecules analysis, as the computational requirement of whic...
Guangming Tan, Ziyu Guo, Mingyu Chen, Dan Meng
78
Voted
ISCA
1997
IEEE
96views Hardware» more  ISCA 1997»
15 years 6 months ago
DataScalar Architectures
DataScalar architectures improve memory system performance by running computation redundantly across multiple processors, which are each tightly coupled with an associated memory....
Doug Burger, Stefanos Kaxiras, James R. Goodman
112
Voted
CISIS
2010
IEEE
15 years 9 months ago
Threaded Dynamic Memory Management in Many-Core Processors
—Current trends in desktop processor design have been toward many-core solutions with increased parallelism. As the number of supported threads grows in these processors, it may ...
Edward C. Herrmann, Philip A. Wilsey
HPCA
2009
IEEE
16 years 3 months ago
Design and implementation of software-managed caches for multicores with local memory
Heterogeneous multicores, such as Cell BE processors and GPGPUs, typically do not have caches for their accelerator cores because coherence traffic, cache misses, and latencies fr...
Sangmin Seo, Jaejin Lee, Zehra Sura