Sciweavers

16159 search results - page 3177 / 3232
» Parallel computing with CUDA
Sort
View
IEEEPACT
2009
IEEE
15 years 4 months ago
Using Aggressor Thread Information to Improve Shared Cache Management for CMPs
—Shared cache allocation policies play an important role in determining CMP performance. The simplest policy, LRU, allocates cache implicitly as a consequence of its replacement ...
Wanli Liu, Donald Yeung
IEEEPACT
2009
IEEE
15 years 4 months ago
Polyhedral-Model Guided Loop-Nest Auto-Vectorization
Abstract—Optimizing compilers apply numerous interdependent optimizations, leading to the notoriously difficult phase-ordering problem — that of deciding which transformations...
Konrad Trifunovic, Dorit Nuzman, Albert Cohen, Aya...
IEEEPACT
2009
IEEE
15 years 4 months ago
StealthTest: Low Overhead Online Software Testing Using Transactional Memory
—Software testing is hard. The emergence of multicore architectures and the proliferation of bugprone multithreaded software makes testing even harder. To this end, researchers h...
Jayaram Bobba, Weiwei Xiong, Luke Yen, Mark D. Hil...
IEEEPACT
2009
IEEE
15 years 4 months ago
CPROB: Checkpoint Processing with Opportunistic Minimal Recovery
—CPR (Checkpoint Processing and Recovery) is a physical register management scheme that supports a larger instruction window and higher average IPC than conventional ROB-style re...
Andrew D. Hilton, Neeraj Eswaran, Amir Roth
IEEEPACT
2009
IEEE
15 years 4 months ago
Soft-OLP: Improving Hardware Cache Performance through Software-Controlled Object-Level Partitioning
—Performance degradation of memory-intensive programs caused by the LRU policy’s inability to handle weaklocality data accesses in the last level cache is increasingly serious ...
Qingda Lu, Jiang Lin, Xiaoning Ding, Zhao Zhang, X...
« Prev « First page 3177 / 3232 Last » Next »