Sciweavers

39 search results - page 7 / 8
» Compiler Generated Multithreading to Alleviate Memory Latenc...
Sort
View
ICS
1999
Tsinghua U.
13 years 10 months ago
Eliminating synchronization bottlenecks in object-based programs using adaptive replication
This paper presents a technique, adaptive replication, for automatically eliminating synchronization bottlenecks in multithreaded programs that perform atomic operations on object...
Martin C. Rinard, Pedro C. Diniz
ISCA
2003
IEEE
110views Hardware» more  ISCA 2003»
13 years 11 months ago
Guided Region Prefetching: A Cooperative Hardware/Software Approach
Despite large caches, main-memory access latencies still cause significant performance losses in many applications. Numerous hardware and software prefetching schemes tolerate th...
Zhenlin Wang, Doug Burger, Steven K. Reinhardt, Ka...
CASES
2007
ACM
13 years 10 months ago
An integrated ARM and multi-core DSP simulator
In this paper we describe the design and implementation of a flexible, and extensible, just-in-time ARM simulator designed to run co-operatively with a multi-core DSP simulator on...
Sharad Singhai, MingYung Ko, Sanjay Jinturkar, May...
IWOMP
2007
Springer
14 years 4 days ago
Supporting OpenMP on Cell
The Cell processor is a heterogeneous multi-core processor with one Power Processing Engine (PPE) core and eight Synergistic Processing Engine (SPE) cores. Each SPE has a directly...
Kevin O'Brien, Kathryn M. O'Brien, Zehra Sura, Ton...
ISLPED
2005
ACM
93views Hardware» more  ISLPED 2005»
13 years 11 months ago
Power-aware code scheduling for clusters of active disks
In this paper, we take the idea of application-level processing on disks to one level further, and focus on an architecture, called Cluster of Active Disks (CAD), where the storag...
Seung Woo Son, Guangyu Chen, Mahmut T. Kandemir