Sciweavers

534 search results - page 105 / 107
» On Scheduling Parallel Tasks at Twilight
Sort
View
EUROPAR
2010
Springer
13 years 6 months ago
Optimized On-Chip-Pipelined Mergesort on the Cell/B.E
Abstract. Limited bandwidth to off-chip main memory is a performance bottleneck in chip multiprocessors for streaming computations, such as Cell/B.E., and this will become even mor...
Rikard Hultén, Christoph W. Kessler, Jö...
PDCAT
2009
Springer
14 years 20 days ago
CheCUDA: A Checkpoint/Restart Tool for CUDA Applications
Abstract—In this paper, a tool named CheCUDA is designed to checkpoint CUDA applications that use GPUs as accelerators. As existing checkpoint/restart implementations do not supp...
Hiroyuki Takizawa, Katsuto Sato, Kazuhiko Komatsu,...
ISCA
2006
IEEE
125views Hardware» more  ISCA 2006»
14 years 5 days ago
Architectural Semantics for Practical Transactional Memory
Transactional Memory (TM) simplifies parallel programming by allowing for parallel execution of atomic tasks. Thus far, TM systems have focused on implementing transactional stat...
Austen McDonald, JaeWoong Chung, Brian D. Carlstro...
PPOPP
2003
ACM
13 years 11 months ago
Improving server software support for simultaneous multithreaded processors
Simultaneous multithreading (SMT) represents a fundamental shift in processor capability. SMT's ability to execute multiple threads simultaneously within a single CPU offers ...
Luke McDowell, Susan J. Eggers, Steven D. Gribble
DAC
2008
ACM
13 years 8 months ago
Application mapping for chip multiprocessors
The problem attacked in this paper is one of automatically mapping an application onto a Network-on-Chip (NoC) based chip multiprocessor (CMP) architecture in a locality-aware fas...
Guangyu Chen, Feihui Li, Seung Woo Son, Mahmut T. ...