Sciweavers

4171 search results - page 779 / 835
» Distributed Data Structure Design for Scientific Computation
Sort
View
PDCAT
2009
Springer
15 years 7 months ago
CheCUDA: A Checkpoint/Restart Tool for CUDA Applications
Abstract—In this paper, a tool named CheCUDA is designed to checkpoint CUDA applications that use GPUs as accelerators. As existing checkpoint/restart implementations do not supp...
Hiroyuki Takizawa, Katsuto Sato, Kazuhiko Komatsu,...
HOTI
2008
IEEE
15 years 7 months ago
Network Processing on an SPE Core in Cell Broadband Engine
Cell Broadband EngineTM is a multi-core system on a chip and is composed of a general-purpose Power Processing Element (PPE) and eight Synergistic Processing Elements (SPEs). Its ...
Yuji Kawamura, Takeshi Yamazaki, Hiroshi Kyusojin,...
120
Voted
IEEEPACT
2002
IEEE
15 years 5 months ago
Transparent Threads: Resource Sharing in SMT Processors for High Single-Thread Performance
Simultaneous Multithreading (SMT) processors achieve high processor throughput at the expense of single-thread performance. This paper investigates resource allocation policies fo...
Gautham K. Dorai, Donald Yeung
ICS
2009
Tsinghua U.
15 years 5 months ago
Exploring pattern-aware routing in generalized fat tree networks
New static source routing algorithms for High Performance Computing (HPC) are presented in this work. The target parallel architectures are based on the commonly used fattree netw...
Germán Rodríguez, Ramón Beivi...
96
Voted
IPPS
2000
IEEE
15 years 5 months ago
Three Dimensional VLSI-Scale Interconnects
As processor speeds rapidly approach the Giga-Hertz regime, the disparity between process time and memory access time plays an increasing role in the overall limitation of processo...
Dennis W. Prather