Sciweavers

3893 search results - page 209 / 779
» Execution Architectures and Compilation
Sort
View
ICS
2004
Tsinghua U.
15 years 9 months ago
Evaluating support for global address space languages on the Cray X1
The Cray X1 was recently introduced as the first in a new line of parallel systems to combine high-bandwidth vector processing with an MPP system architecture. Alongside capabili...
Christian Bell, Wei-Yu Chen, Dan Bonachea, Katheri...
MICRO
2005
IEEE
136views Hardware» more  MICRO 2005»
15 years 9 months ago
Automatic Thread Extraction with Decoupled Software Pipelining
Until recently, a steadily rising clock rate and other uniprocessor microarchitectural improvements could be relied upon to consistently deliver increasing performance for a wide ...
Guilherme Ottoni, Ram Rangan, Adam Stoler, David I...
LCPC
2004
Springer
15 years 9 months ago
Empirical Performance-Model Driven Data Layout Optimization
Abstract. Empirical optimizers like ATLAS have been very effective in optimizing computational kernels in libraries. The best choice of parameters such as tile size and degree of l...
Qingda Lu, Xiaoyang Gao, Sriram Krishnamoorthy, Ge...
DATE
2006
IEEE
120views Hardware» more  DATE 2006»
15 years 10 months ago
System-level scheduling on instruction cell based reconfigurable systems
This paper presents a new operation chaining reconfigurable scheduling algorithm (CRS) based on list scheduling that maximizes instruction level parallelism available in distribut...
Ying Yi, Ioannis Nousias, Mark Milward, Sami Khawa...
FCCM
2006
IEEE
120views VLSI» more  FCCM 2006»
15 years 10 months ago
FPGAs, GPUs and the PS2 - A Single Programming Methodology
Field programmable gate arrays (FPGAs), graphics processing units (GPUs) and Sony’s Playstation 2 vector units offer scope for hardware acceleration of applications. Implementin...
Lee W. Howes, Paul Price, Oskar Mencer, Olav Beckm...