Sciweavers

60 search results - page 12 / 12
» Reducing memory requirements of stream programs by graph tra...
Sort
View
ASPLOS
2011
ACM
12 years 9 months ago
On-the-fly elimination of dynamic irregularities for GPU computing
The power-efficient massively parallel Graphics Processing Units (GPUs) have become increasingly influential for scientific computing over the past few years. However, their ef...
Eddy Z. Zhang, Yunlian Jiang, Ziyu Guo, Kai Tian, ...
IPPS
2006
IEEE
13 years 11 months ago
A simulator for parallel applications with dynamically varying compute node allocation
Dynamically allocating computing nodes to parallel applications is a promising technique for improving the utilization of cluster resources. We introduce the concept of dynamic ef...
Basile Schaeli, B. Gerlach, Roger D. Hersch
IISWC
2008
IEEE
14 years 5 days ago
Accelerating multi-core processor design space evaluation using automatic multi-threaded workload synthesis
The design and evaluation of microprocessor architectures is a difficult and time-consuming task. Although small, handcoded microbenchmarks can be used to accelerate performance e...
Clay Hughes, Tao Li
ISCA
2005
IEEE
119views Hardware» more  ISCA 2005»
13 years 11 months ago
Rescue: A Microarchitecture for Testability and Defect Tolerance
Scaling feature size improves processor performance but increases each device’s susceptibility to defects (i.e., hard errors). As a result, fabrication technology must improve s...
Ethan Schuchman, T. N. Vijaykumar
CF
2006
ACM
13 years 7 months ago
Intermediately executed code is the key to find refactorings that improve temporal data locality
The growing speed gap between memory and processor makes an efficient use of the cache ever more important to reach high performance. One of the most important ways to improve cac...
Kristof Beyls, Erik H. D'Hollander