Sciweavers

914 search results - page 43 / 183
» Assessing the performance limits of parallelized near-thresh...
Sort
View
EUROPAR
2007
Springer
15 years 6 months ago
MCSTL: The Multi-core Standard Template Library
1 Future gain in computing performance will not stem from increased clock rates, but from even more cores in a processor. Since automatic parallelization is still limited to easily...
Johannes Singler, Peter Sanders, Felix Putze
ISPAN
1997
IEEE
15 years 4 months ago
CASS: an efficient task management system for distributed memory architectures
The thesis of this research is that the task of exposing the parallelism in a given application should be left to the algorithm designer, who has intimate knowledge of the applica...
Jing-Chiou Liou, Michael A. Palis
PPOPP
2012
ACM
13 years 8 months ago
Chestnut: a GPU programming language for non-experts
Graphics processing units (GPUs) are powerful devices capable of rapid parallel computation. GPU programming, however, can be quite difficult, limiting its use to experienced prog...
Andrew Stromme, Ryan Carlson, Tia Newhall
CLUSTER
2003
IEEE
15 years 5 months ago
A Performance Comparison of Linux and a Lightweight Kernel
In this paper, we compare running the Linux operating system on the compute nodes of ASCI Red hardware to running a specialized, highly-optimized lightweight kernel (LWK) operatin...
Ron Brightwell, Rolf Riesen, Keith D. Underwood, T...
ICPPW
2006
IEEE
15 years 6 months ago
Multiple Flows of Control in Migratable Parallel Programs
Many important parallel applications require multiple flows of control to run on a single processor. In this paper, we present a study of four flow-of-control mechanisms: proces...
Gengbin Zheng, Laxmikant V. Kalé, Orion Sky...