Sciweavers

1431 search results - page 170 / 287
» Analytical Performance Models of Parallel Programs in Cluste...
Sort
View
ICPP
2000
IEEE
15 years 7 months ago
Issues in Designing and Implementing a Scalable Virtual Interface Architecture
The Virtual Interface Architecture brings the benefits of low latency User-level Networking to a cluster environment. With an increasing number of communication channels created ...
Shailabh Nagar, Anand Sivasubramaniam, Jorge Rodri...
HPCA
2006
IEEE
15 years 9 months ago
Speculative synchronization and thread management for fine granularity threads
Performance of multithreaded programs is heavily influenced by the latencies of the thread management and synchronization operations. Improving these latencies becomes especially...
Alex Gontmakher, Avi Mendelson, Assaf Schuster, Gr...
SC
2003
ACM
15 years 8 months ago
Compiler Support for Exploiting Coarse-Grained Pipelined Parallelism
The emergence of grid and a new class of data-driven applications is making a new form of parallelism desirable, which we refer to as coarse-grained pipelined parallelism. This pa...
Wei Du, Renato Ferreira, Gagan Agrawal
OSDI
2008
ACM
16 years 3 months ago
Gadara: Dynamic Deadlock Avoidance for Multithreaded Programs
Deadlock is an increasingly pressing concern as the multicore revolution forces parallel programming upon the average programmer. Existing approaches to deadlock impose onerous bu...
Manjunath Kudlur, Scott A. Mahlke, Stéphane...
PDP
2009
IEEE
15 years 9 months ago
A Parallel Implementation of the 2D Wavelet Transform Using CUDA
There is a multicore platform that is currently concentrating an enormous attention due to its tremendous potential in terms of sustained performance: the NVIDIA Tesla boards. The...
Joaquín Franco, Gregorio Bernabé, Ju...