Sciweavers

18 search results - page 4 / 4
» Hardware Speedups in Long Integer Multiplication
Sort
View
IPPS
2010
IEEE
14 years 7 months ago
Dynamic load balancing on single- and multi-GPU systems
The computational power provided by many-core graphics processing units (GPUs) has been exploited in many applications. The programming techniques currently employed on these GPUs...
Long Chen, Oreste Villa, Sriram Krishnamoorthy, Gu...
70
Voted
CASES
2001
ACM
15 years 1 months ago
A system-on-a-chip lock cache with task preemption support
Intertask/interprocess synchronization overheads may be significant in a multiprocessor-shared memory System-on-a-Chip implementation. These overheads are observed in terms of loc...
Bilge Saglam Akgul, Jaehwan Lee, Vincent John Moon...
110
Voted
PLDI
2004
ACM
15 years 2 months ago
Min-cut program decomposition for thread-level speculation
With billion-transistor chips on the horizon, single-chip multiprocessors (CMPs) are likely to become commodity components. Speculative CMPs use hardware to enforce dependence, al...
Troy A. Johnson, Rudolf Eigenmann, T. N. Vijaykuma...