Sciweavers

347 search results - page 55 / 70
» Caching processor general registers
Sort
View
EGH
2004
Springer
15 years 1 months ago
Understanding the efficiency of GPU algorithms for matrix-matrix multiplication
Utilizing graphics hardware for general purpose numerical computations has become a topic of considerable interest. The implementation of streaming algorithms, typified by highly ...
Kayvon Fatahalian, Jeremy Sugerman, Pat Hanrahan
SIGCOMM
1997
ACM
15 years 1 months ago
Small Forwarding Tables for Fast Routing Lookups
For some time, the networking communityhas assumed that it is impossible to do IP routing lookups in software fast enough to support gigabit speeds. IP routing lookups must nd th...
Mikael Degermark, Andrej Brodnik, Svante Carlsson,...
ESA
2004
Springer
166views Algorithms» more  ESA 2004»
15 years 3 months ago
Super Scalar Sample Sort
Sample sort, a generalization of quicksort that partitions the input into many pieces, is known as the best practical comparison based sorting algorithm for distributed memory para...
Peter Sanders, Sebastian Winkel
ISLPED
2010
ACM
165views Hardware» more  ISLPED 2010»
14 years 9 months ago
Dynamic workload characterization for power efficient scheduling on CMP systems
Runtime characteristics of individual threads (such as IPC, cache usage, etc.) are a critical factor in making efficient scheduling decisions in modern chip-multiprocessor systems...
Gaurav Dhiman, Vasileios Kontorinis, Dean M. Tulls...
TCAD
2002
86views more  TCAD 2002»
14 years 9 months ago
Platune: a tuning framework for system-on-a-chip platforms
System-on-a-chip (SOC) platform manufacturers are increasingly adding configurable features that provide power and performance flexibility in order to increase a platform's ap...
Tony Givargis, Frank Vahid