Sciweavers

16159 search results - page 3144 / 3232
» Parallel computing with CUDA
Sort
View
ICS
2007
Tsinghua U.
15 years 4 months ago
Performance driven data cache prefetching in a dynamic software optimization system
Software or hardware data cache prefetching is an efficient way to hide cache miss latency. However effectiveness of the issued prefetches have to be monitored in order to maximi...
Jean Christophe Beyler, Philippe Clauss
ICS
2007
Tsinghua U.
15 years 4 months ago
Cooperative cache partitioning for chip multiprocessors
This paper presents Cooperative Cache Partitioning (CCP) to allocate cache resources among threads concurrently running on CMPs. Unlike cache partitioning schemes that use a singl...
Jichuan Chang, Gurindar S. Sohi
ICS
2007
Tsinghua U.
15 years 4 months ago
Representation-transparent matrix algorithms with scalable performance
Positive results from new object-oriented tools for scientific programming are reported. Using template classes, abstractions of matrix representations are available that subsume...
Peter Gottschling, David S. Wise, Michael D. Adams
ICS
2007
Tsinghua U.
15 years 4 months ago
An L2-miss-driven early register deallocation for SMT processors
The register file is one of the most critical datapath components limiting the number of threads that can be supported on a Simultaneous Multithreading (SMT) processor. To allow t...
Joseph J. Sharkey, Dmitry V. Ponomarev
MIDDLEWARE
2007
Springer
15 years 4 months ago
A piggybacking approach to reduce overhead in sensor network gossiping
Many wireless sensor network protocols are employing gossipbased message dissemination, where nodes probabilistically forward messages, to reduce message overhead. We are concerne...
Ercan Ucan, Nathanael Thompson, Indranil Gupta
« Prev « First page 3144 / 3232 Last » Next »