Sciweavers

347 search results - page 28 / 70
» Caching processor general registers
Sort
View
AAECC
2007
Springer
111views Algorithms» more  AAECC 2007»
14 years 9 months ago
When cache blocking of sparse matrix vector multiply works and why
Abstract. We present new performance models and a new, more compact data structure for cache blocking when applied to the sparse matrixvector multiply (SpM×V) operation, y ← y +...
Rajesh Nishtala, Richard W. Vuduc, James Demmel, K...
CF
2006
ACM
15 years 1 months ago
An efficient cache design for scalable glueless shared-memory multiprocessors
Traditionally, cache coherence in large-scale shared-memory multiprocessors has been ensured by means of a distributed directory structure stored in main memory. In this way, the ...
Alberto Ros, Manuel E. Acacio, José M. Garc...
ICS
2007
Tsinghua U.
15 years 3 months ago
Performance driven data cache prefetching in a dynamic software optimization system
Software or hardware data cache prefetching is an efficient way to hide cache miss latency. However effectiveness of the issued prefetches have to be monitored in order to maximi...
Jean Christophe Beyler, Philippe Clauss
MICRO
1995
IEEE
97views Hardware» more  MICRO 1995»
15 years 1 months ago
Improving CISC instruction decoding performance using a fill unit
Current superscalar processors, both RISC and CISC, require substantial instruction fetch and decode bandwidth to keep multiple functional units utilized. While CISC instructions ...
Mark Smotherman, Manoj Franklin
COMCOM
2004
95views more  COMCOM 2004»
14 years 9 months ago
A distributed middleware infrastructure for personalized services
In this paper, we present an overview of extensible Retrieval, Annotation and Caching Engine (eRACE), a modular and distributed intermediary infrastructure that collects informati...
Marios D. Dikaiakos, Demetrios Zeinalipour-Yazti