On machines with high-performance processors, the memory system continues to be a performance bottleneck. Compilers insert prefetch operations and reorder data accesses to improve...
Nathaniel McIntosh, Sandya Mannarswamy, Robert Hun...
Prefetching in shared-memory multiprocessor systems is an increasingly difficult problem. As system designs grow to incorporate larger numbers of faster processors, memory latency...
Cycles per Instruction (CPI) stacks break down processor execution time into a baseline CPI plus a number of miss event CPI components. CPI breakdowns can be very helpful in gaini...
In this paper, we propose a fully automatic dynamic scratchpad memory (SPM) management technique for instructions. Our technique loads required code segments into the SPM on deman...
Bernhard Egger, Chihun Kim, Choonki Jang, Yoonsung...
In some hypermedia system applications, like interactive digital TV applications, authoring and presentation of documents may have to be done concomitantly. This is the case of li...