Data prefetching has been widely used in the past as a technique for hiding memory access latencies. However, data prefetching in multi-threaded applications running on chip multi...
Dhruva Chakrabarti, Mahmut T. Kandemir, Mustafa Ka...
Bugs in concurrent programs are extremely difficult to find and fix during testing. In this paper, we propose Kivati, which can efficiently detect and prevent atomicity violat...
As we reach the limits of single-core computing, we are promised more and more cores in our systems. Modern architectures include many performance counters per core, but few or no...
Paul E. West, Yuval Peress, Gary S. Tyson, Sally A...
Previous object code compression schemes have employed static and semiadaptive compression algorithms to reduce the size of instruction memory in embedded systems. The suggestion ...
The POEMS project is creating an environment for end-to-end performance modeling of complex parallel and distributed systems, spanning the domains of application software, runti...