To improve the performance of a single application on Chip Multiprocessors (CMPs), the application must be split into threads which execute concurrently on multiple cores. In mult...
M. Aater Suleman, Onur Mutlu, Moinuddin K. Qureshi...
While deterministic replay of parallel programs is a powerful technique, current proposals have shortcomings. Specifically, software-based replay systems have high overheads on mu...
Pablo Montesinos, Matthew Hicks, Samuel T. King, J...
Transactional memory (TM) eliminates many problems associated with lock-based synchronization. Over recent years, much progress has been made in software and hardware implementati...
Tatiana Shpeisman, Ali-Reza Adl-Tabatabai, Robert ...
Mining discrete patterns in binary data is important for subsampling, compression, and clustering. We consider rankone binary matrix approximations that identify the dominant patt...
Computer architects utilize simulation tools to evaluate the merits of a new design feature. The time needed to adequately evaluate the tradeoffs associated with adding any new fe...
Kaushal Sanghai, Ting Su, Jennifer G. Dy, David R....