While deterministic replay of parallel programs is a powerful technique, current proposals have shortcomings. Specifically, software-based replay systems have high overheads on mu...
Pablo Montesinos, Matthew Hicks, Samuel T. King, J...
This paper describes the design and implementation of virtual memory management within the CMU Mach Operating System and the experiences gained by the Mach kernel group in porting...
Richard F. Rashid, Avadis Tevanian, Michael Young,...
In programming high performance applications, shared address-space platforms are preferable for fine-grained computation, while distributed address-space platforms are more suita...
On Chip Multiprocessors (CMP), it is common that multiple cores share certain levels of cache. The sharing increases the contention in cache and memory-to-chip bandwidth, further h...
Yunlian Jiang, Eddy Z. Zhang, Kai Tian, Xipeng She...
The memory consistency model supported by a multiprocessor architecture determines the amount of buffering and pipelining that may be used to hide or reduce the latency of memory ...
Kourosh Gharachorloo, Anoop Gupta, John L. Henness...