Formal reasoning about concurrent programs is usually done with the assumption that the underlying memory model is sequentially consistent, i.e. the execution outcome is equivalen...
Recent research advocates memory streaming techniques to alleviate the performance bottleneck caused by the high latencies of off-chip memory accesses. Temporal memory streaming r...
Stephen Somogyi, Thomas F. Wenisch, Anastasia Aila...
Modern architectures implement relaxed memory models which may reorder memory operations or execute them non-atomically. Special instructions called memory fences are provided, al...
In order to extract high levels of performance from modern parallel architectures, the effective management of deep memory hierarchies is very important. While architectural advan...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...
Providing deterministic execution significantly simplifies the debugging, testing, replication, and deployment of multithreaded programs. Recent work has developed deterministic...
Joseph Devietti, Jacob Nelson, Tom Bergan, Luis Ce...