Graphics processing units (GPU) are increasingly being used for general purpose computing. We present implementations of large integer modular exponentiation, the core of public-ke...
Sequential consistency (SC) is the simplest programming interface for shared-memory systems but imposes program order among all memory operations, possibly precluding high perform...
Cache memories are commonly implemented through multiple memory banks to improve bandwidth and latency. The early knowledge of the data cache bank that an instruction will access ...
Stefan Bieschewski, Joan-Manuel Parcerisa, Antonio...
Vision-based user interfaces enable natural interaction modalities such as gestures. Such interfaces require computationally intensive video processing at low latency. We demonstr...
Ming-yu Chen, Lily B. Mummert, Padmanabhan Pillai,...
From multiprocessor scale-up to cache sizes to the number of reorder-buffer entries, microarchitects wish to reap the benefits of more computing resources while staying within po...
Peter G. Sassone, Jeff Rupley, Edward Brekelbaum, ...