Historically, processor accesses to memory-mapped device registers have been marked uncachable to insure their visibility to the device. The ubiquity of snooping cache coherence, ...
Shubhendu S. Mukherjee, Babak Falsafi, Mark D. Hil...
We describe an algorithm for support vector machines (SVM) that can be parallelized efficiently and scales to very large problems with hundreds of thousands of training vectors. I...
In this paper, we propose a low-power approach to the design of embedded very long instruction word (VLIW) processor architectures based on the forwarding (or bypassing) hardware, ...
—Several recently proposed techniques including CPR (Checkpoint Processing and Recovery) and NoSQ (No Store Queue) rely on reference counting to manage physical registers. Howeve...
In this paper we propose and evaluate the Adaptive++ technique, a novel runtime-only data prefetching strategy for software-based distributed shared-memory systems (software DSMs)...
Ricardo Bianchini, Raquel Pinto, Claudio Luis de A...