Wide Single Instruction, Multiple Thread (SIMT) architectures often require a static allocation of thread groups that are executed in lockstep throughout the entire application ker...
The popularity of the Internet and the emergence of broadband access networks is fueling the development of communications processors -- devices that integrate processing, network...
Charles D. Cranor, R. Gopalakrishnan, Peter Z. Onu...
Abstract—We present a new low-level interfacing scheme for connecting custom accelerators to processors that tolerates latencies that usually occur when accessing hardware accele...
Jaroslav Sykora, Leos Kafka, Martin Danek, Lukas K...
Heterogeneous processors that mix big high performance cores with small low power cores promise excellent single– threaded performance coupled with high multi–threaded through...
Dheeraj Reddy, David A. Koufaty, Paul Brett, Scott...