Modern network processors support high levels of parallelism in packet processing by supporting multiple threads that execute on a micro-engine. Threads switch context upon encoun...
R. Collins, Fernando Alegre, Xiaotong Zhuang, Sant...
Modern out-of-order processors tolerate long latency memory operations by supporting a large number of inflight instructions. This is particularly useful in numerical applications...
The size and speed of first-level caches and SRAMs of embedded processors continue to increase in response to demands for higher performance. In power-sensitive devices like PDAs a...
Abstract--The 2-D Discrete Wavelet Transform (DWT) consumes up to 68% of the JPEG2000 encoding time. In this paper, we develop efficient implementations of this important kernel on...
Asadollah Shahbahrami, Ben H. H. Juurlink, Stamati...
Despite their superior performance for multimedia applications, vector processors have three limitations that hinder their widespread acceptance. First, the complexity and size of...