Load latency remains a significant bottleneck in dynamically scheduled pipelined processors. Load speculation techniques have been proposed to reduce this latency. Dependence Pred...
We developed a new modular synthesis approach for design of low-power core-based data-intensive application-specific systems on silicon. The power optimization is conducted in th...
Given an n-degree polynomial fx over an arbitrary ring, the shift of fx by c is the operation which computes coefficients of the polynomial fx + c. In this paper we conside...
: We study the scalability of 2-D discrete wavelet transform algorithms on fine-grained parallel architectures. The principal operation in the 2-D DWT is the filtering operation us...
Jamshed N. Patel, Ashfaq A. Khokhar, Leah H. Jamie...
A concurrent partitioner for partitioning unstructured finite element meshes on distributed memory architectures is developed. The partitioner uses an element-based partitioning st...