Sequential consistency is the archetypal correctness condition for the memory protocols of shared-memory multiprocessors. Typically, such protocols are parameterized by the number ...
Jesse D. Bingham, Anne Condon, Alan J. Hu, Shaz Qa...
Abstract. This paper presents a study of performance optimization of dense matrix multiplication on IBM Cyclops-64(C64) chip architecture. Although much has been published on how t...
Ziang Hu, Juan del Cuvillo, Weirong Zhu, Guang R. ...
In this paper, we present the power estimation methodologies for the development of a low-power security processor that contains significant amount of logic and memory. For the lo...
This paper presents a middleware capable of out-of-order execution of kernels and data transfers for efficient stream processing in the compute unified device architecture (CUDA). ...
—Counting Bloom Filters (CBFs) are widely used in networking device algorithms. They implement fast set representations to support membership queries with limited error, and supp...