Floating-point summation is one of the most important operations in scientific/numerical computing applications and also a basic subroutine (SUM) in BLAS (Basic Linear Algebra Sub...
Fine-grained network measurement requires routers and switches to update large arrays of counters at very high link speed (e.g. 40 Gbps). A naive algorithm needs an infeasible amo...
Yi Lu, Andrea Montanari, Balaji Prabhakar, Sarang ...
The growing processor/memory performance gap causes the performance of many codes to be limited by memory accesses. If known to exist in an application, strided memory accesses fo...
Tushar Mohan, Bronis R. de Supinski, Sally A. McKe...
Code compression is a field where compression ratios between compiler-generated code and subsequent compressed code are highly dependent on decisions made at compile time. Most op...
—One of the key factors underlying the popularity of Low-density parity-check (LDPC) code is its iterative decoding algorithm that is amenable to efficient hardware implementati...
Ming Gu, Kiran Misra, Hayder Radha, Shantanu Chakr...