This paper explores the correlation of instruction counts and cache misses to runtime performance for a large family of divide and conquer algorithms to compute the Walsh–Hadama...
Grid computing aims to connect large numbers of geographically and organizationally distributed resources to increase computational power, resource utilization, and resource acces...
—The performance bottleneck for many scientific applications is the cost of memory access inside linear algebra kernels. Tuning such kernels for memory efficiency is a complex ...
Protein sequences with unknown functionality are often compared to a set of known sequences to detect functional similarities. Efficient dynamic programming algorithms exist for t...
Weiguo Liu, Bertil Schmidt, Gerrit Voss, Adrian Sc...
We propose a novel design estimation method for adaptive streaming applications to be implemented on a partially reconfigurable FPGA. Based on experimental results we enable accu...
Ingo Sander, Jun Zhu, Axel Jantsch, Andreas Herrho...