We consider technology mapping from factored form binary leaf-DAG to lookup tables LUTs, such as those found in eld programmable gate arrays. Polynomial time algorithms exist f...
We motivate and define the XSIL language as a flexible, hierarchical, extensible transport language for scientific data objects. The entire object may be represented in the file, o...
Kent Blackburn, Albert Lazzarini, Thomas A. Prince...
: Software cache coherence schemes are very desirable in the design of scalable multiprocessors and massively parallel processors. The authors propose a software cache coherence sc...
Improving memory performance at software level is more effective in reducing the rapidly expanding gap between processor and memory performance. Loop transformations (e.g. loop un...
Surendra Byna, Xian-He Sun, William Gropp, Rajeev ...
With the increasing processing power, the latency of the memory hierarchy becomes the stumbling block of many modern computer architectures. In order to speed-up the calculations, ...