Design of efficient System-on-Chips (SoCs) require thorough application analysis to identify various compute intensive parts. These compute intensive parts can be mapped to hardwa...
Amarjeet Singh 0002, Amit Chhabra, Anup Gangwar, B...
In this paper, we show the importance and feasibility of much faster multi-view stereo reconstruction algorithms relying almost exclusively on graphics hardware. Reconstruction al...
Patrick Labatut, Renaud Keriven, Jean-Philippe Pon...
Instruction and data address traces are widely used by computer designers for quantitative evaluations of new architectures and workload characterization, as well as by software de...
Milena Milenkovic, Aleksandar Milenkovic, Martin B...
We present a case study parallelizing streaming aggregation on three different parallel hardware architectures. Aggregation is a performance-critical operation for data summarizat...
Scott Schneider, Henrique Andrade, Bugra Gedik, Ku...
The increasing numbers of cores, shared caches and memory nodes within machines introduces a complex hardware topology. High-performance computing applications now have to carefull...