Sciweavers

FPGA
2016
ACM
75views FPGA» more  FPGA 2016»
8 years 22 days ago
FPRESSO: Enabling Express Transistor-Level Exploration of FPGA Architectures
In theory, tools like VTR—a retargetable toolchain mapping circuits onto easily-described hypothetical FPGA architectures—could play a key role in the development of wildly in...
Grace Zgheib, Manana Lortkipanidze, Muhsen Owaida,...
FPGA
2016
ACM
72views FPGA» more  FPGA 2016»
8 years 22 days ago
CASK: Open-Source Custom Architectures for Sparse Kernels
Sparse matrix vector multiplication (SpMV) is an important kernel in many scientific applications. To improve the performance and applicability of FPGA based SpMV, we propose an ...
Paul Grigoras, Pavel Burovskiy, Wayne Luk
FPGA
2016
ACM
69views FPGA» more  FPGA 2016»
8 years 22 days ago
A Case for Work-stealing on FPGAs with OpenCL Atomics
We provide a case study of work-stealing, a popular method for run-time load balancing, on FPGAs. Following the Cederman–Tsigas implementation for GPUs, we synchronize workitems...
Nadesh Ramanathan, John Wickerson, Felix Winterste...
FPGA
2016
ACM
71views FPGA» more  FPGA 2016»
8 years 22 days ago
Resolve: Generation of High-Performance Sorting Architectures from High-Level Synthesis
Field Programmable Gate Array (FPGA) implementations of sorting algorithms have proven to be efficient, but existing implementations lack portability and maintainability because t...
Janarbek Matai, Dustin Richmond, Dajung Lee, Zac B...
FPGA
2016
ACM
83views FPGA» more  FPGA 2016»
8 years 22 days ago
GPU-Accelerated High-Level Synthesis for Bitwidth Optimization of FPGA Datapaths
Bitwidth optimization of FPGA datapaths can save hardware resources by choosing the fewest number of bits required for each datapath variable to achieve a desired quality of resul...
Nachiket Kapre, Deheng Ye
FPGA
2014
ACM
108views FPGA» more  FPGA 2014»
8 years 11 months ago
Fast and effective placement and routing directed high-level synthesis for FPGAs
Achievable frequency (fmax) is a widely used input constraint for designs targeting Field-Programmable Gate Arrays (FPGA), because of its impact on design latency and throughput. ...
Hongbin Zheng, Swathi T. Gurumani, Kyle Rupnow, De...
FPGA
2012
ACM
337views FPGA» more  FPGA 2012»
12 years 5 days ago
Accelerator compiler for the VENICE vector processor
Zhiduo Liu, Aaron Severance, Satnam Singh, Guy G. ...
FPGA
2012
ACM
300views FPGA» more  FPGA 2012»
12 years 5 days ago
Reducing the cost of floating-point mantissa alignment and normalization in FPGAs
In floating-point datapaths synthesized on FPGAs, the shifters that perform mantissa alignment and normalization consume a disproportionate number of LUTs. Shifters are implemente...
Yehdhih Ould Mohammed Moctar, Nithin George, Hadi ...
FPGA
2012
ACM
285views FPGA» more  FPGA 2012»
12 years 5 days ago
Optimizing SDRAM bandwidth for custom FPGA loop accelerators
Memory bandwidth is critical to achieving high performance in many FPGA applications. The bandwidth of SDRAM memories is, however, highly dependent upon the order in which address...
Samuel Bayliss, George A. Constantinides
FS
2011
168views more  FS 2011»
12 years 8 months ago
Gamma expansion of the Heston stochastic volatility model
Abstract We derive an explicit representation of the transitions of the Heston stochastic volatility model and use it for fast and accurate simulation of the model. Of particular i...
Paul Glasserman, Kyoung-Kuk Kim