Search Sciweavers | Sciweavers

7 search results - page 1 / 2

» High-Performance FPGA-Based General Reduction Methods

112

click to vote

FCCM
2005
IEEE

106views VLSI» more FCCM 2005»

High-Performance FPGA-Based General Reduction Methods

15 years 8 months ago

Download halcyon.usc.edu

FPGA-based ﬂoating-point kernels must exploit algorithmic parallelism and use deeply pipelined cores to gain a performance advantage over general-purpose processors. Inability t...

Gerald R. Morris, Ling Zhuo, Viktor K. Prasanna

claim paper

Read More »

151

click to vote

CCGRID
2007
IEEE

98views Distributed And Parallel Com...» more CCGRID 2007»

High-Performance MPI Broadcast Algorithm for Grid Environments Utilizing Multi-lane NICs

15 years 9 months ago

Download matsu-www.is.titech.ac.jp

The performance of MPI collective operations, such as broadcast and reduction, is heavily aﬀected by network topologies, especially in grid environments. Many techniques to cons...

Tatsuhiro Chiba, Toshio Endo, Satoshi Matsuoka

claim paper

Read More »

138

click to vote

TPDS
2010

174views more TPDS 2010»

Parallel Two-Sided Matrix Reduction to Band Bidiagonal Form on Multicore Architectures

15 years 1 months ago

Download www.netlib.org

The objective of this paper is to extend, in the context of multicore architectures, the concepts of tile algorithms [Buttari et al., 2007] for Cholesky, LU, QR factorizations to t...

Hatem Ltaief, Jakub Kurzak, Jack Dongarra

claim paper

Read More »

115

click to vote

SBCCI
2005
ACM

111views VLSI» more SBCCI 2005»

Total leakage power optimization with improved mixed gates

15 years 8 months ago

Download www.cpdee.ufmg.br

Gate oxide tunneling current Igate and sub-threshold current Isub dominate the leakage of designs. The latter depends on threshold voltage Vth while Igate vary with the thickness ...

Frank Sill, Frank Grassert, Dirk Timmermann

claim paper

Read More »

139

click to vote

DAC
2010
ACM

178views Computer Architecture» more DAC 2010»

Non-uniform clock mesh optimization with linear programming buffer insertion

15 years 1 months ago

Download vlsida.soe.ucsc.edu

Clock meshes are extremely effective at filtering clock skew from environmental and process variations. For this reason, clock meshes are used in most high performance designs. Ho...

Matthew R. Guthaus, Gustavo Wilke, Ricardo Reis

claim paper

Read More »

« Prev « First page 1 / 2 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Sciweavers