Search Sciweavers | Sciweavers

19 search results - page 4 / 4

» Automatic Tuning Matrix Multiplication Performance on Graphi...

click to vote

ICCS
2009
Springer

191views Applied Computing» more ICCS 2009»

A Note on Auto-tuning GEMM for GPUs

13 years 12 months ago

Download www.netlib.org

The development of high performance dense linear algebra (DLA) critically depends on highly optimized BLAS, and especially on the matrix multiplication routine (GEMM). This is espe...

Yinan Li, Jack Dongarra, Stanimire Tomov

claim paper

Read More »

click to vote

FCCM
2007
IEEE

107views VLSI» more FCCM 2007»

Optimizing Logarithmic Arithmetic on FPGAs

13 years 11 months ago

Download comparch.doc.ic.ac.uk

This paper proposes optimizations of the methods and parameters used in both mathematical approximation and hardware design for logarithmic number system (LNS) arithmetic. First, ...

Haohuan Fu, Oskar Mencer, Wayne Luk

claim paper

Read More »

click to vote

SASP
2009
IEEE

291views Hardware» more SASP 2009»

A parameterisable and scalable Smith-Waterman algorithm implementation on CUDA-compatible GPUs

14 years 1 days ago

Download www.see.ed.ac.uk

—This paper describes a multi-threaded parallel design and implementation of the Smith-Waterman (SM) algorithm on compute unified device architecture (CUDA)-compatible graphic pr...

Cheng Ling, Khaled Benkrid, Tsuyoshi Hamada

claim paper

Read More »

click to vote

FCCM
2008
IEEE

212views VLSI» more FCCM 2008»

Map-reduce as a Programming Model for Custom Computing Machines

13 years 11 months ago

Download mesl.ucsd.edu

The map-reduce model requires users to express their problem in terms of a map function that processes single records in a stream, and a reduce function that merges all mapped out...

Jackson H. C. Yeung, C. C. Tsang, Kuen Hung Tsoi, ...

claim paper

Read More »

« Prev « First page 4 / 4 Last » Next »

Sciweavers

Explore & Download

Productivity Tools

Document Tools

Image Tools

Sciweavers