Sciweavers

25 search results - page 3 / 5
» Performance portable GPU code generation for matrix multipli...
Sort
View
MACOM
2010
13 years 2 days ago
On the Performance of Single LDGM Codes for Iterative Data Fusion over the Multiple Access Channel
One of the applications of wireless sensor networks currently undergoing active research focuses on the scenario where the information generated by a data source S is simultaneousl...
Javier Del Ser, Javier Garcia-Frias, Pedro M. Cres...
CC
2008
Springer
123views System Software» more  CC 2008»
13 years 7 months ago
Automatic Transformation of Bit-Level C Code to Support Multiple Equivalent Data Layouts
Portable low-level C programs must often support multiple equivalent in-memory layouts of data, due to the byte or bit order of the compiler, architecture, or external data formats...
Marius Nita, Dan Grossman
ACMMSP
2006
ACM
260views Hardware» more  ACMMSP 2006»
13 years 11 months ago
Seven at one stroke: results from a cache-oblivious paradigm for scalable matrix algorithms
A blossoming paradigm for block-recursive matrix algorithms is presented that, at once, attains excellent performance measured by • time, • TLB misses, • L1 misses, • L2 m...
Michael D. Adams, David S. Wise
SAIG
2000
Springer
13 years 8 months ago
Code Generators for Automatic Tuning of Numerical Kernels: Experiences with FFTW
Achieving peak performance in important numerical kernels such as dense matrix multiply or sparse-matrix vector multiplication usually requires extensive, machine-dependent tuning ...
Rich Vuduc, James Demmel
APVIS
2008
13 years 6 months ago
Dynamic Shader Generation for Flexible Multi-Volume Visualization
Volume rendering of multiple intersecting volumetric objects is a difficult visualization task, especially if different rendering styles need to be applied to the components, in o...
Friedemann Rößler, Ralf P. Botchen, Tho...