Sciweavers

1249 search results - page 224 / 250
» Software Architecture for Large-Scale, Distributed, Data-Int...
Sort
View
130
Voted
IPPS
2002
IEEE
15 years 6 months ago
A SIMD Vectorizing Compiler for Digital Signal Processing Algorithms
Short vector SIMD instructions on recent microprocessors, such as SSE on Pentium III and 4, speed up code but are a major challenge to software developers. We present a compiler t...
Franz Franchetti, Markus Püschel
ICA3PP
2010
Springer
15 years 6 months ago
Accelerating Euler Equations Numerical Solver on Graphics Processing Units
Abstract. Finite volume numerical methods have been widely studied, implemented and parallelized on multiprocessor systems or on clusters. Modern graphics processing units (GPU) pr...
Pierre Kestener, Frédéric Chât...
ICS
1989
Tsinghua U.
15 years 5 months ago
Control flow optimization for supercomputer scalar processing
Control intensive scalar programs pose a very different challenge to highly pipelined supercomputers than vectorizable numeric applications. Function call/return and branch instru...
Pohua P. Chang, Wen-mei W. Hwu
ICS
2007
Tsinghua U.
15 years 8 months ago
Adaptive Strassen's matrix multiplication
Strassen’s matrix multiplication (MM) has benefits with respect to any (highly tuned) implementations of MM because Strassen’s reduces the total number of operations. Strasse...
Paolo D'Alberto, Alexandru Nicolau
CODES
2009
IEEE
15 years 5 months ago
TotalProf: a fast and accurate retargetable source code profiler
Profilers play an important role in software/hardware design, optimization, and verification. Various approaches have been proposed to implement profilers. The most widespread app...
Lei Gao, Jia Huang, Jianjiang Ceng, Rainer Leupers...