Sciweavers

421 search results - page 67 / 85
» An Intelligent Parallel Loop Scheduling for Parallelizing Co...
Sort
View
79
Voted
CEC
2010
IEEE
14 years 10 months ago
Evolving a CUDA kernel from an nVidia template
Rather than attempting to evolve a complete program from scratch we demonstrate genetic interface programming (GIP) by automatically generating a parallel CUDA kernel with identica...
William B. Langdon, Mark Harman
ICCS
2009
Springer
15 years 4 months ago
Generating Empirically Optimized Composed Matrix Kernels from MATLAB Prototypes
The development of optimized codes is time-consuming and requires extensive architecture, compiler, and language expertise, therefore, computational scientists are often forced to ...
Boyana Norris, Albert Hartono, Elizabeth R. Jessup...
73
Voted
ICS
2005
Tsinghua U.
15 years 3 months ago
Think globally, search locally
A key step in program optimization is the determination of optimal values for code optimization parameters such as cache tile sizes and loop unrolling factors. One approach, which...
Kamen Yotov, Keshav Pingali, Paul Stodghill
HIPC
2005
Springer
15 years 3 months ago
Cooperative Instruction Scheduling with Linear Scan Register Allocation
Abstract. Linear scan register allocation is an attractive register allocation algorithm because of its simplicity and fast running time. However, it is generally felt that linear ...
Khaing Khaing Kyi Win, Weng-Fai Wong
APCSAC
2000
IEEE
15 years 2 months ago
Micro-Threading: A New Approach to Future RISC
This paper briefly reviews the current research into RISC microprocessor architecture, which now seems to be so complex as to make the acronym somewhat of an oxymoron. In response...
Chris R. Jesshope, Bing Luo