Modern microprocessors can achieve high performance on linear algebra kernels but this currently requires extensive machine-speci c hand tuning. We have developed a methodology wh...
Jeff Bilmes, Krste Asanovic, Chee-Whye Chin, James...
In this paper we present the venture of porting two different algorithms for solving the two-dimensional advection PDE on the CBE platform, an in-place and an outof-place one, and ...
The NVIDIA® OptiX™ ray tracing engine is a programmable system designed for NVIDIA GPUs and other highly parallel architectures. The OptiX engine builds on the key observation ...
Steven G. Parker, James Bigler, Andreas Dietrich, ...
This paper presents ObjRecombGA, a genetic algorithm framework for recombining related programs at the object file level. A genetic algorithm guides the selection of object file...
This paper proposes a rapid and accurate evaluation scheme for cycle counts of a pipelined processor using evaluation reuse technique. Since exploration of an optimal processor is...