Shared memory multiprocessors play an increasingly important role in enterprise and scientific computing facilities. Remote misses limit the performance of shared memory applicat...
Silicon technology will continue to provide an exponential increase in the availability of raw transistors. Effectively translating this resource into application performance, how...
Steven Swanson, Ken Michelson, Andrew Schwerin, Ma...
The NVIDIA® OptiX™ ray tracing engine is a programmable system designed for NVIDIA GPUs and other highly parallel architectures. The OptiX engine builds on the key observation ...
Steven G. Parker, James Bigler, Andreas Dietrich, ...
Abstract. In this paper we present the new data structure Colored Sector Search Tree (CSST ) for solving the Nearest-Foreign-Neighbor Query Problem (NFNQP ): Given a set S of n col...
Thomas Graf, V. Kamakoti, N. S. Janaki Latha, C. P...
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...