The current trend is for processors to deliver dramatic improvements in parallel performance while only modestly improving serial performance. Parallel performance is harvested th...
Sanjeev Kumar, Daehyun Kim, Mikhail Smelyanskiy, Y...
Embedded computing architectures can be designed to meet a variety of application specific requirements. However, optimized hardware can require compiler support to realize the po...
This paper presents novel tree-based search algorithms that exploit the SIMD instructions found in virtually all modern processors. The algorithms are a natural extension of binar...
Benjamin Schlegel, Rainer Gemulla, Wolfgang Lehner
We present new algorithms on commodity graphics processors to perform fast computation of several common database operations. Specifically, we consider operations such as conjunct...
Naga K. Govindaraju, Brandon Lloyd, Wei Wang 0010,...
In this paper we describe the design and implementation of CellSort − a high performance distributed sort algorithm for the Cell processor. We design CellSort as a distributed b...