All Netflix Prize algorithms proposed so far are prohibitively costly for large-scale production systems. In this paper, we describe an efficient dataflow implementation of a coll...
Srivatsava Daruru, Nena M. Marin, Matt Walker, Joy...
There has recently been much interest in stream processing, both in industry (e.g., Cell, NVIDIA G80, ATI R580) and academia (e.g., Stanford Merrimac, MIT RAW), with stream progra...
Jayanth Gummaraju, Mattan Erez, Joel Coburn, Mende...
An asynchronous work-stealing implementation of dynamic load balance is implemented using Unified Parallel C (UPC) and evaluated using the Unbalanced Tree Search (UTS) benchmark ...
Abstract. The paradigm shift in processor design from monolithic processors to multicore has renewed interest in programming models that facilitate parallelism. While multicores ar...
Shan Shan Huang, Amir Hormati, David F. Bacon, Rod...
Linear programming (LP) based methods are attractive for solving the placement problem because of their ability to model Half-Perimeter Wirelength (HPWL) and timing. However, it h...