Commercial server applications remain memory bound on modern multiprocessor systems because of their large data footprints, frequent sharing, complex non-strided access patterns, ...
Thomas F. Wenisch, Michael Ferdman, Anastasia Aila...
Traditionally, cache coherence in large-scale shared-memory multiprocessors has been ensured by means of a distributed directory structure stored in main memory. In this way, the ...
— Leveraging the power of scratchpad memories (SPMs) available in most embedded systems today is crucial to extract maximum performance from application programs. While regular a...
Taylan Yemliha, Shekhar Srikantaiah, Mahmut T. Kan...
This paper proposes a simple and efficient implementation method for a hierarchical coarse grain task parallel processing scheme on a SMP machine. OSCAR multigrain parallelizing c...
Intuitively, aggressive work sharing among concurrent queries in a database system should always improve performance by eliminating redundant computation or data accesses. We show...
Ryan Johnson, Nikos Hardavellas, Ippokratis Pandis...