The recent improvements in workstation and interconnection network performance have popularized the clusters of off-the-shelf workstations. However, the usefulness of these cluste...
Data-Pipelining is a widely used model to represent streaming applications. Incremental decomposition and optimization of a data-pipelining application onto a multi-processor plat...
Bo-Cheng Charles Lai, Patrick Schaumont, Wei Qin, ...
3D stacked wafer integration has the potential to improve multiprocessor system-on-chip (MPSoC) integration density, performance, and power efficiency. However, the power density...
We present Lazy Binary Splitting (LBS), a user-level scheduler of nested parallelism for shared-memory multiprocessors that builds on existing Eager Binary Splitting work-stealing...
Alexandros Tzannes, George C. Caragea, Rajeev Baru...
With the ever-increasing transistor variability in CMOS technology, it is essential to integrate variation-aware performance analysis into the task allocation and scheduling proce...