Parallel architectures are the way of the future, but are notoriously difficult to program. In addition to the low-level constructs they often present (e.g., locks, DMA, and non-...
—This paper presents reuse-aware modulo scheduling to maximizing stream reuse and improving concurrency for stream-level loops running on stream processors. The novelty lies in t...
The Cell processor is a typical example of a heterogeneous multiprocessor-on-chip architecture that uses several levels of parallelism to deliver high performance. Closing the gap ...
In modern superscalar processors, the complex instruction scheduler could form the critical path of the pipeline stages and limit the clock cycle time. In addition, complex schedu...
Computational phylogeny is a challenging application even for the most powerful supercomputers. It is also an ideal candidate for benchmarking emerging multiprocessor architecture...
Filip Blagojevic, Alexandros Stamatakis, Christos ...