Parallel workstations, each comprising 10-100 processors, promise cost-effective general-purpose multiprocessing. This paper explores the coupling of such small- to medium-scale s...
This paper presents a basic framework for applying static task scheduling techniques to arbitrarily-structured task systems whose targeted execution environment is comprised of fi...
Orca is a portable, object-based distributed shared memory system. This paper studies and evaluates the design choices made in the Orca system and compares Orca with other DSMs. T...
Henri E. Bal, Raoul Bhoedjang, Rutger F. H. Hofman...
The trend towards integrating multiple cores on the same die has accentuated the need for larger on-chip caches. Such large caches are constructed as a multitude of smaller cache ...
Reetuparna Das, Asit K. Mishra, Chrysostomos Nicop...
Tiling is a widely used loop transformation for exposing/exploiting parallelism and data locality. Effective use of tiling requires selection and tuning of the tile sizes. This is...