In this paper, we develop a new language construct to address one of the pitfalls of parallel programming: precise handling of events across parallel components. The construct, te...
William Thies, Michal Karczmarek, Janis Sermulins,...
We are interested in this paper to study scheduling problems in systems where many users compete to perform their respective jobs on shared parallel resources. Each user has speci...
Abstract. We present new performance models and a new, more compact data structure for cache blocking when applied to the sparse matrixvector multiply (SpM×V) operation, y ← y +...
Rajesh Nishtala, Richard W. Vuduc, James Demmel, K...
This work presents the first extensive study of singlenode performance optimization, tuning, and analysis of the fast multipole method (FMM) on modern multicore systems. We consid...
Aparna Chandramowlishwaran, Samuel Williams, Leoni...
With the availability of large datasets in a variety of scientific and commercial domains, data mining has emerged as an important area within the last decade. Data mining techni...