This paper presents our experience mapping OpenMP parallel programming model to the IBM Cyclops-64 (C64) architecture. The C64 employs a many-core-on-a-chip design that integrates...
In chip multiprocessors (CMPs), data accesslatency dependson the memory hierarchy organization, the on-chip interconnect (NoC), and the running workload. Reducing data access late...
The queue data structure is fundamental and ubiquitous. Lockfree versions of the queue are well known. However, an important open question is whether practical wait-free queues ex...
MapReduce is a programming model and an associated implementation for processing and generating large data sets. Users specify a map function that processes a key/value pair to ge...
Abstract. Recent trend in high-performance computing focuses on networks of workstations (NOWs) as a way ofrealizing cost-effective parallel machines. This has been due to the avai...