Smaller feature sizes, reduced voltage levels, higher transistor counts, and reduced noise margins make future generations of microprocessors increasingly prone to transient hardw...
Despite large caches, main-memory access latencies still cause significant performance losses in many applications. Numerous hardware and software prefetching schemes tolerate th...
Zhenlin Wang, Doug Burger, Steven K. Reinhardt, Ka...
Shared-memory multiprocessors are frequently used in a timesharing style with multiple parallel applications executing at the same time. In such an environment, where the machine ...
Partitioning and global scheduling are two approaches for scheduling real-time tasks on multiprocessors. Though partitioning is sub-optimal, it has traditionally been preferred; t...
Anand Srinivasan, Philip Holman, James H. Anderson...
We investigate the efficient iterative solution of large-scale sparse linear systems on shared-memory multiprocessors. Our parallel approach is based on a multilevel ILU precondit...