Fill-reducing sparse matrix orderings have been a topic of active research for many years. Although most such algorithms are developed and analyzed within a graph-theoretical frame...
A thread executing on a simultaneous multithreading (SMT) processor that experiences a long-latency load will eventually stall while holding execution resources. Existing long-lat...
—Multiple processors or multi-core CPUs are now in common, and the number of processes running concurrently is increasing in a cluster. Each process issues contiguous I/O request...
Parallel software for solving the quadratic program arising in training support vector machines for classification problems is introduced. The software implements an iterative dec...
A key challenge in achieving high performance on software DSM systems is overcoming their relatively large communication latencies. In this paper, we consider two techniques which...