After many years of prefetching research, most commercially available systems support only two types of prefetching: software-directed prefetching and hardware-based prefetchers u...
Tiling is a crucial loop transformation for generating high performance code on modern architectures. Efficient generation of multilevel tiled code is essential for maximizing da...
Albert Hartono, Muthu Manikandan Baskaran, C&eacut...
In a cluster of multiple processors or cpu-cores, many processes may run on each compute node. Each process tends to issue contiguous I/O requests for snapshot, checkpointing or s...
Given a large volume of Web documents, we consider problem of finding the shortest keyword sequences for each of the documents such that a keyword sequence can be rendered to a g...
Achieving good performance on a modern machine with a multi-level memory hierarchy, and in particular on a machine with software-managed memories, requires precise tuning of progr...
Manman Ren, Ji Young Park, Mike Houston, Alex Aike...