Parallel programming continues to be difficult, despite substantial and ongoing research aimed at making it tractable. Especially dismaying is the gulf between theory and the pract...
Instance based locality optimization 6 is a semi automatic program restructuring method that reduces the number of cache misses. The method imitates the human approach of consideri...
Two-level predictors deliver highly accurate conditional branch prediction, indirect branch target prediction and value prediction. Accurate prediction enables speculative executio...
There is growing interest in run-time detection as parallel and distributed systems grow larger and more complex. This work targets run-time analysis of complex, interactive scien...
A structured approach to parallel programming allows to construct applications by composing skeletons, i.e., recurring patterns of task- and data-parallelism. First academic and co...