To execute a shared memory program efficiently, we have to manage memory consistency with low overheads, and have to utilize communication bandwidth of the platform as much as pos...
Thread escape analysis conservatively determines which objects may be accessed in more than one thread. Thread escape analysis is useful for a variety of purposes – finding rac...
Previous studies have shown that array regrouping and structure splitting significantly improve data locality. The most effective technique relies on profiling every access to eve...
Interprocedural data flow analyses of executable programs suffer from the conservative assumptions that need to be made because no precise control flow graph is available and beca...
Divide and conquer algorithms are a good match for modern parallel machines: they tend to have large amounts of inherent parallelism and they work well with caches and deep memory...