Data intensive applications on clusters often require requests quickly be sent to the node managing the desired data. In many applications, one must look through a sorted tree str...
In a parallel system with multiple CPUs, one of the key problems is to assign loop iterations to processors. This problem, known as the loop scheduling problem, has been studied i...
Mahmut T. Kandemir, Taylan Yemliha, Seung Woo Son,...
We present a unified approach to locality optimization that employs both data and control transformations. Data transformations include changing the array layout in memory. Contr...
Current network protocols must comply with rigid interfaces and rules of behavior to fit into well defined, vertical protocol stacks. It is difficult for network designers to offe...
To support dynamic address translation in today's microprocessors, the first-level cache is accessed in parallel with a translation lookaside buffer (TLB). However, this curre...