In this work we investigate how the compiler technique of message strip mining performs in practice on contemporary high performance networks. Message strip mining attempts to redu...
The use of a cluster for distributed performance analysis of parallel trace data is discussed. We propose an analysis architecture that uses multiple cluster nodes as a server to ...
This paper represents a design study of the datapath for a very long instruction word (VLIW) video signal processor (VSP). VLIW architectures provide high parallelism and excellen...
Andrew Wolfe, Jason Fritts, Santanu Dutta, Edil S....
A new acceleration technology for Java embedded virtual machines is presented in this paper. Based on the selective dynamic compilation technique, this technology addresses the J2...
In this paper we propose a design technique to pipeline cache memories for high bandwidth applications. With the scaling of technology cache access latencies are multiple clock cy...