Divide and conquer algorithms are a good match for modern parallel machines: they tend to have large amounts of inherent parallelism and they work well with caches and deep memory...
We have built an eight node SMP cluster called COMPaS (Cluster Of Multi-Processor Systems), each node of which is a quadprocessor Pentium Pro PC. We have designed and implemented a...
This paper describes the design and implementation of a multicast transport protocol called RMTP. RMTP provides sequenced, lossless delivery of bulk data from one sender to a grou...
We present the design of a dynamic compilation system for C. Directed by a few declarative user annotations specifying where and on what dynamic compilation is to take place, a bi...
Brian Grant, Markus Mock, Matthai Philipose, Craig...
This paper is concerned with performance debugging of multitier applications, such as commonly found in servers and dynamic-content web sites. Existing tools and techniques for pr...