Sciweavers

4213 search results - page 812 / 843
» The Tau Parallel Performance System
Sort
View
ICS
2003
Tsinghua U.
15 years 9 months ago
Estimating cache misses and locality using stack distances
Cache behavior modeling is an important part of modern optimizing compilers. In this paper we present a method to estimate the number of cache misses, at compile time, using a mac...
Calin Cascaval, David A. Padua
SC
1992
ACM
15 years 7 months ago
Willow: A Scalable Shared Memory Multiprocessor
We are currently developing Willow, a shared-memory multiprocessor whose design provides system capacity and performance capable of supporting over a thousand commercial microproc...
John K. Bennett, Sandhya Dwarkadas, Jay A. Greenwo...
134
Voted
ASPLOS
1989
ACM
15 years 7 months ago
Architecture and Compiler Tradeoffs for a Long Instruction Word Microprocessor
A very long instruction word (VLIW) processorexploits parallelism by controlling multiple operations in a single instruction word. This paper describes the architecture and compil...
Robert Cohn, Thomas R. Gross, Monica S. Lam, P. S....
156
Voted
CPHYSICS
2006
182views more  CPHYSICS 2006»
15 years 3 months ago
MinFinder: Locating all the local minima of a function
A new stochastic clustering algorithm is introduced that aims to locate all the local minima of a multidimensional continuous and differentiable function inside a bounded domain. ...
Ioannis G. Tsoulos, Isaac E. Lagaris
145
Voted
IJHPCA
2010
117views more  IJHPCA 2010»
15 years 2 months ago
Fine-Grained Multithreading Support for Hybrid Threaded MPI Programming
As high-end computing systems continue to grow in scale, recent advances in multiand many-core architectures have pushed such growth toward more denser architectures, that is, mor...
Pavan Balaji, Darius Buntinas, David Goodell, Will...