Sciweavers

1461 search results - page 75 / 293
» Comparing the Optimal Performance of Parallel Architectures
Sort
View
IEEEPACT
2002
IEEE
15 years 6 months ago
Effective Compilation Support for Variable Instruction Set Architecture
Traditional compilers perform their code generation tasks based on a fixed, pre-determined instruction set. This paper describes the implementation of a compiler that determines ...
Jack Liu, Timothy Kong, Fred C. Chow
IPPS
2008
IEEE
15 years 8 months ago
Reducing the run-time of MCMC programs by multithreading on SMP architectures
The increasing availability of multi-core and multiprocessor architectures provides new opportunities for improving the performance of many computer simulations. Markov Chain Mont...
Jonathan M. R. Byrd, Stephen A. Jarvis, A. H. Bhal...
118
Voted
IPPS
1998
IEEE
15 years 6 months ago
Nearly Optimal Algorithms for Broadcast on d-Dimensional All-Port and Wormhole-Routed Torus
In this paper, we present nearly optimal algorithms for broadcast on a d-dimensional nn:::n torus that supports all-port communication and wormhole routing. Let Tn denote the numb...
Jyh-Jong Tsay, Wen-Tsong Wang
IPPS
1998
IEEE
15 years 6 months ago
Partitioned Schedules for Clustered VLIW Architectures
This paper presents results on a new approach to partitioning a modulo-scheduled loop for distributed execution on parallel clusters of functional units organized as a VLIW machin...
Marcio Merino Fernandes, Josep Llosa, Nigel P. Top...
120
Voted
EUROPAR
2010
Springer
15 years 3 months ago
Thread Owned Block Cache: Managing Latency in Many-Core Architecture
Abstract. Shared last level cache is crucial to performance. However, multithread program model incurs serious contention in shared cache. In this paper, to reduce average cache ac...
Fenglong Song, Zhiyong Liu, Dongrui Fan, Hao Zhang...