Sciweavers

2784 search results - page 181 / 557
» Instruction Level Parallelism
Sort
View
IEEEPACT
2002
IEEE
15 years 9 months ago
Exploiting Pseudo-Schedules to Guide Data Dependence Graph Partitioning
This paper presents a new modulo scheduling algorithm for clustered microarchitectures. The main feature of the proposed scheme is that the assignment of instructions to clusters ...
Alex Aletà, Josep M. Codina, F. Jesú...
CSIE
2009
IEEE
15 years 9 months ago
K-Means on Commodity GPUs with CUDA
K-means algorithm is one of the most famous unsupervised clustering algorithms. Many theoretical improvements for the performance of original algorithms have been put forward, whi...
Hong-tao Bai, Li-li He, Dan-tong Ouyang, Zhan-shan...
ICPP
1999
IEEE
15 years 8 months ago
Improving Performance of Load-Store Sequences for Transaction Processing Workloads on Multiprocessors
On-line transaction processing exhibits poor memory behavior in high-end multiprocessor servers because of complex sharing patterns and substantial interaction between the databas...
Jim Nilsson, Fredrik Dahlgren
MICRO
1996
IEEE
96views Hardware» more  MICRO 1996»
15 years 8 months ago
Exceeding the Dataflow Limit via Value Prediction
For decades, the serialization constraints imposed by true data dependences have been regarded as an absolute limit--the dataflow limit--on the parallel execution of serial progra...
Mikko H. Lipasti, John Paul Shen
ICS
1989
Tsinghua U.
15 years 8 months ago
Control flow optimization for supercomputer scalar processing
Control intensive scalar programs pose a very different challenge to highly pipelined supercomputers than vectorizable numeric applications. Function call/return and branch instru...
Pohua P. Chang, Wen-mei W. Hwu