Sciweavers

421 search results - page 39 / 85
» An Intelligent Parallel Loop Scheduling for Parallelizing Co...
Sort
View
CORR
2010
Springer
153views Education» more  CORR 2010»
14 years 9 months ago
Towards an Efficient Tile Matrix Inversion of Symmetric Positive Definite Matrices on Multicore Architectures
The algorithms in the current sequential numerical linear algebra libraries (e.g. LAPACK) do not parallelize well on multicore architectures. A new family of algorithms, the tile a...
Emmanuel Agullo, Henricus Bouwmeester, Jack Dongar...
IPPS
2007
IEEE
15 years 3 months ago
Optimizing Inter-Nest Data Locality Using Loop Splitting and Reordering
With the increasing gap between processor speed and memory latency, the performance of data-dominated programs are becoming more reliant on fast data access, which can be improved...
Sofiane Naci
TPDS
2010
144views more  TPDS 2010»
14 years 8 months ago
Performance Evaluation of Dynamic Speculative Multithreading with the Cascadia Architecture
—Thread-level parallelism (TLP) has been extensively studied in order to overcome the limitations of exploiting instruction-level parallelism (ILP) on high-performance superscala...
David A. Zier, Ben Lee
ICCD
2001
IEEE
98views Hardware» more  ICCD 2001»
15 years 6 months ago
Design Alternatives for Parallel Saturating Multioperand Adders
Parallel saturating multioperand adders significantly improve the performance of GSM speech coders by giving compilers and assembly language programmers the ability to paralleliz...
Pablo I. Balzola, Michael J. Schulte, Jie Ruan, C....
ISCA
2000
IEEE
134views Hardware» more  ISCA 2000»
15 years 2 months ago
Architectural support for scalable speculative parallelization in shared-memory multiprocessors
Speculative parallelization aggressively executes in parallel codes that cannot be fully parallelized by the compiler. Past proposals of hardware schemes have mostly focused on si...
Marcelo H. Cintra, José F. Martínez,...