Sciweavers

332 search results - page 7 / 67
» T: integrated building blocks for parallel computing
Sort
View
IPPS
1999
IEEE
15 years 3 months ago
Run-Time Selection of Block Size in Pipelined Parallel Programs
Parallelizing compiler technology has improved in recent years. One area in which compilers have made progress is in handling DOACROSS loops, where crossprocessor data dependencie...
David K. Lowenthal, Michael James
128
Voted
IEEEPACT
2005
IEEE
15 years 4 months ago
Dual-Core Execution: Building a Highly Scalable Single-Thread Instruction Window
Current integration trends embrace the prosperity of single-chip multi-core processors. Although multi-core processors deliver significantly improved system throughput, single-thr...
Huiyang Zhou
98
Voted
PDP
2009
IEEE
15 years 5 months ago
Task-Parallel versus Data-Parallel Library-Based Programming in Multicore Systems
—Multicore machines are becoming common. There are many languages, language extensions and libraries devoted to improve the programmability and performance of these machines. In ...
Diego Andrade, Basilio B. Fraguela, James C. Brodm...
IPPS
2009
IEEE
15 years 5 months ago
Exploring the effect of block shapes on the performance of sparse kernels
In this paper we explore the impact of the block shape on blocked and vectorized versions of the Sparse Matrix-Vector Multiplication (SpMV) kernel and build upon previous work by ...
Vasileios Karakasis, Georgios I. Goumas, Nectarios...
TC
2008
14 years 10 months ago
Low-Complexity Bit-Parallel Square Root Computation over GF(2^{m}) for All Trinomials
In this contribution we introduce a low-complexity bit-parallel algorithm for computing square roots over binary extension fields. Our proposed method can be applied for any type ...
Francisco Rodríguez-Henríquez, Guill...