Sciweavers

587 search results - page 37 / 118
» Improving the Java memory model using CRF
Sort
View
PPPJ
2009
ACM
15 years 8 months ago
Automatic parallelization for graphics processing units
Accelerated graphics cards, or Graphics Processing Units (GPUs), have become ubiquitous in recent years. On the right kinds of problems, GPUs greatly surpass CPUs in terms of raw ...
Alan Leung, Ondrej Lhoták, Ghulam Lashari
132
Voted
MST
2002
169views more  MST 2002»
15 years 1 months ago
Bulk Synchronous Parallel Algorithms for the External Memory Model
Abstract. Blockwise access to data is a central theme in the design of efficient external memory (EM) algorithms. A second important issue, when more than one disk is present, is f...
Frank K. H. A. Dehne, Wolfgang Dittrich, David A. ...
139
Voted
DAC
2010
ACM
15 years 2 months ago
Instruction cache locking using temporal reuse profile
The performance of most embedded systems is critically dependent on the average memory access latency. Improving the cache hit rate can have significant positive impact on the per...
Yun Liang, Tulika Mitra
120
Voted
HPCA
2005
IEEE
16 years 2 months ago
Using Virtual Load/Store Queues (VLSQs) to Reduce the Negative Effects of Reordered Memory Instructions
The use of large instruction windows coupled with aggressive out-oforder and prefetching capabilities has provided significant improvements in processor performance. In this paper...
Aamer Jaleel, Bruce L. Jacob
94
Voted
CLUSTER
2002
IEEE
15 years 6 months ago
Mixed Mode Matrix Multiplication
In modern clustering environments where the memory hierarchy has many layers (distributed memory, shared memory layer, cache,  ¡ ¢  ), an important question is how to fully u...
Meng-Shiou Wu, Srinivas Aluru, Ricky A. Kendall