Sciweavers

87 search results - page 5 / 18
» Improving the Memory Bandwidth Utilization Using Loop Transf...
Sort
View
ICCAD
2009
IEEE
179views Hardware» more  ICCAD 2009»
14 years 9 months ago
Automatic memory partitioning and scheduling for throughput and power optimization
Hardware acceleration is crucial in modern embedded system design to meet the explosive demands on performance and cost. Selected computation kernels for acceleration are usually ...
Jason Cong, Wei Jiang, Bin Liu, Yi Zou
ICCS
2005
Springer
15 years 5 months ago
Performance and Scalability Analysis of Cray X1 Vectorization and Multistreaming Optimization
Cray X1 Fortran and C/C++ compilers provide a number of loop transformations, notably vectorization and multistreaming, in order to exploit the multistreaming processor (MSP) hard...
Sadaf R. Alam, Jeffrey S. Vetter
CODES
2000
IEEE
15 years 4 months ago
Co-design of interleaved memory systems
Memory interleaving is a cost-efficient approach to increase bandwidth. Improving data access locality and reducing memory access conflicts are two important aspects to achieve hi...
Hua Lin, Wayne Wolf
HPDC
2008
IEEE
15 years 6 months ago
XenLoop: a transparent high performance inter-vm network loopback
Advances in virtualization technology have focused mainly on strengthening the isolation barrier between virtual machines (VMs) that are co-resident within a single physical machi...
Jian Wang, Kwame-Lante Wright, Kartik Gopalan
IEEEPACT
1999
IEEE
15 years 3 months ago
On Reducing False Sharing while Improving Locality on Shared Memory Multiprocessors
The performance of applications on large shared-memory multiprocessors with coherent caches depends on the interaction between the granularity of data sharing, the size of the coh...
Mahmut T. Kandemir, Alok N. Choudhary, J. Ramanuja...