Sciweavers

599 search results - page 2 / 120
» Applying Data Copy to Improve Memory Performance of General ...
Sort
View
CF
2006
ACM
13 years 6 months ago
Intermediately executed code is the key to find refactorings that improve temporal data locality
The growing speed gap between memory and processor makes an efficient use of the cache ever more important to reach high performance. One of the most important ways to improve cac...
Kristof Beyls, Erik H. D'Hollander
PLDI
2010
ACM
13 years 10 months ago
Z-rays: divide arrays and conquer speed and flexibility
Arrays are the ubiquitous organization for indexed data. Throughout programming language evolution, implementations have laid out arrays contiguously in memory. This layout is pro...
Jennifer B. Sartor, Stephen M. Blackburn, Daniel F...
RTCSA
2005
IEEE
13 years 10 months ago
LyraNET: A Zero-Copy TCP/IP Protocol Stack for Embedded Operating Systems
Embedded systems are usually resource limited in terms of processing power, memory, and power consumption, thus embedded TCP/IP should be designed to make the best use of limited ...
Yun-Chen Li, Mei-Ling Chiang
CLUSTER
2007
IEEE
13 years 8 months ago
Efficient asynchronous memory copy operations on multi-core systems and I/OAT
Bulk memory copies incur large overheads such as CPU stalling (i.e., no overlap of computation with memory copy operation), small register-size data movement, cache pollution, etc...
Karthikeyan Vaidyanathan, Lei Chai, Wei Huang, Dha...
TJS
2008
113views more  TJS 2008»
13 years 4 months ago
Improving the parallelism of iterative methods by aggressive loop fusion
Abstract. Traditionally, loop nests are fused only when the data dependences in the loop nests are not violated. This paper presents a new loop fusion algorithm that is capable of ...
Jingling Xue, Minyi Guo, Daming Wei