Sciweavers

2681 search results - page 372 / 537
» Performance results of running parallel applications on the ...
Sort
View
ICPP
2003
IEEE
15 years 3 months ago
Scheduling Algorithms with Bus Bandwidth Considerations for SMPs
The bus that connects processors to memory is known to be a major architectural bottleneck in SMPs. However, both software and scheduling policies for these systems generally focu...
Christos D. Antonopoulos, Dimitrios S. Nikolopoulo...
ISCA
2002
IEEE
104views Hardware» more  ISCA 2002»
15 years 3 months ago
Using a User-Level Memory Thread for Correlation Prefetching
This paper introduces the idea of using a User-Level Memory Thread (ULMT) for correlation prefetching. In this approach, a user thread runs on a general-purpose processor in main ...
Yan Solihin, Josep Torrellas, Jaejin Lee
CASES
2009
ACM
15 years 4 months ago
CGRA express: accelerating execution using dynamic operation fusion
Coarse-grained reconfigurable architectures (CGRAs) present an appealing hardware platform by providing programmability with the potential for high computation throughput, scalab...
Yongjun Park, Hyunchul Park, Scott A. Mahlke
ICPP
2007
IEEE
15 years 4 months ago
Architectural Challenges in Memory-Intensive, Real-Time Image Forming
The real-time image forming in future, high-end synthetic aperture radar systems is an example of an application that puts new demands on computer architectures. The initial quest...
Anders Ahlander, H. Hellsten, K. Lind, J. Lindgren...
EUROPAR
2005
Springer
15 years 3 months ago
Tolerating Message Latency Through the Early Release of Blocked Receives
Large message latencies often lead to poor performance of parallel applications. In this paper, we investigate a latency-tolerating technique that immediately releases all blocking...
Jian Ke, Martin Burtscher, William Evan Speight