Sciweavers

35 search results - page 7 / 7
» Loop Scheduling and Partitions for Hiding Memory Latencies
Sort
View
IWMM
2009
Springer
152views Hardware» more  IWMM 2009»
13 years 11 months ago
A new approach to parallelising tracing algorithms
Tracing algorithms visit reachable nodes in a graph and are central to activities such as garbage collection, marshalling etc. Traditional sequential algorithms use a worklist, re...
Cosmin E. Oancea, Alan Mycroft, Stephen M. Watt
RTAS
2005
IEEE
13 years 10 months ago
Bounding Worst-Case Data Cache Behavior by Analytically Deriving Cache Reference Patterns
While caches have become invaluable for higher-end architectures due to their ability to hide, in part, the gap between processor speed and memory access times, caches (and partic...
Harini Ramaprasad, Frank Mueller
CASES
2006
ACM
13 years 8 months ago
High-performance packet classification algorithm for many-core and multithreaded network processor
Packet classification is crucial for the Internet to provide more value-added services and guaranteed quality of service. Besides hardware-based solutions, many software-based cla...
Duo Liu, Bei Hua, Xianghui Hu, Xinan Tang
ASPLOS
2008
ACM
13 years 6 months ago
Communication optimizations for global multi-threaded instruction scheduling
The recent shift in the industry towards chip multiprocessor (CMP) designs has brought the need for multi-threaded applications to mainstream computing. As observed in several lim...
Guilherme Ottoni, David I. August
IEEEPACT
2005
IEEE
13 years 10 months ago
HUNTing the Overlap
Hiding communication latency is an important optimization for parallel programs. Programmers or compilers achieve this by using non-blocking communication primitives and overlappi...
Costin Iancu, Parry Husbands, Paul Hargrove