Tracing algorithms visit reachable nodes in a graph and are central to activities such as garbage collection, marshalling etc. Traditional sequential algorithms use a worklist, re...
Exploiting compile time knowledge to improve memory bandwidth can produce noticeable improvements at run-time [13, 1]. Allocating the data structure [13] to separate memories when...
The Sony–Toshiba–IBM Cell Broadband Engine (Cell/B.E.) is a heterogeneous multicore architecture that consists of a traditional microprocessor (PPE) with eight SIMD co-process...
David A. Bader, Virat Agarwal, Kamesh Madduri, Seu...
Search-based graph queries, such as finding short paths and isomorphic subgraphs, are dominated by memory latency. If input graphs can be partitioned appropriately, large cluster...
Jonathan W. Berry, Bruce Hendrickson, Simon Kahan,...
Modern deep packet inspection systems use regular expressions to define various patterns of interest in network data streams. Deterministic Finite Automata (DFA) are commonly used...