For memory constrained embedded systems code size is at least as important as performance. One way of increasing code density is to exploit compact instruction formats, e.g. ARM T...
Tiling is a widely used loop transformation for exposing/exploiting parallelism and data locality. Effective use of tiling requires selection and tuning of the tile sizes. This is...
—Current trends in desktop processor design have been toward many-core solutions with increased parallelism. As the number of supported threads grows in these processors, it may ...
—The ongoing move to chip multiprocessors (CMPs) permits greater sharing of last-level cache by processor cores but this sharing aggravates the cache contention problem, potentia...
Chi Xu, Xi Chen, Robert P. Dick, Zhuoqing Morley M...
Abstract—We generate and provide miniature synthetic benchmark clones for modern workloads to solve two pre-silicon design challenges, namely: 1) huge simulation time (weeks to m...