The increasing availability of high-performance computing systems with thousands, tens of thousands, and even hundreds of thousands of computational nodes is driving the demand fo...
Step caches are caches in which data entered to an cache array is kept valid only until the end of ongoing step of execution. Together with an advanced pipelined multithreaded arc...
We present Vector LLVA, a virtual instruction set architecture (VISA) that exposes extensive static information about vector parallelism while avoiding the use of hardware-speciď¬...
In 2002, Japan announced the Earth Simulator—a supercomputer based on low-volume vector processors and a custom network—and reported that computational scientists had used it ...
Abstract Competition and cooperation can boost the performance of a combinatorial search process. Both can be implemented with a portfolio of algorithms which run in parallel, give...