In the near future, semiconductor technology will allow the integration of multiple processors on a chip or multichipmodule (MCM). In this paper we investigate the architecture an...
With increasing demands for high performance by embedded systems, especially by digital signal processing applications, embedded processors must increase available instruction lev...
We present a fast and scalable matrix multiplication algorithm on distributed memory concurrent computers, whose performance is independent of data distribution on processors, and...
While technology trends have ushered in the age of chip multiprocessors (CMP) and enabled designers to place an increasing number of cores on chip, a fundamental question is what ...
Divya Gulati, Changkyu Kim, Simha Sethumadhavan, S...
Low-latency and high-throughput processing are key requirements of data stream management systems (DSMSs). Hence, multi-core processors that provide high aggregate processing capa...