The recent multi/many-core CPUs or GPUs have provided an ideal parallel computing platform to accelerate the timeconsuming analysis of radio-frequency/millimeter-wave (RF/ MM) int...
Xuexin Liu, Hao Yu, Jacob Relles, Sheldon X.-D. Ta...
In this work we discuss a class of defect correction methods which is easily adapted to create parallel time integrators for multi-core architectures and is ideally suited for deve...
Andrew J. Christlieb, Colin B. Macdonald, Benjamin...
Modern chip multiprocessors (CMPs) are designed to exploit both instruction-level parallelism (ILP) within processors and thread-level parallelism (TLP) within and across processo...
Changkyu Kim, Simha Sethumadhavan, M. S. Govindan,...
Model order reduction is an efficient technique to reduce the system complexity while producing a good approximation of the input-output behavior. However, the efficiency of reduc...
Boyuan Yan, Lingfei Zhou, Sheldon X.-D. Tan, Jie C...
—Distributing large data to many nodes, known as a broadcast or a multicast, is an important operation in parallel and distributed computing. Most previous broadcast algorithms e...