In this work, we propose a new FPGA design flow that combines the CUDA programming model from Nvidia with the state of the art high-level synthesis tool AutoPilot from AutoESL, to...
The increasing demand for low power and high performance multimedia embedded systems has motivated the need for effective solutions to satisfy application bandwidth and latency req...
Luis Angel D. Bathen, Yongjin Ahn, Nikil D. Dutt, ...
Web search engines are facing formidable performance challenges due to data sizes and query loads. The major engines have to process tens of thousands of queries per second over t...
QR decomposition is a computationally intensive linear algebra operation that factors a matrix A into the product of a unitary matrix Q and upper triangular matrix R. Adaptive sys...
Efficient bandwidth allocation and low delays remain important goals, expecially in high-speed networks. Existing end-to-end congestion control schemes (such as TCP+AQM/RED) have s...
A. Durresi, P. Kandikuppa, M. Sridharan, S. Chella...