The Collect problem for an asynchronous shared-memory system has the objective for the processors to learn all values of a collection of shared registers, while minimizing the tot...
Bogdan S. Chlebus, Dariusz R. Kowalski, Alexander ...
The disparity between microprocessor clock frequencies and memory latency is a primary reason why many demanding applications run well below peak achievable performance. Software c...
Joseph Gebis, Leonid Oliker, John Shalf, Samuel Wi...
—Immediate notification of urgent but rare events and delivery of time sensitive actuation commands appear in many practical wireless sensor and actuator network applications. M...
The designer of a system on a chip (SoC) that connects IP cores through a network on chip (NoC) needs methods to support application performance evaluation. Two key aspects these ...
Leonel Tedesco, Aline Mello, Diego Garibotti, Ney ...
Cache hierarchies have been traditionally designed for usage by a single application, thread or core. As multi-threaded (MT) and multi-core (CMP) platform architectures emerge and...