Parallel programs that use critical sections and are executed on a shared-memory multiprocessor with a writeinvalidate protocol result in invalidation actions that could be elimin...
A key step in program optimization is the determination of optimal values for code optimization parameters such as cache tile sizes and loop unrolling factors. One approach, which...
This work proposes a new architecture and execution model called 2D-VLIW. This architecture adopts an execution model based on large pieces of computation running over a matrix of...
Optimizing the performance of a multi-core microprocessor within a power budget has recently received a lot of attention. However, most existing solutions are centralized and cann...
In this paper we present a concurrent implementation of genetic algorithms designed for shared memory architectures intended to take advantage of multi-core processor platforms. O...