Sciweavers

2609 search results - page 458 / 522
» Optimizing for parallelism and data locality
Sort
View
105
Voted
ISSS
2002
IEEE
139views Hardware» more  ISSS 2002»
15 years 5 months ago
Multiprocessor Mapping of Process Networks: A JPEG Decoding Case Study
We present a system-level design and programming method for embedded multiprocessor systems. The aim of the method is to improve the design time and design quality by providing a ...
Erwin A. de Kock
EUROPAR
2009
Springer
15 years 5 months ago
Capturing and Visualizing Event Flow Graphs of MPI Applications
A high-level understanding of how an application executes and which performance characteristics it exhibits is essential in many areas of high performance computing, such as applic...
Karl Fürlinger, David Skinner
103
Voted
ICCD
1999
IEEE
110views Hardware» more  ICCD 1999»
15 years 4 months ago
TriMedia CPU64 Architecture
We present a new VLIW core as a successor to the TriMedia TM1000. The processor is targeted for embedded use in media-processing devices like DTVs and set-top boxes. Intended as a...
Jos T. J. van Eijndhoven, Kees A. Vissers, Evert-J...
SIGGRAPH
1994
ACM
15 years 4 months ago
IRIS performer: a high performance multiprocessing toolkit for real-time 3D graphics
This paper describes the design and implementation of IRIS Performer, a toolkit for visual simulation, virtual reality, and other real-time 3D graphics applications. The principal...
John Rohlf, James Helman
112
Voted
DSD
2010
IEEE
161views Hardware» more  DSD 2010»
15 years 22 days ago
Design of Trace-Based Split Array Caches for Embedded Applications
—Since many embedded systems execute a predefined set of programs, tuning system components to application programs and data is the approach chosen by many design techniques to o...
Alice M. Tokarnia, Marina Tachibana