Sciweavers

36 search results - page 3 / 8
» Performance Portable Optimizations for Loops Containing Comm...
Sort
View
CASES
2009
ACM
14 years 19 days ago
CGRA express: accelerating execution using dynamic operation fusion
Coarse-grained reconfigurable architectures (CGRAs) present an appealing hardware platform by providing programmability with the potential for high computation throughput, scalab...
Yongjun Park, Hyunchul Park, Scott A. Mahlke
WIESS
2000
13 years 7 months ago
Stub-Code Performance Is Becoming Important
As IPC mechanisms become faster, stub-code efficiency becomes a performance issue for local client/server RPCs and inter-component communication. Inefficient and unnecessary compl...
Andreas Haeberlen, Jochen Liedtke, Yoonho Park, La...
MICRO
2005
IEEE
130views Hardware» more  MICRO 2005»
13 years 11 months ago
Exploiting Vector Parallelism in Software Pipelined Loops
An emerging trend in processor design is the addition of short vector instructions to general-purpose and embedded ISAs. Frequently, these extensions are employed using traditiona...
Samuel Larsen, Rodric M. Rabbah, Saman P. Amarasin...
ISSS
1998
IEEE
104views Hardware» more  ISSS 1998»
13 years 10 months ago
Synchronization Detection for Multi-Process Hierarchical Synthesis
Complex system specifications are often hierarchically composed of several subsystems. Each subsystem contains one or more processes. In order to provide optimization across diffe...
Oliver Bringmann, Wolfgang Rosenstiel, Dirk Reicha...
IOPADS
1997
94views more  IOPADS 1997»
13 years 7 months ago
Remote I/O Fast Access to Distant Storage
As high-speed networks make it easier to use distributed resources, it becomes increasingly common that applications and their data are not colocated. Users have traditionally add...
Ian T. Foster, David Kohr, Rakesh Krishnaiyer, Jac...