Sciweavers

5553 search results - page 429 / 1111
» Parallel Implementation of Sch
Sort
View
176
Voted
IPPS
2006
IEEE
16 years 5 days ago
An extensible global address space framework with decoupled task and data abstractions
ions Sriram Krishnamoorthy½ Umit Catalyurek¾ Jarek Nieplocha¿ Atanas Rountev½ P. Sadayappan½ ½ Dept. of Computer Science and Engineering, ¾ Dept. of Biomedical Informatics T...
Sriram Krishnamoorthy, Ümit V. Çataly&...
218
Voted
PPOPP
2003
ACM
15 years 11 months ago
Optimizing data aggregation for cluster-based internet services
Large-scale cluster-based Internet services often host partitioned datasets to provide incremental scalability. The aggregation of results produced from multiple partitions is a f...
Lingkun Chu, Hong Tang, Tao Yang, Kai Shen
164
Voted
VLSISP
2008
173views more  VLSISP 2008»
15 years 6 months ago
Fast Bit Gather, Bit Scatter and Bit Permutation Instructions for Commodity Microprocessors
Advanced bit manipulation operations are not efficiently supported by commodity word-oriented microprocessors. Programming tricks are typically devised to shorten the long sequence...
Yedidya Hilewitz, Ruby B. Lee
216
Voted
PLDI
2011
ACM
14 years 9 months ago
Automatic compilation of MATLAB programs for synergistic execution on heterogeneous processors
MATLAB is an array language, initially popular for rapid prototyping, but is now being increasingly used to develop production code for numerical and scientific applications. Typ...
Ashwin Prasad, Jayvant Anantpur, R. Govindarajan
IPPS
2008
IEEE
16 years 17 days ago
High performance MPEG-2 software decoder on the cell broadband engine
The Sony-Toshiba-IBM Cell Broadband Engine is a heterogeneous multicore architecture that consists of a traditional microprocessor (PPE) with eight SIMD coprocessing units (SPEs) ...
David A. Bader, Sulabh Patel