Sciweavers

86 search results - page 17 / 18
» Custom Data Layout for Memory Parallelism
Sort
View
CCGRID
2011
IEEE
12 years 8 months ago
Small Discrete Fourier Transforms on GPUs
– Efficient implementations of the Discrete Fourier Transform (DFT) for GPUs provide good performance with large data sizes, but are not competitive with CPU code for small data ...
S. Mitra, A. Srinivasan
ICS
2007
Tsinghua U.
13 years 11 months ago
Adaptive Strassen's matrix multiplication
Strassen’s matrix multiplication (MM) has benefits with respect to any (highly tuned) implementations of MM because Strassen’s reduces the total number of operations. Strasse...
Paolo D'Alberto, Alexandru Nicolau
SIGGRAPH
1993
ACM
13 years 9 months ago
Leo: a system for cost effective 3D shaded graphics
A physically compact, low cost, high performance 3D graphics accelerator is presented. It supports shaded rendering of triangles and antialiased lines into a double-buffered 24-bi...
Michael F. Deering, Scott R. Nelson
ICPP
1998
IEEE
13 years 9 months ago
Performance Implications of Architectural and Software Techniques on I/O-Intensive Applications
Many large scale applications, have significant I/O requirements as well as computational and memory requirements. Unfortunately, limited number of I/O nodes provided by the conte...
Meenakshi A. Kandaswamy, Mahmut T. Kandemir, Alok ...
ISPASS
2006
IEEE
13 years 11 months ago
Modeling TCAM power for next generation network devices
Applications in Computer Networks often require high throughput access to large data structures for lookup and classification. Many advanced algorithms exist to speed these searc...
Banit Agrawal, Timothy Sherwood