Sciweavers

IFE
2010
87views more  IFE 2010»
13 years 1 months ago
A middleware for efficient stream processing in CUDA
This paper presents a middleware capable of out-of-order execution of kernels and data transfers for efficient stream processing in the compute unified device architecture (CUDA). ...
Shinta Nakagawa, Fumihiko Ino, Kenichi Hagihara
CCGRID
2010
IEEE
13 years 5 months ago
High Performance Data Transfer in Grid Environment Using GridFTP over InfiniBand
GridFTP, designed by using the Globus XIO framework, is one of the most popular methods for performing data transfers in the Grid environment. But the performance of GridFTP in WA...
Hari Subramoni, Ping Lai, Rajkumar Kettimuthu, Dha...
MOBISYS
2010
ACM
13 years 6 months ago
Energy-delay tradeoffs in smartphone applications
Many applications are enabled by the ability to capture videos on a smartphone and to have these videos uploaded to an Internetconnected server. This capability requires the trans...
Moo-Ryong Ra, Jeongyeup Paek, Abhishek Sharma, Ram...
ISLPED
1995
ACM
100views Hardware» more  ISLPED 1995»
13 years 8 months ago
Simultaneous scheduling and binding for power minimization during microarchitecture synthesis
ABSTRACT { Sub-micron technologies and the increasing size and complexity of integrated components have aggravated the e ect of long interconnects and buses, compared to that of ga...
Aurobindo Dasgupta, Ramesh Karri
DATE
2004
IEEE
146views Hardware» more  DATE 2004»
13 years 8 months ago
Data Reuse Analysis Technique for Software-Controlled Memory Hierarchies
In multimedia and other streaming applications a significant portion of energy is spent on data transfers. Exploiting data reuse opportunities in the application, we can reduce th...
Ilya Issenin, Erik Brockmeyer, Miguel Miranda, Nik...
DAC
1996
ACM
13 years 8 months ago
Electromigration Reliability Enhancement via Bus Activity Distribution
: Electromigration induced degradation in integrated circuits has been accelerated by continuous scaling of device dimensions. We present a methodology for synthesizing high-reliab...
Aurobindo Dasgupta, Ramesh Karri
PPOPP
1997
ACM
13 years 8 months ago
Performance Implications of Communication Mechanisms in All-Software Global Address Space Systems
Global addressing of shared data simplifies parallel programming and complements message passing models commonly found in distributed memory machines. A number of programming sys...
Beng-Hong Lim, Chi-Chao Chang, Grzegorz Czajkowski...
HPDC
2003
IEEE
13 years 9 months ago
Optimizing GridFTP through Dynamic Right-Sizing
In this paper, we describe the integration of dynamic right-sizing — an automatic and scalable buffer management technique for enhancing TCP performance — into GridFTP, a sub...
Sunil Thulasidasan, Wu-chun Feng, Mark K. Gardner
GRID
2004
Springer
13 years 10 months ago
Flexible Control of Data Transfers between Parallel Programs
Allowing loose coupling between complex e-Science applications has many advantages, such as being able to easily incorporate new applications and to flexibly specify how the appl...
Joe Shang-Chieh Wu, Alan Sussman
ICS
2007
Tsinghua U.
13 years 10 months ago
Automatic nonblocking communication for partitioned global address space programs
Overlapping communication with computation is an important optimization on current cluster architectures; its importance is likely to increase as the doubling of processing power ...
Wei-Yu Chen, Dan Bonachea, Costin Iancu, Katherine...