Predicting how well applications may run on modern systems is becoming increasingly challenging. It is no longer sufficient to look at number of floating point operations and commu...
Large–scale parallel applications performing global synchronization may spend a significant amount of execution time waiting for the completion of a barrier operation. Conseque...
This paper presents an analytical model to predict the performance of general-purpose applications on a GPU architecture. The model is designed to provide performance information ...
Sara S. Baghsorkhi, Matthieu Delahaye, Sanjay J. P...
With the emergence of large wired and wireless sensor networks, many real-time applications need to operate on continuous unbounded data streams. At the same time, many of these s...
Yuan Wei, Vibha Prasad, Sang Hyuk Son, John A. Sta...
Machine-learning algorithms are employed in a wide variety of applications to extract useful information from data sets, and many are known to suffer from superlinear increases in ...
Karthik Nagarajan, Brian Holland, Alan D. George, ...