Abstract-- We investigate the problem of clustering on distributed data streams. In particular, we consider the k-median clustering on stream data arriving at distributed sites whi...
Present databases, whether on centralized or parallel DBMSs, do not deal well with scalability. We present an architecture for Wintel multicomputers termed AMOS-SDDS, coupling a h...
Yakham Ndiaye, Aly Wane Diene, Witold Litwin, Tore...
Statistical analysis of massive data is becoming indispensable to science, commerce, and society today. Such analysis requires efficient, flexible storage support and special optim...
There is a rapidly growing set of applications, referred to as data driven applications, in which analysis of large amounts of data drives the next steps taken by the scientist, e...
We describe a methodology that enables the real-time diagnosis of performance problems in complex high-performance distributed systems. The methodology includes tools for generati...
Brian Tierney, William E. Johnston, Brian Crowley,...