ALPS is a per-application user-level proportional-share scheduler that operates with low overhead and without any special kernel support. ALPS is useful to a range of applications...
Abstract. A new approach for acquiring knowledge of parallel applications regarding resource usage and for searching similarity on workload traces is presented. The main goal is to...
Abstract. Association rule mining is a data mining technique that reveals interesting relationships in a database. Existing approaches employ different parameters to search for int...
To improve data availability and resilience MapReduce frameworks use file systems that replicate data uniformly. However, analysis of job logs from a large production cluster show...
A fundamental problem in data management is to draw a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streamin...
Graham Cormode, S. Muthukrishnan, Ke Yi, Qin Zhang