Sciweavers

PVLDB
2010

DataGarage: Warehousing Massive Performance Data on Commodity Servers

13 years 2 months ago
DataGarage: Warehousing Massive Performance Data on Commodity Servers
Contemporary datacenters house tens of thousands of servers. The servers are closely monitored for operating conditions and utilizations by collecting their performance data (e.g., CPU utilization). In this paper, we show that existing database and file-system solutions are not suitable for warehousing performance data collected from a large number of servers because of the scale and the complexity of performance data. We describe the design and implementation of DataGarage, a performance data warehousing system that we have developed at Microsoft. DataGarage is a hybrid solution that combines benefits of DBMSs, file-systems, and MapReduce systems to address unique challenges of warehousing performance data. We describe how DataGarage allows efficient storage and analysis of years of historical performance data collected from many tens of thousands of servers—on commodity servers. We also report DataGarage’s performance with a real dataset and a 32node, 256-core shared-nothing...
Charles Loboz, Slawek Smyl, Suman Nath
Added 30 Jan 2011
Updated 30 Jan 2011
Type Journal
Year 2010
Where PVLDB
Authors Charles Loboz, Slawek Smyl, Suman Nath
Comments (0)