Automated dimensionality reduction of data warehouses

9 years 1 months ago
Automated dimensionality reduction of data warehouses
A data warehouse is designed to consolidate and maintain all attributes that are relevant for the analysis processes. Due to the rapid increase in the size of the modern operational systems, it becomes neither practical, nor necessary to load and maintain in the data warehouse every operational attribute. This paper presents a novel methodology for automated selection of the most relevant independent attributes in a data warehouse. The method is based on the information-theoretic approach to knowledge discovery in databases. Attributes are selected by a stepwise forward procedure aimed at minimizing the uncertainty in the values of key performance indicators (KPI's). Each selected attribute is assigned a score, expressing its degree of relevance. Using the method does not require any prior expertise in the domain of the data and it can be equally applied to nominal and ordinal attributes. An attribute will be included in a data warehouse schema, if it is found as relevant to at l...
Mark Last, Oded Maimon
Added 01 Nov 2010
Updated 01 Nov 2010
Type Conference
Year 2000
Where DMDW
Authors Mark Last, Oded Maimon
Comments (0)