Data Cube - Multidimensional Data Modeling PDF
Document Details
Tags
Related
Summary
This document provides an introduction to the concept of data cubes and multidimensional data modeling. It explains how a data cube represents data in a multi-dimensional format, and uses a rainfall data example to illustrate the 2D and 3D view of this concept. It also outlines the elements and operations of a data cube.
Full Transcript
Data Cube MULTIDIMENSIONAL DATA MODELLING Concept of data cube A multidimensional data model views data in the form of a cube. A data cube is characterized with two things Dimension: the perspective or entities with respect to which an organization wants to keep record. Fact:...
Data Cube MULTIDIMENSIONAL DATA MODELLING Concept of data cube A multidimensional data model views data in the form of a cube. A data cube is characterized with two things Dimension: the perspective or entities with respect to which an organization wants to keep record. Fact: The actual values in the record Example: Rainfall data of Metrological Department Time (Year, Season, Month, Week, Day, etc.) Location (Country, Region, State, etc.) 2-D view of rainfall data In this 2-D representation, the rainfall for “North-East” region are shown with respect to different months for a period of years 3-D view of rainfall data Suppose, we want to represent data according to times (Year, Month) as well as regions of a country say East, West, North, North-East, etc. A 2-D view of 3-D rainfall data 3-D view of the rainfall data Elements of a data cube A data cube is a multi-dimensional data structure. A data cube is characterized by its dimensions (e.g., Year, Month, region). Each dimension is associated with corresponding attributes (e.g., the attributes of region are East, West, North east etc. ). All dimensions connect in order to create a certain fact. A fact has a corresponding measure in the data cube (e.g., the rainfall is measured in cm. ) Operations on data cube Rollup – decreases dimensionality by aggregating data along a certain dimension. using sum, average, standard deviation. Drill-down – increases dimensionality by splitting the data further (month can be further splits to weeks and days). Slicing – decreases dimensionality by choosing a single value from a particular dimension (slicing a rainfall for a particular region) Dicing – picks a subset of values from each dimension (taking rainfall information for month of May-July in year 2007. Pivoting – rotates the data cube.