Podcast
Questions and Answers
What is the purpose of adding Time to the Date dimension in the warehouse?
What is the purpose of adding Time to the Date dimension in the warehouse?
- To analyze which promotions drive the most sales
- To support tracking promotions and their effects on sales
- To capture transaction information to the minute (correct)
- To determine the size of the Date_Dim for 10 years of data
What is the main reason for creating a Promotion dimension in the warehouse?
What is the main reason for creating a Promotion dimension in the warehouse?
- To analyze the effectiveness of promotions on sales (correct)
- To capture transaction information in the OLTP Sales Transaction table
- To determine the size of the Date_Dim for 10 years of data
- To support tracking Time of Day in the warehouse
What is an example of a causal dimension in the warehouse?
What is an example of a causal dimension in the warehouse?
- Date
- Transaction ID
- Promotion (correct)
- Time of Day
What is a challenge when analyzing the effectiveness of promotions?
What is a challenge when analyzing the effectiveness of promotions?
What is the grain of the revised schema in the warehouse?
What is the grain of the revised schema in the warehouse?
What is an attribute of the Promotion dimension?
What is an attribute of the Promotion dimension?
What is a potential issue with promotions in the warehouse?
What is a potential issue with promotions in the warehouse?
Why is it important to create a 'No Promotion' type in the Promotion dimension?
Why is it important to create a 'No Promotion' type in the Promotion dimension?
What is the primary goal of the theme 'To grow sales across all market segments and product lines'?
What is the primary goal of the theme 'To grow sales across all market segments and product lines'?
What is the business process being modeled in the Cardinal Merch case study?
What is the business process being modeled in the Cardinal Merch case study?
What does the 'grain' of the business process convey?
What does the 'grain' of the business process convey?
What is the reason for choosing the most granular or atomic information captured by the business process?
What is the reason for choosing the most granular or atomic information captured by the business process?
What is the primary key in the DateDim table?
What is the primary key in the DateDim table?
What is the purpose of declaring the grain of the business process?
What is the purpose of declaring the grain of the business process?
What is a critical success factor?
What is a critical success factor?
What is the foreign key in the SalesFact table that links to the ProductDim table?
What is the foreign key in the SalesFact table that links to the ProductDim table?
What is the purpose of the query in the context?
What is the purpose of the query in the context?
What is the primary key in the StoreDim table?
What is the primary key in the StoreDim table?
What type of data is stored in the SalesFact table?
What type of data is stored in the SalesFact table?
What is the final step in the dimensional modeling process?
What is the final step in the dimensional modeling process?
What is the relationship between the SalesFact table and the DateDim table?
What is the relationship between the SalesFact table and the DateDim table?
What is the purpose of the ProductDim table?
What is the purpose of the ProductDim table?
What is the approximate number of rows in the Product Dimension when accounting for different merchandising schemes across stores and historical products?
What is the approximate number of rows in the Product Dimension when accounting for different merchandising schemes across stores and historical products?
What is the natural key in the Product Dimension?
What is the natural key in the Product Dimension?
What is the hierarchy of the Product Dimension?
What is the hierarchy of the Product Dimension?
What is an example of a geographic hierarchy in the Store Dimension?
What is an example of a geographic hierarchy in the Store Dimension?
What is the formula for Gross Margin?
What is the formula for Gross Margin?
What should we choose to store as a fact, Gross Profit or Gross Margin?
What should we choose to store as a fact, Gross Profit or Gross Margin?
What is an example of an attribute in the Product Dimension?
What is an example of an attribute in the Product Dimension?
What is the purpose of the Store Dimension?
What is the purpose of the Store Dimension?
What is the purpose of a junk dimension?
What is the purpose of a junk dimension?
How many rows would a junk dimension have if it had 3 genders, 100 ages, 3 marital statuses, 3 return statuses, and 3 child under 15 statuses?
How many rows would a junk dimension have if it had 3 genders, 100 ages, 3 marital statuses, 3 return statuses, and 3 child under 15 statuses?
What is a consideration when creating a junk dimension?
What is a consideration when creating a junk dimension?
Why might you want to remove some unrealistic combinations from a junk dimension?
Why might you want to remove some unrealistic combinations from a junk dimension?
What is an example of an unrealistic combination that might be removed from a junk dimension?
What is an example of an unrealistic combination that might be removed from a junk dimension?
How many rows are typically appropriate for a junk dimension?
How many rows are typically appropriate for a junk dimension?
What is the benefit of using a junk dimension with a smaller number of rows?
What is the benefit of using a junk dimension with a smaller number of rows?
What is a characteristic of a junk dimension?
What is a characteristic of a junk dimension?
Study Notes
DIM_DEMOGRAPHICS
- Pre-loaded with every combination of responses
- Columns: DemoKey, Gender, Age, AgeGroup, MaritalStatus, ReturnStatus, Child15Status
- Contains 8,100 rows (calculated by multiplying the number of potential values for each attribute)
Junk Dimensions
- Calculated by taking the total number of potential values for each attribute
- Should be restricted to a total potential size of up to 20,000 rows
- May want to remove unrealistic combinations, but this adds complexity to the load process
Data Model
- The Query: selects sum of QuantitySold and sum of NetPrice from SalesFact, filtered by DivisionName, CategoryName, and CharDate
- Critical Success Factors: a central goal, e.g. to grow sales across all market segments and product lines
- Business Questions: specific questions tied to data, e.g. how many men's statement shirts were sold in the Philippines during NCAA semifinals 2016 and what was the total net sales?
Dimensional Modeling Process
- Consists of four main steps: select the business process to model, declare the grain of the business process, choose the dimensions that apply to each fact table row, and identify the facts
- Dimensional modeling is both science and art
Cardinal Merch Case Study
- Step 1: select the business process to model, e.g. POS retail sales
- Step 2: declare the grain of the business process, specifying the level of detail associated with fact table measurements
- Choose the most granular or atomic information captured by the business process
Date Dimension
- Example attributes: Date, CharDate, etc.
- Grain conveys the level of detail associated with fact table measurements
Product Dimension
- Example attributes: SKU Number, UPC, Product Description, Brand Description, etc.
- Product dimension will contain about 150K rows when accounting for different merchandising schemes across stores and historical products
- Product hierarchy: SKU=>Brand=>Category=>Department
Store Dimension
- Example attributes: Store Number, Store Name, Store Street Address, etc.
- Represents primary geographic dimension
- Store hierarchies include: Store=>State, Store=>District=>Region, etc.
Additivity
- If CMI wants to look at gross margin, it's calculated by gross profit/sales dollar amount
- Should choose to store gross profit or gross margin as a fact
Simple Warehouse Data Model
- Consists of DateDim, ProductDim, SalesFact, and StoreDim tables
- Each table has primary keys and foreign keys to establish relationships
Time of Day
- If we add Time to the Date dimension, the size of the Date_Dim for 10 years of data to the minute would be significant
- Alternatively, we can add it as a new dimension, changing the grain to Store, Date, SKU, Transaction ID, Time
Promotion Dimension
- Example attributes: Promotion Code, Promotion Name, Price Reduction Type, etc.
- We can create a new dimension to support tracking promotions, with a promotion type 'No Promotion' to account for non-promotion sales
- Revised schema includes the Promotion dimension, making the grain Store, Date, SKU, Transaction ID, Time, Promotion
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Description
Test your knowledge of demographics with this quiz, covering gender, age, marital status, and more!