Data Warehousing and Data Mart

PerfectSandDune avatar
PerfectSandDune
·
·
Download

Start Quiz

Study Flashcards

10 Questions

What is the primary benefit of using a hosted warehouse as an alternative to traditional data warehousing?

Reduced infrastructure costs

What is the main difference between the Star Schema and Relational data warehouse structures?

Dimensional modeling

What is the primary purpose of a data cube in dimensional modeling?

To enable fast querying and analysis

What is the term for the process of breaking down a data cube into smaller, more detailed parts?

Drill-down

What is the primary benefit of building a data warehouse incrementally?

Reduced project risk

What is the primary reason for managing user expectations during a data warehouse project?

To ensure user adoption

What is the term for the process of selecting a specific portion of a data cube for analysis?

Slicing

What is the primary benefit of having a business-supplier relationship during a data warehouse project?

Better communication and collaboration

What is the primary reason for only loading high-quality data into a data warehouse?

To ensure accurate business decisions

What is the primary benefit of being politically aware during a data warehouse project?

Reduced project risk

Study Notes

Data Warehousing

  • A data warehouse is a physical repository where relational data is specially organized to provide enterprise-wide, cleansed data in a standardized format.
  • Characteristics of a data warehouse:
    • Subject-oriented
    • Integrated
    • Time-variant (time series)
    • Nonvolatile
    • Summarized
    • Not normalized
    • Metadata
    • Web-based, relational/multi-dimensional
    • Client-server
    • Real-time and/or right-time (active)

Data Mart

  • A data mart is a departmental data warehouse that stores only relevant data.
  • Types of data marts:
    • Dependent data mart: a subset created directly from a data warehouse
    • Independent data mart: a small data warehouse designed for a strategic business unit or a department

Data Warehousing Definitions

  • Operational Data Stores (ODS): a type of database often used as an interim area for a data warehouse
  • Oper Marts: an operational data mart
  • Data Acquisition Software (back-end): software that extracts data from various sources
  • Two-tier architecture: combines the first two tiers of a three-tier architecture into one

Data Warehousing Architectures

  • Three-tier architecture:
    • Tier 1: Client workstation
    • Tier 2: Application server
    • Tier 3: Database server
  • Alternative architectures:
    • Independent Data Marts Architecture
    • Data Mart Bus Architecture with Linked Dimensional Datamarts
    • Hub and Spoke Architecture (Corporate Information Factory)
    • Centralized Data Warehouse Architecture
    • Federated Architecture
  • Factors affecting architecture selection:
    • Information interdependence between organizational units
    • Upper management's information needs
    • Urgency of need for a data warehouse
    • Nature of end-user tasks
    • Constraints on resources
    • Strategic view of the data warehouse prior to implementation
    • Compatibility with existing systems
    • Perceived ability of the in-house IT staff
    • Technical issues
    • Social/political factors

Data Integration and ETL Process

  • Data integration: comprises three major processes: data access, data federation, and change capture
  • ETL (Extraction, Transformation, and Load) process:
    • Extraction: extracting data from various sources
    • Transformation: transforming data into a standardized format
    • Load: loading data into a data warehouse

Benefits of Data Warehousing

  • Direct benefits:
    • Allows end users to perform extensive analysis
    • Allows a consolidated view of corporate data
    • Better and more timely information
    • Enhanced system performance
    • Simplification of data access
  • Indirect benefits:
    • Enhance business knowledge
    • Present competitive advantage
    • Enhance customer service and satisfaction
    • Facilitate decision making
    • Help in reforming business processes

Data Warehouse Development

  • Approaches:
    • Inmon Model: EDW approach (top-down)
    • Kimball Model: Data mart approach (bottom-up)
  • Data warehouse structure:
    • Star Schema (a.k.a. Dimensional Modeling)
    • Real-time data warehousing

Best Practices for Implementing DW

  • The project must fit with corporate strategy
  • There must be complete buy-in to the project
  • It is important to manage user expectations
  • The data warehouse must be built incrementally
  • Adaptability must be built in from the start
  • The project must be managed by both IT and business professionals
  • Only load data that have been cleansed/high quality
  • Do not overlook training requirements
  • Be politically aware

Learn about the characteristics of a data warehouse and data mart, including their definitions, features, and applications in data management.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free

More Quizzes Like This

Use Quizgecko on...
Browser
Browser