quiz image

Chapter 1 (Additional Questions): Evolution of Data Architectures

EnrapturedElf avatar
EnrapturedElf
·
·
Download

Start Quiz

Study Flashcards

12 Questions

A dimensional model is described by a _____ schema

star

A ___table, which is the primary, or central table for the schema

fact

A ________ provides the context surrounding the selected business process. In a sales scenario exam‐ ple, the list of dimensions could include product, customer, salesperson, and store

dimension

Data lakes require very large, scalable ______ systems, like the ones typically offered in cloud environments

storage

High amounts of ______ power are required to process the large amounts of data stored in the storage layer

compute

The shape of the data on disk defines the _______ .

formats

Modern, cloud-based storage systems maintain _______ (i.e., contextual infor‐ mation about the data)

metadata

Delta Lake will make sure that all data lake transactions using Spark, or any other processing engine, are committed for durability and exposed to other readers in an atomic fashion

True

Traditional data lakes support transactional, atomic updates to the data. Delta Lake fully supports all DML operations, including deletes and updates, and complex data merge or upsert scenarios

False

The Delta Lake transaction log records every change made to the data, in the order that these changes were made.

True

Delta Lake cannot work with batch and streaming sinks or source

False

Delta Lake enforces a schema when writing or reading data from the lake. However, when explicitly enabled for a data entity, it allows for a safe schema evolution, enabling use cases where the data needs to evolve

True

Study Notes

Dimensional Modeling

  • A dimensional model is described by a star schema
  • A fact table, which is the primary, or central table for the schema
  • A fact table provides the context surrounding the selected business process
  • Dimensions in a sales scenario example include product, customer, salesperson, and store

Data Lakes

  • Data lakes require very large, scalable storage systems, like the ones typically offered in cloud environments
  • High amounts of processing power are required to process the large amounts of data stored in the storage layer

Data Storage

  • The shape of the data on disk defines the storage format
  • Modern, cloud-based storage systems maintain metadata (i.e., contextual information about the data)

Delta Lake

  • Delta Lake ensures all data lake transactions using Spark, or any other processing engine, are committed for durability and exposed to other readers in an atomic fashion
  • Delta Lake fully supports all DML operations, including deletes and updates, and complex data merge or upsert scenarios
  • The Delta Lake transaction log records every change made to the data, in the order that these changes were made
  • Delta Lake enforces a schema when writing or reading data from the lake
  • Delta Lake allows for safe schema evolution when explicitly enabled for a data entity, enabling use cases where the data needs to evolve

Explore the challenges faced by organizations with monolithic architecture and the purpose of data warehouses in storing metadata and contextual information. Learn how these concepts are essential for efficient data management.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free

More Quizzes Like This

Use Quizgecko on...
Browser
Browser