12 Questions
A dimensional model is described by a _____ schema
star
A ___table, which is the primary, or central table for the schema
fact
A ________ provides the context surrounding the selected business process. In a sales scenario exam‐ ple, the list of dimensions could include product, customer, salesperson, and store
dimension
Data lakes require very large, scalable ______ systems, like the ones typically offered in cloud environments
storage
High amounts of ______ power are required to process the large amounts of data stored in the storage layer
compute
The shape of the data on disk defines the _______ .
formats
Modern, cloud-based storage systems maintain _______ (i.e., contextual infor‐ mation about the data)
metadata
Delta Lake will make sure that all data lake transactions using Spark, or any other processing engine, are committed for durability and exposed to other readers in an atomic fashion
True
Traditional data lakes support transactional, atomic updates to the data. Delta Lake fully supports all DML operations, including deletes and updates, and complex data merge or upsert scenarios
False
The Delta Lake transaction log records every change made to the data, in the order that these changes were made.
True
Delta Lake cannot work with batch and streaming sinks or source
False
Delta Lake enforces a schema when writing or reading data from the lake. However, when explicitly enabled for a data entity, it allows for a safe schema evolution, enabling use cases where the data needs to evolve
True
Study Notes
Dimensional Modeling
- A dimensional model is described by a star schema
- A fact table, which is the primary, or central table for the schema
- A fact table provides the context surrounding the selected business process
- Dimensions in a sales scenario example include product, customer, salesperson, and store
Data Lakes
- Data lakes require very large, scalable storage systems, like the ones typically offered in cloud environments
- High amounts of processing power are required to process the large amounts of data stored in the storage layer
Data Storage
- The shape of the data on disk defines the storage format
- Modern, cloud-based storage systems maintain metadata (i.e., contextual information about the data)
Delta Lake
- Delta Lake ensures all data lake transactions using Spark, or any other processing engine, are committed for durability and exposed to other readers in an atomic fashion
- Delta Lake fully supports all DML operations, including deletes and updates, and complex data merge or upsert scenarios
- The Delta Lake transaction log records every change made to the data, in the order that these changes were made
- Delta Lake enforces a schema when writing or reading data from the lake
- Delta Lake allows for safe schema evolution when explicitly enabled for a data entity, enabling use cases where the data needs to evolve
Explore the challenges faced by organizations with monolithic architecture and the purpose of data warehouses in storing metadata and contextual information. Learn how these concepts are essential for efficient data management.
Make Your Own Quizzes and Flashcards
Convert your notes into interactive study material.
Get started for free