Unity Catalog part 1/2

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

What is the main benefit of using the Unity Catalog in Databricks?

  • It provides unified governance for data, analytics, and AI. (correct)
  • It does not support ANSI SQL.
  • It requires hard migration of existing data.
  • It allows physical separation of storage and control.

What element is NOT part of the metastore in the Unity Catalog?

  • External location
  • Shared storage (correct)
  • Store credentials
  • List of schemas

In the Unity Catalog architecture, what does the cloud storage component return?

  • Audit logs of data queries.
  • The user's credentials for access.
  • The data requested by the compute resource. (correct)
  • Short-lived tokens for access.

Which access mode does NOT support Unity Catalog?

<p>No isolation shared (A)</p> Signup and view all the answers

What function does data lineage serve in data governance?

<p>To capture upstream sources and downstream consumers. (B)</p> Signup and view all the answers

Which statement best describes the function of the audit log in the query life cycle?

<p>It captures metadata and namespaces checked by Databricks. (D)</p> Signup and view all the answers

What is a key characteristic of the Unity Catalog's security model?

<p>It implements a unified access control system across different workspaces. (A)</p> Signup and view all the answers

When using the Unity Catalog, what role does the principal play in the query life cycle?

<p>It initiates the query to access data. (C)</p> Signup and view all the answers

What does data access control in data governance ensure?

<p>It defines who can access or manage specific data sets. (A)</p> Signup and view all the answers

Which of the following best describes the functionality of the catalog in Unity Catalog?

<p>It combines various schemas, tables, views, and functions into an organized structure. (A)</p> Signup and view all the answers

Flashcards

Unity Catalog

A Databricks feature that provides unified governance for data, analytics, and AI across clouds, using ANSI SQL.

Data Governance

A set of policies and practices to control who accesses data, audit access, track data lineage, and discover data assets.

Data Access Control

The process of controlling who has access to specific data.

Data Access Audit

The process of recording and tracking all data access requests.

Signup and view all the flashcards

Data Lineage

The process of tracking the history of data from its sources to its consumers.

Signup and view all the flashcards

Data Discovery

The ability to search for and find authorized data assets.

Signup and view all the flashcards

Metastore

The top-level logical container in Unity Catalog, containing storage credentials and references to data assets.

Signup and view all the flashcards

Catalog

A container of schemas (databases) in the Unity Catalog, which in turn contain tables, views, and functions.

Signup and view all the flashcards

SQL Query Access

Accessing data in Unity Catalog using SQL statements specifying the catalog, schema, and object.

Signup and view all the flashcards

Unity Catalog Security Model

A security model for managing data access and security in Unity Catalog.

Signup and view all the flashcards

Query Life Cycle (Unity Catalog)

The sequence of steps involved in executing a query against the Unity Catalog.

Signup and view all the flashcards

Cluster Access Mode

Determines which cluster settings support Unity Catalog.

Signup and view all the flashcards

Study Notes

Unity Catalog Overview

  • Unity Catalog is a data governance tool for data, analytics, and AI.
  • It offers fine-grained governance across multiple cloud environments.
  • Supports open standards like ANSI SQL.
  • Unifies data and AI assets for central management and access.
  • Works with existing data, storage, and catalogs without migration.

Data Governance Features

  • Data access control: Controls who can access specific data.
  • Data access audit: Records all data access activity.
  • Data lineage: Tracks the origin and flow of data.
  • Data discovery: Enables searching for and finding authorized data assets.

Unity Catalog Architecture

  • Metastore: The top-level logical container in Unity Catalog.
    • Contains credentials.
    • Defines external locations.
    • Houses schemas (databases) that organize tables, views, and functions.
  • Catalog: A logical container for schemas, tables, views, and functions and is a three level namespace
    • catalog.schema.table is used to access an object.
  • Hive Metastore: A special catalog for legacy access to data.
  • Workspaces: Different workspaces can reuse access control lists and security policies.

Security Model

  • Query life cycle: Starts with a user request, checks the Unity Catalog for permissions, retrieves the data, and returns it to the user.
  • Principal checks: Databricks verifies authentication and permissions.
  • Cloud storage: Short-lived tokens and signed URLs secure the data retrieval.
  • Compute resources: Vary with cluster access modes.
    • Single-user mode: Unity Catalog is supported.
    • Shared mode: Unity Catalog is supported.
    • Shared (no isolation) mode: Unity Catalog is not supported.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Unity Catalog PDF

More Like This

Use Quizgecko on...
Browser
Browser