Databricks SQL Fundamentals Quiz

Podcast

Play an AI-generated podcast conversation about this lesson

Download our mobile app to listen on the go

Get App

Questions and Answers

What is the purpose of the command 'CREATE SCHEMA accounting LOCATION 'dbfs:/accounting/data';' in Databricks SQL?

It creates a new database called accounting.
It defines the storage location for the database accounting. (correct)
It alters the existing schema named accounting.
It drops the existing schema named accounting.

Which option correctly completes the command to return the first item from a nested array column 'products' in the transactions table?

products.0
products[0] (correct)
products.1
products.first()

Which of the following locations is incorrect for storing data in the accounting database created in Databricks SQL?

dbfs:/accounting/data.db (correct)
dbfs:/user/hive/warehouse/accounting.db
dbfs:/accounting/data
dbfs:/accounting/data/accounting.db

In creating a visualization from the given query, which option does not determine the Y Axis configuration?

Settings -> User Settings -> Scaling (B) Signup and view all the answers

Which visualization use case is most effectively executed using Databricks SQL compared to other visualization tools?

Complex aggregations across large datasets. (C) Signup and view all the answers

What permissions must a new user have to become an owner of a SQL warehouse?

Allow Database Creation entitlement (A), Allow Schema Creation entitlement (C) Signup and view all the answers

Which SQL command correctly updates 'Thomas' to 'Michel' in the Users table?

UPDATE Users SET LastName = 'Michel' WHERE LastName = 'Thomas' (D) Signup and view all the answers

Which syntax is correct to delete all matching rows in a target table using a source table?

MERGE INTO target USING source ON target.key = source.key WHEN MATCHED THEN DELETE (D) Signup and view all the answers

What characterizes discrete statistics?

It can take on only a finite amount of data. (B) Signup and view all the answers

Which of the following is NOT an expectation when managing data quality with Delta Live Tables?

A boolean statement that returns true based on conditions (B) Signup and view all the answers

Which entitlement is crucial for a user to create new tables in a SQL warehouse?

Allow Table Create/Delete entitlement (B) Signup and view all the answers

What is the correct term for updating existing data in a relational database?

UPDATE (C) Signup and view all the answers

In SQL, which command is used to remove rows from a table based on a condition?

DELETE FROM (A) Signup and view all the answers

What will be the result of the command: INSERT INTO stakeholders.suppliers TABLE stakeholders.new_suppliers?

The command fails because it is written incorrectly. (B) Signup and view all the answers

Which SQL command correctly expands the 'products' nested array column in the 'transactions' table to create a new row for each unique item?

explode(products) (D) Signup and view all the answers

If the command SELECT age, country FROM my_table WHERE age >= 75 AND country = 'canada'; is executed, what type of data will be returned?

Only records of individuals aged 75 or older from Canada. (C) Signup and view all the answers

To deduplicate data from the 'bronze' table and write it to a new 'silver' table, which type of SQL query could be used?

SELECT DISTINCT * FROM bronze INTO silver; (C) Signup and view all the answers

What is one potential issue with the command INSERT INTO stakeholders.suppliers TABLE stakeholders.new_suppliers?

The command relies on an incorrectly formed SQL statement. (B) Signup and view all the answers

What will happen if the command SELECT age, country FROM my_table WHERE age >= 75 AND country = 'canada'; is executed?

Only entries of individuals aged 75 or older from Canada will be displayed. (B) Signup and view all the answers

Which function is NOT suitable to use for separating unique items in a nested array column?

array(products) (C) Signup and view all the answers

If a data analyst wishes to deduplicate the 'bronze' table and save it to 'silver', which command could they use?

INSERT INTO silver SELECT DISTINCT * FROM bronze; (C) Signup and view all the answers

What effect does the given SQL command have on the suppliers table?

The suppliers table now contains both the data it had before the command was run and the data from the new_suppliers table, including any duplicate data. (C) Signup and view all the answers

Where can an admin or data owner grant permissions to a group in the database?

Settings (A) Signup and view all the answers

Which SQL command correctly fills in the blank to convert dollars_spent to hundreds of dollars?

ARRAY_MAP(dollars_spent, x -> x / 100) (B) Signup and view all the answers

What is the primary function of the INSERT command in SQL?

To add new records to a table. (A) Signup and view all the answers

What type of visualization is best suited for publication-grade presentations?

Organization-branded visualizations. (A) Signup and view all the answers

What data can be included in the new_suppliers table before the INSERT command is executed?

Any combination of new and existing data. (B) Signup and view all the answers

In the context of databases, what does a view represent?

A virtual table displaying the result of a query. (C) Signup and view all the answers

What is a common mistake when interpreting the output of an SQL command that combines data from two tables?

Assuming duplicates are always removed. (B) Signup and view all the answers

Which SQL query correctly calculates the average duration of appointments per doctor?

SELECT doctor_id, AVG(duration) as avg_duration FROM appointments GROUP BY doctor_id; (C) Signup and view all the answers

What SQL statement correctly retrieves the top 10% of customers based on total spending?

SELECT * FROM customers WHERE total_spend > (SELECT PERCENTILE(total_spend, 90) FROM customers); (D) Signup and view all the answers

Which command should a data engineer use to create a database if it doesn't already exist?

CREATE DATABASE IF NOT EXISTS customer360 LOCATION '/customer/customer360'; (C) Signup and view all the answers

Which SQL function is used to unnest the items column in a JSON structure?

EXPLODE() (D) Signup and view all the answers

What would be the result of the command: SELECT doctor_id, AVG(duration) as avg_duration FROM appointments GROUP BY doctor_id HAVING avg_duration > 0?

It filters out doctors with no appointments. (D) Signup and view all the answers

Which SQL command is appropriate to create a database named customer360 in a specific location only if it doesn't exist?

CREATE IF NOT EXISTS DATABASE customer360 LOCATION '/customer/customer360'; (D) Signup and view all the answers

In SQL, what function can be used to calculate average values but requires a counting mechanism in its context?

SUM()/COUNT() (B) Signup and view all the answers

If a query uses ORDER BY total_spend DESC OFFSET 10%, what is its likely purpose?

To show customers ranking from the 11th most spent onward. (D) Signup and view all the answers

What is the result of using the RANK() function with PARTITION BY region and ORDER BY sales DESC in a query?

It assigns a unique rank to each product within each region based on sales. (D) Signup and view all the answers

Which SQL function should a data analyst use to assign a relative rank to sales data within a region?

RANK() OVER (PARTITION BY region ORDER BY sales DESC) (D) Signup and view all the answers

What is the difference in behavior between PERCENT_RANK() and RANK() functions?

PERCENT_RANK() gives a percentage value while RANK() provides a whole number. (C) Signup and view all the answers

When would a data analyst choose to use higher-order functions in data analysis?

To apply custom logic at scale for data that is already unstructured. (B) Signup and view all the answers

Which statement accurately describes the output of the two given SQL statements in Databricks SQL?

The first statement returns only customers who have made orders. (B) Signup and view all the answers

What happens when a data analyst improperly uses GROUP BY with RANK() in SQL?

It may misinterpret the dataset leading to incorrect analysis. (A) Signup and view all the answers

What is a primary feature of the PERCENT_RANK() function in SQL?

It calculates the relative position of each row within the partition as a percentage. (C) Signup and view all the answers

Which scenario is least likely to benefit from using higher-order functions?

Performing simple calculations on scalar fields. (D) Signup and view all the answers

Flashcards

What permissions are needed to manage a SQL warehouse as a new owner?

When an admin transfers ownership of a SQL warehouse, the new owner needs specific permissions to manage the warehouse. These permissions grant the new user the ability to create databases, clusters, tables, and schemas within the warehouse.

How do you change 'Thomas' to 'Michel' in a table column?

The UPDATE statement in SQL is used to modify existing data in a table. In this case, the UPDATE statement changes the LastName column value to 'Michel' for all rows where the current LastName value is 'Thomas'.

How to delete rows from a target table that have matches in a source table?

The MERGE statement allows you to insert or update data in a target table based on matching records in a source table. In this case, rows in the target table with a matching key in the source table are deleted.

What is Discrete Statistics?

Discrete statistics deals with data that can only take on a finite number of values. It's used to analyze and understand data that has distinct categories or intervals.