Data Normalization Rules

SociableForeshadowing avatar
SociableForeshadowing
·
·
Download

Start Quiz

Study Flashcards

22 Questions

What is the primary goal of data normalization?

To minimize data redundancy and dependency

What is the difference between an INNER JOIN and a LEFT JOIN?

An INNER JOIN returns only the rows that have matching values in both tables, while a LEFT JOIN returns all the rows from the left table and the matching rows from the right table.

What is the purpose of the PRIMARY KEY constraint in a CREATE TABLE statement?

To create a unique identifier for each row

What is indexing in the context of query optimization?

Creating indexes on frequently used columns to speed up queries

What is the purpose of a scalar subquery?

To return a single value

What is denormalization in the context of query optimization?

Storing redundant data to reduce the number of joins

What is the purpose of the EXPLAIN tool in query optimization?

Analyzing the query plan to identify bottlenecks

What is the difference between a table subquery and a row subquery?

A table subquery returns a table, while a row subquery returns a single row

What is the purpose of the SELECT command in SQL?

The purpose of the SELECT command is to retrieve data from a database table.

What is the difference between single quotes and double quotes in SQL string literals?

There is no difference, both single quotes and double quotes can be used to enclose string literals in SQL.

What is the function of the WHERE clause in a SQL query?

The WHERE clause specifies conditions for which records to include in the result set.

What is the purpose of the IN operator in a SQL query?

The IN operator specifies a list of values to include in the result set.

What does the SUM aggregate function do in SQL?

The SUM function calculates the total value of a column.

What is the main difference between the AND and OR operators in a SQL WHERE clause?

The AND operator combines multiple conditions and only returns records that meet all conditions, while the OR operator returns records that meet any of the conditions.

What is the purpose of the CREATE command in SQL?

The CREATE command creates a new database table or other object.

What does the UPDATE command do in SQL?

The UPDATE command updates existing data in a database table.

What is the purpose of the MAX function in SQL, and provide an example of its usage?

The MAX function returns the maximum value of a column. Example: SELECT MAX(price) FROM orders;

What is the difference between an INNER JOIN and a LEFT JOIN, and provide an example of each?

An INNER JOIN returns records that have matching values in both tables, while a LEFT JOIN returns all records from the left table and the matched records from the right table. Example: SELECT * FROM customers INNER JOIN orders ON customers.id = orders.customer_id; and SELECT * FROM customers LEFT JOIN orders ON customers.id = orders.customer_id;

What is the purpose of the COUNT function in SQL, and provide an example of its usage?

The COUNT function returns the number of rows in a table. Example: SELECT COUNT(*) FROM customers;

What is a subquery in SQL, and provide an example of its usage?

A subquery is a query nested inside another query, used to return a set of records to use in the main query, perform calculations, or test for existence or non-existence of records.

How does the RIGHT JOIN differ from the LEFT JOIN in SQL, and provide an example of its usage?

A RIGHT JOIN returns all records from the right table, and the matched records from the left table. Example: SELECT * FROM customers RIGHT JOIN orders ON customers.id = orders.customer_id;

What is the purpose of using aggregate functions like AVG, MAX, and MIN in SQL, and provide an example of each?

Aggregate functions like AVG, MAX, and MIN are used to perform calculations on a set of values, such as calculating the average price, maximum price, or minimum price of orders. Example: SELECT AVG(price) FROM orders;, SELECT MAX(price) FROM orders;, and SELECT MIN(price) FROM orders;

Study Notes

Data Normalization

  • Goal: To minimize data redundancy and dependency
  • Normalization rules:
    1. First Normal Form (1NF): Each table cell must contain a single value
    2. Second Normal Form (2NF): Each non-prime attribute in a table must depend on the entire primary key
    3. Third Normal Form (3NF): If a table is in 2NF, and a non-prime attribute depends on another non-prime attribute, then it should be moved to a separate table
    4. Boyce-Codd Normal Form (BCNF): A table is in 3NF and there are no transitive dependencies
    5. Higher Normal Forms: 4NF, 5NF, and higher, which deal with more complex dependencies

Joining Tables

  • Types of joins:
    • INNER JOIN: Returns only the rows that have matching values in both tables
    • LEFT JOIN: Returns all the rows from the left table, and the matching rows from the right table
    • RIGHT JOIN: Returns all the rows from the right table, and the matching rows from the left table
    • FULL JOIN: Returns all the rows from both tables
    • CROSS JOIN: Returns the Cartesian product of both tables
  • Join syntax:
    • SELECT * FROM table1 INNER JOIN table2 ON table1.column = table2.column
    • SELECT * FROM table1 LEFT JOIN table2 ON table1.column = table2.column

Create Table Syntax

  • Basic syntax: CREATE TABLE table_name (column1 data_type, column2 data_type, ...)
  • Data types:
    • INT: Whole numbers
    • VARCHAR: Character strings with a maximum length
    • DATE: Dates in the format 'YYYY-MM-DD'
    • TIME: Times in the format 'HH:MM:SS'
  • Constraints:
    • PRIMARY KEY: Unique identifier for each row
    • UNIQUE: Ensures all values in a column are unique
    • NOT NULL: Ensures a column cannot be null
    • CHECK: Ensures a column meets a specific condition

Query Optimization

  • Techniques:
    • Indexing: Creating indexes on frequently used columns to speed up queries
    • Caching: Storing frequently accessed data in memory for faster retrieval
    • Optimizing SQL: Rewriting queries to minimize the number of rows and columns accessed
    • Denormalization: Storing redundant data to reduce the number of joins
  • Tools:
    • EXPLAIN: Analyzing the query plan to identify bottlenecks
    • Query profiling: Measuring the execution time and resource usage of queries

Subqueries

  • A query nested inside another query
  • Types:
    • Scalar subquery: Returns a single value
    • Row subquery: Returns a single row
    • Table subquery: Returns a table
  • Syntax:
    • SELECT * FROM table WHERE column = (SELECT value FROM table2)
    • SELECT * FROM table WHERE column IN (SELECT value FROM table2)
    • SELECT * FROM table WHERE EXISTS (SELECT * FROM table2 WHERE condition)

Data Normalization

  • Minimizes data redundancy and dependency
  • Normalization rules:
    • First Normal Form (1NF): Each table cell must contain a single value
    • Second Normal Form (2NF): Each non-prime attribute in a table must depend on the entire primary key
    • Third Normal Form (3NF): If a table is in 2NF, and a non-prime attribute depends on another non-prime attribute, then it should be moved to a separate table
    • Boyce-Codd Normal Form (BCNF): A table is in 3NF and there are no transitive dependencies
    • Higher Normal Forms: 4NF, 5NF, and higher, which deal with more complex dependencies

Joining Tables

  • Types of joins:
    • INNER JOIN: Returns only the rows that have matching values in both tables
    • LEFT JOIN: Returns all the rows from the left table, and the matching rows from the right table
    • RIGHT JOIN: Returns all the rows from the right table, and the matching rows from the left table
    • FULL JOIN: Returns all the rows from both tables
    • CROSS JOIN: Returns the Cartesian product of both tables
  • Join syntax:
    • SELECT * FROM table1 INNER JOIN table2 ON table1.column = table2.column
    • SELECT * FROM table1 LEFT JOIN table2 ON table1.column = table2.column

Creating Tables

  • Basic syntax: CREATE TABLE table_name (column1 data_type, column2 data_type,...)
  • Data types:
    • INT: Whole numbers
    • VARCHAR: Character strings with a maximum length
    • DATE: Dates in the format 'YYYY-MM-DD'
    • TIME: Times in the format 'HH:MM:SS'
  • Constraints:
    • PRIMARY KEY: Unique identifier for each row
    • UNIQUE: Ensures all values in a column are unique
    • NOT NULL: Ensures a column cannot be null
    • CHECK: Ensures a column meets a specific condition

Query Optimization

  • Techniques:
    • Indexing: Creating indexes on frequently used columns to speed up queries
    • Caching: Storing frequently accessed data in memory for faster retrieval
    • Optimizing SQL: Rewriting queries to minimize the number of rows and columns accessed
    • Denormalization: Storing redundant data to reduce the number of joins
  • Tools:
    • EXPLAIN: Analyzing the query plan to identify bottlenecks
    • Query profiling: Measuring the execution time and resource usage of queries

Subqueries

  • A query nested inside another query
  • Types:
    • Scalar subquery: Returns a single value
    • Row subquery: Returns a single row
    • Table subquery: Returns a table
  • Syntax:
    • SELECT * FROM table WHERE column = (SELECT value FROM table2)
    • SELECT * FROM table WHERE column IN (SELECT value FROM table2)
    • SELECT * FROM table WHERE EXISTS (SELECT * FROM table2 WHERE condition)

Basics of SQL

  • SQL is a standard language for managing relational databases, used to store, manipulate, and retrieve data stored in a database.
  • SQL is not case-sensitive, but it's common to write commands in uppercase and table/column names in lowercase.

Data Types in SQL

  • Integer: whole numbers, e.g. 1, 2, 3, etc.
  • String: character strings, e.g. 'hello', "hello", etc. Strings can be enclosed in single quotes or double quotes.
  • Date: dates, e.g. '2022-01-01', etc.
  • Boolean: true or false values.

Basic SQL Commands

  • SELECT: retrieves data from a database table, e.g. SELECT * FROM customers;
  • CREATE: creates a new database table or other object, e.g. CREATE TABLE customers (id INT, name VARCHAR(255), email VARCHAR(255));
  • INSERT: inserts new data into a database table, e.g. INSERT INTO customers (id, name, email) VALUES (1, 'John Doe', '[email protected]');
  • UPDATE: updates existing data in a database table, e.g. UPDATE customers SET name = 'Jane Doe' WHERE id = 1;
  • DELETE: deletes data from a database table, e.g. DELETE FROM customers WHERE id = 1;

SQL Query Clauses

  • WHERE: specifies conditions for which records to include in the result set, e.g. SELECT * FROM customers WHERE country='USA';
  • AND: combines multiple conditions in a WHERE clause, e.g. SELECT * FROM customers WHERE country='USA' AND city='New York';
  • OR: combines multiple conditions in a WHERE clause, e.g. SELECT * FROM customers WHERE country='USA' OR country='Canada';
  • IN: specifies a list of values to include in the result set, e.g. SELECT * FROM customers WHERE country IN ('USA', 'Canada', 'Mexico');

Aggregate Functions in SQL

  • SUM: calculates the total value of a column, e.g. SELECT SUM(price) FROM orders;
  • AVG: calculates the average value of a column, e.g. SELECT AVG(price) FROM orders;
  • MAX: returns the maximum value of a column, e.g. SELECT MAX(price) FROM orders;
  • MIN: returns the minimum value of a column, e.g. SELECT MIN(price) FROM orders;
  • COUNT: returns the number of rows in a table, e.g. SELECT COUNT(*) FROM customers;

SQL Joins

  • INNER JOIN: returns records that have matching values in both tables, e.g. SELECT * FROM customers INNER JOIN orders ON customers.id = orders.customer_id;
  • LEFT JOIN: returns all records from the left table, and the matched records from the right table, e.g. SELECT * FROM customers LEFT JOIN orders ON customers.id = orders.customer_id;
  • RIGHT JOIN: returns all records from the right table, and the matched records from the left table, e.g. SELECT * FROM customers RIGHT JOIN orders ON customers.id = orders.customer_id;

SQL Subqueries

  • A subquery is a query nested inside another query.
  • Subqueries can be used to:
    • Return a set of records to use in the main query.
    • Perform calculations and return a single value.
    • Test for existence or non-existence of records.

Learn about the rules of data normalization, including 1NF, 2NF, 3NF, and BCNF, to minimize data redundancy and dependency.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free

More Quizzes Like This

Data Modeling and Normalization
16 questions
Data Modeling and Normalization
16 questions
Modelo Relacional en Bases de Datos
39 questions
Use Quizgecko on...
Browser
Browser