D426 Data Management Foundations Study Guide PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document is a study guide for database management concepts. It outlines database roles, including database administrators, authorization, rules, query processors, query optimization, storage managers, and transaction managers. It also details SQL data types, arithmetic and comparison operators, and database design phases.
Full Transcript
**[D426 Data Management Foundations Study Guide]** **Database Roles:** - [Database administrator:] Secures database system against unauthorized users (Controls user access availability). - [Authorization:] Limits what each user can access - [Rules:] Ensures that the database system is...
**[D426 Data Management Foundations Study Guide]** **Database Roles:** - [Database administrator:] Secures database system against unauthorized users (Controls user access availability). - [Authorization:] Limits what each user can access - [Rules:] Ensures that the database system is consistent with structural and business rules - [Query Processor:] Reads and understands a query. Creates plans to determine the best way to get or change data and sends the results back to the application. - [Query optimization:] Finds the best way to run a query to retrieve or modify data as efficiently as possible. The Query Processor completes this. - [Storage Manager]: Takes the query processor's instructions and turns them into file commands to get or change data. Uses indexes to quickly locate data - [Transaction Manager:] Makes sure transactions are properly executed. Prevents conflicts between concurrent transactions and restores the database in the event of failure. **SQL Datatypes** - [INT]: Stores integer values (positive & negative). +-------------+-------------+-------------+-------------+-------------+ | Category | Example | Data type | Storage | Notes | +=============+=============+=============+=============+=============+ | Integer | 34 and | TINYINT | 1 byte | Signed: 128 | | | -739448 | | | to 127 | | | | | | | | | | | | Unsigned: 0 | | | | | | to 255 | +-------------+-------------+-------------+-------------+-------------+ | | | SMALLINT | 2 bytes | Signed: | | | | | | -32,768 to | | | | | | 32,767\ | | | | | | Unsigned: 0 | | | | | | to 65,535 | +-------------+-------------+-------------+-------------+-------------+ | | | MEDIUMINT | 3 bytes | Signed: | | | | | | -8,388,608 | | | | | | to | | | | | | 8,388,607 | | | | | | | | | | | | Unsigned: 0 | | | | | | to | | | | | | 16,777,215 | +-------------+-------------+-------------+-------------+-------------+ | | | INT | 4 bytes | Signed: | | | | | | -2,147,483, | | | | | | 648 | | | | | | to | | | | | | 2,147,483,6 | | | | | | 47 | | | | | | | | | | | | Unsigned: 0 | | | | | | to | | | | | | 4,294,967,2 | | | | | | 95 | +-------------+-------------+-------------+-------------+-------------+ | | | BIGINT | 8 bytes | Signed: | | | | | | -263 to 263 | | | | | | -1 | | | | | | | | | | | | Unsigned: 0 | | | | | | to 264 -1 | +-------------+-------------+-------------+-------------+-------------+ - [DECIMAL]: Stores fractional numeric values - [VARCHAR]: Stores textual values - [DATE]: Stores year, month, and day (format: 1999-12-30) **Arithmetic and Comparison Operators** [+]: Adds two numeric values. Example: 4 + 3 = 7. [- (unary)]: Reverses the sign of one numeric value. Example: -(-2) = 2. [- (binary)]: Subtracts one numeric value from another. Example: 11 - 5 = 6. [% (modulo)]: Divides one numeric value by another and returns the integer remainder. Example: 5 % 2 = 1. [=]: Compares two values for equality. Example: 1 = 2 is FALSE. [\>]: Compares two values with \>. Example: \'2019-08-13\' \> \'2021-08-13\' is FALSE. [!=:] Compares two values for inequality Example: 1 !=2 **Database Design Phases:** - [Analysis:] Outlines what the database needs without focusing on any specific system - [Entities:] Things or people (customers, products, orders) - [Relationships:] How entities are related (Customer places an order) - [Attributes:] Details about the entity (Customer name or order date) - [Logical Design:] Turns database requirements into a specific system format - [Entities] = tables [Relationships] = keys [attributes] = columns - [Physical Design:] Adds indexes to speed up data retrieval and table organizations to optimize performance. Has data independence so it does not change the actual results of queries only improves speed (Rule 7) **SQL Commands:** [CREATE TABLE:] Creates a table [INSERT:] Inserts rows into a table ![](media/image2.png) [SELECT:] Retrieves data from a table A black screen with white text Description automatically generated [UPDATE:] Modifies data in a table ![A black background with white text Description automatically generated](media/image4.png) [DELETE:] Deletes rows from a table A black background with white text Description automatically generated [ALTER TABLE:] adds, deletes, or modifies columns on an existing table ![](media/image6.png) [TRUNCATE:] Deletes all rows from a table. It does not log individual row deletions making it quicker [MERGE:] Selects data from one table (source) and inserts the data to another table (target). [RESTRICT:] rejects an insert, update, or delete that violates referential integrity [SET NULL:] sets invalid foreign keys to NULL [SET DEFAULT:] Sets invalid foreign keys to the foreign key default value [CASCADE:] Propagates primary key changes to foreign keys ![](media/image8.png) **[SQL Operators]** [BETWEEN:] Provides an alternative way to determine if a value is between two other values [LIKE:] Used in a WHERE clause to match text against a pattern in a column. You use % (multiple characters) or \_ for single characters. ![](media/image10.png) *Finds all employees name who start with J* [Alias:] Temporary name assigned to a column or table using the AS keyword. **SQL Functions:** +-----------------------+-----------------------+-----------------------+ | Function | Description | Example | +=======================+=======================+=======================+ | ABS(n) | Returns the absolute | SELECT ABS(-5) | | | value of n | | | | | Returns: 5 | +-----------------------+-----------------------+-----------------------+ | LOWER(s) | Returns the lowercase | SELECT LOWER | | | s | ('MYSQL') | | | | | | | | Returns: 'mysql' | +-----------------------+-----------------------+-----------------------+ | TRIM(s) | Returnst he string s | SELECT TRIM (' | | | w/o leading and | test'); | | | trailing spaces | | | | | Returns: 'test' | +-----------------------+-----------------------+-----------------------+ | HOUR(t) | Returns the hour, | SELECT | | | minute, or second | HOUR(\'22:11:45\'); | | MINUTE(t) | from time t | | | | | returns 22 | | SECOND(t) | | | | | | SELECT | | | | MINUTE(\'22:11:45\'); | | | | | | | | returns 11 | | | | | | | | SELECT | | | | SECOND(\'22:11:45\'); | | | | | | | | returns 45 | +-----------------------+-----------------------+-----------------------+ **Data Structure** - [Tuple:] Ordered collection of elements enclosed in parentheses (a,b,c) - [Table:] has a name, fixed columns, and rows that can change - [Column:] has a name and specific data type (numbers or text) - [Row:] holds values that match the columns. Each value fits the column's data type - [Cell:] a single column of a single row A table must have at least one column but any number of rows. Table without rows is called an empty table **SQL Elements** - [Literals:] Explicit string, numeric, or binary values ('String', 123) - [Keywords]: Words with special meaning (SELECT, FROM, WHERE) - [Identifiers]: Objects from the database like tables, columns, etc. (City, Name, Population) - [Primary Key]: Column or group of columns used to identify a row (typically the first column) - [Simple Primary Key]: Consists of a single column - [Composite primary Key:] Consists of multiple columns - [Artificial key:] Single column primary key used when there is no suitable single-column or composite primary key exists. These are integers that are automatically generated. - [Candidate Key:] Simple or composite column that is unique and minimal (table may have several) - [Auto-increment column:] Numeric column that automatically increases by one each time a new row is added - [Foreign Key:] A column or group of columns that refer to a primary key. The data type must be the same! - The database will reject insert, update, and delete statements that violate referential integrity - [Third normal form:] Every non-key column A depends on a key column B (B must be unique). - [Boyce Codd normal form:] Column A depends on B. So B must be unique Very similar to Third normal for, but stricter because it will remove non-key term. **SQL Sublanguages** - [Data Definition Language (DDL):] Defines and sets up the structure of the database (Creating tables) - [Data Query Language (DQL):] Retrieves data from the database (SELECT) - [Data Manipulation Language (DML):] Changes or manipulates the data in the database (INSERT, UPDATE, DELETE) - [Data Control Language (DCL):] Controls user permissions and access (GRANT, REVOKE) - [Data Transaction Language (DTL):] manages database transactions (ROLLLBACK, COMMIT) **Aggregate Functions** - HAVING clause filters group results when used with GROUP BY - Used in the SELECT clause to calculate values from multiple rows. If you use a WHERE clause then they only process rows that meet that condition. No WHERE means all rows will be processed - COUNT(): counts the number of rows in the set - [MIN():] finds the minimum value in the set - [MAX():] finds the maximum value in the set - [SUM():] sums all the values in the set - [AVG()]: computes the arithmetic mean of all values in the set **JOIN and Union in SQL** - JOIN is a SELECT statement that combines data from two tables known as left table and right table into a single result. The columns must have comparable data types - [INNER JOIN:] selects only matching left and right table rows - [FULL JOIN:] selects all left and right table rows - [LEFT JOIN:] selects all left table rows but only matching right table rows - [RIGHT JOIN:] [ ] selects all right table rows, but only matching left table rows - [OUTER JOIN:] selects unmatched rows, including left, right, and full joins - [UNION:] combines the two results into one table ![](media/image12.png) - [Equijoin:] Compares columns of two tables with the = - [Non-]equijoin: Compares columns with \< and \> [ ] - [Self-join:] Joins a table to itself - [Cross-join:] Combines two tables without comparing columns (uses CROSS JOIN, not ON) **Entity Relationship Model:** - Known as ER diagram. It is a high-level diagram that shows how data entities relate to each other. It focuses on data requirements and relationships without worrying about how the data will be stored or implemented in a database. - Types usually become tables, foreign keys, and columns - [Entity Type:] set of things (All employees in a company) - [Relationship Type:] Set of related things (Employee-Manges Department) - [Attribute type:] Set of values (Employee salaries) - Instance is an element of a set - [Entity instance]: Individual thing (Employee Sam Snead) - [Relationship instance:] Statement about entity instances ("Maria Rodriguez manages Sales") - [Attribute instance:] Individual value (Salary is \$35,000) **Steps:** - [Analysis Steps:] Creates an ER model to outline data needs and relationships without focusing on how it will be implemented - [Logical Design:] Transforms the ER model into tables, columns, and keys specific to a database system - [Physical Design:] Adds indexes and organizes tables on storage media to optimize performance **Table Structures:** - [Heap table:] No specific order for rows. Good for fast and bulk loading because rows are stored in the order they are inserted. - [Sorted table:] Ordered based on a specific column. This is fast - [Hash table:] Rows are placed into buckets using a hash function. This helps locate rows quickly - [Table cluster:] Multiple tables share the same storage area. This can improve access times for related data. **Other Definitions:** [Application Programming Interface (API):] Makes it easier to use SQL with other programming languages. Has tools and rules that lets different programs communicate with each other. [MySQL Command-Line Client]: Text-based tool included with MySQL Server. It shows error codes and shows descriptions of the SQL statement if there are any issues or if it cannot be executed. [Business Rules:] Rules based on business policy and specific to a particular database. [Subquery:] Is a query within another SQL query. This is also known as nested query or inner query [Materialized View:] Data is always stored. If a base table changes the corresponding view tables can also change. [Cardinality:] Refers to maxima and minima of relationships and attributes [Subtype entity:] subset of a supertype entity. Ex: Managers are a subset of employees, so Manager is a subtype entity of the Employee supertype entity. On ER diagrams, subtype entities are drawn within the supertype. Vehicle Supertype entity and "car" "truck" "motorcycle" subtype [Intangible entity:] Documented in the data model, but not tracked with data in the database [Normalization:] Eliminates redundancy by decomposing a table into two or more tables [Denormalization:] Intentionally introducing redundancy by merging tables [Hit ratio:] Measures percentage of table rows that a query retrieves. [Binary search:] Database repeatedly splits the index in two until it finds the entry containing the search value [Dense index:] Contains an entry for every table row [Sparse index:] contains an entry for every table block