Relational Data Model (Updated) PDF
Document Details
Uploaded by Deleted User
Tags
Related
Summary
This document provides information about the relational database model, including its logical structure, components, and relationship between tables. It details data redundancy handling and explains the purpose of indexing.
Full Transcript
Chapter 3 The Relational Database Model ©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise...
Chapter 3 The Relational Database Model ©2017 Cengage Learning®. May not be scanned, copied or duplicated, or posted to a publicly accessible website, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website or school-approved learning management system for classroom use. Learning Objectives Describe relational database model’s logical structure Identify relational model components Using relational database operators to manipulate tables Purpose of data dictionary and system catalog Describe data redundancy handling in relational model Explain purpose of indexing 2 Logical View of Data Relational database model enables logical representation of the data and its relationships Relational database consists of relations (tables) that consist of tuples (rows) and attributes (columns) 3 Characteristics of a Relational Table Table name: STUDENT Database name: Ch03_TinyCollege database 4 Entity Entity Type Entity Set A thing in the real world with Set of all entities of a particular entity A category of a particular entity independent existence type. Any particular row (a record) in a The name of a relation (table) in All rows of a relation (table) in RDBMS relation(table) is known as an entity. RDBMS is an entity type is entity set 5 Two dimensional (eight rows/tuples, twelve columns/attributes) Each row describes a single entity occurrence within the entity set Each column represents an attribute, and each column has a distinct name Values in a column match the attribute’s characteristics (e.g. numeric, character, date, logical data (e.g. true/false). Column’s range of permissible values = domain (e.g. STU_GPA [0,4] Order of rows and columns are not important Must have a primary key (e.g. STU_NUM) 6 Keys A key consists of one or more attributes that determine other attributes Used to: Database name: SQL_database Ensure that each row in a table is uniquely identifiable Establish relationships among tables to ensure the integrity of the data Primary key (PK): Attribute or combination of attributes that uniquely identifies any given row 7 Keys (cont.) Table name: STUDENT Database name: Ch03_TinyCollege database not be a good primary key because it is possible to find several students whose last name is Smith 8 Determination Is the basis for establishing the role of a key The state in which knowing the value of one attribute makes it possible to determine the value of another A–B=C Given any two values, the third value can be determined In the relational model, the determination is based on the relationships among the attributes 9 Determination (cont.) Table name: STUDENT Database name: Ch03_TinyCollege database STU_NUM in the STUDENT table means that you are able to look up (determine) that student’s last name, grade point average, phone number, and so on. The shorthand notation for “A determines B” is A → B. “STU_NUM determines STU_LNAME” is STU_NUM → STU_LNAME If A determines B, C, and D, you write A → B, C, D. “STU_NUM determines STU_FNAME, STU_INIT, STU_DOB” is STU_NUM → STU_FNAME, STU_INIT, STU_DOB In contrast, STU_NUM is not determined by STU_LNAME quite possible for several students to have the last name Smith. 10 Dependencies Functional dependence: Value of one or more attributes determines the value of one or more other attributes Determinant : Attribute whose value determines another Dependent: Attribute whose value is determined by the other attribute 11 Full Functional Dependence The dependent attributes are determined by the determinant attributes. 12 Full Functional Dependence (cont.) Table name: Student_info StudentID StudentName CampusAddress Major CourseID CourseTitle Instructor InstructorLocation CourseGrade What are determinants for CourseGrade? (?,?) CourseGrade (StudentID,CourseID) CourseGrade 13 Full Functional Dependence (cont.) ProjectNum ProjectName (ProjectNum, EmployeeNum) WorkingHours 14 Types of Keys Composite key Superkey Candidate key Primary key Foreign key Secondary key 15 Composite key A Key that is composed of more than one attribute Also known as a mutli-attribute key Each attribute that is part of a key is knows as a key attribute STU_NUM is a key composed of one key attribute (STU_LNAME, STU_FNAME, STU_INIT, STU_PHONE) is a composite key composed of four key attributes. 16 Superkey A key that can uniquely identify any row in the table A superkey functionally determines every attribute in the row STU_NUM and the following composite keys are superkeys: (STU_NUM) (STU_NUM, STU_LNAME) (STU_LNAME, STU_FNAME, STU_INIT, STU_PHONE) STU_NUM, with or without additional attributes, can be a superkey even when the additional attributes are redundant 17 Candidate Key A minimal superkey a superkey without unnecessary attributes Eligible option when we want to select the primary key Based on full functional dependency STU_NUM and (STU_LNAME, STU_FNAME, STU_INIT, STU_PHONE) are candidate keys (STU_NUM, STU_LNAME) is a superkey but it is not a candidate key, because 18 STU_LNAME could be removed and the key would still be a superkey. Relationships of Super Key, Candidate Key and Primary Key Example 19 Primary Key (PK) Providing unique identity to each row Entity integrity: Condition in which each row in the table has its own unique identity The existence of nulls in a table is often an indication of poor database design Null if used improperly, can create problems - represents: An unknown attribute value A known, but missing, attribute value A “not applicable” condition 20 PK example Relationships are implemented through common attributes, which is a form of controlled redundancy = makes the relational database work 1:M relationship between VENDOR and PRODUCT Figure 3.2 - An Example of a Simple Relational Database 21 PK example (cont.) In database terms, the multiple occurrences of the VEND_CODE values in the PRODUCT table are not redundant because they are required to make the relationship work the VENDOR-PRODUCT relationship = 1:M relationship between VENDOR and PRODUCT 22 Foreign Key (FK) The primary key of one table that has been placed into another table to create a common attribute An attribute or combination of attributes in one table whose values must either match the primary key in another table or be null Foreign keys ensure referential integrity: every reference to an entity instance by another entity instance is valid 23 Table name: Customer Table name: Order 24 https://databasetown.com/referential-integrity-in-dbms-rules-example/ FK example Does the PRODUCT table have referential integrity? Why? 25 Secondary Key An attribute or combination of attributes used strictly for data retrieval purposes. Example: Few customers will remember CUS_CODE Combination of CUS_LNAME and CUS_PHONE can be used as a composite secondary key Table name: Customer 26 Exercise - Keys Suppose we have a table to store the employees’ information. Which of the following is superkey? EmployeeID (EmployeeID, Name) (EmployeeID, Title) (EmployeeID, DepartmentID) (EmployeeID, Name, DepartmentID) 27 Exercise – Keys (cont…) Suppose we have a table to store the employees’ information. Which of the following is candidate key (minimal superkey)? EmployeeID (EmployeeID, Name) (EmployeeID, Title) (EmployeeID, DepartmentID) (EmployeeID, Name, DepartmentID) 28 Exercise – Keys (cont…) Suppose we have a table to store the employees’ information. _ID Which of the following is candidate key? EmployeeID EPF_ID 29 Summary of Relational Database Keys 30 Integrity Rules Let us revisit the two integrity rules which you have learned so far: Entity integrity – Primary Key Referential integrity – Foreign Key 31 Integrity Rules (cont…) A customer might not yet have an assigned sales representative (number), but it will be impossible to have an invalid sales representative (number). 32 Illustration of Integrity Rules Do the following table show entity and referential integrities? the (primary key) number 10013 contains a null entry in its AGENT_CODE foreign key because No null entries Mr. Paul F. Olowski Entries are does not yet have a unique sales representative assigned to him The remaining AGENT_CODE entries in the CUSTOMER table all match the AGENT_CODE entries in the AGENT table. No null entries Entries are unique 33 Ways to Handle Nulls Use flags to indicate the absence of some value To add Paul (CUS_CODE ‘10013’) without having agent yet. In AGENT, assign AGENT_CODE = ‘-99’ as flag Other integrity rules that can be enforced in relational model: NOT NULL constraint - Placed on a column to ensure that every row in the table has a value for that column UNIQUE constraint - Restriction placed on a column to ensure that no duplicate values exist for that column 34 Relational Database Operators The relational model is based on mathematical principles. The data manipulation can be denoted in mathematical terms. The mathematical operations are executed using powerful languages like SQL. 35 Relation vs Relvar Relation = the data contained in the tables Relvar = the variable (container) that holds a relation Example : Variable name “qty” holds integer data. The variable “qty” is a container (relvar) for holding integers. Relvar has two parts: Heading contains the names of the attributes and the body contains the relation 36 Relational Algebra Theoretical way of manipulating table contents using relational operators Relational operators have the property of closure Closure: Use of relational algebra operators on existing relations produces new relations 37 Relational Set Operators Select (Restrict) Unary operator that yields a horizontal subset of a table Project Unary operator that yields a vertical subset of a table Union Combines all rows from two tables, excluding duplicate rows Union-compatible: Tables share the same number of columns, and their corresponding columns share compatible domains Intersect Yields only the rows that appear in both tables Tables must be union-compatible to yield valid results 38 SELECT (or RESTRICT) Unary operator - only uses one table as input Output – horizontal subset (entire rows) of a table depending on specified criterion Represented by the lowercase Greek letter sigma (σ) Example: Select all of the rows in the CUSTOMER table that have the value ‘10010’ in the CUS_CODE attribute σcus_code = 10010 (customer) 39 SELECT (or RESTRICT) Example 40 PROJECT Unary operator - only uses one table as input Output – vertical subset (entire columns) of a table depending on the attributes requested Represented by the Greek letter pi(π) Example 1: To project the CUS_FNAME and CUS_LNAME attributes in the CUSTOMER table π cus_fname, cus_lname (customer) Example 2: To find the customer first and last name of the customer with customer code 10010 π cus_fname, cus_lname (σcus_code = 10010 (customer)) 41 PROJECT Example 42 UNION Input : Two tables that are union-compatible Union-compatible: Tables share the same number of columns, and their corresponding columns share compatible domains Output – combination of rows from two tables, excluding duplicates Represented by the symbol ∪ 43 UNION Example 1: Assume SUPPLIER and VENDOR are union-compatible, a UNION between them is as follow supplier ∪ vendor Example 2: If SUPPLIER and VENDOR are not union-compatible, to produce a list of supplier and vendor names π supplier_name (supplier) ∪ π vendor_name (vendor) 44 INTERSECT Input: Two tables that are union-compatible Output - Rows that appear in both tables Denoted by the symbol ∩ 45 INTERSECT Example 1: Assume SUPPLIER and VENDOR are union-compatible, an INTERSECT between them is as follows supplier ∩ vendor Example 2: If SUPPLIER and VENDOR are not union-compatible, to produce a list of supplier and vendor names that are the same in both tables π supplier_name (supplier) ∩ π vendor_name (vendor) 46 DIFFERENCE Input: Two tables that are union-compatible Output - All rows in one table that are not found in the other table Represented by the minus symbol – 47 DIFFERENCE Example 1: Assume SUPPLIER and VENDOR are union-compatible, the DIFFERENCE of SUPPLIER minus VENDOR is as follows supplier − vendor Example 2: If SUPPLIER and VENDOR are not union-compatible, to produce a list of any supplier names that do not appear as vendor name π supplier_name (supplier) − π vendor_name (vendor) 48 PRODUCT Input: Two tables Output - All possible pairs of rows from two tables Represented by the multiplication symbol, x 49 PRODUCT Example 1: The PRODUCT of CUSTOMER and AGENT tables is as follows customer x agent Example 2: If SUPPLIER and VENDOR are not union-compatible, to produce a list of any supplier names that do not appear as vendor name π supplier_name (supplier) − π vendor_name (vendor) 50 JOIN Input: Two or more tables Output – Combine information from two or more tables Represented by the bowtie symbol, ⋈ Inner join : only returns matched records from joined table Natural join Equijoin Theta join Outer join : unmatched values in other table would be left null Left outer join Right outer join 51 JOIN Two tables that will be used in JOIN illustrations 52 NATURAL JOIN Link tables by selecting rows with common values in their common attributes A natural join is the result of 3-stage process Derived from fundamental operators such as PRODUCT, SELECT and PROJECT 53 NATURAL JOIN Example customer ⋈ agent 54 NATURAL JOIN Example – Step 1 customer x agent 55 NATURAL JOIN EXAMPLE – Step 2 56 NATURAL JOIN EXAMPLE – Step 3 57 EQUIJOIN and THETA JOIN Equijoin: Link tables on the basis of an equality condition that compares specified columns of each table The equijoin takes its name from the equality comparison operator (=) used in the condition Do not eliminate duplicate columns The outcome will produce a table similar to Step 2 of the natural join Theta join: Represented by ⋈θ Link table using inequality comparison operator (, =) in the join condition Equijoin is a special case of theta join 58 OUTER JOIN The joins discussed so far are known as inner joins Inner joins returns matched records from tables that are being joined Outer join return matched pairs and unmatched values in the other table are left null Useful to determine what values are causing referential integrity problems Two types: Left outer join Right outer join 59 LEFT AND RIGHT OUTER JOIN Left outer join: Yields all of the rows in the first table (left table), including those that do not have a matching value in the second table Right outer join: Yields all of the rows in the second table (right table), including those that do not have matching values in the first table 60 LEFT OUTER JOIN Example Refers to non-existent agent (referential integrity problem) customer agent 61 RIGHT OUTER JOIN Example Refers to non-existent customer (referential integrity problem) customer agent 62 DIVIDE Input: Uses one 2-column table (dividend) and one single- column table (divisor) Output - A single column that contains all values from the second column of the dividend that are associated with every row in the divisor Represented by the division symbol ÷ 63 DIVISION Example Given two relations, R and S, the division of them is R÷S 64 The Data Dictionary & the System Catalog Data dictionary: Description of all tables in the database created by the user and designer Contain metadata Described as “the database designer’s database” Because it records the design decisions about tables and their structures 65 Data Dictionary Example 66 System Catalog System Catalog: System data dictionary that describes all objects within the database Also contains metadata Example : data about the table names, table creator and creation date, number of column in each table etc. 67 Homonyms and Synonyms Use data dictionary to detect homonyms and synonyms Avoid homonyms and synonyms to prevent confusion Homonym: Same word for different attributes Example: C_NAME is used to refer to customer name and consultant name Synonym: Use different names to describe same attributes Example: car and auto are used to refer to the same object 68 Relationships, Redundancies and Indexes 69 Relationships in Relational Database 70 1:M Relationship Norm for relational databases Examples: 71 1:M Relationship Implementation The primary key is placed on the “1” side and the foreign key is placed on the “many” side 72 1:1 Relationship One entity can be related to only one other entity and vice versa Rare in any relational database design Example: One professor chairs one department. 73 1:1 Relationship Implementation Need to select one side to place the foreign key Tip: Pick the side to minimize null values 74 M:N Relationship Not directly supported in relational environment Implemented by changing the M:N relationship into two 1:M relationships with a composite/bridge/associative entity Example: Bridge entity 75 M:N Relationship Implementation Bridge entity will have a composite primary key consisting of foreign keys from the two original tables 76 Data Redundancy Revisited Data redundancy leads to data anomalies that destroys the effectiveness of a database Proper use of foreign keys minimizes data redundancies To be controlled except the following circumstances Data redundancy must be increased to store crucial information such as historical accuracy of data 77 Example of Historical Accuracy LINE_PRICE is the same as PROD_PRICE but still exists in LINE table to preserve historical accuracy 78 Indexes Indexes are used in many places 79 Indexes (cont…) Indexes are also used in relational database environment Indexes for fast retrieval of information based on other columns Consists of an index key and a set of pointers Index key: Index’s reference point that leads to data location identified by the key For a unique index, each key has only one pointer value 80 Dr. Codd’s 12 Relational Database Rules In 1985, Dr. E. F. Codd published a list of 12 rules for relational database Some dominant database vendors do not fully support all 12 rules 81 Dr. Codd’s 12 Relational Database Rules 82 Dr. Codd’s 12 Relational Database Rules 83