DB Lecture 2 PDF
Document Details
Uploaded by BraveBanjo667
Dr. Rachsuda Setthawong
Tags
Summary
This document provides a lecture on relational database systems, including introductory information about the relational model, fundamental relational algebra operators, and database design phases. The keywords reflect the content of the document.
Full Transcript
Lecture 2: Introduction to Relational Model and Fundamental Relational Algebra Operators CSX3006 DATABASE SYSTEMS CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 1 Big Picture: Design Phases - 1 Application...
Lecture 2: Introduction to Relational Model and Fundamental Relational Algebra Operators CSX3006 DATABASE SYSTEMS CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 1 Big Picture: Design Phases - 1 Application Requirements Entity- Conceptual Conceptual Database Relationship Model Design Data Model Logical Database Relational Logical Model Design Model Relational DBMS Physical Database File Design Organization Physical and Access Model Path Database Implementation CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 2 Big Picture: Design Phases - 2 Conceptual Model ◦ Captures the requirements and rules of an application in terms of semantics of required entities, relationships among the entities and consistency constraints. ◦ Entity-Relationship Model Logical Model Physical Model CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 3 Big Picture: Design Phases - 3 Conceptual Model Logical Model ◦ Specifying particular data structures to represent the entities, relationships and constraints defined in the conceptual model ◦ Facilitate manipulation and retrieval of data stored in the database ◦ Relational Database Model and Structured Query Language (SQL): ◦ Tables, keys and constraints ◦ Specification of structures and constraints → DDL ◦ Manipulation and Retrieval of data → DML ◦ Exact syntax and semantics vary from implementation to implementation (different DBMS products), but ◦ All are based on the SAME mathematical concept of ‘relation’ and ‘relational algebra’ Physical Model CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 4 Big Picture: Design Phases - 4 Conceptual Model Logical Model Physical Model ◦ Internal data storage structuring and organization of DBMS CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 5 Objectives and Outline Objectives ◦ Understand the structure of relational database ◦ Be able to express queries using relational algebra Relational Database Model ◦ Structure of Relational Database ◦ table, column, row ◦ relation, attribute, domain and tuple ◦ Data Consistency Constraints ◦ Unique Tuple Identification: Super key, Candidate Key, Primary Key ◦ Referential Integrity and Foreign Keys ◦ Fundamental Relational Algebra Operators ◦ select: , project: , union: , set difference: –, Cartesian product: x, rename: CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 6 Why Study the Relational Database Model? Most widely used model; Logical Model of most DBMS ◦ Oracle, MS SQL, IBM DB2, PostgreSQL, Microsoft SQL Server,... Very Simple yet Extremely Useful ◦ Single Data modelling concept: relation (a.k.a. table) Allows clean yet powerful manipulation language CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 7 Basic Structure of Relational Database - 1 A Relational Database: a collection of relations (tables), each with a unique name A table: 2 dimensional structure, consisting of ◦ Column (field, attribute) ◦ Each attribute has a set of permitted values → domain of the attributes ◦ In pure relational database, domain values must be atomic. ◦ Each value is indivisible simple value, e.g., age, first name, balance ◦ Row (record, instance, tuple) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 8 Example of a Relation (Table) Column headers Attribute names a row a tuple account table account relation CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 9 Basic Structure of Relational Database - 2 A Relation: a set of records (tuples) ◦ ORDER of records does NOT matter. ◦ Each record has to be uniquely distinguishable from others in a relation. ◦ No multiple copies of the same tuple in a set CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 10 Domain of Attribute Attribute names Domain: A set of permitted values for an attribute ◦ E.g., Domain of the branch_name is the set of all branch names Logical Level Domain: A set of all branch names Physical Level Domain: char(30); character string consisting of 30 characters CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 11 Question: What are the Logical (Physical) Level domains for account_number and balance? Specification of Domain is based on required Business Logic! CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 12 Domain of Attribute - 1 employee relation customer relation Should customer_name and employee_name have same Logical Level domain (Physical Level domain)? CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 13 Domain of Attribute - 2 employee relation Should employee_name and branch_name have same Logical Level domain (Physical Level domain)? CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 14 A Null Value Is a member of any possible domain Can signify “unknown” or “not exist” or “not sure” In theory, good designs should try to avoid null values if possible In practise, null values are used to avoid ‘too complex’ design customer_name customer_street customer_city customer_phone Adams Spring Pittsfield 091234567 Curry North Rye null Green Walnut Stamford 076752345 CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 15 Formal Definition of Relation Given sets D1, D2, …. Dn , representing Domains, ◦ a relation r is ◦ a subset of D1 x D2 x … x Dn (Cartesian Product of n Sets) ◦ a set of n-tuples (a1, a2, …, an) where each ai Di ◦ D1, D2, …. Dn : domains of attributes ◦ a1, a2, …, an : attribute values in each respective domains ◦ a particular instance of (a1, a2, …, an ): a tuple A relation is a set of such tuples that satisfies a certain ‘property’ defining connection among attribute values of a n-tuple. CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 16 Example of Relation Customer If ◦ customer_name = {Jones, Smith, Curry, Lindsay, …} ◦ customer_street = {Main, North, Park, …} ◦ customer_city = {Harrison, Rye, Pittsfield, …} Then r = { (Jones, Main, Harrison), (Smith, North, Rye), (Curry, North, Rye), (Lindsay, Park, Pittsfield) } is a relation over customer_name × customer_street × customer_city CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 17 Relation Schema - 1 customer The customer_name customer_street customer_city schema of a relation Jones Main Harrison Smith North Rye Curry North Rye Lindsay Park Pittsfield Schema defines the structure of the relation; ◦ Definition of the relation + Integrity Constraints + Keys, etc. CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 18 Relation Schema - 2 Given A1, A2, …, An are attributes a relation schema R = (A1, A2, …, An) Example: Customer_schema = (customer_name, customer_street, customer_city) r(R) denotes a relation r on the relation schema R Example: customer (Customer_schema) “customer is a relation conforming to Customer_schema” CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 19 Relation Instance customer At time t1 customer_name customer_street customer_city Jones Main Harrison An instance of the relation Smith North Rye Curry North Rye The current set of tuples in Lindsay Park Pittsfield a relation customer At time t2 customer_name customer_street customer_city Jones Main Harrison Smith North Rye Turner Putnam Stamford Another instance Lindsay Park Pittsfield of the relation Green Walnut Stamford Williams Nassau Princeton CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 20 Relations are Unordered Are they the same instance? CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 21 Relational Database - 1 A relational database consists of one or (typically) more relations Information about an enterprise is broken up into parts, with each relation storing one part of the information E.g., account : stores information about accounts depositor : stores information about which customer owns which account customer : stores information about customers See also: Relational Model of the Banking Enterprise DB.pdf CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 22 Relational Database - 2 Why break up into multiple relations? Why not single relation having everything? ◦ Storing all information as a single relation such as bank(account_number, branch, balance, customer_name,... ,...,...,... ) results in ◦ repetition of information (Data Redundancy → Data Inconsistency) ◦ Inability to represent information; the need for null values ◦ Normalization theory deals with how to design good relational schemas CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 23 A simple Relational Database example Customer_schema = (customer_name, customer_street, account relation customer_city) Account_schema=(account_number, branch_name, balance) Depositor_schema=(customer_name, account_number) customer relation depositor relation CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 24 Recall the Fact about Relation A relation is a set of tuples Each tuple in a relation must be uniquely distinguishable from others (→ different attribute values) ◦ No two tuples in a relation have exactly the same values for all attributes → A Key is needed. Consider bank customers have deposit balances. Can we use their name as a key? CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 25 Different Types of Keys and their Relationships Superkey Candidate key Primary key CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 26 Superkey A superkey: a set of one or more attributes that allow us to identify uniquely a tuple in the relation. Let K R; (R is a relation schema; the set of attribute names for the relation) K is a superkey of R if values for K are uniquely identify each tuple of each possible relation r(R) Example: ◦ {customer_name, customer_street}, and ◦ {customer_name, customer_street, customer_city} are a superkeys of Customer, IF NO two customers living on the same street can possibly have the SAME name The set of all attributes is always a superkey (Trivial superkey). CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 27 Candidate Key K is a candidate key if K is a minimal superkey Example: candidate keys for Customer ◦ {mobile_phone} Note. This column is ◦ {customer_name, customer_street} added for demonstration ◦ Under the assumption of the customer_name customer_street customer_city mobile_phone business logic discussed in Adams Spring Pittsfield 7154628 Brooks Senator Brooklyn 2548621 the previous slide Curry North Rye 5841254 Glenn Sand Hill Woodside 2554478 Green Walnut Stamford 1236545 Hayes Main Harrison 9854123 Johnson Alma Palo Alto 9874100 Jones Main Harrison 9568323 Lindsay Park Pittsfield 9786541 Smith North Rye 6958569 Turner Putnam Stamford 2123021 Williams Nassau Princeton 9851459 Jones North Princeton 5114233 CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 28 Primary Key Primary key (PK): a candidate key chosen as the principal means of identifying tuples within a relation ◦ Should choose an attribute whose value never, or very rarely, changes. ◦ E.g. email address is unique, but may change CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 29 What are Primary Keys in this Relational Database? - 1 account relation Customer_schema = (customer_name, customer_street, customer_city) Account_schema=(account_number, branch_name, balance) Depositor_schema=(customer_name, account_number) customer relation depositor relation CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 30 What are Primary Keys in this Relational Database? - 2 account relation Customer_schema = (customer_name, customer_street, customer_city) Account_schema=(account_number, branch_name, balance) Depositor_schema=(customer_name, account_number) customer relation depositor relation CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 31 Database Constraints vs Business Logic Schema of Database also represents “constraints” that the data in the database must followed. These “constraints” are means of specifying certain (but not all) Business Logic e.g.) Domain Specification for an Attribute Attribute Domain customer_age Integer Value from 1 to 199 CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 32 Basic Constraints in Database Domain constraints Key constraints Referential Integrity constraint (Foreign key constraint) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 33 Key Specification (Constraint) Key Specification (esp. Primary Key and Candidate Keys) is a kind of constraint Choice of candidate keys and the primary key of relations must be made based on Business Logic and Application Requirements ◦ Depositor_schema = (customer_name, account_number) or Depositor_schema = (customer_name, account_number) ? depositor In real applications, we tend to use some “artificial sequencing relation mechanism” as the primary key of relations Need to pay also attention to other candidate keys that may exist in your relation schema since they represent business rules that must be enforced! CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 34 Foreign Key and Referential Integrity Foreign key (FK): an attribute in some relation schema that corresponds to the primary (or candidate) key of another relation. Referential Integrity ◦ A property of data that only values occurring in the Key attributes of the referenced relation may occur in the foreign key attribute of the referencing relation. PK Customer_schema = (customer_name, customer_street, customer_city) PK Account_schema=(account_number, branch_name, balance) Depositor_schema=(customer_name, account_number) FK FK Note: the attributes at the tail of arrows are FK. CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 35 Question According to Referential Integrity constraint, can we add the following tuple to the Depositor relation? ◦ (“Alex”, A-103); to reflect the fact that Customer named “Alex” has opened account A-103 depositor relation customer relation account relation CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 36 Representing Database Schema Branch_schema = (branch_name, branch_city, assets) Account_schema = (account_number, branch_name, balance) Depositor_schema = (customer_name, account_number) Customer_schema = (customer_name, customer_street, customer_city) Loan_schema = (loan_number, branch_name, amount) Borrower_schema = (customer_name, loan_number) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 37 Typical Steps in Defining Relational Database Schema 1. Define the relations, with a unique name for each relation 2. Define attributes for each relation and specify the domain for each attribute 3. Specify the key(s) for each relation ◦ Candidate keys and choose a primary key 4. Specify all appropriate foreign keys and integrity constraints CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 38 Another Simple Example - 1 Teacher = (Number, Name, Office, E-mail) Course = (Number, Name, Credit, Description) Class = ( Semester, Course_ID, Section, TimeDays, Room) Taught-by = (Teacher, Semester, Course, Section, Performance ) What do you think is the purpose of the ‘Class’ relation? What should be the primary key of the relation? WHY? CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 39 Another Simple Example - 2 Teacher = (Number, Name, Office, E-mail) Course = (Number, Name, Credit, Description) Class = ( Semester, Course_ID, Section, TimeDays, Room) Taught-by = (Teacher, Semester, Course, Section, Performance) Can you figure out the purpose of ‘Taught-by’ relation? Can you see what should be the primary key for the relation? CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 40 Another Simple Example - 3 Teacher = (Number, Name, Office, E-mail) Course = (Number, Name, Credit, Description) Class = ( Semester, Course_ID, Section, TimeDays, Room) Taught-by = (Teacher, Semester, Course, Section, Performance) What are the foreign keys of each relation? CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 41 Another Simple Example - 4 Teacher = (Number, Name, Office, E-mail) Course = (Number, Name, Credit, Description) Class = ( Semester, Course_ID, Section, TimeDays, Room) Taught-by = (Teacher, Semester, Course, Section, Performance) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 42 Todo Task: Do Worksheet 2 (15 minutes) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 43 Query Languages Used to retrieve information and manipulate data stored in the relational database in convenient and efficient manner ◦ SQL (Structured Query Language) the standard way ◦ Based on formal mathematical foundations ◦ Relational Algebra ◦ Relational Calculus Relational database model supports query language! CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 44 Basics of Algebra Algebra: the study of mathematical symbols and the rules for manipulating these symbols in arithmetic expressions ◦ Operands: constants and variables ◦ Operators: +, –, *, / Expressions can be constructed by applying operators to operands and/or other expressions: ◦ E.g., (x + y) * 3 / ( ( x + z – k) * ( -y + z) ) ◦ Pipelining: output of one operation is used as an input to another operation CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 45 Relational Algebra Operands: variables that stand for relation instances (a set of tuples) Operators: ◦ select: ◦ project: ◦ union: ◦ set difference: – ◦ Cartesian product: x ◦ rename: Examples: ◦ amount > 1200 (loan) ◦ customer_name (depositor) customer_name (borrower) ◦ customer_name (branch_name = “Perryridge” (borrower.loan_number = loan.loan_number (borrower x loan))) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 46 Additional Relational Algebra Operators Additional Operators (Add no power to the fundamental ones, but provides convenience of expressing complex expressions.) → Next Week ◦ set intersection: ◦ natural join: ◦ division: ◦ assignment: Extended Operators (Add more power) → Next Week ◦ Generalized Project to allow arithmetic functions in the projection list ◦ Aggregate Functions ◦ Outer Join Modification Operations → Next Week ◦ How to express deletion, insertion and updating of tuples CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 47 In this lecture, we will use relational algebra to demonstrate data retrieval from relational database. Later on, we will use SQL to demonstrate data retrieval from relational database. CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 48 Select Operation - 1 Notation: p(r) is pronounced as ‘Sigma’ ◦ Unary operator ◦ p is called the selection predicate ◦ Produces a new relation with the subset of the tuples in r that match the predicate p ◦ Examples: ◦ branch_name=“Perryridge” (account) ◦ amount>1200 (loan) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 49 Select Operation - 2 Defined as: r : relation instances p(r) = {t | t r and p(t)} t : tuple where p is a formula consisting of terms connected by : (and), (or), (not) Each term is one of: op or where op is one of: =, , >, , 1300 (loan) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 50 Select Operation Example - 1 loan relation Q1: Select all loan amounts at the branch ‘Perryridge’ CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 51 Select Operation Example - 2 Q1: Select all loan amounts at the branch ‘Perryridge’ branch_name=“Perryridge” (loan) loan relation What’s the result of the above query? loan_number branch_name amount L-15 Perryridge 1500 L-16 Perryridge 1300 Remember the result of a relational algebra operator is a relation instance CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 52 Select Operation Example - 3 loan relation Q1: Select all loan amounts at the branch ‘Perryridge’ whose amount is greater than 1300. CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 53 Select Operation Example - 4 loan relation Q1: Select all loan amounts at the branch ‘Perryridge’ whose amount is greater than 1300. branch_name=“Perryridge” amount > 1300 (loan) loan_number branch_name amount L-15 Perryridge 1500 CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 54 Select Operation Example - 1 Relation r A B C D 1 7 5 7 12 3 23 10 A=B (r) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 55 Select Operation Example - 2 Relation r A B C D 1 7 5 7 12 3 23 10 A=B (r) A B C D 1 7 12 3 23 10 CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 56 Select Operation Example - 3 Relation r A B C D 1 7 5 7 12 3 23 10 A=B ^ D > 5 (r) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 57 Select Operation Example - 4 Relation r A B C D 1 7 5 7 12 3 23 10 A=B ^ D > 5 (r) A B C D 1 7 23 10 CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 58 Project Operation Notation: A1 , A 2 , , A k (r ) where A1, A2,…,Ak are attribute names, and r is a relation name. ◦ Unary Operator ◦ Produce the relation of k attributes obtained by erasing the attributes that are NOT listed ◦ Duplicate rows removed from result, since relations are sets ◦ Example: To eliminate the branch_name attribute of account loan_number, amount (loan) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 59 Project Operation Example - 1 Relation r: A B C 10 1 20 1 30 1 40 2 A,C (r) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 60 Project Operation Example - 2 Relation r: A B C 10 1 20 1 30 1 40 2 A,C (r) A C A C 1 1 1 1 1 2 A relation is a SET of tuples! 2 CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 61 Union Operation - 1 Notation: r s ◦ Binary operator ◦ The usual set union operation ◦ Produce a new relation containing tuples from r and s eliminating duplicate tuples Defined as: r s = {t | t r or t s} CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 62 Union Operation - 2 For r s to be valid, 1. r, s must have the same arity (same number of attributes) 2. The attribute domains must be compatible Example: 2nd column of r deals with the same type of values as does the 2nd column of s CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 63 Union Operation - 3 Example: to find all customers with either a deposit account or a loan (or both) depositor relation borrower relation CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 64 Union Operation - 4 Example: to find all customers with either an account or a loan (or both) customer_name (depositor) customer_name (borrower) depositor relation customer_name Hayes Johnson Jones Lindsay borrower relation Smith Turner Adams Curry Jackson Williams CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG Union Operation Example - 1 Relations r, s: A B A B 1 2 2 3 1 s r r s: CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 66 Union Operation Example - 2 Relations r, s: A B A B 1 2 2 3 1 s r A B r s: 1 2 1 3 CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 67 Set Difference Operation Notation r – s ◦ Binary Operator ◦ Produce a new relation consisting of tuples that are in r BUT NOT in s ◦ Defined as: r – s = {t | t r and t s} Set differences must be taken between compatible relations 1. r and s must have the same arity 2. attribute domains of r and s must be compatible Example: customer_name (depositor) – customer_name (borrower) What does the above query find? CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 68 Set Difference Operation Example Relations r, s: A B A B 1 2 2 3 1 s r r – s: A B 1 1 CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 69 Cartesian-Product Operation - 1 Notation r x s ◦ Binary Operator ◦ Produce a relation that combines information from any two relations. Defined as:r x s = { (t q) | t r and q s} set of all possible “ordered pairs” whose first component is a member of r, and the second component is a member of s Assume that attributes of r(R) and s(S) are disjoint. (R S = ) ◦ If attributes of r(R) and s(S) are NOT disjoint, then renaming must be used. CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 70 Cartesian-Product Operation - 2 borrower = (customer_name, loan_number) loan = (loan_number, branch_name, amount) What’s the relation schema for borrower x loan ? A. (borrower.customer_name, borrower.loan_number, loan.loan_number, loan.branch_name, loan.amount) B. (customer_name, borrower.loan_number, loan.loan_number, branch_name, amount) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 71 Cartesian-Product Operation – Example Relations r, s: A B C D E 1 10 a 10 a 2 20 b r 10 b s r x s: A B C D E 1 10 a 1 10 a 1 20 b 1 10 b Assume we have 2 10 a n1 tuples in r1 and 2 10 a n2 tuples in r2. 2 20 b 2 10 b What’s the number of tuples in r3 = r1 x r2? CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 72 Cartesian-Product Operation – Example - 1 loan borrower How can we find the names of all customers who have a loan at the Perryridge branch? ◦ Hint: use all operations discussed so far CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 73 Cartesian-Product Operation – Example - 2 How can we find the names of all customers who have a loan at the Perryridge branch? loan borrower branch_name=“Perryridge” (borrower x loan) ? borrower.loan_number =loan.loan_number ( branch_name=“Perryridge” (borrower x loan) ) ? customer_name ( borrower.loan_number =loan.loan_number ( branch_name=“Perryridge” (borrower x loan) ) ) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 74 Rename Operation ◦ The results of relational algebra expressions do not have a name, we may need to refer to the same relation by more than one name Notations is pronounced as ‘Rho’ ◦ x (E) ◦ returns the expression E under the name X ◦ x ( A ,A 1 2 ,...,A n ) (E ) ◦ returns the result of expression E under the name X, and with the attributes renamed to A1 , A2 , …., An. CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 75 Rename Operation Example - 1 Example: Find the names of all customers who live on the same street and in the same city as Hayes CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 76 Rename Operation Example - 2 Step 1. Find the street and city of Hayes customer_street, customer_city ( customer_name = “Hayes” (customer) ) customer-street customer-city Main St. Harrison CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 77 Rename Operation Example - 3 Step 2. Rename attributes: street and city (of Hayes) hayes_addr(hayes_street, hayes_city) customer_street, customer_city ( customer_name = “Hayes” (customer) ) hayes_addr hayes_street hayes_city Main St. Harrison CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 78 Rename Operation Example - 4 Step 3. Concatenate all customer info. with Hayes’ street and city. customer x ( hayes_addr(hayes_street, hayes_city) customer_street, customer_city ( customer_name = “Hayes” (customer))) All customers’ street and city Hayes’ street and city customer-id Customer-name Customer-street Customer-city Hayes-street Hayes-city 192-83-7465 Johnson Alma St. Palo Alto Main St. Harrison 019-28-3746 Smith North St. Rye Main St. Harrison 677-89-9011 Hayes Main St. Harrison Main St. Harrison 182-73-6091 Turner Putnam Ave. Stamford Main St. Harrison 321-12-3123 Jones Main St. Harrison Main St. Harrison 336-66-9999 Lindsay Park Ave. Pittsfield Main St. Harrison 019-28-3746 Smith North St. Rye Main St. Harrison CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 79 Rename Operation Example - 5 Step 4. Select only customer info. with the same street and city as Hayes customer_street = hayes_street customer_city = hayes_city customer x ( hayes_addr(hayes_street, hayes_city) customer_street, customer_city ( customer_name = “Hayes” (customer))) customer-id Customer-name Customer-street Customer-city hayes_street hayes_city 677-89-9011 Hayes Main St. Harrison Main St. Harrison 321-12-3123 Jones Main St. Harrison Main St. Harrison CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 80 Rename Operation Example - 6 Step 5. Filter out Hayes’ record customer.customer_street = hayes_addr.street customer.customer_city = hayes_addr.city customer.customer_name “Hayes” customer x ( hayes_addr(hayes_street, hayes_city) customer_street, customer_city ( customer_name = “Hayes” (customer))) customer-id Customer-name Customer-street Customer-city hayes_addr.street hayes_addr.city 321-12-3123 Jones Main St. Harrison Main St. Harrison CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 81 Rename Operation Example - 6 Step 6. Retrieve only customer’s name who live on the same street and in the same city as Hayes customer.customer_name ( customer.customer_street = hayes_addr.street customer.customer_city = hayes_addr.city customer.customer_name “Hayes” ( customer x hayes_addr(street, city) (customer_street, customer_city ( customer_name = “Hayes” (customer) ) ) ) ) Customer-name Jones CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 82 Formal Definition - 1 A basic expression in the relational algebra consists of either one of the following: ◦ A relation variable ◦ A constant relation ◦ Written by listing tuples within { } ◦ e.g.) { (A-101, Downtown, 500), (A-102, Mianus, 700) } ◦ a constant relation having two tuples, of length 3 ◦ Of course, every tuple in a relation must of the same size since they all have the same number of attributes CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 83 Formal Definition - 2 Let E1 and E2 be relational-algebra expressions; the following are all relational-algebra expressions: ◦ E1 E2 ◦ E1 – E2 ◦ E1 x E2 ◦ p (E1), P is a predicate on attributes in E1 ◦ s(E1), S is a list consisting of some of the attributes in E1 ◦ x (E1), x is the new name for the result of E1 CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 84 Banking Example Refer to the handout provided CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 85 Example Queries - 1 Find all loans of over $1200 Find the loan number of each and every loan whose amount is greater than $1200 Find the names of all customers who have a loan, an account, or both, from the bank CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 86 Example Queries - 2 Find all loans of over $1200 amount > 1200 (loan) Find the loan number of each and every loan whose amount is greater than $1200 loan_number (amount > 1200 (loan)) Find the names of all customers who have a loan, an account, or both, from the bank customer_name (borrower) customer_name (depositor) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 87 Example Queries - 3 Find the names of all customers who have a loan at the Perryridge branch. CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 88 Example Queries - 4 Find the names of all customers who have a loan at the Perryridge branch. ◦ Query 1 customer_name (branch_name = “Perryridge” ( borrower.loan_number = loan.loan_number (borrower x loan) ) ) ◦ Query 2 customer_name(loan.loan_number = borrower.loan_number ( branch_name = “Perryridge” (borrower x loan) ) ) ◦ Query 3 customer_name(loan.loan_number = borrower.loan_number ( branch_name = “Perryridge” (loan) x borrower) ) ◦ Query 4 customer_name ( branch_name = “Perryridge” borrower.loan_number = loan.loan_number ( borrower x loan) ) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 89 Example Queries - 5 Find the names of all customers who have a loan at the Perryridge branch but do not have an account at any branch of the bank. CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 90 Example Queries - 6 Find the names of all customers who have a loan at the Perryridge branch but do not have an account at any branch of the bank. customer_name ( branch_name = “Perryridge” (borrower.loan_number = loan.loan_number (borrower x loan))) – customer_name(depositor) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 91 Example Queries - 7 Find the largest account balance ◦ Strategy: 1. Find those balances that are not the largest ◦ Rename account relation as d so that we can compare each account balance with all others 2. Use set difference to find those account balances that were not found in the earlier step. ◦ The query is: balance(account) − account.balance (account.balance < d.balance (account x d (account))) CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 92 Classroom Exercise Find the names of all customers who have an account at Brighton branch Find all account numbers managed by any of branches in the city of Horseneck Find the names of customers who are living in “Ramkhamheng” street and have accounts in “Huamark” branch. Find all account numbers which have the same balance as A-222 Find the smallest account balance Find the details of accounts (the account number, the branch in which the account is held and the current balance) that have the same balance in the same branch as any of the accounts held by “Somchai” CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 93 Additional Exercises Study the chapter 2 of the text thoroughly, especially 2.1 and 2.2 Express the following queries using only “fundamental relational algebra operators” and show the resulting relation instances based on the banking example database schema and relation instances given in the handout: ◦ Find all account number whose balance is greater than 500 ◦ Find all account numbers managed by any of branches in the city of Horseneck. ◦ Find the names of all customers who have both a loan and an account ◦ Hint: A B = A – (A – B) ◦ Find all account numbers which have the same balance as A-222 ◦ Find the name of all customers who have an account with the bank, along with his/her account number and the balance of the account. ◦ Find the names of all customers who have an account at Brighton branch ◦ Find the smallest account balance ◦ Find the names of all customer who have accounts in any of branches whose assets is below 2000000 ◦ Find the names of customers who are living in “Ramkhamheng” street and have accounts in “Huamark” branch. ◦ Find all the customers who have at least one account and one loan in a same branch ◦ Find the account numbers and names of the account holders whose balance is equal to balance of any of accounts held by “Somchai” and held in the same branch as to having the Somchai’s account having the same balance CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 94 ToDo Task Complete Worksheet 2 (Assignment 2) – group work Due: as shown in MS Teams’ Assignment 2 Submission: submit 1 pdf file per group via MS Team CSX3006 LECTURE 2 BY ASST. PROF. DR. RACHSUDA SETTHAWONG 95