IM Lecture 8: Relational Algebra - PDF

Document Details

SignificantPoltergeist

Uploaded by SignificantPoltergeist

CS1F

Dr Graham McDonald

Tags

relational algebra database sql computer science

Summary

This document is a lecture on relational algebra, a formal language for interacting with databases. It covers topics such as sets, relations, cartesian product, and operations on relations, such as projection, selection, and joins.

Full Transcript

Relations and Relational Algebra CS1F IM Lecture 8 Dr Graham McDonald Where are we now……. This Week:  Set Theory (Tuesday)  Relations and Relational Algebra (today) Weeks 5 & 6:  Lectures: SQL and querying a dat...

Relations and Relational Algebra CS1F IM Lecture 8 Dr Graham McDonald Where are we now……. This Week:  Set Theory (Tuesday)  Relations and Relational Algebra (today) Weeks 5 & 6:  Lectures: SQL and querying a database  Labs: Populate and Query your database  Database Report (hand in 4:30pm Monday 4th November)  TOTAL: 24 marks Recap on Sets 3  What are sets? {1,3,5}  Set builder notation for making sets; comparing sets  Operators: making new sets from other sets   union   intersection  – difference  A complement   symmetric difference  And… the Cartesian Product… Cartesian Product 4 Let A and B be sets  The cartesian product of A and B (A x B) is  the set of all ordered pairs (i.e. tuples) where a ∈ A and b ∈ B Set A has elements of the domain {0,1} Set B has elements of the domain {a,b,c} A={0,1}, B={a,b,c} 1st element of each tuple has domain {0,1} 2nd element of each tuple has domain {a,b,c} A x B = {,,, , ,} How does this help with databases? Cartesian Products & Relations 5  Consider a relation between values from A and values from B, denoted as R(A,B)  R({0,1}, {a,b,c}) A ={0,1} B ={a,b,c} 0 a  All possible rows in R: 0 b  Same as {0,1} x {a,b,c} 0 c  If we had less rows in our 1 a relation R, it would be a 1 b subset of {0,1} x {a,b,c} 1 c Relations 6  Relationships between elements of sets are represented using a structure called a relation  “A relation R is a subset of the Cartesian product of the domains that define R”  i.e. of the domains of its attributes  Relations are the fundamental data structure used to store information in databases Relations and their properties 7 A and B are sets a binary relation R from A to B is a subset of the Cartesian product A x B  That is – a set R of ordered pairs where the first element of each ordered pair comes from A and the second element comes from B  Notation  a R b denotes ∈ R  a is related to b by R An example relation 8  Let A = {3,4,5} and B = {x,y}  Cartesian product A x B = {, , , , , }  A relation R between A and B is a subset of this cartesian product  E.g. R= {, , } is a relation from A to B  3 R x True  3 R y False Representing a relation 9 Forename = {Marilyn, Quintin, Simon, Henrik} Domains Surname = {McGee, Cutts, Gay} Names = {,, } Marilyn Cutts R Cutts McGee Gay Marilyn x Quintin McGee Quintin x Simon x Henrik Simon Gay Henrik Foreign Keys as Domains 10  Foreign Keys further restrict an attribute's domain  Consider the example from the previous lecture m n Lecturer Teaches Course initials code  Teaches(Lecturer : VARCHAR(10), Course: VARCHAR(5))  As stated, the domain of the Course attribute is all possible strings of 0..5 in length: "", "a", "b", … "zzzzz", …  But the many-to-many rule tells us that Teaches.Course will have a foreign key constraint with Course.Code  Therefore any Teaches.Course value must also occur in the primary key of Course.Code RELATIONAL ALGEBRA 11 Relational Algebra 12  Sets and relations help us understand how the underlying data elements are structured in a database  Relational Algebra helps us understand how the DBMS extracts the required data from the DB 12 Querying a Database 13 A query… …is performed when we wish to extract from the database a subset of the information that answers some question:  "What are the department names?"  "Tell me all the data held about male employees."  "What are the names of the employees in the R&D department?“ 13 Querying a Database 14  Extraction of results consists of ‘programs’ built out of  retrieving data as a subset of some relation  combining two relations together in a meaningful way 14 Querying Languages 15 Two ways of querying a database: based on  procedural (relational algebra, Pandas) set theory  sequence of operations  the output of each operation is the input to the next operation result ← F4 (F2 (F1(tableA), tableB), F3(tableC)) This is internally  declarative (SQL) implemented  describes the desired results (in terms of conditions) as RA  the DBMS works out the operations operations result ← CONDITIONS (tableA, tableB, tableC) RA is key to understanding SQL query processing! Querying a Relational Database - Procedural 16 Procedural approach - describe the sequence of operations  e.g., using relational algebra  a formal language which allows the description of queries as a sequence of operations  abstractly describes the query op1 op2 R1 R2 V1 R3 V2 result Preliminaries 17  A query is a sequence of operations applied to relation instances, and the result of a query is also a relation instance  Schema of relations are fixed (c.f. previous lectures)  The query will then execute over any valid relation instance…  … and return a relation  The schema of the result can also be determined Relational Algebra 18  Relational algebra is a set of operations which can be combined to provide the result of a query in the form of a relation  The algebra consists of:  a collection of operations which fall into two categories:  special relational operations  traditional set operations  a form of "relational assignment" statement so that partial results can be assigned a name and passed to subsequent operations Relational Assignment 19  A query is made up of a sequence of operations of the form:  Format:  newRelation := UnaryOperation parameter ( inputRelation )  Parameters can include column names or conditions Or (for a binary operator)  newRelation := inputRelation1 Operator parameter inputRelation2 Relation Operations 20  Principal relation operations are:  Select - pick rows from a relation by some condition  Project - pick columns by name  Join - connect two relations (usually by a Foreign Key) Set Operations 21  Set operations include:  union - make a relation containing all the rows of two relations;  intersection - pick the tuples which are common to two relations;  difference - pick the tuples which are in one relation but not another;  Cartesian Product - pair off each of the tuples in one relation with those in another - creating a double sized row for each pair. 21 Selection () 22 Extract the tuples (rows) of a relation (table) which satisfy some condition on the values of their rows and return these as a relation (table view) 22 Selection () 23 Syntax  Condition ( RelationName ) Locals :=  city = "Glasgow" (Employee)  where the condition can contain:  Literals, e.g. “Glasgow”  column names, e.g. city  comparison operators ( =, >, etc. )  boolean operators ( and, not, or ) 23 Selection Employee name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh “Who are the employees that live in Glasgow?” Locals :=  city = "Glasgow" (Employee) Locals name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow 24 Selection Employee name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh “Who are the employees that live in Glasgow?” Locals :=  city = "Glasgow" (Employee) Locals name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow 25 Projection () 26  Extracts some of the columns from a relation, given the names  E.g. GenderSalary :=  (gender, salary) (Employee)  no attribute may occur more than once  duplicates will be removed 26 Projection Employee name NI Dept gender age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh GenderSalary :=  (Gender, salary)(Employee) GenderSalary gender salary F 34 M 23 M 17 F 28 27 Projection () 28  Projection and selection can be combined  (house, street) ( city = "Glasgow" (Employee)) 28 Projection () 29  Projection and selection can be combined  (house, street, city) ( city = "Glasgow" (Employee))  does a selection 29 Projection () 30  Projection and selection can be combined  (house, street, city) ( city = "Glasgow" (Employee))  does a selection followed by a projection  For this query, the following RA is equivalent  city = "Glasgow" ( (house, street, city) (Employee))  A DBMS can re-organise RA expressions for faster retrieval 30 Union () 31  Produces a relation which combines two relations into a new relation containing all of the tuples from each (removing duplicates)  The two relations must be "union compatible" i.e. have the same number of attributes drawn from the same domain  E.g. GlasgowOrStirling :=  city = "Glasgow" (Employee)   city = "Stirling" (Employee) 31 Union Employee name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh GlasgowOrStirling : =  city = "Glasgow" (Employee)   city = "Stirling" (Employee) GlasgowOrStirling name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling 32 Union Employee name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh GlasgowOrStirling : =  city = "Glasgow" (Employee)   city = "Stirling" (Employee) GlasgowOrStirling name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling 33 Union Employee name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh GlasgowOrStirling : =  city = "Glasgow" (Employee)   city = "Stirling" (Employee) GlasgowOrStirling name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling 34 Union Compatibility 35  E.g. S1S2 is valid, but S1R1 is not R1 S2 sid bid day sid sname rating age 22 101 101001 10 Myleene 6 23 99 103 111201 22 Tim 8 26 S1 sid sname rating age 99 Julia 100 20 11 Sue 7 26 88 Gavin 100 21 22 Tim 8 26 33 Bob 9 28 55 Kim 10 28 Intersection () 36 Similar to union but returns tuples that are in both relations E.g. FemalesInGlasgow :=  city = "Glasgow" (Employee)   sex = "F”(Employee) 36 Intersection Employee name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh FemalesInGlasgow :=  city = "Glasgow”(Employee)   sex = "F” (Employee) FemalesInGlasgow name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow 37 Intersection Employee name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh FemalesInGlasgow :=  city = "Glasgow”(Employee)   sex = "F” (Employee) FemalesInGlasgow name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow 38 Intersection Employee name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh FemalesInGlasgow :=  city = "Glasgow”(Employee)   sex = "F” (Employee) FemalesInGlasgow name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow 39 Difference ( – ) 40  Similar to union but returns tuples that are in the first relation but not the second E.g. NonLocals := Employee  Locals  Intersection and difference both require union compatibility  Both use column names from the first relation More Examples of Union compatible operations Customers Employees FN LN FN LN Susan Yao John Smith Ramesh Shah Ricardo Browne Customers  Employees Johnny Kohler Susan Yao FN LN Barbara Jones Francis Johns Susan Yao Amy Ford Ramesh Shah Ramesh Shah Jimmy Wang Johnny Kohler Ernest Gilbert Barbara Jones Employees-Customers Amy Ford Customers Employees Jimmy Wang FN LN FN LN Ernest Gilbert John Smith John Smith Susan Yao Ricardo Browne Ricardo Browne Ramesh Shah Francis Johns Francis Johns 42 Operations on any two relations 43  Relations may have different schema  Conditions indicate how to join the relations around common attributes Cartesian Product of Relations ( X ) 44  Cartesian Product A x B of two relations A and B, which have attributes A1... Am and B1... Bn……  ….is the relation with m + n attributes containing a row for every pair of rows, one from A and one from B  Thus if A has a tuples and B has b tuples then the result has a x b tuples 44 Cartesian Product 45 Employee Dependant name NI dept ENI name sex PDG 123 5 123 BG F QC 456 5 123 DG M 456 FC F Degree: Employee X Dependant name NI dept ENI name sex 3+3 = 6 PDG 123 5 123 BG F PDG 123 5 123 DG M PDG 123 5 456 FC F QC 456 5 123 GC F QC 456 5 123 DG M QC 456 5 456 FC F Cardinality 2x3=6 45 Cartesian Product 46  Cartesian Product - fairly meaningless & rarely used on its own  What would be the corresponding ER diagram for the relations? 1 N NI# Employee has Dependant Name Name Dept  More meaningful is the subset of rows highlighted in green  this could be created with the following selection:   NI# = ENI# ( Employee x Dependant )  the selection essentially makes use of the foreign key  It tells us the Employees and their corresponding Dependants 46 A Join… 47  NI# = ENI# ( Employee x Dependant )  Cartesian Product followed by this kind of selection is called a join (⋈) because it joins together two relations 47 Join = Cartesian Product followed by Selection 48 Employee Dependant name NI dept ENI name sex PDG 123 5 123 BG F QC 456 5 123 DG M 456 FC F Deps =  NI# = ENI# ( Employee x Dependant ) Deps name NI dept ENI name sex PDG 123 5 123 BG F PDG 123 5 123 DG M QC 456 5 456 FC F 48 Join operators 49  Join operators takes two relations and makes a new one, based on the pairing of rows between those two relations  There is a wide variety of join operators  This particular form is called an equi-join: Relation1 ⋈ A=B Relation2   A = B (Relation1 x Relation2) 49 Join = Cartesian Product followed by Selection 50 Deps =  NI# = ENI# ( Employee x Dependant ) Deps name NI dept ENI name sex PDG 123 5 123 BG F PDG 123 5 123 DG M QC 456 5 456 FC F  Deps := Employee ⋈ NI# = ENI# Dependant 50 Natural Join (⋈) 51  In its simplest form, the join of relations A and B pairs off the tuples of A and B so that identically named attributes from the relations have the same value  We now have two columns holding the same value, so we eliminate the duplicated common attributes to form the natural join or inner join  Natural Join is simply written as Relation1 ⋈ Relation2 51 Example 52 R1 B1 bid colour sid bid day 101 red 22 101 101001 102 blue 99 103 111201 103 green sid R1.bid day colour R1 ⋈ B1 22 101 101001 red 99 103 111201 green Join Summary 53  Cartesian product A x B : every row in A with every row in B  Join ⋈: Cartesian product followed by a selection  Equi-join ⋈X=Y: Selection is on some attributes being equal  Natural join ⋈: Selection is on attributes with the same name being equal, and duplicate attributes in the resulting schema removed Summary of Relational Algebra Operators 54 Three basic types of Relational Algebra operators: 1. Applying to one relation projection ∏ (attributes), selection σ (conditions) 2. Applying to two relations of identical structure union ⋃, intersection ⋂, difference - (no conditions) 3. Applying to two relations of different structure (Cartesian) product × (no conditions) joins ⋈ (conditions) Example 55 Employee and Department are two tables in a relational database.  Employee (Name, NI-Number, Email, Phone-No, Works_In)  Department (DeptName, Code, BuildingName) The attribute Works_In is a foreign key in Employee relating to the primary key Code in Department. Express in relational algebra: (a) the names of all departments based in the Main building DeptsInMain :=  DeptName ( BuildingName = “Main” (Departments)) Example 56 Employee and Department are two tables in a relational database.  Employee (Name, NI-Number, Email, Phone-No, Works_In)  Department (DeptName, Code, BuildingName) Schema: The attribute Works_In is a foreign key in Employee relating to the AllEmployeeDepts(Name,NI-Number,Email,Phone- primary key Code in Department. No,Works_In,DeptName,Code,BuildingName) Schema: Express in relational algebra: AccountsEmployeesNames (Name) (b) the names of all the employees of the “Accounts” department. AllEmployeeDepts := Employee ⋈ Works_In = Code Department AccountsEmployees :=  DeptName = “Accounts” (AllEmployeeDepts ) AccountsEmployeesNames :=  Name ( AccountsEmployees ) Relational Algebra Queries – Worked Examples  Consider the following relations: Employee name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh Department Dnum Dname manager 5 R&D 456 6 Production 111 7 Admin 345 Give the name and addresses of all employees who work for the R&D department 61 Relational Algebra Queries: Example 1 62  Give the name and addresses of all employees who work for the R&D department  ResDept :=  Dname = "R&D”(Department)  this should return just one tuple  ResDeptEmps := ResDept ⋈ Dnumber = Dept# Employee  this picks out the employees in that department.  Result :=  (Name, Address) ResDeptEmps  Notes:  the order could be changed – e.g. to make it faster.  the sub-steps could be merged into one:  (Name, Address) ( ( Dname = "R&D" (Department) ⋈ Dnumber = Dept# Employee  but this is not so readable 62 Example 1: Breakdown (1/3) 63 Give name and addresses of employees who work for the R&D dept Department Dnum Dname manager 5 R&D 456 6 Production 111 7 Admin 345 ResDept :=  Dname = "R&D" (Department) ResDept Dnum Dname manager 5 R&D 456 63 Example 1: Breakdown (2/3) 64 Employee name NI Dept sex age salary house street city Smith 123 7 F 52 34 5 High St Glasgow Jones 456 5 M 35 23 21 Lo St Glasgow McSomething 789 6 M 24 17 75 My St Stirling Black 541 5 F 44 28 77 Ur St Edinburgh ResDept Dnum Dname manager ResDeptEmps := 5 R&D 456 ResDept ⋈ Dnum = Dept Employee ResDeptEmps Dnum Dname Manager name NI sex age salary house street city 5 R&D 456 Jones 456 M 35 23 21 Lo St Glasgow 5 R&D 456 Black 541 F 44 28 77 Ur St Edinburgh 64 Example 1: Breakdown (3/3) 65 ResDeptEmps Dnum Dname Manager name NI Dept sex age salary house street city 5 R&D 456 Jones 456 5 M 35 23 21 Lo St Glasgow 5 R&D 456 Black 541 5 F 44 28 77 Ur St Edinburgh  (Name, House, Street, City) ResDeptEmps Result name house street city Jones 21 Lo St Glasgow Black 77 Ur St Edinburgh 65 Example 2 66  List the name, controlling department name, the department manager's name, address and date of birth for every project in Stafford  StaffordProjs :=  PLocation = "Stafford" (Project)  it is common to start by restricting to an area of interest  StaffordProjDepts:= StaffordProjs ⋈ Dnum=Department Department  this brings together the department information  StaffordProjDeptMngrs:= StaffordProjDepts ⋈ MgrNI#=NI# Employee  this brings in the manager information  Result := (Pname, Dname, Name, Address, DateOfBirth) StaffordProjDeptMngrs 66

Use Quizgecko on...
Browser
Browser