Document Details

YouthfulTsavorite7659

Uploaded by YouthfulTsavorite7659

Vietnam-France University

Nguyen Hoang Ha

Tags

database design functional dependencies normal forms database management

Summary

This presentation covers the fundamentals of databases, specifically focusing on functional dependencies and normal forms. Key topics include defining functional dependencies, identifying keys and superkeys, and applying normal forms like 1NF, 2NF, and 3NF to database design.

Full Transcript

FUNDAMENTALS OF DATABASES Functional Dependencies & Normal Forms NGUYEN Hoang Ha Email: [email protected] FUNCTIONAL DEPENDENCIES Objectives  Understand what are functional dependencies  Understand the logic behind the FDs concept  Understand th...

FUNDAMENTALS OF DATABASES Functional Dependencies & Normal Forms NGUYEN Hoang Ha Email: [email protected] FUNCTIONAL DEPENDENCIES Objectives  Understand what are functional dependencies  Understand the logic behind the FDs concept  Understand the rules about FDs  Understand what are keys, super-keys  Understand what are closure-sets and how to determine them 3 Functional Dependency Definition  A functional dependency (FD) is a constraint between two sets of attributes in a Relation from a database  Given a relation R, a set of attributes X in R is said to functionally determine another attribute Y, also in R, (written X → Y) if and only if each X value is associated with precisely one Y value  Say “X → Y holds in R.”  Convention: …, X,Y, Z represent sets of attributes; A, B, C,… represent single attributes.  Convention: no set formers in sets of attributes, just ABC, rather than {A,B,C }. 4 Functional Dependency Definition  Customarily we call X the determinant set and Y the dependent attribute  A functional dependency FD: is called trivial if Y is a subset of X  A Functional Dependency (FD) X → Y on a relation R is a statement of the form: “If 2 tuples of R agree on attribute X, then they must also agree on Y” 5 Splitting Right Sides of FD’s  X → A1A2…An holds for R exactly when each of X→A1, X→A2,…, X→An hold for R.  Example: A → BC is equivalent to A → B and A → C.  There is no splitting rule for left sides.  We’ll generally express FD’s with singleton right sides. 6 6 FD Example Drinkers(name, addr, beersLiked, manf, favBeer)  Reasonable FD’s to assert: 1. name → addr favBeer Note this FD is the same as name → addr and name → favBeer. 2. beersLiked → manf 7 FD Example  Easy to see that: the following FD is true (title, year) → (length, genre, studioName)  Exercise: How about this FD (TRUE or FALSE)? (title, year) → (starName) 8 Properties of functional dependencies Given that X, Y, and Z are sets of attributes in a relation R, one can derive several properties of functional dependencies. Among the most important are Armstrong’s axiom, which are used in database normalization:  Subset Property (Axiom of Reflexivity): If Y is a subset of X, then X → Y  Augmentation (Axiom of Augmentation): If X → Y, then XZ → YZ  Transitivity (Axiom of Transitivity): If X → Y and Y → Z, then X → Z From these rules, we can derive these secondary rules:  Union: If X → Y and X → Z, then X → YZ  Decomposition: If X → YZ, then X → Y and X → Z  Pseudo transitivity: If X → Y and YZ → W, then XZ → W  Accumulation: If X → YZ and Z → V, then X → YZV  Extension: If X → Y and W → Z, then WX → YZ 9 Keys  A set of one or more attributes {A1, A2,.., An} is called a key of the relation R if: 1.Those attributes functionally determine all other attributes of R. This means that: It is impossible for 2 tuples of R to agree on all of {A1, A2,.., An} 2. No proper subset of {A1, A2,.., An} functionally determines all other attributes of R This means that: A Key must be minimal 10 Example: Keys  Exercise 1: (title, year) forms a key?  Exercise 2: (title, year, starName) forms a key ? 11 Super-key  A set of attributes that contains a key is called a Super-key  A superkey is a set of attributes of a relation whose values can be used to uniquely identify a row 12 Example: Super-key  There are many Super-key in the above schema: (title, year, starName) (title, year, starName, length, studioName)  Exercise: Can (title, year, starName, length, studioName) form a key? 13 Closure sets  Suppose: X is a set of attributes, and S is a set of FDs.  The closure of X under S is the set of attributes Y such that can be determined from X using S  Closure of a set of attributes X or {A1,..., An} is denoted X+ or {A1, A2,..., An}+ respectively 14 Example: Closure Set  R (A, B, C, D, E, F)  S = {A → C, A → D, D → E, E → F}  Easy to see that: {A}+ = {A, C, D, E, F}  Naïve idea to find closure set  Basis: Y + = Y.  Induction: Look for an FD’s left side X that is a subset of the current Y +. If the FD is X -> A, add A to Y +. 15 Closure set Finding Algorithm (courtesy [DBSC] Fig. 7.9 [p. 281])  INPUT: A set of attributes A={A1,..,An} and a set of FD’s F OUTPUT: The closure result = A+  Step 1: Split all FDs in F such that: each FD has a single attribute on the right; Step 2: result = A; Step 3: while (changes to result) for each (FD X → Y of F) do if (X is the sub-set of result) result = result U Y; 16 Closure sets Algorithm explain:  Step 1: start with initial set of attributes X  Step 2: identify FD's A → B where A is a sub-set of X, but B ∉ X  Step 3: add B to X  Step 4: repeat until no more attributes can be added to the closure, or, in other words, when you reach a fixpoint 17 Exercise  R (A, B, C, D, E, F)  S = {AB → C, BC → AD, D → E, CF → B}  compute {AB}+  Answer: {AB}+ = {ABCDE} 18 Exercise  R (A, B, C, D)  S = {BC→D, D→A, A→B}  What are all the keys of R?  What are all the super-keys for R that are not keys? 19 Exercise  R (A, B, C, D)  S = {A→B, A→C, C→D}  What are all the keys of R?  What are all the super-keys for R that are not keys?  Answer:  {A}+ = {ABCD} → A is a super-key. A is minimal → A is a key  {C}+= {CD} → C is not a super-key  {B}+= {B} ➔ do not need to check sets whose elements existing on the right side of FDs.  All super sets of {A} is a super-key, e.g: {AB}, {AC}, {AD}…. 20 Exercise  R (A, B, C, D)  S = {A→B, B→C, C→D, D→A}  What are all the keys of R?  What are all the super-keys for R that are not keys?  Answer:  {A}+={ABCD}  {B}+={ABCD}  {C}+={ABCD}  {D}+={ABCD}  → Keys: {A}, {B}, {C}, {D}  → Super-keys: add any attribute to the keys. 21 Exercise  R (A, B, C, D)  S = {AD→B, AB→C, BC→D, CD→A}  What are all the keys of R?  What are all the super-keys for R that are not keys?  Answer:  {AD}+={ADBC}  {AB}+={ABCD}  {BC}+={ABCD}  {CD}+={ABCD}  → Keys: {AD}, {AB}, {BC}, {DD}  → Super-keys: add any attribute to the keys. 22 NORMAL FORMS Anomalies introduction  Careless selection of a relational database schema can lead to redundancy and anomalies  So in this session we shall tackle the problems of relational database designing  Problems such as redundancy that occur when we try to cram too much into a single relation are called anomalies 24 title year length genre studioName starName Star Wars 1977 124 SciFi Fox Carrie Fisher Star Wars 1977 124 SciFi Fox Mark Hamill Star Wars 1977 124 SciFi Fox Harrison Ford Gone With The Wind 1939 231 drama MGM Vivien Leigh Wayne's World 1992 95 comedy Paramount Dana Carvey Wayne's World 1992 95 comedy Paramount Mike Meyers  The principal kinds of anomalies that we encounter are:  Redundancy: information maybe repeated unnecessarily in several tuples (exp: the length and genre)  Update Anomalies: We may change information in one tuple but leave the same information unchanged in another (e.g.: if we found that Star Wars is 125 minutes long, we may change the length in the first tuple but not in the second and third tuples)  Deletion Anomalies: If a set of values becomes empty, we may lose other information as a side effect (e.g.: if we delete “Fox” from as the studioName of “Star Wars”, then we may lost the “Fox” from the list of studio) 25 Decomposition  The accepted way to eliminate anomalies is the decomposition of relations  Decomposition of a relation R involves splitting the attributes of R to make the schemas of 2 new relations  Definition: Given a relation R(A1,..,An), we say R is decomposed into S(B1,..,Bm) and T(C1,..,Ck) if: + {A1,..,An} = {B1,..,Bm} U {C1,..,Ck} + S = ∏B1,..Bm(R) + T = ∏C1,..,Ck(R) 26 Example: Decomposition title year length genre studioName starName Star Wars 1977 124 SciFi Fox Carrie Fisher Star Wars 1977 124 SciFi Fox Mark Hamill Star Wars 1977 124 SciFi Fox Harrison Ford Gone With The Wind 1939 231 drama MGM Vivien Leigh Wayne's World 1992 95 comedy Paramount Dana Carvey Wayne's World 1992 95 comedy Paramount Mike Meyers title year length genre studioName title year starName Star Wars 1977 124 SciFi Fox Star Wars 1977 Carrie Fisher Gone With The Wind 1939 231 drama MGM Star Wars 1977 Mark Hamill Wayne's World 1992 95 comedy Paramount Star Wars 1977 Harrison Ford Gone With The1939 Wind Vivien Leigh Wayne's World1992 Dana Carvey Wayne's World1992 Mike Meyers 27 Discuss title year length genre studioName title year starName Star Wars 1977 124 SciFi Fox Star Wars 1977 Carrie Fisher Gone With The Wind 1939 231 drama MGM Star Wars 1977 Mark Hamill Wayne's World 1992 95 comedy Paramount Star Wars 1977 Harrison Ford Gone With The1939 Wind Vivien Leigh Wayne's World1992 Dana Carvey Wayne's World1992 Mike Meyers  The redundancy is eliminated (the length of each film appears only once)  The risk of an update anomaly is gone (we only have to change the length of Star Wars in one tuple)  The risk of a deletion anomaly is gone (if we delete all the stars for Gone with the wind, that deletion makes the movie disappear from the right but still be found in the left) 28 Decomposition: The Good, Bad and Ugly  We observed that before we decomposing a relation schema will eliminate anomalies;That’s the “Good”  But, decomposition can also have some bad:  Maybe we can’t recovery the original information; OR  After reconstruction, the FDs maybe not hold 29 Loss of information after decomposition R R1 R2 R3 A B C A B C A B B C 1 2 2 3 1 2 3 1 2 3 4 2 2 5 1 2 5 4 2 5 4 2 3 4 2 5  Suppose we have R(A,B,C) but neither of the FD’s B->A nor B->C holds.  R is decomposed into R1 and R2 as above  When we try to re-construct R by Natural Join of R1 and R2, we have: R3 = R1 X R2 (but R3 R1 => We lost information) 30 1NF  First Normal Form  Attributes are single-valued, atomic  Tuples are unique  Case study: Garment management  Is Products (item, colors, price, tax) in 1NF? → Answer: No 31 1NF normalized 32 Example: 1NF converting  The Name attribute is not atomic, so it must be divided into First_Name and Last_Name 33 How to recognize the Design is not in 1-NF ? 34 2NF  Non-primary key attributes depend on all component of PK  PK is a single attribute → guaranteed  Our case:  {item} → {price} , {price} → {tax} so violate 2NF criteria 35 2NF normalized 36 3NF  No non-key attributes depends on others  Example  Products (item, price, tax)  {Item} → {price} , {price} → {tax} 37 Exercise 4 – Make it 2NF  {author_social_security_number} → {author_first_name, author_second_name}  {Book_ISBN_number} → {book_title}  {{author_social_security_number, Book_ISBN_number} → {royalty_percentage} 38 Answer 39 Make it 3NF 40 Exercise  R (A, B, C, D, E, F)  S = {AB → C, BC → AD, D → E, CF → B}  What is the Primary Key, if so whether R is in 3NF? 41 Exercise  R (A, B, C, D)  S = {BC→D, D→A, A→B}  What is the Primary Key, if so whether R is in 3NF? 42 Exercise  R (A, B, C, D)  S = {A→B, A→C, C→D}  What is the Primary Key, if so whether R is in 3NF? 43 Exercise  R (A, B, C, D)  S = {A→B, B→C, C→D, D→A}  What is the Primary Key, if so whether R is in 3NF? 44 Exercise  R (A, B, C, D)  S = {AD→B, AB→C, BC→D, CD→A}  What is the Primary Key, if so whether R is in 3NF? 45

Use Quizgecko on...
Browser
Browser