Database Normalization Steps Summary PDF
Document Details
![DecisiveGreatWallOfChina1467](https://assets.quizgecko.com/cdn-cgi/image/width=100,height=100,quality=75,format=webp/profile-images/K9h4G1j8j2PI1EWmRvpVijiU5DBfHLidCTSYEgZr.jpg)
Uploaded by DecisiveGreatWallOfChina1467
Tags
Summary
This document summarizes normalization steps in relational database design. It discusses concepts like super keys and functional dependencies, and explains different normal forms (1NF, 2NF, 3NF, BCNF, 4NF, and 5NF) with examples.
Full Transcript
" 607 Summary;All Normalization Steps *...
" 607 Summary;All Normalization Steps * * ==* Super Key (Or Superkey) * * * * * == * A super key is any combination of attributes that can uniquely identify a row in a table. ** ** ==** ** == * * Super keys include all candidate keys, primary keys, and any additional attributes that == ** ** ** ** ** ** make them unique. == * ==* Super keys may contain more attributes than necessary to ensure uniqueness. *== ** Example: In the Student table, ** ` ` ` {Roll Number, Name} and {Roll Number, Email} are super keys ` ` ` ** ** * because they uniquely identify each row, * * * * * * although Roll Number alone could also act as a key. ` ` * A super key can include attributes that aren’t individually unique , as long as the ** ** ==* * *** *** * *== combination of attributes ensures uniqueness for each row. ** ** See [[403 Keys in Relational Databases/Super Key]] [\ \ \ \ ]() * == Functional Dependency Notation: == * ` X → Y ` Indicates that every unique value of X * * ` ` * determines a corresponding unique value of Y. * * * ` ` This doesn't mean that Y must be unique across the entire table. *** ** * ` ` ~~ ~~ ==* It simply means that for any specific value of X , * * * ` ` == == the corresponding Y is the same every time X appears. ` ` * ** ** ` ` ** *** == See 501 Introduction to Functional Dependency/Definition of Functional Dependency [[ ]] " *Multivalued Functional Dependency Notation — (I.e. == == * "...→→...") ` X →→ Y ` Indicates that for every unique value of X , * * ` ` * there exists a set of values for Y ** ** * ` ` * that is independent of other attributes ( Z ).** ** ** ** * ` ` See 502 Types of Functional Dependencies/3. Multivalued Functional Dependency [[ ]] ==* Transitive Functional Dependency * == Given: ` X → Y and Y → Z ` * * ` ` Then: ` X → Z ` Indicates: ==... X can indirectly determine Z ` ` * * ` ` == == so Z is redundant information. ` ` ** ** == See 502 Types of Functional Dependencies/4. Transitive Functional Dependency [[ ]] * In other words, * a transitive dependency occurs when a non-prime attribute ** ** * * ** ** * depends on another non-prime attribute * *** * ** $ 1. Requirements of 1NF — (Atomic Values & Single Data == == Type & No Repeating Groups) * * ~~ ~~ To satisfy the requirements of 1NF: ** ** 1. Each column should contain only atomic values (no arrays or lists). * * ** ** * * ==** ** == *** *** ~~ ~~ 2. Each column should store values of a single data type. * * ** ** ==** **== 3. All entries in a column should be unique for each row , without repeating groups. *** *** ** ** ==** ** * ** ***== *** *** ~~** **~~ ~ Definition: A repeating group means ~ ~ ** ** ~ …storing multiple values in the same field *** * ** * * …or having multiple columns for the same kind of data. *** *** *** * ** * * To meet 1NF, you flatten the data by breaking it into separate rows. ** ** * * * * * * See 602 First Normal Form (1NF) [[ ]] 2. Requirements of 2NF — (No Partial Dependencies) == == * * ~~ ~~ To satisfy the requirements of 2NF: ** ** 1. The table must already be in 1NF. == ** **== 2. There should be no partial dependencies , ==*** *** **~~ ~~**== * meaning all non-key attributes must (functionally) depend on the entire primary key , * ==*** *** ** ** *** *** ** **== * not just part of it. * ~ Recap: Partial dependencies only apply to tables with composite primary keys ~ ==** ** * * ** **== (primary keys consisting of multiple columns). * * See 603 Second Normal Form (2NF) [[ ]] Example: ~ Given table: ~ StudentID, == CourseID , StudentName, Major == ~ Problem: ~ == There exists a partial dependency ** **== ==... since StudentName and Major don't depend on CourseID, ` ` ` ` *** ** * == ==...and also the composite primary key is {StudentID, CourseID}. * * ** ** ` ` == ~ Solution: ~ * Split into two tables to eliminate the partial dependencies: * * * * **~~ ~~*** ➡ # $ # # ~** Student Info Table ** ~ ` StudentID, StudentName, Major ` ~** Enrollment (Relationship) table ** ~ ` StudentID , ` ==** CourseID **== 3. Requirements of 3NF — (No Transitive == == * * ~~ Dependencies) ~~ To satisfy the requirements of 3NF: ** ** 1. The table must already be in 2NF. == * * ** **== 2. There should be no transitive dependencies , ==*** *** **~~ ~~**== * meaning non-key attributes shouldn't depend on other non-key attributes. * ** ** *** *** ~~ *** * **~~ See 502 Types of Functional Dependencies/4. Transitive Functional Dependency [[ ]] See 604 Third Normal Form (3NF)/Requirements of 3NF [[ ]] Example ~ Given table: ~ StudentID, StudentName, MajorID, MajorName, MajorDeptBuilding ~ Problem: ~ == There exists a transitive dependency ** **== ==...since StudentID → MajorID ` ` == ==* and MajorID → MajorName, MajorDeptBuilding * ` ` == ==...Therefore (indirectly) StudentID → MajorName, MajorDeptBuilding * * ` ` == == which is redundant. *** *** == ~ Solution: ~ * Split into two tables to eliminate the transitive dependency: * * * * * **~~ ~~** ~** Student Info table ** ~ ` StudentID , StudentName , MajorID ` ` ` ` ` ~** Major Info table ** ~ == MajorID , MajorName , MajorDeptBuilding == ` ` ` ` $ & # # 3.5. Requirements of Boyce-Codd Normal Form (BCNF) ==* * == (Enhanced 3NF) — (All "X" Superkeys) * * To satisfy the requirements of BCNF: * * ** ** 1. The table must already be in 3NF. ==* * * * ** **== 2. For every functional dependency X → Y , ==* ** *** ** ** ` ` == == ` X should be a superkey. ` * * ** ** == This means that X should uniquely identify every row in the table. ` ` * ** *** See [[403 Keys in Relational Databases/Super Key]] [\ \ \ \ ]() See 605 Boyce-Codd Normal Form (BCNF)/Requirements of BCNF [[ ]] Example ~ Given: ~ Table: StudentID, CourseID, Instructor , EnrollmentDate == == ` StudentID, CourseID → EnrollmentDate ` ==... StudentID, CourseID → Instructor ` ` == ==... CourseID → Instructor ` ` == * Satisfies 3NF because there are no transitive dependencies * ** ** * * *** *** **~~ ~~** * since Student_ID doesn't determine CourseID. * ` ` *** ** * ` ` ~ Problem: ~ == Given CourseID → Instructor ` ` == ==* yet, CourseID is not a superkey! * ` ` *** *** ** ** == ==* This therefore violates the BCNF rule. * ** ** == ~ Solution: ~ ~ Course Info table ~ ` CourseID , Instructor ` ` ` * Now, CourseID is a superkey in this table. * ` ` ** ** ~ Student Enrollment (Relationship) table * * * * ~ ` StudentID , ` ` CourseID , EnrollmentDate ` ` ` * Now, {StudentID, CourseID} are the superkeys in this table. * ` ` ** ** $ ➡ # ✅ 4. Requirements of 4NF — ( ≤ 1 Multivalued == == == == Dependency per table) 1. The table must already be in BCNF. ==* * ** **== 2. A table can have at most one multivalued dependency (per candidate key) ==* * *** *** ** ** == See /3. Multivalued Functional Dependency [[ ]] See 606 Higher Normal Forms (4NF, 5NF)/Fourth Normal Form (4NF) [[ ]] Example ~ Given: ~ Table: StudentID, Sport, Club == == ==... StudentID →→ Sport ` ` == ==... StudentID→→Club ` ` == ~ Problem: ~ == The example has multivalued dependencies since student can be independently ** ** ` ` *** ** associated with multiple sports and multiple clubs. * * * ` ` * * ` ` == == These independent relationships risks redundant data occurring. ** ** == ~ Solution: ~ ~ Student Sports Info table * * ~ ` StudentID, Sport ` ` StudentID →→ Sport ` ~ Student Clubs Info table * * ~ ` StudentID, Club ` ` StudentID →→ Club ` Each table now satisfies ** 4NF, ** * with exactly one multi-valued dependency each. ** *** ** ** * * 5. Requirements of 5NF — (No Join Dependencies) == == * * ~~ ~~ == The table must already be in 4NF. * * ** ** == == There should be no join dependencies, *** *** ~~ ~~ == == meaning the table should not be able to be decomposed further without losing *** *** ~~ ~~ * * information. == * In other words, a join dependency means that * ** ** a table can be split into two or more smaller tables *** ** * * and then recombined (joined) without losing any data or creating duplicates. ** ** ** ** ** *** ~~ ~~ * * **~~ ~~** (If the table can't be split this way, no join dependency exists, and so it is already in * * * * * * ** ** * * ** 5NF) ** ( Join dependency redundancy happens when a single table tries to store multiple == == * * * * * * * independent relationships.) * See 606 Higher Normal Forms (4NF, 5NF)/Fifth Normal Form (5NF) [[ ]] Example ~ Given: ~ Table: CourseID , Instructor , StudentID , InstructorApproved ` ` ` ` ` ` == == ~ Problem: ~ == There is a join dependency ** **== == because CourseID, Instructor →→ StudentID (multiple values possible) ` ` == == and CourseID, Instructor → InstructorApproved (only one value possible) ` ` == == So redundancy can occur when adding a new record == == because there can be multiple redundant instances of == == ` CourseID + Instructor + InstructorApproved. ` ** ** ` ` ** ** ` ` == ~ Solution: ~ ~ Course Instructor Info table ~ # # ` CourseID , ` ` Instructor , InstructorApproved ` ` ` ~ Course Student Info table ~ ` CourseID , ` ` StudentID , Instructor ` ` ` We've achieved 5NF since there is a table to represent each relationship individually. ** ** * * * * ** ** *** ***