Database Normalization Lecture 2 PDF
Document Details
Uploaded by TenderConstructivism
Dr. Mohamed Saied Amer
Tags
Summary
This lecture covers database normalization, explaining the concepts of functional dependencies, normal forms (1NF, 2NF, 3NF, and BCNF). It details different types of dependencies and includes examples illustrating the theory and practical application in database design.
Full Transcript
Lecture 2 Database Normalization Database 2 Dr. Mohamed Saied Amer Outline Functional Dependencies. What is Database Normalization? What is the Purpose of Normalization? What is 1NF 2NF and 3NF? The First Normal Form – 1NF The Second Normal Form – 2NF The Third Norma...
Lecture 2 Database Normalization Database 2 Dr. Mohamed Saied Amer Outline Functional Dependencies. What is Database Normalization? What is the Purpose of Normalization? What is 1NF 2NF and 3NF? The First Normal Form – 1NF The Second Normal Form – 2NF The Third Normal Form – 3NF What is BCNF (Boyce-Codd Normal Form) Functional Dependencies A functional dependency is a relationship between two sets of attributes in a database table. It describes how the value of one attribute determines the value of another attribute. Basically, a functional dependency is represented as X→Y where X and Y are sets of attributes in a table. Functional Dependencies Example Suppose we have a table called “Employees” with attributes such as employee ID, name, department, and salary. If we define a functional dependency as “department → salary,” it means that for any two employees who belong to the same department, their salaries will be the same. Another Example roll_no name city Here roll_no is only unique attribute. 1 Yash Delhi So the primary key for the given table will be roll_no. 2 Kartik Mumbai Other attributes such as name and city are dependent 3 Aditya Delhi on the roll_no. i.e. on the basis of roll_no we can get student’s name 4 Kartik Pune and its city. But we can not get roll_no of student based on it’s name or city as it will create ambiguity. So here we can say that name and city are functionally dependent on roll_no. Why Functional Dependency is Used? To maintain Data Integrity Easy to maintain Efficient Data Storage Improved Data Redundancy Types of Dependencies Full Dependency: occurs when all attributes of the primary key are required for the identifying value of a non-primary attribute. Example: roll_no sub_id marks If we want a mark of any student, we require 1 121 80 both roll_no and sub_id. 1 131 65 2 131 95 We cannot obtain marks based on one attribute from the primary key. Types of Dependencies Partial Dependency: occurs when primary key is formed using more than one attribute. This type of key also called as composite key. Example: Here primary key will be roll_no + sub_id because multiple roll_no can have the same sub_id. roll_no sub_id sub_name sub_mark Column of sub_name and the value of sub_name can be easily obtained 1 121 Science 80 by only sub_id which is part of the 1 131 Math 65 primary key. 2 131 Math 95 Types of Dependencies Transitive Dependency: occurs when the value of a non-primary attribute can be defined using another non-primary attribute Example: roll_no name city zip-code Here the primary key is roll_no but 1 abc pune 411044 we can identify the city using zip-code 2 ali Egypt 400001 where city and zip-code both are the 3 uvw pune 411044 primary key Primary Key A Primary key is used to ensure that data in the specific column is unique. A column cannot have NULL values. Example: STUD_NO, as well as STUD_PHONE both, are candidate keys for relation STUDENT but STUD_NO can be chosen as the primary key STUD_NO STUD_NAME STUD_PHONE STUD_STATE STUD_COUNT STUD_AGE 1 RAM 9865278251 Haryana India 20 2 RAM 9655470231 Punjab India 19 3 SUJIT 7514290359 Rajasthan India 18 Foreign Key A Foreign key is a column or group of columns in a relational database table that provides a link between data in two tables. Example: STUD_NO in STUDENT_COURSE is a foreign key to STUD_NO in STUDENT relation. STUD_NO STUD_NAME STUD_PHONE STUD_STATE STUD_COUNT STUD_AGE Register_id STUD_NO Course_name 1 RAM 9865278251 Haryana India 20 1 1 Math 2 RAM 9655470231 Punjab India 19 2 1 Science 3 SUJIT 7514290359 Rajasthan India 18 3 2 Math Composite Key A Composite key is made by the combination of two or more columns in a table that can be used to uniquely identify each row in the table Example: The composite key is the combination of two columns i.e. stud_no and stud_phone because all the rows of the table student can be uniquely identified by this composite key. STUD_NO STUD_NAME STUD_PHONE STUD_STATE STUD_COUNT STUD_AGE 1 RAM 889977 Haryana India 20 2 RAM 889977 Punjab India 19 3 SUJIT 7514290359 Rajasthan India 18 Super Key A Super Key is an attribute (or set of attributes) that is used to uniquely identifies all attributes in a relation. Example: Student{ID, First_name, Last_name, Age, Sex, Phone_no} Here we can see the two candidate keys ID and {First_name, Last_name, DOB, Phone_no}. So here, there are present more than one candidate keys, which can uniquely identify a tuple in a relation. Database Normalization Database normalization is a database design principle for organizing data in an organized and consistent way. It helps you avoid redundancy and maintain the integrity of the database. It helps you eliminate undesirable characteristics associated with insertion, deletion, and updating. Purpose of Normalization The main purpose of database normalization is Avoid complexities. Eliminate duplicates Organize data in a consistent way. Database administrators are able to achieve these relationships by using primary keys, foreign keys, and composite keys. 1NF, 2NF and 3NF 1NF, 2NF, and 3NF are the first three types of database normalization. They stand for first normal form, second normal form, and third normal form, respectively. There are also 4NF (fourth normal form) and 5NF (fifth normal form). All the types of database normalization are cumulative – meaning each one builds on top of those beneath it. So all the concepts in 1NF also carry over to 2NF, and so on. The First Normal Form – 1NF For a table (Relation) to be in the first normal form, it must meet the following criteria: A single cell must not hold more than one value (atomicity) There must be a primary key for identification The First Normal Form – 1NF Example The First Normal Form – 1NF Example Relation STUDENT in table 1 is not in 1NF because of multi-valued attribute STUD_PHONE. Its decomposition into 1NF has been shown in table 2. The Second Normal Form – 2NF It is based on the concept of fully functional dependency A table is said to be in 2NF if it meets the following criteria: it’s already in 1NF has no partial dependency. That is, all non-key attributes are fully dependent on a primary key. The Second Normal Form – 2NF example The Second Normal Form – 2NF example OrderID ProductID ProductName Category 101 1 Laptop Electronics 102 2 Smartphone Electronics 103 3 T-Shirt Apparel 2N F ProductID ProductName ProductID Category 1 Laptop 1 Electronics 2 Smartphone 2 Electronics 3 T-Shirt 3 Apparel The Third Normal Form – 3NF Eliminate transitive partial dependency. For a table to be in 3NF, it must: Be in 2NF Remove transitive partial dependency. The Third Normal Form – 3NF example The Third Normal Form – 3NF example EmployeeID Department SupervisorID SupervisorName 101 HR 102 Jone Doe 102 Management 103 IT 102 Jone Doe 3NF EmployeeID Department SupervisorID SupervisorID SupervisorName 101 HR 102 102 Jone Doe 102 Management 103 IT 102 102 Jone Doe General Normal Form Definitions The above definitions consider the primary key only The following more general definitions take into account relations with multiple candidate keys Any attribute involved in a candidate key is a prime attribute All other attributes are called attributes. Normal Forms Defined Informally 1st normal form All attributes depend on the key 2nd normal form All attributes depend on the whole key 3rd normal form All attributes depend on nothing but the key For reading BCNF (Boyce-Codd Normal Form) A relation schema R is in Boyce-Codd Normal Form (BCNF) if whenever an FD X → A holds in R, then X is a superkey of R Each normal form is strictly stronger than the previous one Every 2NF relation is in 1NF Every 3NF relation is in 2NF Every BCNF relation is in 3NF There exist relations that are in 3NF but not in BCNF Hence BCNF is considered a stronger form of 3NF The goal is to have each relation in BCNF (or 3NF) Thanks Any Questions