Unit_2 RDBMS.pptx

Unit II RDBMS BCACCA3101 Table of Content  Introduction to Distributed Database  Classification of DBMS  Introduction to RDBMS  Relational Model –Concepts  Relational operations (Insert, delete, update, select, project, rename, union, intersection, minus, Join, division)  Transactions and ER mapping Examples  Normalization of RDBMS (1NF, 2NF, 3NF and 4NF) and inference rules Introduction to Distributed Database  A distributed database is basically a database that is not limited to one system, it is spread over different sites, i.e, on multiple computers or over a network of computers. A distributed database system is located on various sites that don’t share physical components. This may be required when a particular database needs to be accessed by various users globally. It needs to be managed such that for the users it looks like one single database. 1. Homogeneous Database: In a homogeneous database, all different sites store database identically. The operating system, database management system, and the data structures used – all are the same at all sites. Hence, they’re easy to manage. 2. Heterogeneous Database: In a heterogeneous distributed database, different sites can use different schema and software that can lead to problems in query processing and transactions. Also, a particular site might be completely unaware of the other sites. Different computers may use a different operating system, different database application. They may even use different data models for the database. Hence, translations are required for different sites to communicate. Classification of DBMS  Based on Data Model  Based on Number of Users  Based on Purpose  Based on Number of Sites Classification Based on Data Models  Hierarchical Model is used for representing data in a tree-like structure, where each child node in the tree has a single parent and each parent can have multiple children.  Network Model is used for representing data in the form of nodes connected via links between them. Unlike hierarchical model, it allows each record to have multiple children and parent nodes to form a generalized graph structure.  Relational Model leverages the power of tables to represent data. Each table comprises rows and columns with relationships established through keys.  Entity-Relationship Model emphasizes entities, their attributes and the relationships between them. It employs entities to represent real-world objects and illustrates how they interact, providing a blue-print for database schema design. Classification Based on Number of Users  The database management system can also be classified on the basis of its user. So, a DBMS can either be used by a single user or it can be used by multiple users.  The database system is stored on a single computer and accessed by a single user at a time is referred to as a single-user system. If user A is using the database user B or C must wait until user A is through.  The database is stored in single system accessed by multiple users at a time is referred to as a multiple user system. Classification Based on Purpose  On the basis of the access path that is used to store the files, the database can be classified as general-purpose DBMS and special-purpose DBMS.  The special-purpose DBMS is the one that is designed for a specific application and it can not be used for another application without performing any major changes we refer to this as online transaction processing (OLTP). The OLTP system supports a large number of transactions concurrently without any delay.  The general-purpose DBMS is the one that is designed to meet the need of as many applications as possible. Classification Based on Number of Sites a. Centralized DBMS In the centralized DBMS, the entire database is stored in a single computer site and is maintained and modified from that location only and usually accessed using an internet connection such as LAN and MAN. The centralized database support multiple users. b. Distributed DBMS  In the distributed DBMS (DDBMS) the database and the DBMS software are distributed over many computer sites. These computer sites are connected via a computer network. The DDBMS is further classified as homogeneous DDBMS and heterogeneous DDBMS.  Homogeneous DDBMS: The homogeneous DDBMS has the same DBMS software at all the distributed sites.  Heterogeneous DDBMS: The heterogeneous DDBMS has different DBMS software for different sites. Introduction to RDBMS What is RDBMS (Relational Database Management System)?  RDBMS stands for Relational Database Management System.  All modern database management systems like SQL, MS SQL Server, IBM DB2, ORACLE, My-SQL, and Microsoft Access are based on RDBMS.  It is called Relational Database Management System (RDBMS) because it is based on the relational model introduced by E.F. Codd. How it works?  Data is represented in terms of tuples (rows) in RDBMS.  A relational database is the most commonly used database. It contains several tables, and each table has its primary key.  Due to a collection of an organized set of tables, data can be accessed easily in RDBMS. Following are the various terminologies of RDBMS: What is table/Relation?  Everything in a relational database is stored in the form of relations. The RDBMS database uses tables to store data.  A table is a collection of related data entries and contains rows and columns to store data. Each table represents some real-world objects such as person, place, or event about which information is collected. The organized collection of data into a relational table is known as the logical view of the database. Properties of a Relation: Each relation has a unique name by which it is identified in the database. Relation does not contain duplicate tuples. The tuples of a relation have no specific order. All attributes in a relation are atomic, i.e., each cell of a relation contains exactly one value.  A table is the simplest example of data stored in RDBMS.  Let's see the example of the student table. ID Name AGE COURSE 1 Ajeet 24 B.Tech 2 aryan 20 C.A 3 Mahesh 21 BCA 4 Ratan 22 MCA 5 Vimal 26 BSC What is a row or record? A row of a table is also called a record or tuple. It contains the specific information of each entry in the table. It is a horizontal entity in the table. For example, The above table contains 5 records. Properties of a row: No two tuples are identical to each other in all their entries. All tuples of the relation have the same format and the same number of entries. The order of the tuple is irrelevant. They are identified by their content, not by their position. Let's see one record/row in the table. ID Name AGE COURSE 1 Ajeet 24 B.Tech What is a column/attribute? A column is a vertical entity in the table which contains all information associated with a specific field in a table. For example, "name" is a column in the above table which contains all information about a student's name. Properties of an Attribute: Every attribute of a relation must have a name. Name Null values are permitted for the attributes. Ajeet Default values can be specified for an attribute Aryan automatically inserted if no other value is specified Mahesh for an attribute. Ratan Attributes that uniquely identify each tuple of a Vimal relation are the primary key. What is data item/Cells? The smallest unit of data in the table is the individual data item. It is stored at the intersection of tuples and attributes. Properties of data items: Data items are atomic. The data items for an attribute should be drawn from the same domain.  In the below example, the data item in the student table consists of Ajeet, 24 and Btech, etc. ID Name AGE COURSE 1 Ajeet 24 B.Tech Degree:  The total number of attributes that comprise a relation is known as the degree of the table.  For example, the student table has 4 attributes, and its degree is 4. ID Name AGE COURSE 1 Ajeet 24 B.Tech 2 aryan 20 C.A 3 Mahesh 21 BCA 4 Ratan 22 MCA 5 Vimal 26 BSC Cardinality:  The total number of tuples at any one time in a relation is known as the table's cardinality. The relation whose cardinality is 0 is called an empty table.  For example, the student table has 5 rows, and its cardinality is 5. ID Name AGE COURSE 1 Ajeet 24 B.Tech 2 aryan 20 C.A 3 Mahesh 21 BCA 4 Ratan 22 MCA 5 Vimal 26 BSC Domain:  The domain refers to the possible values each attribute can contain. It can be specified using standard data types such as integers, floating numbers, etc. For example, An attribute entitled Marital_Status may be limited to married or unmarried values. NULL Values  The NULL value of the table specifies that the field has been left blank during record creation. It is different from the value filled with zero or a field that contains space. Relational Model –Concepts  Relational Model (RM) represents the database as a collection of relations. A relation is nothing but a table of values. Every row in the table represents a collection of related data values. These rows in the table denote a real-world entity or relationship.  Atomicity keeps data accurate by ensuring that all changes to the data are either made completely or not made at all and that there are no partial changes to the data.  Consistency ensures that the state of the database remains consistent throughout the transaction.  Isolation ensures that changes made by one transaction are not visible to other transactions until the changes are committed.  Durability ensures that once changes are committed, they will remain even if there is a system failure. Relational Operations/Algebra  Relational algebra is a procedural query language. It gives a step by step process to obtain the result of the query. It uses operators to perform queries.  Types of Relational operation 1. Select Operation: The select operation selects tuples that satisfy a given predicate. It is denoted by sigma (σ) Notation: σ p(r) or σ(condition)(r) Where: σ is used for selection prediction r is used for relation p is used as a propositional logic formula which may use connectors like: AND OR and NOT. These relational can use as relational operators like =, ≠, ≥, , ≤.  For example: LOAN Relation BRANCH_NAME LOAN_NO AMOUNT Downtown L-17 1000 Redwood L-23 2000 Perryride L-15 1500 Downtown L-14 1500 Mianus L-13 500 Roundhill L-11 900 Perryride L-16 1300 Input: σ BRANCH_NAME="perryride" (LOAN) Output: BRANCH_NAME LOAN_NO AMOUNT Perryride L-15 1500 Perryride L-16 1300 2. Project Operation: This operation shows the list of those attributes that we wish to appear in the result. Rest of the attributes are eliminated from the table. It is denoted by ∏. Notation: ∏ A1, A2, An (r) Where A1, A2, A3 is used as an attribute name of relation r. Example: CUSTOMER RELATION NAME STREET CITY Jones Main Harrison Smith North Rye Hays Main Harrison Curry North Rye Johnson Alma Brooklyn Brooks Senator Brooklyn Input: ∏ NAME, CITY (CUSTOMER) Output: NAME CITY Jones Harrison Smith Rye Hays Harrison Curry Rye Johnson Brooklyn Brooks Brooklyn 3. Union Operation: Suppose there are two tuples R and S. The union operation contains all the tuples that are either in R or S or both in R & S. It eliminates the duplicate tuples. It is denoted by ∪. Notation: R ∪ S  A union operation must hold the following condition: R and S must have the attribute of the same number. Duplicate tuples are eliminated automatically. Example: DEPOSITOR RELATION CUSTOMER_NAME ACCOUNT_NO Johnson A-101 Smith A-121 Mayes A-321 Turner A-176 Johnson A-273 Jones A-472 Lindsay A-284  BORROW RELATION CUSTOMER_NAME LOAN_NO Jones L-17 Smith L-23 Hayes L-15 Jackson L-14 Curry L-93 Smith L-11 Williams L-17 Input: ∏ CUSTOMER_NAME (BORROW) ∪ ∏ CUSTOMER_NAME (DEPOSITOR) Output: CUSTOMER_NAME Johnson Smith Hayes Turner Jones Lindsay Jackson Curry Williams Mayes 4. Set Intersection/Intersection: Suppose there are two tuples R and S. The set intersection operation contains all tuples that are in both R & S. It is denoted by intersection ∩. Notation: R ∩ S Example: Using the above DEPOSITOR table and BORROW table Input: ∏ CUSTOMER_NAME (BORROW) ∩ ∏ CUSTOMER_NAME (DEPOSITOR) Output: CUSTOMER_NAME Smith Jones 5. Set Difference/Difference/Minus: Suppose there are two tuples R and S. The set intersection operation contains all tuples that are in R but not in S. It is denoted by intersection minus (-). Notation: R - S Example: Using the above DEPOSITOR table and BORROW table Input: ∏ CUSTOMER_NAME (BORROW) - ∏ CUSTOMER_NAME (DEPOSITOR) Output: CUSTOMER_NAME Jackson Hayes Willians Curry 6. Cartesian product/Cross Product The Cartesian product is used to combine each row in one table with each row in the other table. It is also known as a cross product. It is denoted by X. Notation: E X D  Example: EMPLOYEE EMP_ID EMP_NAME EMP_DEPT 1 Smith A 2 Harry C DEPARTMENT 3 John B DEPT_NO DEPT_NAME A Marketing B Sales C Legal Input: EMPLOYEE X DEPARTMENT Output: EMP_ID EMP_NAME EMP_DEPT DEPT_NO DEPT_NAME 1 Smith A A Marketing 1 Smith A B Sales 1 Smith A C Legal 2 Harry C A Marketing 2 Harry C B Sales 2 Harry C C Legal 3 John B A Marketing 3 John B B Sales 3 John B C Legal 7. Rename Operation:  The rename operation is used to rename the output relation. It is denoted by rho (ρ).  Example: We can use the rename operator to rename STUDENT relation to STUDENT1. Notation: ρ(STUDENT1, STUDENT) 8. Division operation  The division operator is used for queries which involve the 'all'.  R1 ÷ R2 = tuples of R1 associated with all tuples of R2.  Example: Retrieve the name of the subject that is taught in all courses. Name Course ÷ = Course Name System Btech Btech database Database Mtech Mtech Database Btech Algebra Btech 9. Join Operations:  A Join operation combines related tuples from different relations, if and only if a given join condition is satisfied.  It is denoted by ⋈.  Example: EMPLOYEE EMP_CODE EMP_NAME 101 Stephan 102 Jack SALARY 103 Harry EMP_CODE SALARY 101 50000 102 30000 103 25000 Input : (EMPLOYEE ⋈ SALARY) Output: EMP_CODE EMP_NAME SALARY 101 Stephan 50000 102 Jack 30000 103 Harry 25000 Types of Join operations: SQL Joins 1. Natural join a JOIN operation that creates an implicit join clause for you based on the common columns in the two tables being joined. 2. Full Outer Join or Full Join retrieves all matching and non-matching rows from both tables. If no matching values exist, NULL values will return. 3. Left Outer Join (Left Join) retrieves all records from the left or first table and the matching records from the second or right table. It returns NULL values if no matching values exist in the right table. 4. Right Join (Right Outer Join) retrieves all records from the right table and the matching records from the left table. It returns NULL values if no matching values exist in the left table. 5. EQUI JOIN creates a JOIN for equality or matching column(s) values of the relative tables. 1. Natural Join: A natural join is the set of tuples of all combinations in R and S that are equal on their common attribute names. It is denoted by ⋈. Example: Let's use the above EMPLOYEE table and SALARY table: Input: ∏EMP_NAME, SALARY (EMPLOYEE ⋈ SALARY) Output: EMP_NAME SALARY Stephan 50000 Jack 30000 Harry 25000 2. Outer Join:  The outer join operation is an extension of the join operation. It is used to deal with missing information. Example: EMPLOYEE EMP_NAME STREET CITY Ram Civil line Mumbai Shyam Park street Kolkata Ravi M.G. Street Delhi FACT_WORKERSHari Nehru nagar Hyderabad EMP_NAME BRANCH SALARY Ram Infosys 10000 Shyam Wipro 20000 Kuber HCL 30000 Hari TCS 50000 Input: (EMPLOYEE ⋈ FACT_WORKERS) Output: EMP_NAME STREET CITY BRANCH SALARY Ram Civil line Mumbai Infosys 10000 Shyam Park street Kolkata Wipro 20000 Hari Nehru nagar Hyderabad TCS 50000 An outer join is basically of three types: a. Left outer join b. Right outer join c. Full outer join a. Left outer join: Left outer join contains the set of tuples of all combinations in R and S that are equal on their common attribute names. In the left outer join, tuples in R have no matching tuples in S. It is denoted by ⟕. Example: Using the above EMPLOYEE table and FACT_WORKERS table Input: EMPLOYEE ⟕ FACT_WORKERS EMP_NAME STREET CITY BRANCH SALARY Ram Civil line Mumbai Infosys 10000 Shyam Park street Kolkata Wipro 20000 Hari Nehru street Hyderabad TCS 50000 Ravi M.G. Street Delhi NULL NULL b. Right outer join: Right outer join contains the set of tuples of all combinations in R and S that are equal on their common attribute names. In right outer join, tuples in S have no matching tuples in R. It is denoted by ⟖. Example: Using the above EMPLOYEE table and FACT_WORKERS Relation Input: EMPLOYEE ⟖ FACT_WORKERS Output: EMP_NAME BRANCH SALARY STREET CITY Ram Infosys 10000 Civil line Mumbai Shyam Wipro 20000 Park street Kolkata Hari TCS 50000 Nehru street Hyderabad Kuber HCL 30000 NULL NULL c. Full outer join: Full outer join is like a left or right join except that it contains all rows from both tables. In full outer join, tuples in R that have no matching tuples in S and tuples in S that have no matching tuples in R in their common attribute name. It is denoted by ⟗. Example: Using the above EMPLOYEE table and FACT_WORKERS table Input: EMPLOYEE ⟗ FACT_WORKERS Output: EMP_NAME STREET CITY BRANCH SALARY Ram Civil line Mumbai Infosys 10000 Shyam Park street Kolkata Wipro 20000 Hari Nehru street Hyderabad TCS 50000 Ravi M.G. Street Delhi NULL NULL Kuber NULL NULL HCL 30000 3. Inner/Equi join:  It is also known as an inner join. It is the most common join. It is based on matched data as per the equality condition. The equi join uses the comparison operator(=). Example: CUSTOMER RELATION CLASS_ID NAME 1 John 2 Harry PRODUCT 3 Jackson PRODUCT_ID CITY 1 Delhi 2 Mumbai 3 Noida Input: CUSTOMER ⋈ PRODUCT Output: CLASS_ID NAME PRODUCT_ID CITY 1 John 1 Delhi 2 Harry 2 Mumbai 3 Harry 3 Noida Transaction and ER Mapping  Transaction Management in DBMS A transaction is a set of logically related operations.  For example, you are transferring money from your bank account to your friend’s account, the set of operations would be like this: Simple Transaction Example 1. Read your account balance 2. Deduct the amount from your balance 3. Write the remaining balance to your account 4. Read your friend’s account balance 5. Add the amount to his account balance 6. Write the new updated balance to his account This whole set of operations can be called a transaction. Although I have shown you read, write and update operations in the above example but the transaction can have operations like read, write, insert, update, delete. In DBMS, we write the above 6 steps transaction like this: Lets say your account is A and your friend’s account is B, you are transferring 10000 from A to B, the steps of the transaction are: 1. R(A); 2. A = A - 10000; 3. W(A); 4. R(B); 5. B = B + 10000; 6. W(B); In the above transaction R refers to the Read operation and W refers to the write operation. Transaction failure in between the operations Now that we understand what is transaction, we should understand what are the problems associated with it. The main problem that can happen during a transaction is that the transaction can fail before finishing the all the operations in the set. This can happen due to power failure, system crash etc. This is a serious problem that can leave database in an inconsistent state. Assume that transaction fail after third operation (see the example above) then the amount would be deducted from your account but your friend will not receive it. To solve this problem, we have the following two operations Commit: If all the operations in a transaction are completed successfully then commit those changes to the database permanently. Rollback: If any of the operation fails then rollback all the changes done by previous operations. ER mapping  Following are the main components and its symbols in ER Diagrams:  Rectangles: This Entity Relationship Diagram symbol represents entity types  Ellipses : Symbol represent attributes  Diamonds: This symbol represents relationship types  Lines: It links attributes to entity types and entity types with other relationship types  Primary key: attributes are underlined  Double Ellipses: Represent multi-valued attributes  ER Model, when conceptualized into diagrams, gives a good overview of entity- relationship, which is easier to understand. ER diagrams can be mapped to relational schema, that is, it is possible to create relational schema using ER diagram. We cannot import all the ER constraints into relational model, but an approximate schema can be generated.  There are several processes and algorithms available to convert ER Diagrams into Relational Schema. Some of them are automated and some of them are manual. We may focus here on the mapping diagram contents to relational basics.  ER diagrams mainly comprise of − Entity and its attributes Relationship, which is association among entities. Mapping Entity  An entity is a real-world object with some attributes. Mapping Process (Algorithm) Create table for each entity. Entity's attributes should become fields of tables with their respective data types. Declare primary key. Mapping Relationship  A relationship is an association among entities. Mapping Process Create table for a relationship. Add the primary keys of all participating Entities as fields of table with their respective data types. If relationship has any attribute, add each attribute as field of table. Declare a primary key composing all the primary keys of participating entities. Declare all foreign key constraints. Mapping Weak Entity Sets  A weak entity set is one which does not have any primary key associated with it.  Mapping Process Create table for weak entity set. Add all its attributes to table as field. Add the primary key of identifying entity set. Declare all foreign key constraints. Mapping Hierarchical Entities ER specialization or generalization comes in the form of hierarchical entity sets. Mapping Process Create tables for all higher-level entities. Create tables for lower-level entities. Add primary keys of higher-level entities in the table of lower-level entities. In lower-level tables, add all other attributes of lower-level entities. Declare primary key of higher-level table and the primary key for lower-level table. Declare foreign key constraints. Normalization of RDBMS Need Of Normalization  A large database defined as a single relation may result in data duplication. This repetition of data may result in: Making relations very large. It isn't easy to maintain and update data as it would involve searching many records in relation. Wastage and poor utilization of disk space and resources. The likelihood of errors and inconsistencies increases. So to handle these problems, we should analyze and decompose the relations with redundant data into smaller, simpler and well structured relations that satisfy desirable properties. What is Normalization?  Normalization is the process of organizing the data in the database.  Normalization is used to minimize the redundancy from a relation or set of relations. It is also used to eliminate undesirable characteristics like Insertion, Update, and Deletion Anomalies.  Normalization divides the larger table into smaller and links them using relationships.  The normal form is used to reduce redundancy from the database table. Or we can say that Normalization is the process of decomposing the relations into relations with fewer attributes. Why do we need Normalization?  The main reason for normalizing the relations is removing these anomalies. Failure to eliminate anomalies leads to data redundancy and can cause data integrity and other problems as the database grows. Normalization consists of a series of guidelines that helps to guide you in creating a good database structure.  Data modification anomalies can be categorized into three types: Insertion Anomaly: Insertion Anomaly refers to when one cannot insert a new tuple into a relationship due to lack of data. Deletion Anomaly: The delete anomaly refers to the situation where the deletion of data results in the unintended loss of some other important data. Updatation Anomaly: The update anomaly is when an update of a single data value requires multiple rows of data to be updated. Types of Normal Forms Normalization works through a series of stages called Normal forms. The normal forms apply to individual relations. The relation is said to be in particular normal form if it satisfies constraints. First Normal Form (1NF)  A relation will be 1NF if it contains an atomic value.  It states that an attribute of a table cannot hold multiple values. It must hold only single-valued attribute.  First normal form disallows the multi-valued attribute, composite attribute, and their combinations. Relation EMPLOYEE is not in 1NF because of multi-valued The decomposition of the EMPLOYEE table into 1NF attribute EMP_PHONE. Partial Function Dependency  Partial functional dependency: This means that a non-key column (a column that is not part of the primary key) depends on only a part of the primary key.  For example, let's say we have Student_ID as the primary key and we want to record the Course_Name that each student is enrolled in. If the Course_Name depends on the Student_ID alone, that's a full functional dependency. However, if Course_Name depends only on Course_ID and not on Student_ID, that's a partial functional dependency.  Student_ID = 1, Course_ID = 101, Course_Name = Math, Student_ID = 1, Course_ID = 102, Course_Name = Science  Here, the Course_Name (Math or Science) depends only on the Course_ID (101 or 102), not on the Student_ID.  Second Normal Form (2NF) In the 2NF, relational must be in 1NF.  In the second normal form, all non-key attributes are fully functional dependent on the primary key  Example: Let's assume, a school can store the data of teachers and the subjects they teach. In a school, a teacher can teach more than one subject. non-prime attribute TEACHER_AGE is dependent on TEACHER_ID which is a proper subset of a candidate key. That's why it violates the rule for 2NF. Transitive Dependency  It's a situation where a column depends on another column through a third column.  In our example, let's say that the Project_Manager depends on the Project_ID. However, if the Employee_ID determines the Project_ID, and the Project_ID determines the Project_Manager, then the Employee_ID indirectly determines the Project_Manager. This is a transitive dependency because the Employee_ID influences the Project_Manager through the Project_ID.  Employee_ID influences Project_Manager indirectly through Project_ID. If you know the Employee_ID and Project_ID, you can figure out the Project_Manager. This indirect relationship can make managing and updating data more complex and less efficient. Third Normal Form (3NF)  A relation will be in 3NF if it is in 2NF and not contain any transitive partial dependency.  3NF is used to reduce the data duplication. It is also used to achieve the data integrity.  If there is no transitive dependency for non-prime attributes, then the relation must be in third normal form.  A relation is in third normal form if it holds atleast one of the following conditions for every non-trivial function dependency X → Y.  X is a super key.  Y is a prime attribute, i.e., each element of Y is part of some candidate key. Super key in the table above: {EMP_ID}, {EMP_ID, EMP_NAME}, {EMP_ID, EMP_NAME, EMP_ZIP}....so on Candidate key: {EMP_ID} Non-prime attributes: In the given table, all attributes except EMP_ID are non-prime. Here, EMP_STATE & EMP_CITY dependent on EMP_ZIP and EMP_ZIP dependent on EMP_ID. The non- prime attributes (EMP_STATE, EMP_CITY) transitively dependent on super key(EMP_ID). It violates the rule of third normal form. That's why we need to move the EMP_CITY and EMP_STATE to the new table, with EMP_ZIP as a Primary key. Boyce Codd normal form (BCNF)  the Boyce-Codd Normal Form (BCNF) helps to eliminate or minimize join dependencies by addressing functional dependencies within a table.  Join dependencies occur when a table can be decomposed into multiple tables, and their attributes can be reconstructed through joins.  BCNF aims to reduce these join dependencies by breaking down a table into smaller, more focused tables that have clearer relationships. 4 Normal Form  4NF specifically addresses multi-valued dependencies, which occur when an attribute depends on a combination of key attributes rather than just one key attribute.  In the Following Example, there's a multi-valued dependency present. Teacher_Name depends only on Teacher_ID, and Course_Name depends only on Course_ID. These dependencies result in redundant data, as the same teacher's name and course name are repeated for every course they teach.  To apply 4NF, we need to split this table into two separate tables: one for teacher information and another for course information. This eliminates the redundancy caused by multi-valued dependencies. Advantages of Normalization Normalization helps to minimize data redundancy. Greater overall database organization. Data consistency within the database. Much more flexible database design. Enforces the concept of relational integrity. Disadvantages of Normalization You cannot start building the database before knowing what the user needs. The performance degrades when normalizing the relations to higher normal forms, i.e., 4NF, 5NF. It is very time-consuming and difficult to normalize relations of a higher degree. Careless decomposition may lead to a bad database design, leading to serious problems. Inference rules  Inference rules in a database management system (DBMS) are a set of logical rules that allow you to deduce new information from existing facts or conditions stored in the database. These rules help in making logical deductions or drawing conclusions based on established relationships and dependencies within the data.  Inference rules are commonly used in database systems to perform various tasks, including query optimization, data integrity checks, and security enforcement. They help the DBMS make intelligent decisions and enforce constraints to maintain the accuracy and consistency of data.  Transitive Rule: If A is related to B, and B is related to C, then we can infer that A is related to C. This rule is often used for enforcing referential integrity. Example: If a student (A) is enrolled in a course (B), and that course (B) is related to a department (C), then we can infer that the student (A) is associated with the department (C).  Augmentation Rule: If A implies B, then adding more attributes to A also implies adding those attributes to B. Example: If an employee's Employee_ID (A) uniquely determines their Employee_Name (B), then adding more attributes like Department_ID to Employee_ID (A) would also imply adding Department_ID to Employee_Name (B).  Project and Join Rules: These rules help optimize queries by reducing the number of attributes involved and minimizing join operations. Example: If you're selecting only a few attributes (Project) from a table, the inference rule might suggest that certain attributes that are not selected can be safely ignored. Similarly, the Join rule helps optimize queries involving multiple tables by considering only the necessary join conditions.  Constraint Enforcement: Inference rules are used to enforce constraints like primary keys, foreign keys, and unique constraints. For instance, if you have a foreign key relationship between two tables, the inference rule ensures that a value cannot be inserted in the child table if the corresponding value doesn't exist in the parent table. These are just a few examples of inference rules in a DBMS. They help maintain data integrity, improve query performance, and ensure that the data remains consistent and accurate throughout its lifecycle. Thanky ou

Document Details

Tags

Related

Full Transcript