Document Details

ConcisePluto

Uploaded by ConcisePluto

Tags

relational database database design functional dependencies normalization

Summary

This document details relational database design concepts. It covers topics like what a relational database is and how data is organized into tables, columns, and relationships. Furthermore, the document describes database design principles such as data integrity and normalization. It also elaborates on types of relationships, the decomposition process, and functional dependencies used in database design.

Full Transcript

RELATIONAL DATABASE DESIGN WHAT IS RELATIONAL DATABASE? A relational database is a type of database that organizes data into tables (relations) with rows (records) and columns (fields). These tables are interconnected through relationships, allowing data to be accessed and...

RELATIONAL DATABASE DESIGN WHAT IS RELATIONAL DATABASE? A relational database is a type of database that organizes data into tables (relations) with rows (records) and columns (fields). These tables are interconnected through relationships, allowing data to be accessed and combined in various ways. WHAT IS DATABASE DESIGN? Database design is the process of creating an efficient and organized structure for storing and managing data in a database. It involves defining tables, columns, relationships, and constraints to ensure data integrity, minimize redundancy, and optimize performance. Proper database design is the foundation for building robust and scalable applications. DATA INTEGRITY refers to the accuracy, consistency, and reliability of data stored in a database. It ensures that the data follows specific rules and constraints, preventing errors and inconsistencies. There are three types of data integrity: Entity Integrity: Ensures each row in a table is uniquely identifiable by a primary key, and the primary key cannot have null values. Referential Integrity: Maintains relationships between tables by ensuring foreign key values in one table match the primary key values in another table. Domain Integrity: Enforces valid entries for a given column by restricting the data type, format, and range of values that can be stored. ATOMIC VALUES In database design, it's important to store atomic values, which means storing the smallest pieces of information that cannot be further divided. This principle helps maintain data integrity and avoid redundancy. RELATIONSHIPS Relationships are the cornerstone of relational databases, allowing you to connect and combine data from different tables. There are three main types of relationships: ONE-TO-ONE In a one-to-one relationship, each record in one table is associated with exactly one record in another table, and vice versa. Example: Product Sales Database ONE-TO-MANY In a one-to-many relationship, each record in one table (the "one" side) can be associated with multiple records in another table (the "many" side). Example: Class Roster Database MANY-TO-MANY In a many-to-many relationship, each record in one table can be associated with multiple records in another table, and vice versa. Example: Product Sales Database RELATIONAL DATABASE DESIGN PROCESS 1. Requirement Analysis Before designing a relational database, it is essential to understand the requirements of the system and the data it needs to manage. This involves interviewing stakeholders and reviewing documentation to identify entities, attributes, and relationships. 2. Conceptual Design The conceptual design is the high-level mapping of the system. This involves creating an Entity- Relationship Diagram (ERD), which visually represents the entities, attributes, and relationships. 3. Logical Design Logical design involves converting the conceptual model into a relational model, defining tables, columns, and relationships. This is where the primary and foreign keys are defined. 4. Normalization At this stage, the database schema is normalized to reduce data redundancy and ensure the structure is efficient and robust. 5. Physical Design The physical design deals with the actual implementation of the database, focusing on performance considerations such as indexing and data storage. 6. Implementation The database is then implemented using a Database Management System (DBMS), like MySQL, PostgreSQL, or Microsoft SQL Server, with tables and relationships created based on the design. Features of Good Relational Design Features of Good Relational Design A good relational database design is the foundation for organizing data, building databases that are easy to use, maintain, and extend while ensuring the data is stored logically, consistently, and efficiently while meeting business requirements and maintaining long-term scalability. Features of Good Relational Design ACID Compliance – These four properties ensures that the database can handle complex, multi-step operations while preserving integrity and reliability of a database. Atomicity – requires all of the tasks or operations to succeed, or the transaction will roll back and the database remains unchanged. Consistency – the state of the database must remain consistent throughout the transaction, from one valid state to another. This ensures they remain in a correct state before and after a transaction. If a transaction violates these rules, it will not be allowed to complete and the database is rolled back to its previous consistent state. Isolation – each transaction is separate and not dependent on others. Isolation ensures that the operations of a transaction is invisible until it is completed. This reduces the risk of confusion and “dirty reads”, where one transaction might read the data that is being updated. Durability – can recover data from failed transaction. Also ensures data changes are permanent and that committed transactions are never lost. Features of Good Relational Design Normalization - is the process of organizing data to reduce redundancy and avoid anomalies, ensuring that each piece of data is stored only once, which reduces the chances of data inconsistency. Data Integrity Constraints – refers to rules and conditions that ensure how accurate, consistent, and reliable the data in a database is. These constraints help enforce valid and correct data while maintaining relationships between different data. Primary Keys Foreign Keys Domain Integrity Unique Constraints Check Constraints Not Null Constraints Features of Good Relational Design Relation for every entity – Every entity (such as a person, product, or order) should be represented as a relation (table) in the relational database. Each attribute of an entity should be represented as a column in the table. This ensures the database reflects the real-world structure and relationships between entities, being efficient and clear in managing data. Appropriate Relationships – how entities are related to one another in the database. Established using foreign keys. One-to-One (1:1) - A record in one table is associated with at most one record in another table. One-to-Many (1) - A record in one table is associated with many records in another table. Many-to-Many (M) - Records in one table can be associated with many records in another table, and vice versa. Features of Good Relational Design Clear and Consistent Naming Conventions - Tables and columns should have clear, descriptive, specific, meaningful names. This makes the database easier to understand and maintain, reducing the chance of errors due to confusion about what each table, column, or object represents. Efficient Indexing – use of indexes can speed up data retrieval, improving query performance, especially for large datasets and complex queries. Efficient indexing helps optimize query execution time, ensuring that data retrieval operations (SELECT, JOIN, WHERE, etc.) are fast and scalable. Scalability and Flexibility – ensures that the database can efficiently handle increased loads of data, complex queries, adapt to changing requirements, and evolve over time, without performance degradation. Features of Good Relational Design Data Redundancy and Duplication Control – avoids unnecessary repetition or duplication of data within a database by organizing related data into separate tables. Managing redundancy and duplication ensures data integrity, efficiency, consistency, and optimal query performance. Data Security and Privacy - refers to how well the data in the database is controlled and protected from crashes, hacks, misuse, and accidental deletion ensuring compliance with data privacy regulations. Referential Integrity – refers to relationships between tables, ensuring that foreign key relationships in the database remain accurate, consistent and valid, and corresponds to existing primary key value in another table. Foreign Key Constraints: Foreign keys ensure that no invalid references are made Cascading Operations: ensuring that changes to parent records automatically update or delete related child records Features of Good Relational Design Handling Nulls and Defaults – avoids issues with incomplete and inaccurate data and improve overall robustness of your database. Is critical for maintaining data integrity, consistency, and clarity. Ensure that columns which are critical to the integrity of the database are NOT NULL. Optimizing for Query Performance – ensures that data is stored in a way that allows for efficient querying. Optimizing for query performance is not just about fast query execution but also about efficient data storage, reducing unnecessary complexity, and ensuring scalability. DECOMPOSITION USING FUNCTIONAL DEPENDENCIES WHAT IS FUNCTIONAL DEPENDENCY? A functional dependency specifies that the value of one attribute (or set of attributes) determines the value of another attribute in a database. Example: → If attribute A determines attribute B, it’s written as A B. Determinant (A): - An attribute or set of attributes that uniquely determines another attribute. Dependent (B): The attribute whose value is determined by the determinant. SIMPLE EXAMPLE OF FUNCTIONAL DEPENDENCIES Simple Example of Functional Dependencies In this example, we have a functional dependency between student_number and student_name: Functional Dependency: student_number → student_nameThis means that each student_number uniquely determines a student_name. In other words, if you know a student’s student_number, you can always find their corresponding student_name. Functional Dependency Notation Left-Hand Side (LHS) Right-Hand Side (RHS) WHAT IS DECOMPOSITION? Decomposition is the process of breaking a large table into smaller, simpler tables. This helps to reduce repeated data and makes the database more organized and easier to manage. Why Decompose? To achieve Normalization: A process to minimize redundancy and improve data consistency. To improve query efficiency and scalability. DECOMPOSITION USING FUNCTIONAL DEPENDENCY Decomposition using functional dependency is a key idea in database design that shows how attributes (columns) in a table are related. It helps break down complex tables into smaller, organized ones to keep data accurate, reduce duplication, and avoid errors. TYPES OF DECOMPOSITION Dependency-Preserving Lossless-Join Decomposition: Decomposition: dependency-preserving A decomposition is lossless if, decompositions retain all when the decomposed functional dependencies relations are joined, no post-decomposition. information is lost. STEPS IN DECOMPOSITION USING FUNCTIONAL DEPENDENCY: Identify Functional Dependencies: Determine how 1 attributes depend on each other. Check Normal Form Requirements: See if the table meets 2 certain forms (e.g., 1NF, 2NF, 3NF, BCNF). Decompose the Relation: Split the table into smaller ones 3 that meet normal form standards. 4 Verify Lossless-join and Dependency Preservation: Ensure no data is lost, and original dependencies are preserved. EXAMPLE: Table: Initial Employee_Department Table STEP 1: Identifying Functional Dependencies Key Functional Dependencies in Employee_Department: FD1: Department_id→Department_name, Manager_id FD2: Employee_name →Department_id EXAMPLE: STEP 2: Determine Normal Form Requirements To satisfy Boyce-Codd Normal Form (BCNF), each determinant (the left-hand side of each functional dependency) should be a candidate key. In this case, Department_ID is not a candidate key for the Employee_Department relation as a whole, because Employee_name can uniquely identify each row in the original relation. This means the dependency Department_ID → Department_Name, Manager_ID violates BCNF, so we need to decompose the table. STEP 3: DECOMPOSE THE RELATIONAL We’ll decompose Employee_Department into two relations based on the functional dependencies identified: 1. Department (Department_ID, Department_Name, Manager_ID): Department Table 2. Employee_Department_Assignment (Employee_name, Department_id): STEP 4: VERIFY LOSSLESS-JOIN AND DEPENDENCY PRESERVATION Part 1: Lossless-Join Property The Lossless-Join Property ensures that when we join the decomposed tables back together, we can reconstruct the original table without losing any information. Part 2: Dependency Preservation Dependency Preservation ensures that all functional dependencies from the original table are still enforceable in the decomposed tables. In our case, the original dependencies were: 1. Department_ID → Department_Name, Manager_ID 2. Employee_Name → Department_ID ANOTHER EXAMPLE OF DECOMPOSITION USING FUNCTIONAL DEPENDENCIES Original Table: Order_Details Step 1: Identify Functional Dependencies 1. Order_ID → Customer_Name, Customer_Address, Product_Name, Product_Category 2. Customer_Name → Customer_Address 3. Product_Name → Product_Category ANOTHER EXAMPLE OF DECOMPOSITION USING FUNCTIONAL DEPENDENCIES Original Table: Order_Details Step 2: Determine Normal Form Requirements Order_ID is a superkey, so Order_ID → Customer_Name, Customer_Address, Product_Name, Product_Category satisfies BCNF. Customer_Name → Customer_Address and Product_Name → Product_Category violate BCNF because Customer_Name and Product_Name are not superkeys (they don’t uniquely identify each row). ANOTHER EXAMPLE OF DECOMPOSITION USING FUNCTIONAL DEPENDENCIES Step 3: Decompose Based on Functional Dependencies We decompose the Order_Details table into smaller tables based on the violating dependencies Customer_Name → Customer_Address and Product_Name → Product_Category. New Tables After Decomposition: ANOTHER EXAMPLE OF DECOMPOSITION USING FUNCTIONAL DEPENDENCIES Step 4: Verify Lossless-Join and Dependency Preservation Lossless-Join We can join these tables back together using Customer_Name and Product_Name to reconstruct the original Order_Details table without losing any information, so this decomposition is lossless. Dependency Preservation Let’s check each functional dependency: 1. Order_ID → Customer_Name, Product_Name 2. Customer_Name → Customer_Address 3. Product_Name → Product_Category NORMAL FORMS Ensuring Database Efficiency and Reducing Redundancy www.Team5cutie.com NOT N BAD DATABASE DESIGN: ORMA LIZED Customer_ID Customer_Date_of_birth 1001 10-July-2003 dancy redun 1001 10-August-2013 multiple va lues StudentName Courses NORMALIZED TABLES SHOULD BE: Charlius BSIT, BSCE Easier to understand Roy BSIT, BSCPE Easier to enhance and extend Protected from anomalies: - Insertion anomalies - update anomalies - deletion anomalies WHAT IS NORMALIZATION? Definition Purpose Normalization in a Database Normalization optimizes a database's Management System (DBMS) is a structure by reducing redundancy, systematic process of organizing improving data integrity, enhancing data to reduce redundancy and performance, and simplifying dependency, ensuring data maintenance by organizing data into well- structured tables with defined integrity and efficiency. relationships. Insertion Anomaly Insertion Anomaly refers to when one cannot insert a new tuple into a relationship due to lack of data. ANOMALIES Deletion Anomaly The delete anomaly refers to the situation where the deletion of data results in the unintended loss of some other important data. Updation Anomaly The update anomaly is when an update of a single data value requires multiple rows of data to be updated. INSERTION ANOMALY EXAMPLE: StudentID StudentName CourseID CourseName 1 Jane 101 Math 2 Brian 102 History 1 Jane 103 Science NORMALIZED EXAMPLE: StudentID CourseID CourseID CourseName 101 Math 1 101 102 History 2 102 103 Science 1 103 104 Computer Science DELETION ANOMALY EXAMPLE: StudentID StudentName CourseID CourseName 1 Jane 101 Math 2 Brian 102 History 1 Jane 103 Science NORMALIZED EXAMPLE: StudentID CourseID CourseID CourseName 1 101 101 Math 2 102 102 History 1 103 103 Science UPDATION ANOMALY EXAMPLE: StudentID StudentName CourseID CourseName InstructorName 1 Jane 101 Math Int. Aphril Ganda 2 Brian 102 History Int. Faju 1 Jane 103 Science Int. Luther 2 Brian 101 Math Int. Aphril Ganda NORMALIZED EXAMPLE: StudentID CourseID CourseID CourseName InstructorName 1 101 101 Math Int. Aphril Ganda 2 102 102 History Int. Faju 1 103 103 Science Int. Luther 2 101 TYPES OF NORMAL FORMS 3NF A relation will be in 3NF if it is in 2NF 2NF and not contain any transitive partial 1NF In the 2NF, relational must be in 1NF. dependency. 3NF is used to reduce the data duplication. It is also used to is a property of a relational database In the second normal form, all non- achieve the data integrity. If there is table in which all columns contain only key attributes are fully functional no transitive dependency for non- atomic (indivisible) values, and each dependent on the primary key prime attributes, then the relation column holds a single type of data. In must be in third normal form. 1NF, there are no repeating groups or arrays within a column. 5NF also known as Project-Join Normal Form (PJNF), 4NF is a property of a relational database table BCNF where it is in Fourth Normal Form (4NF) and cannot be further decomposed without losing is a property of a relational database data. A table is in 5NF if it contains no join is a stricter version of Third Normal Form table that is in Boyce-Codd Normal dependencies and every join dependency is a (3NF). A table is in BCNF if it is in 3NF and, Form (BCNF) and has no multi-valued consequence of the candidate keys. for every functional dependency, the left- hand side (determinant) is a superkey. dependencies. FIRST NORMAL FORM (1NF) VIOLATES FIRST NORMAL FORM : StudentID StudentName Courses 1 Jude Math, Science 2 Astro Math, History 3 Zach History, Art NORMALIZED: StudentID StudentName Courses 1 Jude Math 1 Jude Science 2 Astro Math 2 Astro HIstory 3 Zach History 3 Zach Art SECOND NORMAL FORM (2NF) VIOLATES SECOND NORMAL FORM : TEACHER_ID SUBJECT TEACHER_AGE 13 Filipino 28 13 Math 28 46 Science 30 2 Computer 31 2 English 31 NORMALIZED: TEACHER DETAIL TABLE 1 TEACHER SUBJECT TABLE 1 TEACHER_ID SUBJECT TEACHER_ID TEACHER_AGE 13 Filipino 13 2 28 2 13 Math 46 2 Science 46 30 2 3 2 Computer 2 31 2 3 English THIRD NORMAL FORM (3NF) VIOLATES THIRD NORMAL FORM : STUDENT_ID STUDENTNAME COURSE_ID COURSE_NAME INSTUCTOR_NAME INSTRUCTOR_PHONE 1 JOHN 101 MATH Engr. Jose 0984-135-9901 1 JOHN 102 SCIENCE Dr. Alca 0927-333-4411 2 KATIE 101 MATH Engr. Jose 0984-135-9901 2 KATIE 103 HISTORY Prof. Sotto 0926-226-1731 NORMALIZED: 2 STUDENT TABLE 1 INSTRUCTOR TABLE STUDENT_ID STUDENTNAME 1 INSTRUCTOR_NAME INSTRUCTOR_PHONE 1 JOHN 2 Engr. Jose 0984-135-9901 2 KATIE 2 2 Dr. Alca 0927-333-4411 2 ENROLLMENT TABLE COURSE TABLE Prof. Sotto 0926-226-1731 1 1 1 STUDENT_ID COURSE_ID COURSE_ID COURSE_NAME INSTRUCTOR_NAME 101 1 101 102 101 MATH 2 Engr. Jose 2 2 1 102 102 SCIENCE Dr, Alca 2 103 2 2 2 101 103 HISTORY Prof. Sotto 2 103 BOYCE CODD NORMAL FORM (BCNF) VIOLATES BOYCE CODD NORMAL FORM : COURSE_ID INSTRUCTOR_NAME INSTRUCTOR_PHONE COURSE_LOCATION 101 Dr. Smith 0931-333-8971 ROOM A 102 Dr. Alca 0967-214-2127 ROOM B 101 Dr. Smith 0931-333-8971 ROOM A 103 Dr. Kate 0918-413-0871 ROOM C NORMALIZED: INSTRUCTOR TABLE COURSE TABLE 1 1 1 INSTRUCTOR_NAME INSTRUCTOR_PHONE COURSE_ID INSTRUCTOR_NAME COURSE_LOCATION Dr.Smith 0931-333-8971 101 Dr.Smith ROOM A 2 2 2 Dr. Alca 0967-214-2127 102 Dr. Alca ROOM B Dr. Kate 2 0918-413-0871 2 2 ROOM C 103 Dr. Kate 103 FOURTH NORMAL FORM (4NF) VIOLATES FOURTH NORMAL FORM : STUDENT_ID COURSE HOBBY 12 Computer Dancing 12 Math Singing 17 Chemistry Dancing 18 Biology Basketball 15 Physics Fencing NORMALIZED: STUDENT COURSE TABLE STUDENT HOBBY TABLE STUDENT_ID COURSE STUDENT_ID HOBBY 12 Computer 12 Dancing 12 Math 12 Singing 17 Chemistry 17 Dancing 18 Biology 18 Basketball 15 Physics 15 Fencing FIFTH NORMAL FORM (5NF) VIOLATES FIFTH NORMAL FORM : SUBJECT PROFESSOR SEMESTER Math Josh Semester 1 Math Mark Semester 1 Science Mark Semester 1 Science Justine Semester 2 English Ella Semester 1 NORMALIZED: SEMESTER-SUBJECT TABLE SUBJECT-PROFESSOR TABLE SEMESTER-PROFESSOR TABLE SEMESTER SUBJECT SEMESTER PROFESSOR SUBJECT PROFESSOR Semester 1 Josh Semester 1 Math Math Josh Semester 1 Mark Semester 1 English Math Mark Semester 1 Mark Semester 1 Science Science Mark Semester 2 Justine Semester 2 Science Science Justine Semester 1 Ella English Ella 15

Use Quizgecko on...
Browser
Browser