PrADM - Part 1.2.pdf
Document Details
Uploaded by ElatedSuprematism7205
Full Transcript
Lesson 3 & 4: Databases and Normalization | Entity Relationship Diagram and Relational Modeling DATABASES AND NORMALIZATION Databases In the context of data warehousing and management, databases are structured collections of data stored electronically. They allow for the efficient storage, re...
Lesson 3 & 4: Databases and Normalization | Entity Relationship Diagram and Relational Modeling DATABASES AND NORMALIZATION Databases In the context of data warehousing and management, databases are structured collections of data stored electronically. They allow for the efficient storage, retrieval, and manipulation of data. Databases can range from simple systems like Excel spreadsheets to complex systems like SQL databases, which are often used in enterprise settings. 2 DATABASES AND NORMALIZATION Structured Data Storage – Databases are designed to store data in a structured format, often using tables, columns, and rows. This structure makes it easier to query and analyze data. Scalability – Databases can handle large amounts of data, making them suitable for enterprise-level applications. Consistency and Integrity – Databases ensure that data is consistent and that integrity constraints are enforced, which means 3 the data is reliable and accurate. DATABASES AND NORMALIZATION Normalization Normalization is the process of organizing the data in a database to reduce redundancy and improve data integrity. The main goals of normalization are to: 4 DATABASES AND NORMALIZATION Eliminate Redundancy – By dividing large tables into smaller ones and defining relationships between them, normalization reduces the duplication of data. Ensure Data Integrity – Normalization helps maintain data accuracy by ensuring that data is stored in one place and referenced wherever needed. Optimize Queries – Normalized databases often perform better because they have less redundancy, which can speed 5 up query times. APPLICATION IN DATA WAREHOUSING In a data warehouse, normalization might be applied to ensure that each piece of data is stored once and referenced multiple times, rather than being duplicated across different tables. 6 RELATIONAL MODELING This is a methodology for designing a database that is structured to minimize redundancy and dependency. Relational modeling involves: 7 RELATIONAL MODELING Defining Tables: Identifying the tables needed to store data, along with the columns and data types for each table. Defining Relationships: Establishing relationships between tables, often using foreign keys, to ensure that data can be related across different tables. Normalization: Applying normalization rules to ensure the database is efficient and minimizes redundancy. 8 Entity Relationship Diagram (ERD) ERDs are visual representations of the data models and relationships within a database. They consist of entities (which represent tables), attributes (which represent columns within the tables), and relationships (which define how tables relate to one another). ERDs are crucial in the design phase of a database because they help in: 9 Entity Relationship Diagram (ERD) Visualizing Database Structure: ERDs provide a graphical view of how different data elements are interconnected. Communicating Design: ERDs are a communication tool that helps stakeholders understand the database design. Guiding Database Creation: They serve as blueprints for the creation of the actual database. 10 INTEGRATION Data Consistency and Quality: Normalization ensures that data is stored consistently and without redundancy. This is important in a data warehouse, where data from multiple sources is integrated. Efficient Data Storage: By normalizing data and using relational models, data warehouses can optimize storage space, which is crucial when dealing with large datasets 11 INTEGRATION Scalability: Proper use of relational modeling and ERDs allows data warehouses to scale efficiently, supporting more users and larger datasets. Ease of Reporting and Analytics: A well-structured relational model facilitates complex queries and analytics, which are essential for making informed business decisions. 12 PRACTICAL CONSIDERATIONS Denormalization in Data Warehousing - While normalization is a key principle in OLTP (Online Transaction Processing) databases, data warehouses sometimes use denormalization to improve query performance. Denormalization involves combining tables to reduce the number of joins needed, which can speed up read operations. This is because data warehouses are optimized for read-heavy operations rather than write- heavy operations. 13 PRACTICAL CONSIDERATIONS Design Complexity - Both ERDs and relational modeling require careful planning and understanding of the business requirements. Over-normalization can lead to complex queries and slower performance, whereas under- normalization can lead to data inconsistency. 14 Entity Relationship Diagram - Example An Entity Relationship Diagram (ERD) is a visual representation of entities, their attributes, and the relationships between them. In the context of a data warehouse or database design, an ERD helps to illustrate how different entities (or objects) in a system are connected to one another. 15 Entity Relationship Diagram - Example Example: ERD for a Simple Online Retail System Imagine we are designing an ERD for an online retail system that includes customers, orders, and products. This ERD will have three main entities: 1. Customer 2. Order 3. Product 16 Entity Relationship Diagram - Example These entities will be connected through relationships such as "places" (a customer places an order) and "contains" (an order contains products). Below is the graphical representation of this scenario: 17 Entity Relationship Diagram - Example 18 USAGE OF ERD IN PRACTICE Database Design - Use this ERD as a blueprint to create database tables and define relationships using SQL statements. Communication - Use the ERD to explain database structure to stakeholders, developers, and team members. Data Integrity - Ensure that the design adheres to normalization principles to reduce redundancy and improve data integrity. 19 “ Turn the rejections you receive into others' regrets. - JDG, CPA 20 21