CIS 1303 - Database Systems Lecture Notes PDF
Document Details
Tags
Summary
These lecture notes cover various aspects of database systems, including core database concepts, data models (hierarchical, network, relational, NoSQL), and the challenges of data management. The document also discusses the differences between data, information, and knowledge, and how to store data efficiently and securely.
Full Transcript
CIS1303- DATABASE SYSTEMS CHAPTER 1 WELCOME ABOARD CHAPTER 1 CLO1: Demonstrate an understanding of core database concepts, including both relational and non-relational data models. Version Author Effective Change Description...
CIS1303- DATABASE SYSTEMS CHAPTER 1 WELCOME ABOARD CHAPTER 1 CLO1: Demonstrate an understanding of core database concepts, including both relational and non-relational data models. Version Author Effective Change Description DRC Date No 1.0 SCTL ? Defined the first version 001 DOCUMENT 2.0 Nourchene July 2021 Restructured and added 002 REVISION 2.1 Benayed Achraf August examples and animation Reviewed the content 002 CONTROL Ghorbel 2021 (DRC) 3.0 Omar June 2023 Added definitions and 003 Abuzaghleh examples 4.0 Nourchene July 2024 Updated the slides to 004 Benayed implement 5Es model 4.1 Abraham August Added slides S12 004 X.Y George 2024 X: MAJOR CHANGE Y: MINOR CHANGE 2 LECTURE NOTES Contents of lectures are based on the textbook, recommended text, and supplementary material Please read Chapter 1 Please read Chapter 2 (section 2.5 Database Models) 3 Why do we need a Database and what is a Database? What is Data, Information, and Knowledge? How to collect and store data? LECTURE OUTLINE Database Models Key terms 4 LECTURE OBJECTIVES At the end of lecture, the student should be able to: Explain why databases are essential in modern businesses and organizations Describe the challenges faced when managing data without a database Define key terms related to databases such as data, information, and knowledge Differentiate between data, information, and knowledge, and understand their roles in data management Identify issues associated with traditional file processing systems such as data redundancy, inconsistency, and unsecure data Discuss the advantages of database systems in overcoming these challenges Discuss the hierarchical, network, object-oriented, relational, and NoSQL database models Compare and contrast these models based on their structure, use cases, pros, and cons 5 WHY DO WE NEED A DATABASE?... Data is all around us. For example, you might have All your Your emails Your photos Your videos contacts and are stored in are on can be their phone your email Instagram stored in numbers are app YouTube on your phone 6 WHY DO WE NEED A DATABASE? 7 EXPLORING DATA MANAGEMENT CHALLENGES… File Processing System has disadvantages like: Data Redundancy Data redundancy: It occurs when the same piece of data is stored in multiple places or multiple records Data inconsistency: It refers to a state in which different copies or instances of the same data contain conflicting or contradictory information Example: Imagine a customer's address stored in multiple files: one file for customer details, another for orders, and a third for shipping information. This is what we called data redundancy. Let's consider a scenario where an update is made to one instance of the address but not reflected in the others, it results in data inconsistence 8 EXPLORING DATA MANAGEMENT CHALLENGES File Processing system has disadvantages like: Difficulties in Data Manipulation: Performing complex operations, such as joining or aggregating data from multiple files, can be challenging in a file processing system Examples: Performing updates requires modifying multiple files, increasing the risk of data inconsistency and errors Searching for specific data can be slow, especially as the volume of data grows Generating a report on employee hours worked on projects, data from employees, projects, and assignment files must be combined 9 EXPLORING DATA MANAGEMENT CHALLENGES File Processing system has disadvantages like: Unsecure data: It refers to data that lacks proper security measures, making it vulnerable to unauthorized access, disclosure, alteration, or deletion Example: Each file may have different access control settings, or worse, no access control at all 10 WHAT IS A DATABASE? A database is a collection of organized data that can be easily accessed, modified, and protected Databases reduce redundancy by storing data in one place They ensure consistency with centralized updates Enhanced security through controlled access Example: Your email app uses a database to store all your emails securely and consistently 11 DISTRIBUTED DATABASES A distributed database is any database where records are held in more than one location It’s a web of many different databases connected through one centrally accessible network Records are held in more than one location Distributed databases offer benefits over centralized databases, in terms of storage capability, scalability and cost- effectiveness. Example: Netflix catalog or Amazon Item catalog is distributed over multiple locations 12 EVALUTING YOUR UNDERSTANDING What is data redundancy, and how do databases solve it? Name two advantages of using a database system. How do databases enhance data security? 13 WHAT IS DATA?... Ahmed 20 Data are raw, unprocessed facts and figures without any context 23-Aug-2001 Data can be numbers, letters, figures, sounds, or images Example: Your name, age, date of birth, the car you drive, your bank account balance, your photo can be considered as data 14 WHAT IS DATA? Ahmed Data refer to an elementary description of things, events, activities, and transactions that 20 are recorded, classified, and stored but are not organized to convey any meaning 23-Aug-2001 Data has become indispensable to every kind of modern business and government organization Data is possibly the most important corporate resource 15 WHAT IS INFORMATION?... Data are simply facts or figures — bits of information, but not information itself. When data are processed, interpreted, organized, structured or presented so as to make them meaningful or useful, they are called information. Information provides context for data. 16 WHAT IS INFORMATION? Information refers to processed or organized data that have meaning and value. For example: we can derive a meaningful conclusion from an organized collection of data. We cannot call it Information if we are not getting any result from our data. 17 WHAT IS KNOWLEDGE? Knowledge consists of information that results in an understanding, experience, accumulated learning, and expertise as they apply to a current business problem. Knowledge is the result of analyzed information for learning purposes. 18 DATA, INFORMATION & KNOWLEDGE Data Cycle Example: A survey collects facts → This is Data When data is processed → This is now Information When information is analyzed → This is now Knowledge DATA, INFORMATION & KNOWLEDGE-EXAMPLES ‘1.5’ is a data, information or knoweldge? 1.5 is data. Indeed, without a context, data can mean little. 1.5 is just a sequence of numbers without apparent importance. But if we view it in the context of ‘this is a person age or a student GPA’, we can easily recognize that the age of a person is 1 year and 6 months or the student GPA is 1.5 By adding context and value to the numbers, they now have more meaning, we transformed the data into information Saying the person age is 1 year and 6 months (Information), we know that the person is a toddler (knowledge) Saying the student GPA is 1.5 (Information), we know that the student is an at-risk one (knowledge) 20 APPLYING THE CONCEPTS Business Education Healthcare Sales transactions Test scores (data) Patient records (data) Grade reports (data) Sales reports (information) Diagnostic reports (information) Curriculum (information) Sales strategies adjustments Treatment plans (knowledge) (knowledge) (knowledge) 21 EVALUATING YOUR UNDERTSANDING What is the difference between data and information? How does information become knowledge? Provide an example of data, information, and knowledge in a business context 22 HOW TO COLLECT DATA? Internal Source of Data Data means LIFE to organizations Reside within the organization Internal data includes day-to-day business activities Without data employees cannot work and important to a company such as sales data, marketing, organizations cannot function customer information, HR, etc Quite easy to obtain and manage Data sources in a business environment can be categorized as internal and external sources External Source of Data Reside outside the organization Government data such as census and chamber of commerce Non-Government data such as news media and annual reports International organizations such as IMF (International Monetary Fund), United Nations data, etc 23 HOW TO STORE DATA? Storing data efficiently and securely is a fundamental aspect of data management Database Management System (DBMS) Description: Software that uses a structured format to store, manage, and retrieve data (End Users, (Storage, Retrieval and Modification) Examples: SQL databases like Developers, Database MySQL, PostgreSQL, Oracle and NoSQL databases like MongoDB Administrator) Usage: Suitable for applications requiring structured data storage, such as websites, applications, and large enterprises 24 USER TYPES Database Administrator End Users Developers/Programmers (DBA) The person who will be Computer professionals Responsible for the using the database, they are who design, create, and management and not computer professionals maintain programs for end maintenance of the but are trained database users database on a day-to-day operators basis ADVANTAGES OF DBMS… Encourage data sharing and Data concurrency is a means of allowing multiple users to access allow concurrent use of the same data at the same time database Reduce data redundancy By avoiding repeated data, where same data may exist multiple times By enforcing integrity constraints. Integrity constraint means data Ensure data accuracy should be accurate in the database and incorrect data entry should not be allowed 26 DATA REDUNDANCY… Repeated Data is present when a field has two or more identical values A data value is redundant if one value can be removed without the loss of information A data value is duplicated when loss of information occurs after removing the repeated data field 27 DATA REDUNDANCY- EXAMPLE 1 Student Name ID Address Nasser 123456 Dubai Salim 234567 Sharjah Ahmad 567890 Dubai Alya 232323 Ajman The Student table contains repeated data, because the value Dubai occurs twice in the Address field. Duplicated data: The table does not however, contain redundant data. This is because if we delete the value “Dubai” from the first row, the user will no longer be able to tell from the table where Nasser lives. 28 DATA REDUNDANCY-EXAMPLE 2 Student Name Course Code Course Name Samira CIS 1303 Database Systems Salwa CIS 1703 Advanced SQL Noora CIS 1303 Database Systems Salima CIS 1803 Software Engineering This table contains repeated data, because the value “Database Systems” occurs twice in the course name field Redundant data: The course name “Database Systems” is redundant because, even if the value “Database Systems” is removed from either row, we can still tell from the remaining course code “CIS 1303” the course name is indeed “Database Systems” 29 DATA REDUNDANCY ISSUES Data Redundancy leads to: Inaccurate data Inconsistent data 30 HOW TO ELIMINATE DATA REDUNDANCY? Data Redundancy Data Redundancy Eliminated By spliting the table into two separte tables Student Name Course Code Samira CIS 1303 Student Course Course Name Name Code Salwa CIS 1703 Samira CIS 1303 Database Systems Noora CIS 1303 Salwa CIS 1703 Advanced SQL Salima CIS 1803 Noora CIS 1303 Database Systems Course Code Course Name Salima CIS 1803 Software CIS 1303 Database Systems Engineering 31 CIS 1703 Advanced SQL CIS 1803 Software Engineering ADVANTAGES OF DBMS Allow data querying (Easier By providing a simple query language (e.g SQL), which allows users to retrieve data from the database and faster to find data) Backup means to retain multiple copies of data Provide tools to backup and A database must be backed-up on a regularly scheduled basis (daily/weekly/monthly) The backup set must be stored in a safe place, preferably away from the original set recover a database Recovery means restore backed up data in case of loss of existing current data due to any reasons such as database crash, server crash, deleted mistakenly etc Provide tools to control DBMS allows administrators to define and enforce access control mechanisms to regulate who can access the database and its data Data security and privacy DBMSs offer built-in encryption capabilities that allow data to be stored and transmitted in an encrypted form 32 DATABASE MODELS… Hierarchical Model Network Model (N – Model) Structure: Tree-like structure Structure: Graph structure with a single root and multiple allowing multiple parent nodes levels of hierarchy (more complex than Key Feature: Each parent can hierarchical) have multiple children, but each Key Feature: multiple parents child has only one parent Example: A network of Example: Company friends on a social media organizational chart platform Use Case: Organizational Use Case: Suitable for structures, file systems complex data relationships with Pros: Simple, efficient for many-to-many connections hierarchical data Pros: More flexible than Cons: Limited flexibility, hierarchical model, supports difficult to reorganize, many-to-many relationships redundancy issues Cons: Complexity in design and maintenance 33 DATABASE MODELS… Object Oriented Model (OO – Relational Model (R – Model) Model) Structure: Data is stored as Structure: Data is stored in objects, similar to object- tables (relations) with rows and oriented programming columns. Key Features: Objects, classes, Key Features: table, records, inheritance integrity constraints Example: An inventory system Example: A customer database where each item is an object with tables for customers, with attributes and methods orders, and products Use Case: Useful for Use Case: Most widely used applications with complex data model, suitable for a wide range and relationships that align with of applications object-oriented programming Pros: Flexibility, support for Pros: Seamless integration with complex queries using SQL, object-oriented programming, strong data integrity reusability of objects Cons: Performance can degrade Cons: Can be complex to with very large datasets and implement, less mature than complex queries relational databases 34 DATABASE MODELS… NoSQL Structure: Broad category with various types: document, key-value, wide-column, graph. Focus on scalability, flexibility, and performance Key Features: Schema-less or flexible schema Example: MongoDB (document), Redis (key-value) Use Case: Suitable for large-scale, high-performance applications with unstructured or semi-structured data. Used for real-time analytics, content management, and big data processing Pros: Scalable and flexible. High performance for specific use cases. No rigid schema Cons: Less mature than relational databases. Lack of strong data integrity in some cases 35 DATABASE MODELS Hierarchical Object Relational Network Model NoSQL Model Oriented Model Model Tree structure Graph structure Objects with Tables with rows Various models Parent-child Multiple parents attributes and and columns (document, key- methods value, graph) for unstructured and large-scale data. 36 ACTIVITY 1: DATA MIGRATION AND CONVERSION Scenario: A library is migrating from a hierarchical database to a relational database. The hierarchical database stores information about books, authors, and publishers in the following structure: Books: Title Publication Year Author (one or more) Publisher Task: Design a relational database schema to represent the library's data. 37 ACTIVITY 2: NOSQL VS. RELATIONAL DEBATE Scenario A rapidly growing e-commerce company is planning to develop a new personalized recommendation system. The company needs to store vast amounts of user data, product information, and purchase history. They are debating between using a NoSQL or a relational database for this application Task Divide the class into two groups: Group 1: NoSQL Advocates: Argue in favor of using a NoSQL database for the e-commerce recommendation system. Group 2: Relational Advocates: Argue in favor of using a relational database for the e-commerce recommendation system. 38 Data, Information, Database and DBMS Knowledge Data Redundancy, Data Backup and Integrity, Consistency, Recovery KEY TERMS Accuracy, and security Database Models 39 40