DBMS Course Outline PDF
Document Details
Uploaded by SecureCerium
Armando T. Saguin Jr., MSIT, DPA(CAR)
Tags
Summary
This document is a course outline for a database management systems course. It covers topics such as introduction to DBMS, different types of databases, back-end development, and course requirements. The summary also includes the course's title, author, and a brief description, intended for the reader to quickly understand the content.
Full Transcript
COURSE OUTLINE 1. Introduction to DBMS 2. Centralized vs Distributed vs Decentralized Database 3. Normalization and Relational Database 4. SQL vs NoSQL Database 6. Modelling DFD, ERD /UML 7. Database Replication 7.1 Bidirectional Replication 7.2 Multi-Source Replication 8. Stored Proce...
COURSE OUTLINE 1. Introduction to DBMS 2. Centralized vs Distributed vs Decentralized Database 3. Normalization and Relational Database 4. SQL vs NoSQL Database 6. Modelling DFD, ERD /UML 7. Database Replication 7.1 Bidirectional Replication 7.2 Multi-Source Replication 8. Stored Procedure/Function Using MySQL 9. Rest-API Development 9.1 MVC 9.2 Postman 9.3 Php and MySQL 10. Back-End Development Individual Final Course Requirements : 1. Reporting on the assigned topics 2. Prototype for the Proposed System Application - Architectural Design for the System - Data-Flow Diagram - ERD/UML (Database Design) - Back-End Design 3. Rest-API for the Proposed System Application (The proposed prototype must be related to the application you developed in Software Engineering.) Note: This course requirements are preparation for your Thesis Proposal. DATABASE MANAGEMENT SYSTEM Armando T. Saguin Jr., MSIT, DPA(CAR) Difference between DBMS and RDBMS What is DBMS? DBMS or Database Management System is the application which is used to store, manipulate and retrieve data in the database. DBMS handles all the tasks related to Data. In simple words, the DBMS created the database using the information Relational you give.Database Management System The RDBMS establishes the relationship between the tables in the database. It implements the Relational Model with data to work. Relational database stores the data in a hierarchy where the attributes are having the relationship with the entity in the other table. What is the difference between DBMS and RDBMS? DBMS stores the data as files but here in RDBMS you will store all the data in table format and can be related to each other. Why to Learn DBMS? Traditionally, data was organized in file formats. DBMS was a new concept then, and all the research was done to make it overcome the deficiencies in traditional style of data management. A modern DBMS has the following characteristics − Real-world entity Relation-based tables Isolation of data and application Less redundancy Consistency Query Language DBMS basically manages three things Data Database Engine Database Schema Database Engine Software that stores and retrieves data in a database. It may be a self-contained program or the part of a database management system (DBMS) that performs the storage and retrieval. Database Schema A database schema represents the logical configuration of all or part of a relational database. It can exist both as a visual representation and as a set of formulas known as integrity constraints that govern Characteristics of DBMS It uses a digital repository established on a server to store and manage the information. It can provide a clear and logical view of the process that manipulates data. DBMS contains automatic backup and recovery procedures. It contains ACID properties which maintain data in a healthy state in case of failure. It can reduce the complex relationship between data. It is used to support manipulation and processing of data. It is used to provide security of data. Advantages of DBMS Controls the redundancy in the database. Restricts unauthorized access. Provides a storage structure and multiple techniques for query processing. Provides backup and recovery. Provides multiple user interfaces. Enforces Constraints/Restriction on the data. Reduces application development Disadvantages of DBMS Cost of Hardware and Software: It requires a high speed of data processor and large memory size to run DBMS software. Size: It occupies a large space of disks and large memory to run them efficiently. Complexity: Database system creates additional complexity and requirements. Higher impact of failure: Failure is highly impacted the database because in most of the organization, all the data stored in a single database and if the What is a Database? Database: A collection of related data organized in a way that facilitates data searches. What are some examples of Databases? Types of Data Collected in a Typical Student Database Database Examples Web indexes Train timetables Library catalogues Airline bookings Medical records Credit card details Bank accounts Student records Stock control Customer histories Personnel systems Stock market prices Product catalogues Discussion boards Telephone directories and so on… Database Examples Facts about Database: Databases have evolved dramatically since their inception in the early 1960s. Some Navigational databases such as the Hierarchical database and the Network database were the original systems used to store and manipulate data. Although these early systems were actually inflexible In the early 1980s, Relational databases became very popular, which was followed by object- oriented databases later on. More recently, NoSQL databases came up as a response to the growth of the internet and the need for faster speed and processing of unstructured data. Today, we have cloud databases and self-driving The major components of the Database are: Hardware = This consists of a set of physical electronic devices such as I/O devices, storage devices and many more. It also provides an interface between computers and real-world systems. Software =This is the set of programs that are used to control and manage the overall Database. It also includes the DBMS software itself. The Operating System, the network software being used to share the data among the users, the application programs used to access data in the DBMS. Data = Database Management System collects, stores, processes, and accesses data. The Why should you study the database? Databases are useful Many computing applications deal with large amounts of information Database systems give a set of tools for storing, searching and managing this information Databases in CS Databases are a ‘core topic’ in computer science Basic concepts and skills with database systems are part of the skill set you will be assumed to have as a CS graduate What are databases used for? Improve business processes Keep track of customers Secure personal health information Store personal data Database Systems A database system consists of Data (the database) Software Hardware Users We focus mainly on the software Database systems allow users to Store Update Retrieve Organise Protect their data. Database Users End users Use the database system to achieve some goal Application developers Write software to allow end users to interface with the database system Database Administrator (DBA) Designs & manages the database system Database systems programmer Writes the database software itself Database Management Systems A database is a collection of information A database management system (DBMS) is the software than controls that information Examples: Oracle DB2 (IBM) MS SQL Server MS Access Ingres PostgreSQL MySQL SQLite MongoDB Data Structure for Database A data structure is a way of organizing the data in the computer memory in an efficient way. They are useful in most computer science fields like operating systems, graphics and artificial intelligence. Using the proper data structure will increase the performance of the software. Moreover, it allows the user to search and access data faster. Functionally, a database helps to access and manage data easily while a data structure helps to increase the efficiency related to time and space. The main difference Unstructured Data Structured Data Spreadsheet as Database Suppose we need to store the name and email of reviewers of a popular website. The simplest approach is to open a spreadsheet, maybe in Google Docs or Microsoft Excel, and enter a few names. 1. John, [email protected] 2. Alice, [email protected] Now that we have a spreadsheet, we decide to add a header identifying the columns and an id for each user. So, the worksheet looks like Spreadsheet as Database Next, we need to add reviews of these users. We don't want to clutter the spreadsheet, so we add a new worksheet. Spreadsheet as Database This simplified analogy serves perfectly to describe, conceptually, a database. Keep this analogy in mind as we go forward in the book: What the DBMS does Provides users with Data definition language (DDL) Data manipulation language (DML) Data control language (DCL) Often these are all the same language DBMS provides Persistence Concurrency Integrity Security Data independence Data Dictionary Describes the database itself Data Dictionary The dictionary or catalog stores information about the database itself This is data about data or ‘metadata’ Almost every aspect of the DBMS uses the dictionary The dictionary holds Descriptions of database objects (tables, users, rules, views, indexes,…) Information about who is using which data (locks) Schemas and mappings File Based System File based systems Data is stored in files Each file has a specific format Programs that use these files depend on knowledge about that format Problems: No standards Data duplication Data dependence No way to generate ad hoc queries No provision for security, recovery, concurrency, etc. Relational Systems Information is stored as tuples or records in relations or tables There is a sound mathematical theory of relations Most modern DBMS are based on the relational model The relational model covers 3 areas: Data structure Data integrity Data manipulation ANSI/SPARC Architecture ANSI - American National Standards Institute SPARC - Standards Planning and Requirements Committee 1975 - proposed a framework for DBs A three-level architecture Internal level: For systems designers Conceptual level: For database designers and administrators External level: For database users Internal Level Deals with physical storage of data Structure of records on disk - files, pages, blocks Indexes and ordering of records Used by database system programmers Internal Schema RECORD EMP LENGTH=44 HEADER: BYTE(5) OFFSET=0 NAME: BYTE(25) OFFSET=5 SALARY: FULLWORD OFFSET=30 DEPT: BYTE(10) OFFSET=34 Conceptual Level Deals with the organisation of the data as a whole Abstractions are used to remove unnecessary details of the internal level Used by DBAs and application programmers Conceptual Schema CREATE TABLE Employee ( Name VARCHAR(25), Salary REAL, Dept_Name VARCHAR(10)) External Level Provides a view of the database tailored to a user Parts of the data may be hidden Data is presented in a useful form Used by end users and application programmers External Schemas Payroll: String Name double Salary Personnel: char *Name char *Department Mappings Mappings translate information from one level to the next External/Conceptual Conceptual/Internal These mappings provide data independence Physical data independence Changes to internal level shouldn’t affect conceptual level Logical data independence Conceptual level changes shouldn’t affect external levels ANSI/SPARC Architecture User 1 User 2 User 3 External Schemas External External External/Conceptual Mappings View 1 View 2 Conceptual DBA Conceptual Schema View Conceptual/Internal Mapping Internal Schema Stored Data Database Processes 1. Entering Data 2. Querying Data 3. Creating Database Reports Entering Data Data Entry: process of getting information into a database possible methods of data entry: Data Entry Professional, Electronic Files, Historical Records, or Web Based (Forms) Querying Data Different database systems all have their own way of performing queries to extract data. However all perform similar functions, allowing the user to: Combine into one table the information from two or more related tables Select the fields to be shown in the ‘Answer’ table Specify criteria for searching on e.g. find the names and addresses of all club members whose subscriptions are due Save the query so that it can be executed whenever necessary ICT5 Querying Data Querying: how we get information from a database Structured Query Language (SQL): most common language used to interface with databases Example: SELECT DISTINCT STUDENT_ID, GRADE FROM GRADES WHERE GRADE = “95” ORDER BY STUDENT_ID; Querying Data continued Query By Example (QBE) enables you to fill out a grid, or template, in order to construct a description of the data you would like to retrieve. Creating Database Reports Report: A compilation of data from the database that is organized and produced in a printed format. Typically produced on paper, but also can be displayed on-screen. Example: Quarterly Sales Report A properly designed database system offers a solution to the problems of file processing. It will provides an overall framework that avoids data redundancy and supports a real- time, dynamic environment. Advantages Scalability Better support for client/server systems Economy of scale Flexible data sharing Enterprise-wide application – database administrator (DBA) Stronger standards Controlled redundancy Better security Increased programmer productivity Data independence Types of Databases There are various types of databases used for storing different varieties of data: #Centralized Database It is the type of database that stores data at a centralized database system. It comforts the users to access the stored data from different locations through several applications. These applications contain the authentication process to let users access data securely. An example of a Centralized database can be Central Library that carries a central database of each library in a college/university. Advantages of Centralized Database It has decreased the risk of data management, i.e., manipulation of data will not affect the core data. Data consistency is maintained as it manages data in a central repository. It provides better data quality, which enables organizations to establish data standards. It is less costly because fewer vendors are required to handle the data sets. Disadvantages of Centralized Database The size of the centralized database is large, which increases the response time for fetching the data. It is not easy to update such an extensive database system. If any server failure occurs, entire data will be lost, which could be a huge loss. #Distributed Database Unlike a centralized database system, in distributed systems, data is distributed among different database systems of an organization. These database systems are connected via communication links. Such links help the end-users to access the data easily. Examples of the Distributed database are Apache Cassandra, HBase, Ignite, etc. We can further divide a distributed database system into: Homogeneous DDB: Those database systems which execute on the same operating system and use the same application process and carry the same hardware devices. Heterogeneous DDB: Those database systems which execute on different operating systems under different application procedures, and carries different hardware devices. Advantages of Distributed Database Modular development is possible in a distributed database, i.e., the system can be expanded by including new computers and connecting them to the distributed system. One server failure will not affect the entire data set. #Relational Database This database is based on the relational data model, which stores data in the form of rows(tuple) and columns(attributes), and together forms a table(relation). A relational database uses SQL for storing, manipulating, as well as maintaining the data. E.F. Codd invented the database in 1970. Each table in the database carries a key that makes the data unique from others. Examples of Relational databases are MySQL, Microsoft SQL Server, Oracle, etc. Properties of Relational Database There are following four commonly known properties of a relational model known as ACID properties, where: A-means Atomicity C-means Consistency I-means Isolation D-means Durability #NoSQL Database Non-SQL/Not Only SQL is a type of database that is used for storing a wide range of data sets. It is not a relational database as it stores data not only in tabular form but in several different ways. NoSQL presented a wide variety of database technologies in response to the demands. We can further divide a NoSQL database into the following four types: Key-value storage Document-oriented Database Graph Databases Wide-column stores Advantages of NoSQL Database It enables good productivity in the application development as it is not required to store data in a structured format. It is a better option for managing and handling large data sets. It provides high scalability. Users can quickly access data from the database through key-value. #Cloud Database A type of database where data is stored in a virtual environment and executes over the cloud computing platform. It provides users with various cloud computing services (SaaS, PaaS, IaaS, etc.) for accessing the database. There are numerous cloud platforms, but the best options are: Amazon Web Services(AWS) Microsoft Azure Kamatera PhonixNAP ScienceSoft Google Cloud SQL, etc. #Object-oriented Databases The type of database that uses the object-based data model approach for storing data in the database system. The data is represented and stored as objects which are similar to the objects used in the object-oriented programming language. #Hierarchical Databases It is the type of database that stores data in the form of parent-children relationship nodes. Here, it organizes data in a tree-like structure. Data get stored in the form of records that are connected via links. Each child record in the tree will contain only one parent. On the other hand, each parent record can have multiple child records. #Hierarchical Databases It is the type of database that stores data in the form of parent-children relationship nodes. Here, it organizes data in a tree-like structure. Data get stored in the form of records that are connected via links. Each child record in the tree will contain only one parent. On the other hand, each parent record can have multiple child records. #Network Databases It is the database that typically follows the network data model. Here, the representation of data is in the form of nodes connected via links between them. Unlike the hierarchical database, it allows each record to have multiple children and parent nodes to form a generalized graph structure. #Personal Database Collecting and storing data on the user's system defines a Personal Database. This database is basically designed for a single user. Advantage of Personal Database It is simple and easy to handle. It occupies less storage space as it is small in size. #Operational Database The type of database which creates and updates the database in real-time. It is basically designed for executing and handling the daily data operations in several businesses. For example; an organization uses operational databases for managing per day transactions. #Enterprise Database Large organizations or enterprises use this database for managing a massive amount of data. It helps organizations to increase and improve their efficiency. Such a database allows simultaneous access to users. Advantages of Enterprise Database: Multi processes are supportable over the Enterprise database. It allows executing parallel queries Database Tradeoffs Because DBMSs are powerful, they require more expensive hardware, software, and data networks capable of supporting a multiuser environment More complex than a file processing system Procedures for security, backup, and recovery are more complicated and critical Comparison of PostgreSQL, MySQL, and SQLite The Top Relational Databases What is MySQL? The MySQL software delivers a very fast, multi-threaded, multi-user, and robust SQL (Structured Query Language) database server. MySQL Server is intended for mission-critical, heavy-load production systems as well as for embedding into mass-deployed software. What is PostgreSQL? PostgreSQL is an advanced object-relational database management system that supports an extended subset of the SQL standard, including transactions, foreign keys, subqueries, triggers, user-defined types and functions. What is SQLite? SQLite is an embedded SQL database engine. Unlike most other SQL databases, SQLite does not have a separate server process. SQLite reads and writes directly to ordinary disk files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk file. PostgreSQL PostgreSQL is the open source relational database management system developed by professionals from all over the world. The platform it supports: Linux, Unix MAC OS X Windows Solaris FreeBSD NetBSD OpenBSD Advanced Interactive eXecutive by IBM PostgreSQL differs a lot from RDBMS as it entirely supports Relational system/Object Oriented Model which supports Reliable transactions(ACID). What is ACID? Atomicity Either an operation has to be completed entirely, or it should not happen. This is what we can call atomicity. For, e.g., if you are making a bank transaction and the power goes down, Here the entire operation won't be performed. Consistency In a transaction, the database has to go from one state to another state. If any error occurs, they database should not be changed and should be reverted to the previous stats as it was before the transaction. Isolation In the Isolation process, the transactions which are being executed sequentially. One Transaction doesn't know about another one here. Durability If a transaction is executed, it should be recorded in a non-volatile memory to keep track of it even when the failure occurs due to various reasons. Advantages and Disadvantages of PostgreSQL Advantages Open Source SQL RDBMS Good community Support Widely Supported Extendable Objective Based Cloud Supported Disadvantages Hard for Newbies Performance Popularity When you can use PostgreSQL? Use PostgreSQL Data integrity and Reliability Need of Custom Procedures Integrity concerns Complexity with ease Do Not Use PostgreSQL at this Situation Speed Easy Setup Replication Easy for Configuration Easy to Manage Maximum Database Size: Unlimited Maximum Row Size: 1.6 TB Maximum Table Size: 32 TB Maximum Field Size: 1 GB MySQL MySQL is the world’s most popular open source database, enabling the cost-effective delivery of reliable, high-performance and scalable Web- based and embedded database applications. It is an integrated transaction safe, ACID-compliant database with full commit, rollback, crash recovery, and row-level locking capabilities. MySQL delivers the ease of use, scalability, and high performance, as well as a full suit of database drivers and visual tools to help developers and DBAs build and manage their business-critical MySQL applications. MySQL is developed, distributed, and supported by Oracle, and the latest information about MySQL software can be found on the MySQL Web site MySQL The MySQL database provides the following features: High Performance and Scalability to meet the demands of exponentially growing data loads and users. Self-healing Replication Clusters to improve scalability, performance and availability. Online Schema Change to meet changing business requirements. Performance Schema for monitoring user- and application-level performance and resource consumption. SQL and NoSQL Access for performing complex queries and simple, fast Key Value operations. Platform Independence giving you the flexibility to develop and deploy on multiple operating systems. Advantages and Disadvantages of MySQL Advantages High Functionality Support Highly Secure Scalable and Powerful High Speed Disadvantages Limitations Reliability problems Slow development When you can use MySQL? Use MySQL Highly Secured Web application usages The Scalable and easy to manage Customized solutions Do Not Use MySQL at this Situation SQL Compliant MySQL is not entirely SQL compliant Problem in Concurrency Lack of certain features Max DB size: 256 terabytes (256^7 bytes) Max row size: 65,535 bytes SQLite SQLite is an embedded SQL database engine. Unlike most other SQL databases, SQLite does not have a separate server process. SQLite reads and writes directly to ordinary disk files. A complete SQL database with multiple tables, indices, triggers, and views, is contained in a single disk The SQLite code base is supported by an file. international team of developers who work on SQLite full-time. The developers continue to expand the capabilities of SQLite and enhance its reliability and performance while maintaining backwards compatibility with the published interface spec, SQL syntax, and database file format. The source code is absolutely free to anybody who wants it, but professional support is Advantages and Disadvantages of SQLite Advantages Easily Portable Standards Good for Development and Testing Simplicity Disadvantages No User Management in SQLite No Possibility to tinker When you can use SQLite? Use SQLite Embedded Applications Disk access replacement Do Not Use SQLite at this Situation Applications with Multi-User Applications which require frequent writing Max DB size: 140 terabytes (2^47 bytes) Max row size: 1 gigabyte ….THE END….