Database Management System (DBMS) - Introduction, Components, and Models PDF
Document Details
![PrettyKazoo9208](https://quizgecko.com/images/avatars/avatar-16.webp)
Uploaded by PrettyKazoo9208
Tags
Summary
This document provides an introduction to Database Management Systems (DBMS), software that interacts with applications to analyze data. It covers the history, properties, components, and provides examples of relational SQL and non-relational NoSQL database models.
Full Transcript
Introduction to Database Management System (DBMS) A Database Management System (DBMS) is software that interacts with users, applications, and the database itself to capture and analyze data. It enables users to store, modify, and retrieve data efficiently. Key Features of DBMS: Data Storage: Sto...
Introduction to Database Management System (DBMS) A Database Management System (DBMS) is software that interacts with users, applications, and the database itself to capture and analyze data. It enables users to store, modify, and retrieve data efficiently. Key Features of DBMS: Data Storage: Stores data systematically. Data Retrieval: Allows quick and efficient data access. Data Security: Ensures that unauthorized users cannot access the data. Data Integrity: Maintains accuracy and consistency of data. Concurrency Control: Manages simultaneous data access by multiple users. Backup and Recovery: Protects data against loss and ensures recovery in case of failure. Examples: Oracle, MySQL, PostgreSQL, Microsoft SQL Server, MongoDB. History of Databases 1. Early Systems: File-Based Systems (1960s): Before DBMS, data was stored in flat files. It lacked structure and made data retrieval cumbersome. Problems: Redundancy and inconsistency. Lack of data sharing and security. 2. Hierarchical and Network Models (1970s): Hierarchical Model: Data is organized in a tree-like structure (e.g., IMS by IBM). Network Model: Data is organized in a graph, allowing multiple relationships (e.g., CODASYL DBMS). 3. Relational Model (1980s): xData is stored in tables (relations) with rows and columns. Simplified data management and became the foundation for modern DBMS. 4. Object-Oriented and NoSQL Databases (1990s - Present): Object-Oriented DBMS: Incorporates object-oriented programming principles. NoSQL Databases: Designed for unstructured and semi-structured data (e.g., MongoDB, Cassandra). Properties of Databases 1. Self-Describing Nature: Databases include metadata that describes the structure and constraints of the data. 2. Data Abstraction: Provides three levels of abstraction: Physical Level: How data is stored physically. Logical Level: Structure of the data (e.g., tables). View Level: Customized views for different users. 3. Data Independence: Logical Independence: Changes in schema do not affect application programs. Physical Independence: Changes in physical storage do not affect logical schema. 4. Multi-User Environment: Allows multiple users to access and modify data concurrently. 5. Data Integrity and Security: Maintains accuracy and prevents unauthorized access. 6. Transactions and Concurrency Control: Ensures atomicity, consistency, isolation, and durability (ACID properties). Components of a DBMS 1. Hardware: Physical devices required to store and access the database Examples: Hard drives, servers, and network devices. 2. Software: The DBMS software that provides tools for managing the database Examples: SQL engine, query processor. 3. Data: The actual data stored in the database. Includes user data, metadata, and indexes. 4. Users: Database Administrators (DBAs): Manage and maintain the database. Developers: Write queries and applications to interact with the database. End Users: Access the data through applications. 5. Database Schema:The logical structure of the database, including tables, views, indexes, and constraints. 6. Query Language: Provides a way to interact with the database. Examples: SQL (Structured Query Language), NoSQL for non-relational databases. Sub-Components of a DBMS 1. Storage Manager Handles the storage, retrieval, and updating of data in the database. Components: File Manager: Manages the space on the storage device. Buffer Manager: Temporarily stores data in memory during transactions. Authorization and Integrity Manager: Ensures only authorized users access data and maintains integrity constraints. 2. Query Processor Translates and executes database queries. Components: DML Compiler: Translates data manipulation language (DML) commands into low-level instructions. Query Optimizer: Determines the most efficient way to execute a query. Query Evaluator: Executes the optimized query plan. 3. Transaction Manager Ensures database consistency and handles transactions. Components: Concurrency Control Manager: Manages simultaneous operations to avoid conflicts. Recovery Manager: Restores the database to a consistent state in case of failures. Sub-Components of a DBMS 4. Metadata Manager Maintains and manages metadata (data about the database schema, structure, etc.). Provides information to other components for query optimization and execution. 5. Database Engine The core of the DBMS that executes commands and interacts with the stored data. 6. Data Dictionary Stores metadata such as table definitions, schemas, indexes, and user privileges. 7. User Interfaces Provides an interface for users to interact with the DBMS. Examples: Command-line interface (CLI) Graphical user interface (GUI) APIs for developers Difference between Relational and Non-Relational DBMS Aspect Relational Databases (SQL) Non-Relational Databases (NoSQL) Structured data stored in tables with rows and Flexible, stores data in formats like documents, key- Data Structure columns. value pairs, graphs, or columns. Rigid schema: predefined structure that must be Schema-less or flexible schema, allowing dynamic Schema followed. changes. Handles structured, semi-structured, and unstructured Data Types Handles structured data. data. Scales vertically (adding more resources to a single Scales horizontally (adding more servers to distribute Scalability server). the data). Uses APIs, custom query languages, or query methods Query Language Uses SQL (Structured Query Language). specific to the database. Well-suited for managing relationships between data Limited support for complex relationships, though Relationships using primary and foreign keys. graph databases handle them well. Performance may decrease with large, unstructured Optimized for high-performance, distributed Performance datasets. environments with large datasets. Highly flexible, allowing changes to the data model Flexibility Less flexible due to rigid schema design. without downtime. Examples MySQL, PostgreSQL, Oracle, Microsoft SQL Server. MongoDB, Cassandra, Neo4j, Redis, DynamoDB. Characteristics of DBMS 1.Data Abstraction: Provides an abstraction layer to simplify the complexity of data and hide internal details. 2.Data Independence: Ensures changes in the schema at one level don't affect another level (logical and physical data independence). 3.Data Security and Integrity: Offers authentication and encryption mechanisms to safeguard data and maintain its accuracy. 4.Transaction Management: Supports ACID (Atomicity, Consistency, Isolation, Durability) properties to ensure reliable transactions. 5.Concurrency Control: Allows multiple users to access the database simultaneously without conflicts. 6.Data Redundancy Control: Minimizes duplicate data by using a centralized system. 7.Data Sharing: Facilitates shared access to data across multiple users and applications. 8.Backup and Recovery: Provides automated backup and recovery mechanisms for data protection. 9.Support for Multiple Views: Allows different users to have customized views of the database. 10.Query Processing: Includes a query language (e.g., SQL) for efficient data retrieval and manipulation. Objectives of DBMS Data Centralization: To manage data in a centralized manner, avoiding redundancy. Efficient Data Management: To allow easy storage, retrieval, and update of data. Data Integrity: To maintain consistency and correctness of data. Data Security: To protect data from unauthorized access. Support for Multi-user Environment: To allow multiple users to work on the database concurrently. Ease of Data Sharing: To enable different applications and users to access shared data. Scalability: To accommodate growth in data volume and user base efficiently. Objectives of DBMS Data Centralization: To manage data in a centralized manner, avoiding redundancy. Efficient Data Management: To allow easy storage, retrieval, and update of data. Data Integrity: To maintain consistency and correctness of data. Data Security: To protect data from unauthorized access. Support for Multi-user Environment: To allow multiple users to work on the database concurrently. Ease of Data Sharing: To enable different applications and users to access shared data. Scalability: To accommodate growth in data volume and user base efficiently. Applications of DBMS 1. Banking Systems Used to manage customer accounts, transactions, loans, and deposits. Ensures secure and reliable transaction processing. Example: Online banking systems and ATMs. 2. Airline and Railway Reservations Maintains flight and train schedules, ticket bookings, and passenger information. Facilitates real-time seat availability and booking updates. 3. Education and Academic Institutions Stores student records, course registrations, grades, and faculty details. Supports library management systems and e-learning platforms. 4. Healthcare Systems Manages patient records, appointments, prescriptions, and billing. Enables real-time data sharing among departments and healthcare providers. Example: Hospital management systems. 5. E-commerce and Retail Manages product inventories, orders, customer profiles, and transactions. Ensures smooth online shopping experiences with recommendations and payment processing. Example: Amazon, Flipkart Applications of DBMS 6. Telecommunications 1. Handles customer data, call records, billing information, and network usage. 2. Supports prepaid and postpaid account management. 7. Government and Public Sector 3. Manages citizen records, tax data, land registries, and census information. 4. Used in e-Governance platforms for efficient public service delivery. 8. Finance and Accounting 5. Tracks financial transactions, tax records, and payrolls. 6. Used in budgeting, auditing, and financial reporting systems. 9.Manufacturing and Supply Chain Management 7. Manages inventory, production schedules, and supply chain logistics. 8. Ensures efficient procurement and distribution of goods. 10. Media and Entertainment Organizes and retrieves large volumes of digital content such as videos, music, and images. Example: Streaming platforms like Netflix or Spotify. Merits of DBMS Reduced Data Redundancy: Centralized management eliminates duplication of data. Improved Data Security: Authentication and access controls protect sensitive data. Consistency and Integrity: Rules ensure consistent and valid data. Efficient Query Processing: SQL and other languages enable easy and quick data retrieval. Backup and Recovery: Automated features ensure data safety during failures. Multi-user Access: Supports simultaneous data access by multiple users. Data Independence: Facilitates application and data changes without affecting each other Demerits of DBMS High Initial Cost: Installation and setup of a DBMS can be expensive. Complexity: Requires trained professionals for operation and maintenance. Performance Overhead: Complex systems may have slower performance for certain tasks. Hardware Requirements: Needs robust hardware for efficient functioning. Potential Data Breaches: Centralized data is vulnerable to cyberattacks if security is weak. Maintenance Costs: Regular updates and backups add to operational costs. Dependency on Vendors: Organizations may become dependent on the DBMS vendor for support and upgrades. Entity Relationship Model 1. Entities: Shape: Rectangle Definition: Represents an object or concept within the system, such as a person, place, or event. Examples: Student, Teacher, Product. Types: Strong Entity: Independent existence. Weak Entity: Dependent on a strong entity. 2. Relationships : Shape: Diamond Definition: Associations between two or more entities, Illustrates how two entities share information in the database. Examples: Student Enrolls in Course, Employee Works for Department. Types: Unary (Self-relationship). Binary (Relationship between two entities). Ternary (Involving three entities). 3. Attributes: Shape: Oval Definition: Properties or characteristics of an entity or relationship. Key Attribute: An attribute that uniquely identifies an entity instance; often underlined. Composite Attribute: An attribute composed of multiple sub-attributes. Multivalued Attribute: An attribute that can have multiple values; depicted with a double oval. Derived Attribute: An attribute whose value can be calculated from other attributes; shown with a dashed oval. Example: A Student entity may have attributes like Name, Age, Roll Number. Entity Relationship Model Cardinalities: Definition: Specifies the number of entity instances associated with another entity. Types: One-to-One (1:1). One-to-Many (1:N). Many-to-Many (M:N). Unified Modeling Language (UML) 1. Class: A blueprint for creating objects. Example: A Car class may have attributes like color, model, year. 2. Attributes: Characteristics or properties of a class. Example: For the Car class: color, price, engine type. 3. Methods: Functions or operations defined in a class to perform actions. Example: Methods for Car class: startEngine(), applyBrakes(). 4. Visibility: Specifies the accessibility of attributes and methods. Types: Public (+): Accessible from everywhere. Private (-): Accessible only within the class. Protected (#): Accessible within the class and its subclasses. Unified Modeling Language (UML) 5. Associations: Relationships between classes. Example: Car and Driver could have an association where a Driver drives a Car. 6. Multiplicity: Specifies how many instances of one class are associated with another class. Example: A Professor may teach 0...n courses. 7. Specialization: Represents inheritance or the "is-a" relationship. Example: A Truck is a specialization of a Vehicle. 8. Interfaces and Implementation: Interface: A contract that defines a set of methods without implementation. Example: An interface Shape might declare methods like calculateArea(). Implementation: Concrete classes implement the interface and define the behavior. Responsible for lifecycle of another Aggregation relationship can exist independently Inheritanc e relnshp between classes 1. Multiplicity defines the number of instances of one class that are associated with a single instance of another class in a relationship. Example: A teacher can teach multiple students, but each student has only one teacher in a particular subject. A library has many books, but a book belongs to only one library. Diagram Representation: 1:1 (One-to-One) → A person has one passport. 1:M (One-to-Many) → A customer can place multiple orders. M:M (Many-to-Many) → A student enrolls in multiple courses, and a course has multiple students. Customer and Order: One customer can place multiple orders, but each order belongs to one customer (1:M). Order and Food Item: One order can contain multiple food items, and each food item can be part of multiple orders (M:M). Restaurant and Food Item: One restaurant offers multiple food items, but each food item belongs to one restaurant (1:M). 2. Specialization: Specialization is a process where a general class is divided into multiple specialized subclasses with additional attributes or behaviors. Example: A general class "Vehicle" can be specialized into "Car", "Bike", and "Truck". Each subclass inherits common properties (like speed, engine, fuel type) but also has its own unique properties. A doctor can specialize as a cardiologist, neurologist, or orthopedic surgeon. Diagram representation: Use an inheritance (arrow) to show the parent-child relationship. 3. Interfaces: An interface defines a set of methods that a class must implement but does not provide the actual implementation. Example : Think of an electric socket (interface) – different devices (fan, charger, TV) plug into it but implement their own functionality. In programming, an interface defines a contract that classes must follow. 4. Implementation: Implementation refers to writing the actual logic of a class or an interface in a program. Example: A blueprint of a house (interface) gives a plan, but the actual construction (implementation) happens when workers build it. A company policy (interface) provides rules, and employees implement them in daily work. 1. Concurrency Control Concurrency control ensures that multiple transactions can be executed simultaneously without leading to conflicts like dirty reads, lost updates, or uncommitted data issues. Purpose of Concurrency Control: Prevents data inconsistency when multiple users access the database. Ensures ACID properties (Atomicity, Consistency, Isolation, Durability). Avoids issues like deadlocks and race conditions. 2. Optimistic Concurrency Control (OCC) Concept: Assumes conflicts are rare and allows transactions to execute without locks. At the commit phase, the system validates whether the transaction conflicts with others. Steps: 1.Read Phase: Transaction reads data without locking it. 2.Validation Phase: Before committing, it checks if any other transaction has modified the same data. 3.Write Phase: If no conflicts, the transaction commits; otherwise, it rolls back. Example: Used in high-read, low-write environments like online booking systems. 3. Pessimistic Concurrency Control (PCC) Assumes conflicts are common and prevents them by locking resources before executing transactions. It avoids conflicts but may cause delays due to locking. Types of Locks: Shared Lock (S): Multiple transactions can read, but no one can write. Exclusive Lock (X): Only one transaction can read and write. Example: Used in banking systems where transaction conflicts are likely. 4. Two-Phase Locking (2PL) Two-Phase Locking (2PL) is a locking protocol that ensures serializability by dividing transactions into two phases: 1.Growing Phase: The transaction acquires locks but cannot release them. 2.Shrinking Phase: The transaction releases locks but cannot acquire new ones. Types of 2PL: Strict 2PL: Holds all locks until the transaction commits or aborts. Rigorous 2PL: Holds exclusive locks until the transaction ends, ensuring strict execution order. Example: Used in bank transfers, where partial updates must be prevented. Time Scheduler: Time scheduler is responsible for managing the execution of transactions while maintaining concurrency and ensuring consistency. It is an essential part of the transaction management system, which ensures that database operations follow the principles of ACID (Atomicity, Consistency, Isolation, Durability). Functions of a Time Scheduler in DBMS 1.Transaction Scheduling 1. Controls the order of transaction execution. 2. Ensures that concurrent transactions do not lead to inconsistencies. 2.Concurrency Control 1. Prevents conflicts between transactions that access shared data. 2. Implements protocols like Lock-based protocols, Timestamp-based protocols, and Optimistic concurrency control. 3.Deadlock Prevention & Detection 1. Avoids deadlocks by controlling the sequence of transactions. 2. Uses techniques like wait-die and wound-wait schemes. 4.Ensuring Serializability 1. Ensures that the final execution of transactions is equivalent to some serial execution. 2. Uses serial schedules, conflict serializability, and view serializability. 5.Timestamp Ordering 1. Assigns a unique timestamp to each transaction when it begins. 2. Ensures that transactions execute in a timestamp-based order to prevent conflicts. Types of Scheduling in DBMS 1.Serial Scheduling 1. Transactions execute one after another without overlapping. 2. Guarantees consistency but reduces performance. 2.Concurrent Scheduling 1. Allows multiple transactions to execute simultaneously. 2. Requires concurrency control to avoid conflicts. 3.Preemptive Scheduling 1. A running transaction can be interrupted and another transaction can be executed. 2. Helps in load balancing and preventing long-running transactions from blocking others. 4.Non-preemptive Scheduling 1. A transaction runs until completion without being interrupted. 2. Ensures atomicity but can lead to longer waiting times. Assume a bank account with an initial balance of ₹5000. Two transactions operate on it: 1.T1 (Timestamp = 5): Deposits ₹2000 into the account. 2.T2 (Timestamp = 8): Withdraws ₹1000 from the account. TransactionsT1: Deposit ₹2000 T1: Read(Balance) → ₹5000 T1: Balance = ₹5000 + ₹2000 = ₹7000 T1: Write(Balance) T1: Commit Time Event T1 Reads Balance 5 Transaction T2: Withdraw ₹1000 (₹5000) T2: Read(Balance) → ₹7000 T1 Updates Balance T2: Balance = ₹7000 - ₹1000 = ₹60006 (₹7000) T2: Write(Balance) T2: Commit T1 Writes Balance 7 (₹7000) 8 T1 Commits ✅ T2 Reads Balance 9 (₹7000) T2 Updates Balance 10 (₹6000) The Database System Environment The database system environment encompasses various components and processes that work together to manage and utilize databases efficiently. This environment includes the hardware, software, people, procedures, and data involved in the database system. Hardware: Servers: These are powerful computers that host the database management system (DBMS) and the databases themselves. They handle the processing of database queries and transactions. Storage Devices: These include hard drives, SSDs, and other storage media where the actual database data is stored. Network Infrastructure: This includes routers, switches, and other networking equipment that enable communication between clients and the database servers. Software: Database Management System (DBMS): The software that provides the interface for users and applications to interact with the database. It manages data storage, retrieval, security, and integrity. Examples: MySQL, PostgreSQL, Oracle, Microsoft SQL Server. Operating System: The underlying system software that manages hardware resources and provides services for the DBMS. Examples: Windows Server, Linux, UNIX. Application Software: Programs and applications that access and manipulate the database data. These can be custom-built applications or off-the-shelf software like ERP systems, CRM systems, etc. Utilities and Tools: These include backup and recovery tools, performance monitoring tools, and database design tools that assist in managing and maintaining the database system. People: Database Administrators (DBAs): Responsible for the installation, configuration, management, and maintenance of the database. They ensure the database’s performance, security, and availability. Database Designers: Design the database schema and structure, ensuring it meets the requirements and supports efficient data access. Developers: Write application code that interacts with the database, implementing business logic and user interfaces. End Users: Individuals who use applications to perform various tasks that involve querying and updating the database. They interact with the database indirectly through application interfaces. Procedures: Database Procedures: Standardized methods for designing, creating, using, and maintaining the database. This includes procedures for data entry, updates, backups, and recovery. Security Procedures: Policies and protocols to protect the database from unauthorized access, breaches, and other security threats. This includes user authentication, access controls, and encryption. Maintenance Procedures: Regular tasks performed to ensure the database runs efficiently and without errors. This includes indexing, performance tuning, and data integrity checks. Data: User Data: The actual data stored in the database, such as customer records, transaction details, and product information. Metadata: Data about data, which includes the database schema, data types, constraints, and relationships among tables. Indexes: Structures that improve the speed of data retrieval operations on a database table at the cost of additional writes and storage space. Logs: Records of all the transactions and changes made to the database, used for recovery and auditing purposes. Database Models A database model defines how data is structured, stored, and accessed within a database system. It determines how relationships between data elements are represented. Types of Database Models: 1.Hierarchical Model – Organizes data in a tree-like structure with parent-child relationships. 2.Network Model – Uses a graph structure to allow many-to-many relationships. 3.Relational Model – Stores data in tables with rows and columns, using keys for relationships. 4.Object-Oriented Model – Integrates object-oriented programming principles into databases, storing data as objects. Hierarchical Database Model A hierarchical database model organizes data in a tree-like structure, where each record (node) has a single parent but can have multiple children. This model follows a 1:N (one-to-many) relationship, meaning one parent can have multiple child records, but each child has only one parent. Structure: [CEO]