Unit-1.docx (1).docx.pdf
Document Details
Uploaded by TalentedLimeTree
Silver Oak University
Tags
Full Transcript
Course Name: Database Administration SEMESTER: 1 UNIT No 1 Unit Name : Introduction to Database Topics: ⮚ Overview of databases and their importance ⮚ Types of databases (relational, NoSQL, etc.) ⮚ Database System Architecture(1-tie...
Course Name: Database Administration SEMESTER: 1 UNIT No 1 Unit Name : Introduction to Database Topics: ⮚ Overview of databases and their importance ⮚ Types of databases (relational, NoSQL, etc.) ⮚ Database System Architecture(1-tier,2-tier,3-tier) ⮚ Roles and responsibilities of a database administrator (DBA) ⮚ Introduction to database management systems (DBMS) ⮚ Basic database terminology (tables, rows, columns, keys, etc.) ❖ Introduction to Database and its Importance Data Facts and figures that can be recorded or stored. E.g. Person Name, Age, Gender and Weight etc. Information When data is processed, organized, structured or presented in a given context so to make it useful, it is called information. Database A Database is a collection of interrelated (logically-related) data. OR A database is an organized collection of data, generally stored and accessed electronically from a computer system. E.g. Books Database in Library, Student Database in University etc. DBMS (Database Management System) A database management system is a collection of interrelated data and a set of programs to manipulate those data. DBMS = Database + Set of programs E.g. MS SQL Server, Oracle, My SQL, SQLite, MongoDB etc. Purpose: To efficiently store, retrieve, and manage data. Types of Databases 1. Relational Databases (RDBMS) Description: Use structured query language (SQL) for defining and manipulating data. They organize data into tables with rows and columns. Examples: MySQL, PostgreSQL, Oracle, Microsoft SQL Server. Use Cases: Transactional systems, business applications, where data integrity and complex queries are needed. 2. NoSQL Databases Description: Designed for unstructured or semi-structured data. They can be more flexible than RDBMS and handle a variety of data models. Types: ○ Document Stores: Store data as documents (e.g., JSON, BSON). Examples: MongoDB, CouchDB. Use Cases: Content management, real-time analytics. ○ Key-Value Stores: Store data as key-value pairs. Examples: Redis, DynamoDB. Use Cases: Caching, session management. ○ Column-Family Stores: Store data in columns rather than rows. Examples: Apache Cassandra, HBase. Use Cases: Big data applications, distributed systems. ○ Graph Databases: Store data as nodes and edges, suitable for interconnected data. Examples: Neo4j, Amazon Neptune. Use Cases: Social networks, recommendation engines. 3. Object-Oriented Databases Description: Store data in objects, similar to how programming languages handle data. They integrate with object-oriented programming languages. Examples: ObjectDB, db4o. Use Cases: Complex data relationships, applications where data and behavior are closely tied. 4. NewSQL Databases Description: Modern databases that provide the scalability of NoSQL systems while maintaining the SQL interface and relational database principles. Examples: Google Spanner, CockroachDB, VoltDB. Use Cases: High-performance applications needing ACID (Atomicity, Consistency, Isolation, Durability) properties. 5. Hierarchical Databases Description: Data is organized in a tree-like structure with parent-child relationships. Examples: IBM Information Management System (IMS). Use Cases: Organizational data, file systems. 7. Network Databases Description: Data is organized in a graph structure, allowing multiple parent-child relationships. Examples: Integrated Data Store (IDS), IDMS. Use Cases: Complex many-to-many relationships, organizational hierarchies.. 8. Object-Relational Databases Description: Combine features of relational databases with object-oriented programming capabilities. Examples: PostgreSQL (with object-relational features). Use Cases: Applications needing complex data models and advanced data types. Importance of Databases Data Organization and Management ○ Centralized data management, reducing redundancy and inconsistency. ○ Structured format allows for efficient data retrieval and manipulation. Data Integrity and Security ○ Enforcement of data integrity constraints ensures accuracy and reliability. ○ Security mechanisms protect data from unauthorized access and breaches. Scalability and Performance ○ Databases can handle large volumes of data and user requests efficiently. ○ Techniques like indexing and caching improve query performance. Backup and Recovery ○ Automated backup and recovery procedures protect against data loss. ○ Ensures business continuity and disaster recovery capabilities. Support for Data Analytics ○ Advanced querying and reporting tools facilitate data analysis and business intelligence. ○ Databases support integration with data analytics platforms and tools. Applications of DBMS In so many fields, we will use a database management system. Let’s see some of the applications where database management system uses − Railway Reservation System − The railway reservation system database plays a very important role by keeping record of ticket booking, train’s departure time and arrival status and also gives information regarding train late to people through the database. Library Management System − Now-a-days it’s become easy in the Library to track each book and maintain it because of the database. This happens because there are thousands of books in the library. It is very difficult to keep a record of all books in a copy or register. Now DBMS used to maintain all the information related to book issue dates, name of the book, author and availability of the book. Banking − Banking is one of the main applications of databases. We all know there will be a thousand transactions through banks daily and we are doing this without going to the bank. This is all possible just because of DBMS that manages all the bank transactions. Universities and colleges − Now-a-days examinations are done online. So, the universities and colleges are maintaining DBMS to store Student’s registrations details, results, courses and grade all the information in the database. For example, telecommunications. Without DBMS there is no telecommunication company. DBMS is most useful to these companies to store the call details and monthly postpaid bills. Credit card transactions − The purchase of items and transactions of credit cards are made possible only by DBMS. A credit card holder has to know the importance of their information that all are secured through DBMS. Social Media Sites − By filling the required details we are able to access social media platforms. Many users sign up daily on social websites such as Facebook, Pinterest and Instagram. All the information related to the users are stored and maintained with the help of DBMS. Finance − Now-a-days there are lots of things to do with finance like storing sales, holding information and finance statement management etc. these all can be done with database systems. Military − In military areas the DBMS is playing a vital role. Military keeps records of soldiers and it has so many files that should be kept secure and safe. DBMS provides a high security to military information. Online Shopping − Now-a-days we all do Online shopping without wasting the time by going shopping with the help of DBMS. The products are added and sold only with the help of DBMS like Purchase information, invoice bills and payment. Human Resource Management − the management keeps records of each employee’s salary, tax and work through DBMS. Manufacturing − Manufacturing companies make products and sell them on a daily basis. To keep records of all those details DBMS is used. Airline Reservation system − Just like the railway reservation system, airlines also need DBMS to keep records of flights arrival, departure and delay status. Types of Database System Architecture (1- tier, 2- tier, 3-tier) 1-Tier Architecture: Definition: Also known as the Single-Tier Architecture, where the entire application runs on a single machine. Characteristics: ○ Database management system (DBMS) and the user interface are on the same machine. ○ Suitable for small-scale applications with minimal concurrent users. ○ Limited scalability and performance, as all processing occurs on one system. Advantages: ○ Simple and easy to implement for small applications. ○ No network overhead, as everything is local. Disadvantages: ○ Lack of scalability for larger applications. ○ Difficult to maintain and update as the system grows. _________________________ | Application | | (User Interface) | |_________________________| | Application | | (Business Logic) | |_________________________| | Database | |_________________________| 2-Tier Architecture: Definition: Also known as Client-Server Architecture, it separates the user interface from the database and application logic. Components: ○ Client Tier: Front-end application running on the user’s machine (e.g., GUI). ○ Server Tier: Back-end database server handling data storage and retrieval. Characteristics: ○ Improved scalability compared to 1-tier architecture. ○ Allows multiple clients to connect to the same server concurrently. ○ Client handles presentation logic, while server manages data access and business logic. Advantages: ○ Improved scalability and performance compared to 1-tier. ○ Easier to maintain and update as logic is centralized on the server. Disadvantages: ○ Network dependency can lead to latency and performance issues. ○ Higher initial setup and maintenance costs due to server management. _________________________ _________________________ | Client | | Server | | (User Interface, GUI) | | (Application, Database) | |_________________________| |_________________________| Client Tier: Includes the user interface (UI) and presentation logic. Server Tier: Includes the application logic (business logic) and database management system. 3-Tier Architecture: Definition: Adds an additional layer between the client and server to separate presentation, application processing, and data management. Components: ○ Presentation Tier: Client interface for user interaction (e.g., web browser). ○ Application Tier (Middle Tier): Handles application logic and processing, often called business logic tier. ○ Data Tier: Backend database servers for data storage and retrieval. Characteristics: ○ Each tier can run on separate machines or clusters, enhancing scalability and reliability. ○ Enables easier distribution of workload and better resource management. ○ Promotes modular design, making the system more flexible and maintainable. Advantages: ○ Scalability and flexibility due to distributed architecture. ○ Improved performance by distributing load across different tiers. ○ Easier to maintain and update due to modular design. Disadvantages: ○ Increased complexity in design and implementation. ○ Higher initial setup and infrastructure costs due to multiple tiers. Client | | Application | | Database | | (User Interface, GUI) | | (Business Logic, App) | | (Data Storage, DBMS) | Comparison Summary: 1-Tier: Simple, suitable for small applications, but lacks scalability and maintenance flexibility. 2-Tier: Separates client and server, improving scalability and maintenance, but introduces network overhead. 3-Tier: Adds a middle layer for application logic, enhancing scalability, flexibility, and maintainability, but increases complexity and infrastructure costs. Most widely used architecture is 3-tier architecture. 3-tier architecture separates it tier from each other on the basis of users. 1) Database (Data) Tier At this tier, only the database resides. Database along with its query processing languages sits in layer-3 of 3-tier architecture. It also contains all relations and their constraints. 2) Application (Middle) Tier At this tier, the application server and program, which access the database, resides. For a user this application tier works as an abstracted view of the database. Users are unaware of any existence of databases beyond application. For database-tier, application tier is the user of it. Database tier is not aware of any other user beyond the application tier. This tier works as a mediator between the two. 3) User (Presentation) Tier An end user sits on this tier. From a user’s aspect, this tier is everything. He/she doesn't know about any existence or form of database beyond this layer. At this layer multiple views of the database can be provided by the application. All views which are generated by an application, reside in the application tier. Database System levels (External, Conceptual, Internal) Mapping The process of transforming requests and results between the three levels is called mapping. Types of Mapping Conceptual/Internal Mapping External/Conceptual Mapping Conceptual/Internal Mapping It relates conceptual schema with internal schema. It defines correspondence between the conceptual schema and the database stored in physical devices. It specifies how conceptual records and fields are presented at the internal level. If the structure of stored database is changed, then conceptual/internal mapping must be changed accordingly and conceptual schema can remain invariant. There could be one mapping between conceptual and internal levels. External/Conceptual Mapping It relates each external schema with conceptual schema. It defines correspondence between a particular external view and conceptual schema. If the structure of conceptual schema is changed, then external/conceptual mapping must be changed accordingly and external schema can remain invariant. There could be several mappings between external and conceptual levels. Database Users and DBA There are four different database users. Application programmers These users are computer professionals who write application programs using some tools. E.g. Software developers Sophisticated users These users interact with the system without writing a program. They form their request in a database query language. E.g. Analyst. Specialized users These users write specialized database applications that do not fit into the traditional data processing framework. E.g. Database Administrator. Naive users These users are unsophisticated users who have very less knowledge of database systems. These users interact with the system by using one of the application programs that have been written previously. Examples, e.g. Clerk in bank DBA The database administrator is a person in the organization who controls the design and the Use of the database. DBA provides necessary technical support for implementing a database. DBA is involved more in the design, development, testing and operational phases. DBA is a technical person having knowledge of database technology. A DBA does not need to be a business person, but any kind of knowledge about the functionality of an organization can be more beneficial. DBA is a technically focused person, but he/she should understand more about the business to administer the databases effectively. Roles and Responsibilities of a Database Administrator (DBA): 1. Database Installation and Configuration: ○ Role: DBAs are responsible for installing and configuring database management systems (DBMS) based on organizational needs and requirements. ○ Responsibilities: Ensure proper setup, configuration, and optimization of DBMS software to achieve optimal performance and security. 2. Database Design: ○ Role: DBAs participate in database design and schema definition. ○ Responsibilities: Design tables, relationships, indexes, and constraints to ensure efficient data storage, retrieval, and integrity. 3. Data Security and Authorization: ○ Role: DBAs manage and enforce data security policies and access controls. ○ Responsibilities: Implement security measures such as user authentication, authorization, encryption, and auditing to protect sensitive data from unauthorized access or breaches. 4. Backup and Recovery: ○ Role: DBAs ensure data integrity and availability by implementing backup and recovery strategies. ○ Responsibilities: Schedule regular backups, perform recovery procedures in case of data loss or corruption, and test backup and recovery processes to ensure reliability. 5. Performance Monitoring and Tuning: ○ Role: DBAs monitor database performance and identify areas for optimization. ○ Responsibilities: Use monitoring tools to analyze query performance, identify bottlenecks, tune database parameters, and optimize SQL queries and indexes to improve overall system performance. 6. Database Maintenance and Patch Management: ○ Role: DBAs perform ongoing maintenance and apply patches and upgrades to database software. ○ Responsibilities: Schedule and apply patches, updates, and version upgrades to ensure the database system is up-to-date with security fixes, enhancements, and new features. 7. Capacity Planning and Scalability: ○ Role: DBAs plan for future growth and scalability of the database environment. ○ Responsibilities: Monitor resource usage, predict storage and processing needs, and plan for hardware upgrades or scaling out database systems as required by business growth or changing demands. 8. Disaster Recovery Planning: ○ Role: DBAs plan and prepare for database system failures and disaster recovery scenarios. ○ Responsibilities: Develop and maintain disaster recovery plans, conduct regular tests and simulations, and ensure backup systems and procedures are in place to minimize downtime and data loss in the event of a disaster. 9. Documentation and Compliance: ○ Role: DBAs maintain documentation and ensure compliance with regulatory requirements. ○ Responsibilities: Document database configurations, procedures, and policies. Ensure compliance with data protection regulations, industry standards, and organizational policies related to data management and security. 10. Troubleshooting and Problem Resolution: ○ Role: DBAs troubleshoot database issues and resolve technical problems. ○ Responsibilities: Investigate and diagnose database errors, performance issues, and data discrepancies. Implement corrective actions and collaborate with developers and system administrators to resolve issues promptly. 11. User Training and Support: ○ Role: DBAs provide support and training to database users and stakeholders. ○ Responsibilities: Assist users with database-related queries, provide technical guidance and training on database tools and features, and ensure users understand and adhere to best practices for efficient and secure database usage. Key Skills and Qualities of a DBA: Technical proficiency in database management systems (DBMS) such as Oracle, MySQL, SQL Server, etc. Strong understanding of database design principles, data modeling, and normalization. Knowledge of SQL (Structured Query Language) and database querying techniques. Analytical and problem-solving skills for troubleshooting and optimizing database performance. Ability to work well under pressure, prioritize tasks, and manage multiple projects simultaneously. Communication skills to collaborate with stakeholders, explain technical concepts to non-technical users, and document procedures effectively. Introduction to types of Data Models A database model is a type of data model that defines the logical structure of a database. It determines how data can be stored, accessed and updated in a database management system The most popular example of a database model is the relational model, which uses a table-based format. Type of Database Models are: 1. Hierarchical Model 2. Network Model 3. Entity-relationship Model 4. Relational Model 5. Object-oriented database model 1) Hierarchical Model The hierarchical model organizes data into a tree-like structure, where each record has a single parent or root. The hierarchy starts from the Root data, and expands like a tree, adding child nodes to the parent nodes. In hierarchical model, data is organized into a tree-like structure with a one-to-many relationship between two different types of data, for example, one the department can have many professors and many students. 2) Network Model This is an extension of the hierarchical model, allowing many-to-many relationships in a tree-like structure that allows multiple parents. 3) Entity-relationship Model In this database model, relationships are created by dividing object of interest into an entity and its characteristics into attributes. 4) Relational Model In this model, data is organized in two-dimensional tables and the relationship is maintained by storing a common attribute. Table 1 SubjectI SubjectName D 1 DBMS Patel 2 DS Shah Table 2 Rno StudentName 1 Raj 21 2 Meet 22 5) Object-oriented database model This data model is another method of representing real world objects. It considers each object in the world as objects and isolates it from each other. It groups its related functionalities together and allows inheriting its functionality to other related sub-groups. Database Management Systems (DBMS) Definition: Software that uses a standard method to store and organize data. Functions: Data storage, retrieval, update, and administration. Components: ○ DBMS Engine: Manages access to data. ○ Database Schema: Defines the logical structure of the database. ○ Query Processor: Executes queries against the database. ○ Transaction Manager: Ensures ACID properties. Purpose of DBMS The purpose of database systems is to manage the following insecurities: ▪ data redundancy and inconsistency, ▪ difficulty in accessing data, ▪ data isolation, ▪ atomicity of updates, ▪ concurrent access, ▪ security problems, and ▪ Supports multiple views of data. Data Independence Data independence is the ability to modify a schema definition in one level without affecting a schema definition in the next higher level. Types of data independence Physical data independence Logical data independence Physical data independence Physical data independence allows changing in physical storage devices or organization of files without change in the conceptual view or external view. Modifications at the internal level are occasionally necessary to improve performance. Physical data independence separates conceptual level from the internal level.It is easy to achieve physical data independence. Logical data independence Logical data independence is the ability to modify the conceptual schema without requiring any change in application programs. Conceptual schema can be changed without affecting the existing external schema. Modifications at the logical level are necessary whenever the logical structure of the database is altered. Logical data independence separates the external level from the conceptual view. It is difficult to achieve logical data independence. Basic Database Terminology Here's a list of basic database terminology along with their definitions: 1. Database: ○ A structured collection of data organized for efficient access, storage, and retrieval. 2. Table: ○ A collection of related data organized in rows and columns. 3. Row (or Record): ○ A single data entry representing a complete set of related attributes for an entity. 4. Column (or Field): ○ A vertical group of cells in a table that contains data of a particular type or domain. 5. Key: In databases, keys are crucial for establishing relationships between tables, ensuring data integrity, and enabling efficient data retrieval. Here are the various types of keys used in database systems along with explanations: 1. Primary Key: ○ Definition: A primary key is a unique identifier for each record in a table. It must be unique within the table and cannot contain null values. ○ Purpose: Ensures each record in a table can be uniquely identified and serves as the main reference point for establishing relationships with other tables. 2. Foreign Key: ○ Definition: A foreign key is a field (or a set of fields) in one table that references the primary key in another table. It establishes a link between the two tables. ○ Purpose: Enforces referential integrity by ensuring that values in the foreign key column must exist as primary key values in the referenced table. 3. Unique Key: ○ Definition: A unique key constraint ensures that all values in a column (or a set of columns) are unique, similar to a primary key. However, it allows null values. ○ Purpose: Ensures the uniqueness of data in a column or a combination of columns where uniqueness is required but not necessarily as the primary identifier. 4. Composite Key: ○ Definition: A composite key is a key that consists of multiple columns to uniquely identify records within a table. Each column within the composite key may not be unique by itself, but the combination of columns must be unique. ○ Purpose: Useful when no single column can uniquely identify a record, but a combination of columns can. 5. Super Key: ○ Definition: A super key is a set of one or more columns that uniquely identifies each row in a table. It can include more columns than necessary to uniquely identify rows. ○ Purpose: Primarily a theoretical concept used to describe any set of columns that can uniquely identify rows in a table. 6. Candidate Key: ○ Definition: A candidate key is a minimal super key, meaning it is a set of columns that uniquely identifies rows in a table, and no subset of those columns can uniquely identify rows. ○ Purpose: Helps in the design phase of a database to identify potential primary keys and ensures data integrity by providing unique identification. 7. Alternate Key: ○ Definition: An alternate key is any candidate key that is not chosen to be the primary key. ○ Purpose: While not serving as the primary identifier, alternate keys are still unique and can be used for querying and establishing relationships in a database. Understanding these different types of keys is fundamental for designing well-structured databases that efficiently manage data and enforce data integrity constraints. Each type of key plays a critical role in ensuring accurate data storage, retrieval, and relationship management within a database system. 6. Index: ○ A data structure that improves the speed of data retrieval operations on a database table at the cost of additional space and decreased insert/update performance. 7. Query: ○ A request for data manipulation or retrieval from a database, typically written in a query language like SQL (Structured Query Language). 8. Normalization: ○ The process of organizing data in a database to reduce redundancy and improve data integrity. 9. Denormalization: ○ The process of intentionally adding redundancy to a database to improve read performance of queries at the cost of some additional storage and potential update anomalies. 10. Transaction: ○ A unit of work performed against a database, treated as a single logical operation that either succeeds completely or fails completely. 11. View: ○ A virtual table generated from the result of a query, which presents data from one or more tables in a structured format. 12. Schema: ○ The logical structure of a database that defines its tables, fields, relationships, constraints, and other characteristics. 13. Backup and Recovery: ○ Processes and procedures to protect data from loss and recover it in case of database failures, disasters, or corruption. 14. ACID Properties: ○ A set of four properties (Atomicity, Consistency, Isolation, Durability) that guarantee reliable transactions in a database system. Metadata Metadata is data about data. Data such as table name, column name, data type, authorized user and user access privileges for any table is called metadata for that table. customers..