ch3.pdf
Document Details
Uploaded by ProdigiousQuantum
null
JTO
Tags
Full Transcript
JTO Ph-II DNIT DBMS CHAPTER 3 : DATABASE MANAGEMENT SYSTEM 3.1 OBJECTIVE The objectives of this chapter is to understand DBMS Concept and Characteristics...
JTO Ph-II DNIT DBMS CHAPTER 3 : DATABASE MANAGEMENT SYSTEM 3.1 OBJECTIVE The objectives of this chapter is to understand DBMS Concept and Characteristics DBMS Architecture DBMS Data models, instance and schemas RDBMS Concept Difference between DBMS and RDBMS 3.2 DBMS OVERVIEW Database is a collection of related data and data is a collection of facts and figures that can be processed to produce information. Mostly data represents recordable facts. Data aids in producing information, which is based on facts. For example, if we have data about marks obtained by all students, we can then conclude about toppers and average marks. A database management system stores data in such a way that it becomes easier to retrieve, manipulate, and produce information. 3.3 CHARACTERISTICS A modern DBMS has the following characteristics − Real-world entity − A modern DBMS is more realistic and uses real-world entities to design its architecture. It uses the behavior and attributes too. For example, a school database may use students as an entity and their age as an attribute. Relation-based tables − DBMS allows entities and relations among them to form tables. A user can understand the architecture of a database just by looking at the table names. Isolation of data and application − A database system is entirely different than its data. A database is an active entity, whereas data is said to be passive, on which the database works and organizes. DBMS also stores metadata, which is data about data, to ease its own process. Less redundancy − DBMS follows the rules of normalization, which splits a relation when any of its attributes is having redundancy in values. Normalization is a mathematically rich and scientific process that reduces data redundancy. JTO PH-II IT Version Page 32 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS Consistency − Consistency is a state where every relation in a database remains consistent. There exist methods and techniques, which can detect attempt of leaving database in inconsistent state. A DBMS can provide greater consistency as compared to earlier forms of data storing applications like file-processing systems. Query Language − DBMS is equipped with query language, which makes it more efficient to retrieve and manipulate data. A user can apply as many and as different filtering options as required to retrieve a set of data. Traditionally it was not possible where file-processing system was used. ACID Properties–DBMS follows the concepts of Atomicity, Consistency, Isolation, and Durability (normally shortened as ACID). These concepts are applied on transactions, which manipulate data in a database. ACID properties help the database stay healthy in multi-transactional environments and in case of failure. Multiuser and Concurrent Access − DBMS supports multi-user environment and allows them to access and manipulate data in parallel. Though there are restrictions on transactions when users attempt to handle the same data item, but users are always unaware of them. Multiple views − DBMS offers multiple views for different users. A user who is in the Sales department will have a different view of database than a person working in the Production department. This feature enables the users to have a concentrate view of the database according to their requirements. Security − Features like multiple views offer security to some extent where users are unable to access data of other users and departments. DBMS offers methods to impose constraints while entering data into the database and retrieving the same at a later stage. DBMS offers many different levels of security features, which enables multiple users to have different views with different features. For example, a user in the Sales department cannot see the data that belongs to the Purchase department. Additionally, it can also be managed how much data of the Sales department should be displayed to the user. Since a DBMS is not saved on the disk as traditional file systems, it is very hard for miscreants to break the code. 3.4USERS A typical DBMS has users with different rights and permissions who use it for different purposes. Some users retrieve data and some back it up. The users of a DBMS can be broadly categorized as follows − JTO PH-II IT Version Page 33 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS Figure 9: DBMS USERS Administrators − Administrators maintain the DBMS and are responsible for administrating the database. They are responsible to look after its usage and by whom it should be used. They create access profiles for users and apply limitations to maintain isolation and force security. Administrators also look after DBMS resources like system license, required tools, and other software and hardware related maintenance. Designers − Designers are the group of people who actually work on the designing part of the database. They keep a close watch on what data should be kept and in what format. They identify and design the whole set of entities, relations, constraints, and views. End Users − End users are those who actually reap the benefits of having a DBMS. End users can range from simple viewers who pay attention to the logs or market rates to sophisticated users such as business analysts. 3.5DBMS ARCHITECTURE The DBMS design depends upon its architecture. The basic client/server architecture is used to deal with a large number of PCs, web servers, database servers and other components that are connected with networks. The client/server architecture consists of many PCs and a workstation which are connected via the network. DBMS architecture depends upon how users are connected to the database to get their request done. TYPES OF DBMS ARCHITECTURE Database architecture can be seen as a single tier or multi-tier. But logically, database architecture is of two types like: 2-tier architecture and 3-tier architecture. Figure 10: Types of DBMS Architecture JTO PH-II IT Version Page 34 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS 1-Tier Architecture In this architecture, the database is directly available to the user. It means the user can directly sit on the DBMS and uses it. Any changes done here will directly be done on the database itself. It doesn't provide a handy tool for end users. The 1-Tier architecture is used for development of the local application, where programmers can directly communicate with the database for the quick response. 2-Tier Architecture The 2-Tier architecture is same as basic client-server. In the two-tier architecture, applications on the client end can directly communicate with the database at the server side. For this interaction, API's like: ODBC, JDBC are used. The user interfaces and application programs are run on the client-side. The server side is responsible to provide the functionalities like: query processing and transaction management. To communicate with the DBMS, client-side application establishes a connection with the server side. Figure 11: 2 Tier Architecture 3-Tier Architecture The 3-Tier architecture contains another layer between the client and server. In this architecture, client can't directly communicate with the server. The application on the client-end interacts with an application server which further communicates with the database system. End user has no idea about the existence of the database beyond the application server. The database also has no idea about any other user beyond the application. The 3-Tier architecture is used in case of large web application. JTO PH-II IT Version Page 35 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS Figure 12: 3 Tier Architecture 3.6DBMS DATA MODELS Data models define how the logical structure of a database is modeled. Data Models are fundamental entities to introduce abstraction in a DBMS. Data models define how data is connected to each other and how they are processed and stored inside the system. Entity-Relationship Model Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships among them. While formulating real-world scenario into the database model, the ER Model creates entity set, relationship set, general attributes and constraints. ER Model is best used for the conceptual design of a database. ER Model is based on − Entities and their attributes. Relationships among entities. These concepts are explained below. JTO PH-II IT Version Page 36 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS Figure 13: ER Model Entity − An entity in an ER Model is a real-world entity having properties called attributes. Every attribute is defined by its set of values called domain. For example, in a school database, a student is considered as an entity. Student has various attributes like name, age, class, etc. Relationship − The logical association among entities is called relationship. Relationships are mapped with entities in various ways. Mapping cardinalities define the number of association between two entities. Mapping cardinalities − o one to one o one to many o many to one o many to many Relational Model The most popular data model in DBMS is the Relational Model. It is more scientific a model than others. This model is based on first-order predicate logic and defines a table as an n- ary relation. Figure 14: Relational Model The main highlights of this model are − Data is stored in tables called relations. Relations can be normalized. In normalized relations, values saved are atomic values. Each row in a relation contains a unique value. JTO PH-II IT Version Page 37 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS Each column in a relation contains values from a same domain. 3.7DATABASE SCHEMA A database schema is the skeleton structure that represents the logical view of the entire database. It defines how the data is organized and how the relations among them are associated. It formulates all the constraints that are to be applied on the data. A database schema defines its entities and the relationship among them. It contains a descriptive detail of the database, which can be depicted by means of schema diagrams. It‟s the database designers who design the schema to help programmers understand the database and make it useful. A database schema can be divided broadly into two categories − Physical Database Schema − This schema pertains to the actual storage of data and its form of storage like files, indices, etc. It defines how the data will be stored in a secondary storage. Logical Database Schema − This schema defines all the logical constraints that need to be applied on the data stored. It defines tables, views, and integrity constraints. Figure 15: Database Schema 3.8DATABASE INSTANCE It is important that we distinguish these two terms individually. Database schema is the skeleton of database. It is designed when the database doesn't exist at all. Once the database is operational, it is very difficult to make any changes to it. A database schema does not contain any data or information. JTO PH-II IT Version Page 38 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS A database instance is a state of operational database with data at any given time. It contains a snapshot of the database. Database instances tend to change with time. A DBMS ensures that its every instance (state) is in a valid state, by diligently following all the validations, constraints, and conditions that the database designers have imposed. 3.9DATA INDEPENDENCE A database system normally contains a lot of data in addition to users’ data. For example, it stores data about data, known as metadata, to locate and retrieve data easily. It is rather difficult to modify or update a set of metadata once it is stored in the database. But as a DBMS expands, it needs to change over time to satisfy the requirements of the users. If the entire data is dependent, it would become a tedious and highly complex job. Figure 16: Data Independence Metadata itself follows a layered architecture, so that when we change data at one layer, it does not affect the data at another level. This data is independent but mapped to each other. 3.10 LOGICAL DATA INDEPENDENCE Logical data is data about database, that is, it stores information about how data is managed inside. For example, a table (relation) stored in the database and all its constraints, applied on that relation. Logical data independence is a kind of mechanism, which liberalizes itself from actual data stored on the disk. If we do some changes on table format, it should not change the data residing on the disk. 3.11 PHYSICAL DATA INDEPENDENCE All the schemas are logical, and the actual data is stored in bit format on the disk. Physical data independence is the power to change the physical data without impacting the schema or logical data. JTO PH-II IT Version Page 39 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS For example, in case we want to change or upgrade the storage system itself − suppose we want to replace hard-disks with SSD − it should not have any impact on the logical data or schemas. 3.12 RDBMS RDBMS stands for Relational Database Management Systems.. All modern database management systems like SQL, MS SQL Server, IBM DB2, ORACLE, My-SQL and Microsoft Access are based on RDBMS. It is called Relational Data Base Management System (RDBMS) because it is based on relational model introduced by E.F. Codd. Codd's 12 Rules Dr Edgar F. Codd, after his extensive research on the Relational Model of database systems, came up with twelve rules of his own, which according to him, a database must obey in order to be regarded as a true relational database. Rule 1: Information Rule The data stored in a database, may it be user data or metadata, must be a value of some table cell. Everything in a database must be stored in a table format. Rule 2: Guaranteed Access Rule Every single data element (value) is guaranteed to be accessible logically with a combination of table-name, primary-key (row value), and attribute-name (column value). No other means, such as pointers, can be used to access data. Rule 3: Systematic Treatment of NULL Values The NULL values in a database must be given a systematic and uniform treatment. This is a very important rule because a NULL can be interpreted as one the following − data is missing, data is not known, or data is not applicable. Rule 4: Active Online Catalog The structure description of the entire database must be stored in an online catalog, known as data dictionary, which can be accessed by authorized users. Users can use the same query language to access the catalog which they use to access the database itself. Rule 5: Comprehensive Data Sub-Language Rule A database can only be accessed using a language having linear syntax that supports data definition, data manipulation, and transaction management operations. This language can be used directly or by means of some application. If the database allows access to data without any help of this language, then it is considered as a violation. Rule 6: View Updating Rule JTO PH-II IT Version Page 40 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS All the views of a database, which can theoretically be updated, must also be updatable by the system. Rule 7: High-Level Insert, Update, and Delete Rule A database must support high-level insertion, updation, and deletion. This must not be limited to a single row, that is, it must also support union, intersection and minus operations to yield sets of data records. Rule 8: Physical Data Independence The data stored in a database must be independent of the applications that access the database. Any change in the physical structure of a database must not have any impact on how the data is being accessed by external applications. Rule 9: Logical Data Independence The logical data in a database must be independent of its user‟s view (application). Any change in logical data must not affect the applications using it. For example, if two tables are merged or one is split into two different tables, there should be no impact or change on the user application. This is one of the most difficult rule to apply. Rule 10: Integrity Independence A database must be independent of the application that uses it. All its integrity constraints can be independently modified without the need of any change in the application. This rule makes a database independent of the front-end application and its interface. Rule 11: Distribution Independence The end-user must not be able to see that the data is distributed over various locations. Users should always get the impression that the data is located at one site only. This rule has been regarded as the foundation of distributed database systems. Rule 12: Non-Subversion Rule If a system has an interface that provides access to low-level records, then the interface must not be able to subvert the system and bypass security and integrity constraints. How It Works Data is represented in terms of tuples (rows) in RDBMS. Relational database is most commonly used database. It contains number of tables and each table has its own primary key. Due to a collection of organized set of tables, data can be accessed easily in RDBMS. TABLE The RDBMS database uses tables to store data. A table is a collection of related data entries and contains rows and columns to store data.A table is the simplest example of data storage in RDBMS. JTO PH-II IT Version Page 41 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS Figure 17: Database Table FIELD Field is a smaller entity of the table which contains specific information about every record in the table. In the above example, the field in the student table consists of id, name, age, and course. ROW/RECORD A row of a table is also called record. It contains the specific information of each individual entry in the table. It is a horizontal entity in the table. COLUMN A column is a vertical entity in the table which contains all information associated with a specific field in a table. For example: "name" is a column in the above table which contains all information about student's name. NULL VALUES The NULL value of the table specifies that the field has been left blank during record creation. It is totally different from the value filled with zero or a field that contains space. Data Integrity There are the following categories of data integrity exist with each RDBMS: Entity integrity: It specifies that there should be no duplicate rows in a table. Domain integrity: It enforces valid entries for a given column by restricting the type, the format, or the range of values. Referential integrity: It specifies that rows cannot be deleted, which are used by other records. User-defined integrity: It enforces some specific business rules that are defined by users. These rules are different from entity, domain or referential integrity. JTO PH-II IT Version Page 42 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS 3.13 DBMS VS RDBMS Although DBMS and RDBMS both are used to store information in physical database but there are some remarkable differences between them. The main differences between DBMS and RDBMS are given below: No. DBMS RDBMS 1) DBMS applications store data as file. RDBMS applications store data in a tabular form. 2) In DBMS, data is generally stored in In RDBMS, the tables have an identifier called either a hierarchical form or a primary key and the data values are stored in the navigational form. form of tables. 3) Normalization is not present in Normalization is present in RDBMS. DBMS. 4) DBMS does not apply any RDBMS defines the integrity constraint for the security with regards to data purpose of ACID (Atomocity, Consistency, Isolation manipulation. and Durability) property. 5) DBMS uses file system to store data, In RDBMS, data values are stored in the form of so there will be no relation between tables, so a relationship between these data values the tables. will be stored in the form of a table as well. 6) DBMS has to provide some uniform RDBMS system supports a tabular structure of the methods to access the stored data and a relationship between them to access the information. stored information. 7) DBMS does not support distributed RDBMS supports distributed database. database. 8) DBMS is meant to be for small RDBMS is designed to handle large amount of data. organization and deal with small it supports multiple users. data. it supports single user. JTO PH-II IT Version Page 43 of 131 For Restricted Circulation JTO Ph-II DNIT DBMS 9) Examples of DBMS are file Example of RDBMS are mysql, postgre, sql systems, xml etc. server, oracle etc. Table 3. DBMS Vs RDBMS After observing the differences between DBMS and RDBMS, you can say that RDBMS is an extension of DBMS. There are many software products in the market today who are compatible for both DBMS and RDBMS. Means today a RDBMS application is DBMS application and vice-versa. 3.14 CONCLUSION DBMS is perhaps most useful for providing a centralized view of data that can be accessed by multiple users, from multiple locations in controlled manner. The goal of DBMS is to offer more convenience as well as more efficiency to access data from database with high security. A general purpose DBMS is a software system designed to allow the definition, creation, querying, update and administration of databases. JTO PH-II IT Version Page 44 of 131 For Restricted Circulation