🎧 New: AI-Generated Podcasts Turn your study notes into engaging audio conversations. Learn more

Loading...
Loading...
Loading...
Loading...
Loading...
Loading...
Loading...

Transcript

Unit 1 Introduction to DBMS P A 1.1 Fundamental concepts - data, information, database, DBMS, table, row, field...

Unit 1 Introduction to DBMS P A 1.1 Fundamental concepts - data, information, database, DBMS, table, row, field G 1.2 Need of Information - Business, Research, Governance E 1.3 Requirement of DBMS 5 1.4 Levels of Abstraction in DBMS 1.5 Database Users and Administrator Define the following terms. Data Data is raw, unorganized facts that need to be processed. Or Data: Facts, figures, statistics etc. having no particular meaning (e.g. 1, ABC, 19 etc). Record: Collection of related data items, e.g. in the above example the three data items had no meaning. But if we organize them in the following way, then they collectively represent meaningful information. Roll Name Age 1 ABC 19 Table or Relation: Collection of related records. Roll Name Age 1 ABC 19 2 DEF 22 3 XYZ 28 The columns of this relation are called Fields, Attributes or Domains. The rows are called Tuples or Records. Database: Collection of related relations. Consider the following collection of tables: T1 T2 R Na A R Addr me g ess e KOL T3 T4 We now have a collection of 4 tables. They can be called a “related collection” because we can clearly find out that there are some common attributes existing in a selected pair of tables. Because of these common attributes we may combine the data of two or more tables together to find out the complete details of a student. Questions like “Which hostel does the youngest student live in?” can be answered now, although Age and Hostel attributes are in different tables. A database in a DBMS could be viewed by lots of different people with different responsibilities. Figure 1.1: Empolyees are accessing Data through DBMS For example, within a company there are different departments, as well as customers, who each need to see different kinds of data. Each employee in the company will have different levels of access to the database with their own customized front-end application. In a database, data is organized strictly in row and column format. The rows are called Tuple or Record. The data items within one row may belong to different data types. On the other hand, the columns are often called Domain or Attribute or Fields. All the data items within a single attribute are of the same data type. Information:- When data is processed, organized, structured or presented in a given context so as to make it useful, it is called information. Example:- A student secures 450 marks. Here 450 is data marks of the student is the information. Database A Database is a collection of inter-related data. DBMS (Database Management System) A database management system is a collection of inter- related data and set of programs to manipulate those data. DBMS = Database + Set of programs A simple example of a university database. This database is maintaining information about students, courses, and grades in a university environment. The database is organized as five files: The STUDENT file stores the data of each student The COURSE file stores contain data on each course. The SECTION stores information about sections in a particular course. The GRADE file stores the grades which students receive in the various sections The TUTOR file contains information about each professor. Metadata Metadata is data about data.Data such as table name, column name, data type authorized user and user access for any table is called metadata for that table. Example :- Table metadata(The names, data types, constraints for columns for a table) Table named “customers” might have columns for “customer ID”(integer,primary key)”First name”(varchar) and “last name”(varchar). Data dictionary Data dictionary is information which contains metadata. It is usually a part of the system catalog. Data warehouse Data warehouse is information which stored data. It is design to facilitate reporting and analysis. Record A record is a collection of logically related fields. For examples, collection of fields (id, name, society & city) forms a record for customer. Field:- A field is a character or group of characters that have a specific meaning. It is also called a data item. It is represented in the database by a value. For Example:- customer id, name, society and city are all fields for customer Data. Explain disadvantages of file system (file processing systems) compare to Database management system. OR Explain disadvantages of conventional file-based System compared to Database management system. Data Redundancy It is possible that the same information may be duplicated in different files. This leads to data redundancy. Data redundancy results in memory wastage. For example, consider that some customers have both kinds of accounts - saving and current. In this case, data about customers - name, address, e-mail, contact number -Will be duplicated in both files, file for saving accounts and file for current accounts. This leads to requirement of higher storage space. In other words, same information will be stored in two different locations (files). And, it wastes memory. Data Inconsistency Due to data redundancy, it is possible that data may not be in consistent state. For example, consider that an address of some customer changes. And, that customer has both kinds of accounts. Now, it is possible that this changed address is updated in only one file, leaving address in other file as it is. As a result of this, same customer will have two different addresses in two different files, making data inconsistent. Difficulty in Accessing Data Accessing data is not convenient and efficient in file processing system. For example, suppose, there is a program to find information about all customers. But, what if there is a need to find out all customers from some particular city. In this case, there are two choices here: One, find out all customers using available program, and then extract the needed customers manually. Second, develop new program to get required information. Both options are not satisfactory. For each and every different kind of data access, separate programs are required. This is neither convenient nor efficient. Limited Data Sharing Data are scattered (distrubted) in various files. Different files may have different formats. And these files may be stored in different folders (directories) may be of different computers of different departments. So, due to this data isolation, it is difficult to share data among different applications. 6 Prepared:-Darshana v.Halatwala Integrity Problems Data integrity means that the data contained in the database is both correct and consistent. For this purpose, the data stored in database must satisfy certain types of constraints (rules). For example, a balance for any account must not be less than zero. Such constraints are enforced in the system by adding appropriate code in application programs. But, when new constraints are added, such as balance should not be less than Rs. 5000, application programs need to be changed. But, it is not an easy task to change programs whenever required. Atomicity Problems Any operation on database must be atomic. This means, operation completes either 100% or 0%. For example, a fund transfer from one account to another must happen in its entirely. But, computer systems are vulnerable to failure, such as system crash, virus attack. If a system failure occurs during the execution of fund transfer operation, it may possible that amount to be transferred, say, Rs. 500, is debited from one account, but is not credited to another account. This leaves database in inconsistent state. But, it is difficult to ensure atomicity in a file processing system. Concurrent Access Anomalies Multiple users are allowed to access data simultaneously (concurrently). This is for the sake of better performance and faster response. Consider an operation to debit (withdrawal) an account. The program reads the old balance, calculates the new balance, and writes new balance back to database. Suppose an account has a balance of Rs. 5000. Now, a concurrent withdrawal of Rs. 1000 and Rs. 2000 may leave the balance Rs. 4000 or Rs. 3000 depending upon their completion time rather than the correct value of Rs. 2000. Here, concurrent data access should be allowed under some supervision. But, due to lack of co-ordination among different application programs, this is not possible in file processing systems. Security Problems Database should be accessible to users in a limited way. Each user should be allowed to access data concerning his application only. For example, a customer can check balance only for his/her own account. He/She should not have access for information about other accounts. But, in file processing system, application programs are added in an ad hoc manner by different programmers. So, it is difficult to enforce such kind of security constraints. Explain advantages (benefits) of DBMS over file management system. OR Explain purpose of database system. Minimal Data Redundancy (Duplication) Due to centralized database, it is possible to avoid unnecessary duplication of information. This leads to reduce data redundancy. It prevents memory wastage. It also reduces extra processing time to get required data. Shared Data All authorized user and application program can share database easily. Data Consistency Data inconsistency occurs due to data redundancy. With reduced data redundancy such type of data inconsistency can be eliminated. This results in improved data consistency. Data Access DBMS utilizes a variety of techniques to retrieve data. Required data can be retrieved by providing appropriate query to the DBMS. Thus, data can be accessed in convenient and efficient manner. Data Integrity Data in database must be correct and consistent. So, data stored in database must satisfy certain types of constraints (rules). DBMS provides different ways to implement such type of constraints (rules). 7 Prepared:-Darshana v.Halatwala This improves data integrity in a database. Data Security Database should be accessible to user in a limited way. DBMS provides way to control the access to data for different user according to their requirement. It prevents unauthorized access to data. Thus, security can be improved. Concurrent Access Multiple users are allowed to access data simultaneously. Concurrent access to centralized data can be allowed under some supervision. This results in better performance of system and faster response. Guaranteed Atomicity Any operation on database must be atomic. This means, operation must be executed either 100% or 0%. This type of atomicity is guaranteed in DBMS. REQUIREMENTS FOR A DBMS The various softwares which handle the data in a database i.e. DBMS (like Oracle, FoxPro, SQL Server etc.) should meet the following requirements: - 1. Provide data definition facilities. Define Data Definition Language (DDL)(create,alter,drop) Provide user accessible catalog (Data Dictionary) 2. Provide facilities for storing, retrieving and updating data. Define Data Manipulation Language (DML) 3. Support Multiple View of Data End User or application should see only the need data and information required. 4. Provides facilities for specifying Integrity constraints. Primary Key Constraints Foreign Key Constraints More General Constraints 5. Provide facilities for controlling access to data. Prevent unauthorized access and update. 6. Allow simultaneous access and update by multiple users. Provide concurrency control mechanism. 7. Support Transactions. A sequence of operations to be performed as a whole. All operations are performed or none. 8. Provide facilities for database recovery. Bring database back to consistent state after a failure such as disk failure, faulty program etc. 9. Provide facilities for database maintenance. Maintenance operations: upload, reload, mass Insertion and deletion, validation etc. Data Dictionary in DBMS A data dictionary in Database Management System (DBMS) can be defined as a component that stores the collection of names, definitions, and attributes for data elements that are being used in a database. The Data Dictionary stores metadata, i.e., data about the database. These data elements are then used as part of a database, research project, or information system. Why Use a Data Dictionary? Data Dictionary is made up of two words, data which means the collected information through multiple sources, and dictionary meaning the place where all this information is made available. A data dictionary is a crucial(basic) part of a relational database as it provides additional information about the relationships between multiple tables in a database. The data dictionary in DBMS helps the user to arrange data in a neat and well-organized way, thus preventing data redundancy. Below is a data dictionary describing the table containing employee details. Attribute Name Data Type Size Description isRequired Employee ID Integer 10 A unique ID for yes each Employee 8 Prepared:-Darshana v.Halatwala Name Text 25 Name of the Yes Employee Date_of_birth Date Date of Birth of Yes the Employee Mobile no Integer 10 Contact Number yes of the Employee Some advantages of using a data dictionary are: 1. Data models in DBMS provide very little information about the database, so a data dictionary is very essential to have proper knowledge about entities, relationships, and attributes that are present in a data model. 2. The Data Dictionary provides consistency by reducing data redundancy in the collection and use of data across various members of a team. 3. The Data Dictionary provides structured analysis and design tools by enforcing the use of data standards. Data standards are the set of rules that govern the way data is collected, recorded, and represented. 4. Using a Data Dictionary helps to define naming conventions that are used in a model. Types of Data Dictionary in DBMS There are mainly two types of data dictionary in a database management system: 1. Integrated Data Dictionary 2. Stand Alone Data Dictionary 1. Integrated Data Dictionary Every relational database has an Integrated(combined)Data Dictionary contained within the DBMS. This integrated data dictionary acts as a system catalog that is accessed and updated by the relational database. In older databases, they did not include an integrated data dictionary, so in that case, the database administrator had to use Stand Alone Data Dictionary. In DBMS, an Integrated Data Dictionary can bind metadata to data. 2. Stand Alone Data Dictionary In DBMS, this type of data dictionary is very flexible as it allows the Database Administrator to define and manage all the confidential data. It doesn't matter whether the data is computerized or not. A stand-alone data dictionary allows database designers to interact with end-users regardless of the data dictionary format. There is no standard format for a data dictionary. Below given are some of the common elements: 1. Data Elements: The Data Dictionary stores the definition of all the data elements such as name, datatype, storage formats, and validation rules. 2. Tables: All information regarding the table, such as the user who created the table, the number of rows and columns, the date on which the table was created and accessed, etc. 3. Index: Indexes for defined database tables are stored in the data dictionary. DBMS stores the index name used by the attributes, location, and characteristics of the index, as well as the date of creation, in each index. 4. Programs: Programs defined to access the database, including reports, application and screen formats, SQL queries, etc., are also stored in the data dictionary. 5. Relationship between data elements: The Data Dictionary stores the type of relationship; for example, if it is compulsory or optional, the cardinality of the relationship and connectivity, etc. 6. Administrations and End-Users: The Data Dictionary stores all the information of the administration along with the end-users. What is data Abstraction in DBMS? Example:- Data abstraction is present in our daily lives. Let us take a small example. Say, someone, asks you to switch on the fans in a room. All you will need to do is simply walk to the switchboard and turn on the switch for the fan, that’s it! Do you need to know where the electricity is coming from, how the poles of the switch are connected, or exactly what the internal working of a fan is? The answer to all this is NO! That is what data abstraction is, all these background details are hidden from you inside the switchboard! Data Abstraction is a process of hiding unwanted or irrelevant details from the end user. It provides a different view and helps in achieving data independence which is used to enhance the security of data. The database systems consist of complicated data structures and relations. For users to access the data 9 Prepared:-Darshana v.Halatwala easily, these complications are kept hidden, and only the relevant part of the database is made accessible to the users through data abstraction. Levels of abstraction for DBMS Database systems include complex data-structures. In terms of retrieval of data, reduce complexity in terms of usability of users and in order to make the system efficient, developers use levels of abstraction that hide irrelevant details from the users. Levels of abstraction simplify database design. Mainly there are three levels of abstraction for DBMS, which are as follows − Physical or Internal Level Logical or Conceptual Level View or External Level These levels are shown in the diagram below – Physical or Internal Level It is the lowest level of abstraction for DBMS which defines how the data is actually stored, it defines data- structures to store data and access methods used by the database. Actually, it is decided by developers or database application programmers how to store the data in the database. So, overall, the entire database is described in this level that is physical or internal level. It is a very complex level to understand. 10 Prepared:-Darshana v.Halatwala For example, customer's information is stored in tables and data is stored in the form of blocks of storage such as bytes, gigabytes etc. Logical or Conceptual Level Logical level is the intermediate level or next higher level. It describes what data is stored in the database and what relationship exists among those data. It tries to describe the entire or whole data because it describes what tables to be created and what are the links among those tables that are created. It is less complex than the physical level. Logical level is used by developers or database administrators (DBA). So, overall, the logical level contains tables (fields and attributes) and relationships among table attributes. View or External Level It is the highest level. In view level, there are different levels of views and every view only defines a part of the entire data. It also simplifies interaction with the user and it provides many views or multiple views of the same database. View level can be used by all users (all levels' users). This level is the least complex and easy to understand. For example, a user can interact with a system using GUI(Graphical user interface) that is view level and can enter details at GUI or screen and the user does not know how data is stored and what data is stored, this detail is hidden from the user. Instances and Schemas Databases change over time as information is inserted and deleted. The collection of information stored in the database at a particular moment is called an instance of the database. The overall design of the database is called the database schema. Schemas are changed infrequently, if at all. The concept of database schemas and instances can be understood by analogy to a program written in a programming language. A database schema corresponds to the variable declarations (along with associated type definitions) in a program. Each variable has a particular value at a given instant. The values of the variables in a program at a point in time correspond to an instance of a database schema. Database systems have several schemas, partitioned according to the levels of abstraction. The physical schema describes the database design at the physical level, while the logical schema describes the database design at the logical level. A database may also have several schemas at the view level, sometimes called subschemas, which describe different views of the database. Of these, the logical schema is by far the most important, in terms of its effect on application programs, since programmers construct applications by using the logical schema. The physical schema is hidden beneath the logical schema, and can usually be changed easily without affecting application programs. Application programs are said to exhibit physical data independence if they do not depend on the physical schema, and thus need not be rewritten if the physical schema changes. 11 Prepared:-Darshana v.Halatwala Levels of Data Independence Based on the data abstraction, there are two levels of data independence in DBMS: Physical level data independence Logical level data independence Let’s discuss the properties of these two levels of data independence. 1. Physical Level Data Independence Physical Data Independence can be defined as the ability to change the physical level without affecting the logical or Conceptual level. Physical data independence gives us the freedom to modify the - Storage device, File structure, location of the database, etc. without changing the definition of conceptual or view level. Example: For example, if we take the database of the banking system and we want to scale up the database by changing the storage size and also want to change the file structure, we can do it without affecting any functionality of logical schema. Below changes can be done at the physical layer without affecting the conceptual layer - Changing the storage devices like SSD, hard disk and magnetic tapes, etc. Changing the access technique and modifying indexes. Changing the compression techniques or hashing algorithms. 2. Logical Level Data Independence Logical Data Independence is a property of a database that can be used to change the logic behind the logical level without affecting the other layers of the database. Logical data independence is usually required for changing the conceptual schema without having to change the external schema or application programs. It allows us to make changes in a conceptual structure like adding, modifying, or deleting an attribute in the database. Example: If there is a database of a banking system and we want to add the details of a new customer or we want to update or delete the data of a customer at the logical level data will be changed but it will not affect the Physical level or structure of the database. These changes can be done at a logical level without affecting the application program or external layer. Adding, deleting, or modifying the entity or relationship. Merging or breaking the record present in the database. Database Administrators and Database Users A primary goal of a database system is to retrieve information from and store new information in the database. People who work with a database can be categorized as database users or database administrators. 12 Prepared:-Darshana v.Halatwala Administator End User Designers DBMS Administrators:- Administrators maintain the DBMS and are responsible for administering the database. Administrators also look after DBMS resources like system license, required tools and other software and hardware tools. Designers:-Designers are the group of people who actually work on the designing part of the database. End users:-End users are the people whose jobs require access to a database for querying, updating and generating reports. Database Users and User Interfaces There are four different types of database-system users,differentiated by the way they expect to interact with the system. Different types of user interfaces have been designed for the different types of users. 1)Naive users 2) sophisticated users 3) Application programmers 4)Specialized users Naive users are unsophisticated(inexperienced) users who interact with the system by invoking one of the application programs that have been written previously. For example, a bank teller who needs to transfer $50 from account A to account B invokes a program called transfer. This program asks the teller for the amount of money to be transferred, the account from which the money is to be transferred, and the account to which the money is to be transferred. As another example, consider a user who wishes to find her account balance over the World Wide Web. Such a user may access a form, where she enters her account number. An application program at the Web server then retrieves the account balance, using the given account number, and passes this information back to the user. The typical user interface for naive users is a forms interface, where the user can fill in appropriate fields of the form. Naive users may also simply read reports generated from the database. Application programmers are computer professionals who write application programs. Application programmers can choose from many tools to develop user interfaces. Rapid application development (RAD) tools are tools that enable an application programmer to construct forms and reports without writing a program. There are also special types of programming languages that combine imperative control structures (for example, for loops, while loops and if-then-else statements) with statements of the data manipulation language. 13 Prepared:-Darshana v.Halatwala These languages, sometimes called fourth-generation languages, Often include special features to facilitate the generation of forms and the display of data on the screen. Most major commercial database systems include a fourth generation language. Sophisticated users interact with the system without writing programs. Instead, they form their requests in a database query language. They submit each such query to a query processor, whose function is to break down DML(Select,insert,update) statements into instructions that the storage manager understands. Analysts who submit queries to explore data in the database fall in this category. Online analytical processing (OLAP) tools simplify analysts’ tasks by letting them view summaries of data in different ways. For instance, an analyst can see total sales by region (for example, North, South, East, and West) , or by product, or by a combination of region and product (that is, total sales of each product in each region). The tools also permit the analyst to select specific regions,look at data in more detail (for example, sales by city within a region) or look at the data in less detail (for example, aggregate products together by category). Another class of tools for analysts is data mining tools, which help them find certain kinds of patterns in data. Specialized users are sophisticated users who write specialized database applications that do not fit into the traditional data-processing framework. Among these applications are computer-aided design systems, knowledge base and expert systems, systems that store data with complex data types (for example, graphics data and audio data), and environment-modeling systems. What are the applications of DBMS? The Database Management System (DBMS) is defined as a software system that allows the user to define, create and maintain the database and provide control access to the data. It is a collection of programs used for managing data and simultaneously it supports different types of users to create, manage, retrieve, update and store information. Applications of DBMS In so many fields, we will use a database management system. Let’s see some of the applications where database management system uses − Railway Reservation System − The railway reservation system database plays a very important role by keeping record of ticket booking, train’s departure time and arrival status and also gives information regarding train late to people through the database. Library Management System − Now-a-days it’s become easy in the Library to track each book and maintain it because of the database. This happens because there are thousands of books in the library. It is very difficult to keep a record of all books in a copy or register. Now DBMS used to maintain all the information related to book issue dates, name of the book, author and availability of the book. Banking − Banking is one of the main applications of databases. We all know there will be a thousand transactions through banks daily and we are doing this without going to the bank. This is all possible just because of DBMS that manages all the bank transactions. Universities and colleges − Now-a-days examinations are done online. So, the universities and colleges are maintaining DBMS to store Student’s registrations details, results, courses and grade all the information in the database. For example, telecommunications. Without DBMS there is no telecommunication company. DBMS is most useful to these companies to store the call details and monthly postpaid bills. Credit card transactions − The purchase of items and transactions of credit cards are made possible only by DBMS. A credit card holder has to know the importance of their information that all are secured through DBMS. Social Media Sites − By filling the required details we are able to access social media platforms. Many users sign up daily on social websites such as Facebook, Pinterest and Instagram. All the information related to the users are stored and maintained with the help of DBMS. Finance − Now-a-days there are lots of things to do with finance like storing sales, holding information and finance statement management etc. these all can be done with database systems. Military − In military areas the DBMS is playing a vital role. Military keeps records of soldiers and it has so many files that should be kept secure and safe. DBMS provides a high security to military information. Online Shopping − Now-a-days we all do Online shopping without wasting the time by going shopping with the help of DBMS. The products are added and sold only with the help of DBMS like Purchase information, invoice bills and payment. 14 Prepared:-Darshana v.Halatwala Human Resource Management − The management keeps records of each employee’s salary, tax and work through DBMS. Manufacturing − Manufacturing companies make products and sell them on a daily basis. To keep records of all those details DBMS is used. Airline Reservation system − Just like the railway reservation system, airlines also need DBMS to keep records of flights arrival, departure and delay status. So finally, we can clearly conclude that the DBMS is playing a very important role in each and every field. Need of Information - Business, Research, Governance A database management system (DBMS) is a crucial part of an information system because it helps businesses manage and create data. Here are some ways that DBMSs help businesses: Data accuracy DBMSs help ensure data is accurate and reliable, which is important for making data-driven decisions. Data access DBMSs help users access data quickly and easily, which can improve the speed of sales cycles. Data integration DBMSs help businesses integrate data, which can help them better analyze data and make decisions. Data consistency DBMSs help ensure data is consistent, which is important for presenting accurate information in reports. Data security DBMSs help provide increased security for data. 15 Prepared:-Darshana v.Halatwala

Tags

DBMS database management information systems
Use Quizgecko on...
Browser
Browser