Database Management System

DATABASE MANAGEMENT SYSTEM (CS2102-1) Dr. Santhosh S Asst. Professor III Dept. of IS&E, NMAMIT, Nitte Course Learning Objectives 1. Provide a strong foundation in database concepts, design, and application. 2. Understand the concepts of relational model and relational algebra in database design. 3. Learn structured query language (SQL) to an intermediate/advanced level and evaluate the result set. 4. Understand the use of normalization techniques for building effective database design. 5. Demonstrate the use of File organization and Indexing, Concurrency Control, and transactions in databases. Syllabus Let’s Start the Journey File Systems Versus a DBMS Scenario: A university has a large collection (say, 100 GB) of data on employees, departments, students, and so on. This data is accessed concurrently by several employees or students. The questions about the data must be answered quickly, changes made to the data by different users must be applied consistently, and access to certain parts of the data (e.g., salaries) must be restricted. How to deal with this? Prepared By: Abhishek S. Rao 7 Drawbacks of data storage in OS files Probably we do not have enough space in memory to store data. Might need to write some special programs to answer each query that users may wish to know. Protection of the data from inconsistent changes made by different users while accessing the data. Need to ensure whether the data is restored to a consistent state if the system crashes while changes are being made. Permission to access different subsets of the data. How to overcome these drawbacks? Prepared By: Abhishek S. Rao 8 Need of a Software A DBMS is a piece of software that is designed to make the former tasks easier. By storing data in a DBMS, rather than as a collection of operating system files, we can use the DBMS's features to manage the data robustly and efficiently. Prepared By: Abhishek S. Rao 9 Introduction to Database Systems Data: Known facts that can be recorded and that have implicit meaning. Database: Collection of related data. For example, consider the names, telephone numbers, and addresses of the people you know. Database Management System: DBMS is a collection of programs that enables users to create and maintain a database. Defining a database involves specifying the data types, structures, and constraints for the data to be stored in the database. Manipulating a database includes functions as querying the database to retrieve specific data, updating the database to reflect changes in the mini world, and generating reports from the data. Sharing a database allows multiple users and programs to access the database concurrently. Prepared By: Abhishek S. Rao 10 Characteristics of the Database Approach 1. Self-describing nature of a database system 2. Insulation between programs and data, and data abstraction In traditional file processing, the structure of data files is embedded in the application programs, so any changes to the structure of a file may require changing all programs that access that file. In DBMS the structure of data files is stored in the DBMS catalog separately from the access programs, this property is called program-data independence. The characteristic that allows program-data independence and program-operation independence is called data abstraction. A DBMS provides users with a conceptual representation of data that does not include many of the details of how the data is stored or how the operations are implemented. A data model is a type of data abstraction that is used to provide this conceptual representation. The data model uses logical concepts, such as objects, their properties, and their interrelationships, that may be easier for most users to understand than computer storage concepts. 3. Support of multiple views of the data A traditional file processing approach supports a single view of the data. A database typically has many users, each of whom may require a different perspective or view of the database. A view may be a subset of the database. 4. Sharing of data and multiuser transaction processing. The DBMS includes concurrency control software to ensure that several users trying to update the same data do so in a controlled manner so that the result of the updates is correct For example, when several reservation agents try to assign a seat on an airline flight, the DBMS should ensure that each seat can be accessed by only one agent at a time for assignment to a passenger. These types of applications are generally called online transaction processing (OLTP) applications. Actors on the Scene 1. Database Administrators The DBA is responsible for authorizing access to the database, coordinating and monitoring its use, and acquiring software and hardware resources as needed. The DBA is accountable for problems such as security breaches and poor system response time. 2. Database Designers Database designers are responsible for identifying the data to be stored in the database and for choosing appropriate structures to represent and store this data 3. End Users – Casual end users ; Example: middle- or high-level managers – Naive or parametric end users ; Ex: Bank tellers check account balances and post withdrawals and deposits – Sophisticated end users ; Ex: engineers, scientists, business analysts – Standalone users ; Ex: User who maintain personal databases by using ready-made program packages that provide easy-to-use menu-based or graphics-based interfaces. 4. System Analysts & Application Programmers – System analysts determine the requirements of end users, especially naive and parametric end users, and develop specifications for standard canned transactions that meet these requirements. – Application programmers implement these specifications as programs; then they test, debug, document, and maintain these canned transactions. Workers behind the Scene Workers behind the scene are those who are associated with the design, development, and operation of the DBMS software and system environment. 1. DBMS system designers and implementers: design and implement the DBMS modules and interfaces as a software package. 2. Tool developers: design and implement tools; they include packages for database design, performance monitoring, natural language or graphical interfaces, prototyping, simulation, and test data generation. 3. Operators and maintenance personnel: responsible for the actual running and maintenance of the hardware and software environment for the database system. Advantages of Using the DBMS Approach 1. Controlling Redundancy 2. Restricting Unauthorized Access 3. Providing Persistent Storage for Program Objects. 4. Providing Storage Structures and Search Techniques for Efficient Query Processing. 5. Providing Backup and Recovery 6. Providing Multiple User Interfaces 7. Representing Complex Relationships among Data 8. Enforcing Integrity Constraints 9. Permitting Inferencing and Actions Using Rules. 10. Additional Implications of Using the Database Approach Overview of Database Languages and Architectures Data Models, Schemas, and Instances A data model is a collection of concepts that can be used to describe the conceptual/logical structure of a database. Categories of Data Models High-level or conceptual data models Representational or implementation data models Physical data models Schemas, Instances, and Database State The description of a database is called the database schema The data in the database at a particular moment in time is called a database state or snapshot The DBMS stores the descriptions of the schema constructs and constraints—also called the meta-data Three-Schema Architecture and Data Independence 1. The internal level has an internal schema, which describes the physical storage structure of the database. 2. The conceptual level has a conceptual schema, which describes the structure of the whole database for a community of users. The conceptual schema hides the details of physical storage structures and concentrates on describing entities, data types, relationships, user operations, and constraints. 3. The external or view level includes a number of external schemas or user views. Each external schema describes the part of the database that a particular user group is interested in and hides the rest of the database from that user group. Data Independence The ability to modify a schema definition in one level without affecting a schema definition in a higher level is called data independence. 1. Logical data independence is the capacity to change the conceptual schema without having to change external schemas or application programs. Ex: Adding a new record. 2. Physical data independence is the capacity to change the internal schema without having to change the conceptual schema. Ex: Changes to the internal schema may be needed because some physical files were reorganized—for example, by creating additional access structures—to improve the performance of retrieval or update. Conceptual Data Modeling using Entities and Relationships High-Level Conceptual Data Models for Database Design 1. Requirement collection and analysis: The database designers interview prospective database users to understand and document their data requirements 2. In parallel with specifying the data requirements, it is useful to specify the known functional requirements of the application. These consists of user defined operations that will be applied to the database. In software design, it is common to use data flow diagrams, sequence diagrams, scenario to define functional requirements. 3. The conceptual schema is a concise description of the data requirements of the users and includes detailed description of the entity types, relationships, and constraints. 4. The next step in database design is the actual implementation of the database, using a commercial DBMS. This step is called logical design or data model mapping; its result is a database schema in the implementation data model of the DBMS. 5. In physical design phase, internal storage structures, indexes, access paths, and file organizations for the database files are specified. ENTITY TYPES, ENTITY SETS, ATTRIBUTES, AND KEYS Entities and Attributes  An entity may be an object with a physical existence (for example, a particular person, car, house, or employee) or it may be an object with a conceptual existence (for example, a company, a job, or a university course).  Each entity has attributes-the particular properties that describe it. (For example, an employee entity may be described by the employee's name, age, address, salary, and job).  Several types of attributes occur in the ER model: simple versus composite, single-valued versus Multi-valued, and stored versus derived. Several types of attributes occur in the ER model: 1. Simple versus composite – Attributes that are not divisible is called simple – Attributes that are sub-divisible is called composite 2. Single-valued versus Multi-valued – Most attribute have a single value for a particular entity. E.g. Age. 3. Stored versus derived attributes – If an attribute can be determined using the value of another attribute, they are called derived attributes. – The attribute that is used to derive the attribute is called a stored attribute. – The age attribute is hence called Derived attribute and is said to be derivable from the Birth_date attribute, which is called a Stored attribute. 4. NULL values. – There are cases where an attribute does not have an applicable value for an attribute. For these situations, the value null is created. – A person who does not have a mobile phone would have null stored at the value for the Mobile Phone Number attribute. 5. Complex attributes. – Notice that composite and multi-valued attributes can be nested arbitrarily. We can arbitrary nesting by grouping components of a composite attribute between parentheses( ) and separating the components with commas, and by displaying multi-valued attributes between braces { }. Such attributes are called complex attributes. Entity Types, Entity Sets, Keys, and Value Sets Entity Types and Entity Sets: A database usually contains groups of entities that are similar. For example, a company employing hundreds of employees may want to store similar information concerning each of the employees. These employee entities share the same attributes, but each entity has its own value(s) for each attribute. An entity type defines a collection (or set) of entities that have the same attributes. Each entity type in the database is described by its name and attributes Entity Type Name: Employee Company (Name, age, Salary) (Name, headquarters, President) E1. (John Smith, 55, C1. 80k) (Sunco Oil, Houston, John Smith) E2. (Fred Brown, 40, C2. 30k) (Fast Computer, Dallas, Bob King) E3.. (Kiran, 25, 20k)..... Entity Set The collection of all entities of a particular entity type in the database at any point in time is called an entity set; The entity set is usually referred to using the same name as the entity type. An entity type is represented in ER diagrams as a rectangular box enclosing the entity type name. Attribute names are enclosed in ovals and are attached to their entity type by straight lines. Composite attributes are attached to their component attributes by straight lines. Multi-valued attributes are displayed in double ovals. An entity type describes the schema or intension for a set of entities that share the same structure. Key Attributes of an Entity Type: An important constraint on the entities of an entity type is the key or uniqueness constraint on attributes. An entity type usually has an attribute whose values are distinct for each individual entity in the entity set. Such an attribute is called a key attribute, and its values can be used to identify each entity uniquely. Value Sets(Domains) of attribute: Each simple attribute of an entity type is associated with a value set, which specifies the set of values that may be assigned to that attribute for each individual entry. Value sets are not displayed in ER diagrams. Value sets are typically specified using the basic data types available in most programming languages, such as integers, string, boolean, float, enumerated type and so on. Initial Conceptual Design of the COMPANY database RELATIONSHIP TYPES, RELATIONSHIP SETS, ROLES, AND STRUCTURAL CONSTRAINTS Relationship Types, Relationship Sets and Instances: A relationship type R among n entity types E1,E2,.. En defines a set of associations among entities from these entity types. Relationship instance: Each relationship instance ri in R is an association of entities, where the association includes exactly one entity from each participating entity type. For example, consider a relationship type WORKS_FOR between the two entity types EMPLOYEE and DEPARTMENT. Each relationship instance in the relationship set WORKS_FOR associates one EMPLOYEE entity and one DEPARTMENT entity. In ER diagrams, relationship types are displayed as diamond- shaped boxes, which are connected by straight lines to the rectangular boxes representing the participating entity types. Relationship Degree, Role Names, and Recursive Relationships The degree of a relationship type is the number of participating entity types. Hence, the WORKS_FOR relationship is of degree two. A relationship type of degree two is called binary, and one of degree three is called ternary. An example of a ternary relationship is SUPPLY, shown in Figure below. ‘S’ supplies part ‘p’ to project ‘j’ Role Names and Recursive Relationship Each entity type that participates in a relationship type plays a particular role in the relationship. The role name signifies the role that a participating entity from the entity type plays in each relationship means. Eg: in the WORKS_FOR relationship type, EMPLOYEE plays the role of employee or worker and DEPARTMENT plays the role of department or employer. In some cases the same entity type participates more that once in a relationship type in different roles such types are called recursive relationships. For example, in the Company schema, each employee has a supervisor, we need to include the relationship “Supervises”, however a supervisor is also an employee, therefore the employee entity type participates twice in the relationship, once as an employee and once as a supervisor, therefore we can specify two roles, employee and supervisor Constraints on Relationship Types Relationship types usually have certain constraints that limit the possible combinations of entities that may participate in the corresponding relationship set. We can distinguish two main types of relationship constraints: cardinality ratio and participation. Cardinality Rations for Binary Relationships: The cardinality ratio for a binary relationship specifies the maximum number of relationship instances that an entity can participate in. Eg: WORKS_FOR binary relationship type DEPARTMENT:EMPLOYEE is of cardinality ratio 1:N. Meaning that each department can be related with any number of employee, but an employee can be related to only one department. Possible cardinality ratios for binary relationship types are 1:1, 1:N, N:1 and M:N 1:1 relationship, MANAGES. M:N relationship, WORKS_ON. Participation constraints and existence dependencies:  The participation constraint specifies whether the existence of an entity depends on its being related to another entity via the relationship type.  This constraint specifies the minimum number of relationship instances that each entity can participate in and is sometimes called the minimum cardinality constraint.  There are two types of cardinality constraint: total and partial. If a company policy states that every employee must work for a department, then an employee entity can exist only if it participates in at least one WORKS_FOR relationship instance. Thus, the participation of EMPLOYEE in WORKS_FOR is called total participation, meaning that every entity in the total set of employee entities must be related to a department entity via WORKS_FOR. Total participation is also called existence dependency. The participation of EMPLOYEE in the MANAGES relationship type is partial, meaning that some or part of the set of employee entities are related to some department entity via MANAGES, but not necessarily all. Attribute of Relationship Types Relationship types can also have attribute, similar to those of entity types. Eg: to record the number of hours per week that an employee WORKS_ON a particular project , we can include an attribute Hours for the WORKS_ON relationship type. Another example is to include the date on which a manager started managing a department via an attribute Start_date for the MANAGES relationship type. Notice that attributes of 1:1 or 1:N relationship types can be migrated to one of the participating entity type. For example, the start_date attribute for the MANAGES relationship can be an attribute of either EMPLOYEE OR DEPARTMENT, although conceptually it belongs to MANAGES. For a 1:N relationship type, a relationship attribute can be migrated only to the entity type on the N-Side of the relationship. WEAK ENTITY TYPES Entity types that do not have key attributes of their own are called weak entity types. Entities belonging to a weak entity type are identified by being related to specific entities from another entity type in combination with one of their attribute values. We call this other entity type the identifying or owner entity type, and we call the relationship type that relates a weak entity type to its owner the identifying relationship of the weak entity type. A weak entity type always has a total participation constraint (existence dependency) with respect to its identifying relationship, because a weak entity cannot be identified without an owner entity. However, not every existence dependency results in a weak entity type. For example, a DRIVER_LICENSE entity cannot exist unless it is related to a PERSON entity, even though it has its own key (License_Number) and hence is not a weak entity. A weak entity type normally has a partial key, which is the set of attributes that can uniquely identify weak entities that are related to the same owner entity. It is sometime called as Discriminator. If we assume that no two dependents of the same employee ever have the same first name, the attribute name of DEPENDENT is the partial key. The partial key attribute is underlined with a dashed or dotted line. In ER diagrams, both a weak entity type and its identifying relationship are distinguished by surrounding their boxes and diamonds with double lines. The partial key attribute is underlined with a dashed or dotted line. ER Diagrams, Naming Conventions, and Design Issues. Things to remember before you draw the ER diagram Entities Attributes Relationships Cardinality constraints or participation constraints Cardinality ratios Prepared By: Abhishek S. Rao 65 Case Study - 1: College Database Lets first identify the entities in the given problem  Student  Semester Section  Class  Subject  Marks Prepared By: Abhishek S. Rao 66 ER Diagram: College Database USN Sem Name Address SSID Sec Student Class SemSec n 1 m m Class Marks Secures Title n SubCode Sem Test-1 n Scored Subject Test-2 Marks in m n Credits Test-3 Final 67 Marks Case Study - 2: Library Database Lets first identify the entities in the given problem  Book  Authors  Publisher  Book Copies  Book Lending  Library Branch  Card Prepared By: Abhishek S. Rao 68 ER Diagram: Library Database Title Pub_Year Name m Written n Book By Authors 1 Book ID No. of Has Copies n Branch n m Name Branch Address Book Copies ID m Published Library In By Branch n Card_No Name n 1 Book Lending Publisher Card Phone n Date Out Due Address 69 Date Case Study - 3: Company Database Lets first identify the entities in the given problem  Employee  Department  Project  Dependent Prepared By: Abhishek S. Rao 70 ER Diagram: Company Database MInit Location DNO FName LName DName Works Name Address n For 1 No. of SSN Employees 1 1 Manages Department Employee Start 1 1 Date Supervisee Supervisor m Works Controls Dependents On Supervision of 1 n n Hours n Supervision n Project Dependent PName Name Relationship PNO Location Gender Give me 5 Questions which you predict to be an important one!!!

Database Management System

Document Details

Tags

Related

Summary

Full Transcript