RDBMS Module 1 PPTs Part 1 PDF
Document Details
Uploaded by ConvincingDieBrücke
Presidency University
Tags
Summary
This document provides a presentation on the introduction to database modeling and relational algebra. It covers definitions of data and information, database systems, and DBMS. The presentation material is suitable for an undergraduate-level course.
Full Transcript
CSA2003 – Relational Database Management Systems Module 1 Introduction to Database Modeling and Relational Algebra Presidency U...
CSA2003 – Relational Database Management Systems Module 1 Introduction to Database Modeling and Relational Algebra Presidency University, Bengaluru Contents Introduction to Database: Schema, Instance, 3- shema architecture, physical and logical data independence, Data isolation problem in traditional file system, advantages of database over traditional file systems. Entity Relationship (ER) Model, ER Model to Relational Model, Examples on ER model. Relational Algebra with selection, projection, rename, set operations, Cartesian product, joins (inner and outer joins), and division operator. Examples on Relational Algebra Operations. 2 Introduction to Database 3 Definitions Data: Raw facts, Unprocessed facts or Known facts that can be recorded and have an implicit meaning. Ex: 25, suresh, Bangalore Structured: numbers, text, dates Unstructured: images, video, documents Information: Processed Data Ex: The age of Suresh is 25 Definitions Database: Def 1: Database is an organized collection of logically related data Def 2: A database is a shared collection of logically related data that is stored to meet the requirements of different users of an organization Def 3: A database is a self-describing collection of integrated records Def 4: A database models a particular real world system in the computer in the form of data Ex: Online banking system, Library Management Slide 1-5 What is DBMS ? Mini-world: Some part of the real world about which data is stored in a database. For example, student grades and transcripts at a university. The DBMS provides users and programmers a systematic way to create, retrieve, update and manage database. or A software package/ system to facilitate the creation and maintenance of a computerized database. Database System: The DBMS software together with the data itself. Sometimes, the applications are also included. Example of a Database (with a Conceptual Data Model) Mini-world for the example: Part of a UNIVERSITY environment. Some mini-world entities: STUDENTs COURSEs SECTIONs (of COURSEs) (academic) DEPARTMENTs INSTRUCTORs Note: The above could be expressed in the ENTITY- RELATIONSHIP data model. Slide 1-7 Example of a Database (with a Conceptual Data Model) Some mini-world relationships: SECTIONs are of specific COURSEs STUDENTs take SECTIONs COURSEs have prerequisite COURSEs INSTRUCTORs teach SECTIONs COURSEs are offered by DEPARTMENTs STUDENTs major in DEPARTMENTs Note: The above could be expressed in the ENTITY- RELATIONSHIP data model. Slide 1-8 Main Characteristics of the Database Approach Self-describing nature of a database system: A DBMS catalog stores the description of the database. The description is called meta-data. This allows the DBMS software to work with different databases. Insulation between programs and data: Called program-data independence. Allows changing data storage structures and operations without having to change the DBMS access programs. Data Abstraction: A data model is used to hide storage details and present the users with a conceptual view of the database. 9 Main Characteristics of the Database Approach (contd..) Support of multiple views of the data: Each user may see a different view of the database, which describes only the data of interest to that user. Sharing of data and multiuser transaction processing : allowing a set of concurrent users to retrieve and to update the database. Concurrency control within the DBMS guarantees that each transaction is correctly executed or completely aborted. OLTP (Online Transaction Processing) is a major part of database applications. 10 Types of Databases and Database Applications Numeric and Textual Databases Multimedia Databases Geographic Information Systems (GIS) Data Warehouses Real-time and Active Databases A number of these databases and applications are described later in the book (see Chapters 24,28,29) 11 Database Users Users may be divided into those who actually use and control the content (called “Actors on the Scene”) and those who enable the database to be developed and the DBMS software to be designed and implemented (called “Workers Behind the Scene”). Database Users Actors on the scene Database administrators Database Designers End-users Categories of End-users Casual Naïve or Parametric Sophisticated Stand-alone Workers behind the Scene DBMS system designers and implementers Tool developers Operators and maintenance personnel Data Models Data Model: A set of concepts to describe the structure of a database, and certain constraints that the database should obey. Data Model Operations: Operations for specifying database retrievals and updates by referring to the concepts of the data model. Operations on the data model may include basic operations and user-defined operations. Data Models Data Model gives us an idea that how the final system will look like after its complete implementation. It defines the data elements and the relationships between the data elements. Data Models are used to show how data is stored, connected, accessed and updated in the database management system. Here, we use a set of symbols and text to represent the information so that members of the organization can communicate and understand it. Categories of data models Conceptual (high-level, semantic) data models: Provide concepts that are close to the way many users perceive data. (Also called entity-based or object-based data models.) Physical (low-level, internal) data models: Provide concepts that describe details of how data is stored in the computer. Implementation (representational) data models: Provide concepts that fall between the above two, balancing user views with some computer storage details. Some of the Data Models in DBMS are: Hierarchical Model Flat Data Model Network Model Semi-Structured Data Model Entity-Relationship Model Associative Data Model Relational Model Context Data Model Object-Oriented Data Model Object-Relational Data Model Relational Model The relational model represents the database as a collection of relations. A relation is nothing but a table of values. Every row in the table represents a collection of related data values. These rows in the table denote a real- world entity or relationship. 20 Advantages of Using the Database Approach Controlling redundancy in data storage and in development and maintenance efforts. Sharing of data among multiple users. Restricting unauthorized access to data. Providing persistent storage for program Objects. Providing Storage Structures for efficient Query Processing Providing backup and recovery services. Providing multiple interfaces to different classes of users. Representing complex relationships among data. Enforcing integrity constraints on the database. Drawing Inferences and Actions using rules 21 Additional Implications of Using the Database Approach Potential for enforcing standards: this is very crucial for the success of database applications in large organizations Standards refer to data item names, display formats, screens, report structures, meta- data (description of data) etc. Reduced application development time: incremental time to add each new application is reduced. Flexibility to change data structures: database structure may evolve as new requirements are defined. Availability of up-to-date information – very important for on-line transaction systems such as airline, hotel, car reservations. Economies of scale: by consolidating data and applications across departments wasteful overlap of resources and personnel can be avoided. 22 Schemas, Instances, and Database State Database Schema A database schema is the logical representation of a database, which shows how the data is stored logically in the entire database. Employee Schema Department Schema Dept_Location Schema Schemas, Instances, and Database State Instances The instance of the database is the values of these variables or attributes at any given time Employee Schema Instances Schemas, Instances, and Database State Database State The data in the database at a particular moment in time is called a database state or snapshot. Schemas versus Instances Database Schema: A database schema is the skeleton structure that represents the logical view of the entire database. It defines how the data is organized and how the relations among them are associated. It formulates all the constraints that are to be applied on the data. The description of a database. Includes descriptions of the database structure and the constraints that should hold on the database. Schema Diagram: A diagrammatic display of (some aspects of) a database schema. Schema Construct: A component of the schema or an object within the schema, e.g., STUDENT, COURSE. Database Instance: The actual data stored in a database at a particular moment in time. Also called database state (or occurrence). 26 Database Schema Vs. Database State Database State: Refers to the content of a database at a moment in time. Initial Database State: Refers to the database when it is loaded Valid State: A state that satisfies the structure and constraints of the database. Distinction The database schema changes very infrequently. The database state changes every time the database is updated. Schema is also called intension, whereas state is called extension. 27 Three-Schema Architecture 28 Three-Schema Architecture Defines DBMS schemas at three levels: Internal schema at the internal level to describe physical storage structures and access paths. Typically uses a physical data model. Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users. Uses a conceptual or an implementation data model. External schemas at the external level to describe the various user views. Usually uses the same data model as the conceptual level. Type of Implementation Schema External View 1: Course info(cid:int,cname:string) Schema View 2: studeninfo(id:int. name:string) Conceptual Students(id: int, name: string, login: string, Shema age: integer) Courses(id: int, cname.string, credits:integer) Enrolled(id: int, grade:string) Physical Relations stored as unordered files. Schema Index on the first column of Students. Data Independence When a schema at a lower level is changed, only the mappings between this schema and higher-level schemas need to be changed in a DBMS that fully supports data independence. The higher-level schemas themselves are unchanged. Hence, the application programs need not be changed since they refer to the external schemas. Data Independence Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their application programs. Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema. 32 Differences between Logical & Physical data independence Due to Physical independence, any of the below change will not affect the conceptual Due to Logical independence, any of the layer. below change will not affect the external layer. Using a new storage device like Hard Drive or Magnetic Tapes Add/Modify/Delete a new attribute, entity or relationship is possible Modifying the file organization technique in without a rewrite of existing application the Database programs Switching to different data structures. Merging two records into one Changing the access method. Breaking an existing record into two or Modifying indexes. more records Changes to compression techniques or hashing algorithms. Change of Location of Database from say C drive to D Drive 33 Types of DBMS languages: 34 Data Definition Language (DDL) It is used by the DBA and database designers to specify the conceptual schema of a database. In many DBMSs, the DDL is also used to define internal and external schemas (views). In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas. It is used for creating tables, schema, indexes, constraints etc. in database. CREATE ALTER DROP TRUNCATE RENAME DROP Comment Data Manipulation Language (DML) It is used to specify database retrievals and updates. DML commands (data sublanguage) can be embedded in a general-purpose programming language (host language), such as COBOL, C or an Assembly Language. Alternatively, stand-alone DML commands can be applied directly (query language). SELECT INSERT UPDATE DELETE Data Control Language (DCL) DCL is used for granting and revoking user access on a database – GRANT REVOKE Transaction Control Language (TCL) The changes in the database that we made using DML commands are either performed or roll backed using TCL. COMMIT ROLLBACK Typical DBMS Component Modules 39 Chapter – 2 Data Modelling using Entities and Relationships 40 ER Diagram The ER or (Entity Relational Model) is a high-level conceptual data model diagram. Entity-Relation model is based on the notion of real-world entities and the relationship between them. Helps you to define terms related to entity relationship modeling Provide a preview of how all your tables should connect, what fields are going to be on each table Helps to describe entities, attributes, relationships ER diagrams are translatable into relational tables which allows you to build databases quickly ER diagrams can be used by database designers as a blueprint for implementing data in specific software applications The database designer gains a better understanding of the information to be contained in the database with the help of ERP diagram ERD is allowed you to communicate with the logical structure of the database to users 41 Components of the ER Diagram This model is based on three basic concepts: Entities Attributes Relationships 42 Example COMPANY Database We need to create a database schema design based on the following (simplified) Requirements of the COMPANY Database: The company is organized into DEPARTMENTs. Each department has a name, number and an employee who manages the department. We keep track of the start date of the department manager. A department may have several locations. Each department controls a number of PROJECTs. Each project has a unique name, unique number and is located at a single location. We store each EMPLOYEE’s social security number, address, salary, sex, and birthdate. Each employee works for one department but may work on several projects. We keep track of the number of hours per week that an employee currently works on each project. We also keep track of the direct supervisor of each employee. 43 Each employee may have a number of DEPENDENTs. For each dependent, we keep track of their name, sex, birthdate, and relationship to the employee. 44 ER Model Concepts Entities and Attributes Entities are specific objects or things in the mini-world that are represented in the database. For example the EMPLOYEE John Smith, the Research DEPARTMENT, the ProductX PROJECT Attributes are properties used to describe an entity. For example an EMPLOYEE entity may have a Name, SSN, Address, Sex, BirthDate A specific entity will have a value for each of its attributes. For example a specific employee entity may have Name='John Smith', SSN='123456789', Address ='731, Fondren, Houston, TX', Sex='M', BirthDate='09-JAN-55‘ Each attribute has a value set (or data type) associated with it – e.g. integer, string, subrange, enumerated type, … 45 Entity Types and Key Attributes Entities with the same basic attributes are grouped or typed into an entity type. For example, the EMPLOYEE entity type or the PROJECT entity type. An attribute of an entity type for which each entity must have a unique value is called a key attribute of the entity type. For example, SSN of EMPLOYEE. A key attribute may be composite. For example, VehicleTagNumber is a key of the CAR entity type with components (Number, State). An entity type may have more than one key. For example, the CAR entity type may have two keys: VehicleIdentificationNumber (popularly called VIN) and VehicleTagNumber (Number, State), also known as license_plate number. 46 Types of Attributes 47 Types of Attributes Examples Simple Each entity has a single atomic value for the attribute. For example, SSN or Sex. Composite The attribute may be composed of several components. For example, Address (Apt#, House#, Street, City, State, ZipCode, Country) or Name (FirstName, MiddleName, LastName). Composition may form a hierarchy where some components are themselves composite. Multi-valued An entity may have multiple values for that attribute. For example, Color of a CAR or PreviousDegrees of a STUDENT. Denoted as {Color} or {PreviousDegrees}. 48 Types of Attributes (2) In general, composite and multi-valued attributes may be nested arbitrarily to any number of levels although this is rare. For example, PreviousDegrees of a STUDENT is a composite multi-valued attribute denoted by {PreviousDegrees (College, Year, Degree, Field)}. 49 ENTITY SET corresponding to the ENTITY TYPE CAR 50 What are Keys in DBMS? KEYS in DBMS is an attribute or set of attributes which helps you to identify a row(tuple) in a relation(table). They allow you to find the relation between two tables. Keys help you uniquely identify a row in a table by a combination of one or more columns in that table. Key is also helpful for finding unique record or row from the table. 51 Types of Keys Super Key - A super key is a group of single or multiple keys which identifies rows in a table. Primary Key - is a column or group of columns in a table that uniquely identify every row in that table. Candidate Key - is a set of attributes that uniquely identify tuples in a table. Candidate Key is a super key with no repeated attributes. Alternate Key - is a column or group of columns in a table that uniquely identify every row in that table. 52 Types of Keys Foreign Key - is a column that creates a relationship between two tables. The purpose of Foreign keys is to maintain data integrity and allow navigation between two different instances of an entity. Compound Key - has two or more attributes that allow you to uniquely recognize a specific record. It is possible that each column may not be unique by itself within the database. Composite Key - An artificial key which aims to uniquely identify each record is called a surrogate key. These kind of key are unique because they are created when you don't have any natural primary key. 53 SUMMARY OF ER-DIAGRAM NOTATION FOR ER SCHEMAS 54 Initial Conceptual Design of the COMPANY Database Relationship Types, Relationship Sets, Roles, and Structural Constraints Relationship When an attribute of one entity type refers to another entity type Represent references as relationships not attributes. Relationship type R among n entity types E1, E2,..., En Defines a set of associations among entities from these entity types Relationship instances ri Each ri associates n individual entities (e1, e2,..., en) Each entity ej in ri is a member of entity set Ej Relationship Degree Degree of a relationship type Number of participating entity types Binary, ternary Relationships as attributes Think of a binary relationship type in terms of attributes Role Names and Recursive Relationships Role names and recursive relationships Role name signifies role that a participating entity plays in each relationship instance Recursive relationships Same entity type participates more than once in a relationship type in different roles Must specify role name Constraints on Binary Relationship Types Cardinality ratio for a binary relationship Specifies maximum number of relationship instances that entity can participate in Participation constraint Specifies whether existence of entity depends on its being related to another entity Types: total and partial Attributes of Relationship Types Attributes of 1:1 or 1:N relationship types can be migrated to one entity type For a 1:N relationship type Relationship attribute can be migrated only to entity type on N-side of relationship For M:N relationship types Some attributes may be determined by combination of participating entities Must be specified as relationship attributes Weak Entity Types Do not have key attributes of their own Identified by being related to specific entities from another entity type Identifying relationship Relates a weak entity type to its owner Always has a total participation constraint Entities are identified by the combination of: A partial key of the weak entity type The particular entity they are related to in the identifying entity type Example: Suppose that a DEPENDENT entity is identified by the dependent’s first name and birhtdate, and the specific EMPLOYEE that the dependent is related to. DEPENDENT is a weak entity type with EMPLOYEE as its identifying entity type via the identifying relationship type DEPENDENT_OF Refining the COMPANY database schema by introducing relationships By examining the requirements, six relationship types are identified All are binary relationships( degree 2) Listed below with their participating entity types: WORKS_FOR (between EMPLOYEE, DEPARTMENT) MANAGES (also between EMPLOYEE, DEPARTMENT) CONTROLS (between DEPARTMENT, PROJECT) WORKS_ON (between EMPLOYEE, PROJECT) SUPERVISION (between EMPLOYEE (as subordinate), EMPLOYEE (as supervisor)) DEPENDENTS_OF (between EMPLOYEE, DEPENDENT) 63 ER DIAGRAM 64 Weak Entity Type is: DEPENDENT Identifying Relationship is: DEPENDENTS_OF 65 Alternative (min, max) notation for relationship structural constraints: Specified on each participation of an entity type E in a relationship type R Specifies that each entity e in E participates in at least min and at most max relationship instances in R Default(no constraint): min=0, max=n Must have minmax, min0, max 1 Derived from the knowledge of mini-world constraints Examples: A department has exactly one manager and an employee can manage at most one department. Specify (0,1) for participation of EMPLOYEE in MANAGES Specify (1,1) for participation of DEPARTMENT in MANAGES An employee can work for exactly one department but a department can have any number of employees. Specify (1,1) for participation of EMPLOYEE in WORKS_FOR Specify (0,n) for participation of DEPARTMENT in WORKS_FOR 66 The (min,max) notation relationship constraints (0,1) (1,1) (1,1) (1,N) Chapter 3-67 COMPANY ER Schema Diagram using (min, max) notation 68 Relationships of Higher Degree Relationship types of degree 2 are called binary Relationship types of degree 3 are called ternary and of degree n are called n-ary In general, an n-ary relationship is not equivalent to n binary relationships 69 Components of ER Diagram You base an ER Diagram on three basic concepts: Entities Weak Entity Attributes Key Attribute Composite Attribute Multivalued Attribute Derived Attribute Relationships One-to-One Relationships One-to-Many Relationships Many-to-One Relationships Many-to-Many Relationships Components of ER Diagram Entities An entity can be either a living or non-living component. It showcases an entity as a rectangle in an ER diagram. For example, in a student study course, both the student and the course are entities. Components of ER Diagram Weak Entity An entity that makes reliance over another entity is called a weak entity You showcase the weak entity as a double rectangle in ER Diagram. In the example below, school is a strong entity because it has a primary key attribute - school number. Unlike school, the classroom is a weak entity because it does not have any primary key and the room number here acts only as a discriminator. Components of ER Diagram Identifying Relationships It links the strong and weak entity and is represented by a double diamond sign. Chapter 3-73 Components of ER Diagram Attribute An attribute exhibits the properties of an entity. You can illustrate an attribute with an oval shape in an ER diagram. Components of ER Diagram Key Attribute Key attribute uniquely identifies an entity from an entity set. It underlines the text of a key attribute. For example: For a student entity, the roll number can uniquely identify a student from a set of student Components of ER Diagram Composite Attribute An attribute that is composed of several other attributes is known as a composite attribute. An oval showcases the composite attribute, and the composite attribute oval is further connected with other ovals. Components of ER Diagram Multivalued Attribute Some attributes can possess over one value, those attributes are called multivalued attributes. The double oval shape is used to represent a multivalued attribute. Components of ER Diagram Derived Attribute An attribute that can be derived from other attributes of the entity is known as a derived attribute. In the ER diagram, the dashed oval represents the derived attribute. Components of ER Diagram Relationship The diamond shape showcases a relationship in the ER diagram. It depicts the relationship between two entities. In the example below, both the student and the course are entities, and study is the relationship between them. Components of ER Diagram Many-to-One Relationship When more than one element of an entity is related to a single element of another entity, then it is called a many-to-one relationship. For example, students have to opt for a single course, but a course can have many students Components of ER Diagram Many-to-Many Relationship When more than one element of an entity is associated with more than one element of another entity, this is called a many-to-many relationship. For example, you can assign an employee to many projects and a project can have many employees. 81 Components of ER Diagram Participation Constraints Total Participation − Each entity is involved in the relationship. Total participation is represented by double lines. Partial participation − Not all entities are involved in the relationship. Partial participation is represented by single lines. 83 84 Problem Draw an ER model of the Banking database application considering the following constraints − A bank has many entities. Each customer has multiple accounts. Multiple customers belong to a single branch. Single customer can borrow multiple loans. A branch has multiple employees. Solution Follow the steps given below to draw an ER model of the Banking database application − Step 1 − Identify the entity sets The entity set has multiple instances in a given business scenario. As per the given constraints the entity sets are as follows Customer Account Branch Loan Employee Step 2 − Identify the attributes for the given entities Customer − the relevant attributes are customerName, CustomerID, address. Account − The relevant attributes are AccountNo, balance. Branch − The relevant attributes are branchID, branchName, address. Loan − The relevant attributes are loanNo, paymentMode, dateOfLoan, and amount. Employee − The relevant attributes are empID, empName, dateOfJoin, experience, qualification. Step 3 − Identify the Key attributes CustomerID is the key attribute for a customer. AccountNo is the key attribute for Account entities. BranchID is the key attribute for branch entities. LoanNo is the key attribute for a loan entity. EmpID is the key attribute for an Employee entity. Step 4 − Identify the relationship between entity sets One customer is enrolled by multiple accounts and one account for multiple customers. Hence, the relationship is many to many. Step 4 − Identify the relationship between entity sets (Cont…) Many customers belong to one branch but one branch belongs to many customers. Hence, the relationship between customer and branch is many to one. Step 4 − Identify the relationship between entity sets (Cont…) One customer can borrow multiple loans in the same way multiple loans can borrow a single customer, hence the relationship between customer and loan is one to many. Step 4 − Identify the relationship between entity sets (Cont…) One branch has many employees and in the same way the number of employees works in a single branch. Step 5 − Complete ER diagram The complete ER diagram is as follows − 93 Problem ER diagram of Bank has the following description : Bank have Customer. Banks are identified by a name, code, address of main office. Banks have branches. Branches are identified by a branch_no., branch_name, address. Customers are identified by name, cust-id, phone number, address. Customer can have one or more accounts. Accounts are identified by account_no., acc_type, balance. Customer can avail loans. Loans are identified by loan_id, loan_type and amount. Account and loans are related to bank’s branch. Solution Entities and their Attributes are : Bank Entity : Attributes of Bank Entity are Bank Name, Code and Address. Code is Primary Key for Bank Entity. Customer Entity : Attributes of Customer Entity are Customer_id, Name, Phone Number and Address. Customer_id is Primary Key for Customer Entity. Branch Entity : Attributes of Branch Entity are Branch_id, Name and Address. Branch_id is Primary Key for Branch Entity. Account Entity : Attributes of Account Entity are Account_number, Account_Type and Balance. Account_number is Primary Key for Account Entity. Loan Entity : Attributes of Loan Entity are Loan_id, Loan_Type and Amount. Loan_id is Primary Key for Loan Entity. ER DIAGRAM FOR A BANK DATABASE Chapter 3-96 ER DIAGRAM FOR A BANK DATABASE 97 Data Modeling Tools A number of popular tools that cover conceptual modeling and mapping into relational schema design. Examples: ERWin, S- Designer (Enterprise Application Suite), ER- Studio, etc. POSITIVES: serves as documentation of application requirements, easy user interface - mostly graphics editor support Problems with Current Modeling Tools DIAGRAMMING Poor conceptual meaningful notation. To avoid the problem of layout algorithms and aesthetics of diagrams, they prefer boxes and lines and do nothing more than represent (primary-foreign key) relationships among resulting tables.(a few exceptions) METHODOLGY lack of built-in methodology support. poor tradeoff analysis or user-driven design preferences. poor design verification and suggestions for improvement. PROBLEM with ER notation The entity relationship model in its original form did not support the specialization/ generalization abstractions 101