GATE DBMS.pdf

DATABASE MANAGEMENT SYSTEM For COMPUTER SCIENCE DATABASE MANAGEMENT. SYSTEM SYLLABUS ER model. Relational model: relational algebra, tuple calculus, SQL. Integrity constraints, normal forms. File organization, indexing (e.g., B and B+ trees). Transactions and concurrency control. ANALYSIS OF GATE PAPERS Exam Year 1 Mark Ques. 2 Mark Ques. Total 2003 2 3 8 2004 2 5 12 2005 3 4 11 2006 1 4 9 2007 - 6 12 2008 1 5 11 2009 - 5 10 2010 2 2 6 2011 - 2 4 2012 2 5 12 2013 - 3 6 2014 Set-1 2 3 8 2014 Set-2 2 3 8 2014 Set-3 2 3 8 2015 Set-1 2 2 6 2015 Set-2 2 2 6 2015 Set-3 2 2 6 2016 Set-1 4 1 6 2016 Set-2 2 2 6 2017 Set-1 2 3 8 2017 Set-2 2 4 10 2018 3 2 7 © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission CONTENTS Topics Page No 1. THE ENTITY – RELATIONSHIP MODEL 1.1 Introduction 01 1.2 The E – R Model 01 1.3 Database Administrator (DBA) 01 1.4 The Entity Relationship Diagram 01 2. RELATIONAL MODEL 2.1 Levels of a Database System 04 2.2 The Attributes and tuples of a Relation Student 05 2.3 Keys 06 2.4 Relational Database Design 06 2.5 Network Data Modelling Concepts 08 3. TRANSACTIONS 3.1 Transactions 11 3.2 Serializable Schedule 12 3.3 Recoverability 13 3.4 Recoverable Schedules 13 3.5 Cascade less Schedules 13 3.6 Implementation of Isolation 14 3.7 Transaction Definition in SQL 14 3.8 Testing for Serializability 14 3.9 Tests for View Serializability 14 4. QUEL 4.1 Quel 15 4.2 Sequel 15 4.3 Assign to 15 4.4 SQL 15 4.5 Select Statement 16 4.6 Defining a null Value 16 4.7 Duplicate Rows 16 4.8 Character Strings and Dates 16 4.9 Comparison Operators 16 4.10 Logical Operators 17 4.11 Character Functions 18 © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission 4.12 Working with Dates 20 4.13 Types of Joins 21 4.14 Group Functions 21 4.15 Sub Queries 22 4.16 Data Definition Language (DDL) 23 4.17 Views 26 4.18 Data Manipulation Language (DML) 26 4.19 Queries 30 4.20 One Possible Database State Corresponding to the Company Scheme 30 4.21 Terminology Used in a Relational Database 33 4.22 Table Sal Grade 33 5. FILE STRUCTURE 5.1 Records and Record Types 36 5.2 B-Trees and other Data Structures 40 5.3 DBMS 43 5.4 Existence Dependencies 46 6. GATE QUESTIONS 51 7. ASSIGNMENT QUESTIONS  © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission 1 THE ENTITY – RELATIONSHIP MODEL 1.1 INTRODUCTION 4) Performance monitoring. 5) Strategy design for backup and An entity is any object, place or activity recovery. about which an enterprise keeps data. It is 6) Authorization checks and validation an object which can have instances or procedures. occurrences. An entity type is a set of 7) Decides the information content. objects which share common properties. The Database Design consists of three 1.4 THE ENTITY RELATIONSHIP DIAGRAM components: Conceptual Design on the basis of user requirements, Data Modeling 1.4.1 RELATIONSHIPS AND RELATIONSHIP (Entity – Relationship Diagrams and SETS Normalization), and Physical Design, and Implementation. The major step in A relationship expresses an association conceptual design is to identify entities and between entities. A relationship set is a set relationships, which reflect the data in a of relationships of the same type. A natural way. The aim of this step is to relationship may also have descriptive specify the conceptual structure of the data. attributes. For example, data (last data of This is known as data modeling. The Entity account access) could be an attribute of the – Relationship (E – R) model is used as an relationship set. information model to develop conceptual structure. 1.4.1.1 MAPPING CARDINALITIES 1.2 THE E – R MODEL It indicates the number of entities with which another entity can be associated via The E – R data model considers the real a relationship. The degree of relationship is world consisting of a set of basic objects called cardinality. and relationships among these objects. A a) One-to-one: An entity in A is associated number of attributes are associated with an with at most one entity in B, and an entity and the attributes describing it. The entity in B is associated with at most set of all entities or relationships of the one entity in A. same type is called the entity set or b) One-to-many: An entity in A is relationship set. associated with any number in B. An entity in B is associated with at least 1.3 DATABASE ADMINISTRATOR (DBA) one entity in A. c) Many-to-one (N:1): An entity in A is All controlling of a database system is done associated with at most one entity in B. by database administrator. An entity in B is associated with any number in A. 1.3.1 FUNCTIONS OF DBA 1.4.2 KEYS 1) Decides the storage structure and Differences between entities must be access strategy. expressed in terms of attributes known as 2) Creation of data dictionary for keys. These facilitate us to uniquely identify statistical analysis. each entity in a set. 3) Responding to changes in requirements. © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission 1.4.3 SUPER KEY 1.4.7 AGGREGATION It is a set of one or more attributes which The E – R model cannot express put together enable us to identify uniquely relationships among relationships. When an entity in the entity set. would we need such a thing? Consider a database with information 1.4.4 CANDIDATE KEY about employees who work on a particular A super key may contain extraneous project and using a number of machines for attributes, and we are often interested in doing that work. the smallest super key. A super key for We get the E-R diagram shown in Figure. which no subset is a super key is called a candidate key. 1.4.5 PRIMARY KEY It is a candidate key (there may be more than one) chosen by the database designer to identify entities in an entity set. The idea of strong and weak entity sets is related to the existence dependencies such as the member of a strong entity set is a dominant entity, and the member of a weak entity set is a subordinate entity. A weak entity set does not have a primary key, but we need a means of distinguishing among the entities. Relationship sets Working and Uses could be combined into a single set. However, 1.4.6 GENERALIZATION they shouldn’t be, as this would obscure the logical structure of this scheme. The Generalization hides differences and solution is to use aggregation. It is an emphasizes similarities. Distinction is made abstraction through which relationships through attribute inheritance. are treated as higher – level entities. For Attributes of higher – level entity are our example, we can treat the relationship inherited by lower - level entities. set Working and the entity sets Employee and Project as a higher –level entity set called Working. Following figure shows the E-R diagram with aggregation. © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission Transforming an E-R diagram with aggregation into tabular form is easy. We can create a table for each entity and relationship set as before. The table for relationship set Uses contains a column for each attribute in the primary key of Software Tool and Working. © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission 2 RELATIONAL MODEL 2.1 LEVELS OF A DATABASE SYSTEM 2.1.1 DOMAINS, ATTRIBUTES, TUPLE & RELATIONS The relational model represents the database as a collection of relations. A domain D is a set of atomic values. By Informally, each relation resembles at automatic we mean that each value in the stable of values or, to some extent, a “flat” domain is indivisible as far as the relational file of records model is concerned. A common method of specifying a domain is to specify a data type from which the data values forming the domain are drawn. It is also useful to specify a for the domain, to help in interpreting its values. Some examples of domains  USA – phone – numbers: The set of 10- digit phone numbers valid in the United States.  Social –security – numbers: The set of valid 9 – digit social security numbers.  Employee – ages: Possible ages of employees of a company; each must be a value between 15 and 80 years old.  Academic – department – names: The set of academic department names, such as computer Science, Economics, and Physics, in a university. Academic –department – codes: The set When a relation is thought of as a table of of academic department codes, such as CS, values, each row in the table represents a ECON and PHYS, in a university. collection of related data values. In the The preceding is called logical definition of relational model, each row in the table domains. A data type or format is also represents a fact that typically corresponds specified for each domain. to real- world entity or relationship. The e. g. the data type for the domain USA- table name and column names are used to phone – numbers can be declared as a help Number, Class, Major –specify how to character string of the form (ddd)ddd – interpret the data values in each row, based dddd, where each d is a numeric (decimal) on the column each value is in. All values in digit and the first three digits form a valid a column are of the same data type. In the telephone area code. The data type for formal relational mode terminology, a row Employee – ages is an integer number is called a tuple, a column header is called between 15 and 80. For Academic – an attribute, and the table is called a department – names, the data type is the relation. The data type describing the types set of all character strings that represents of values that can appear in each column. valid department names. A domain is thus We now define these terms-domain, tuple, given a name, data type, and format. attribute, and relation- more precisely. Additional information for interpreting the © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission values of a domain can also be given; for Dom (Name) = Names; example, a numeric domain such as Person- Dom (SSN) = Social security numbers; weights should have the units of Dom(Home Phone) = measurement – pounds or kilograms. A Local_phone_numbers, relation scheme R, denoted by R(A1, Dom(Office Phone) = A2,…….An), is made of a relation name R and Local_phone_numberas, and a list of attributes A1, A2,……..An. Each Dom(GPA)= Grade_point_averages. attribute A1 is the name of a role played by A Relation (or relation state)2 r of the some domain D in the relation schema R. D relation schema R(A1, A2,…An), also is called the domain of Ai and is denoted by denoted by r(R), is a set of n_tuples r = (t1, dom(Ai) t2 _ tm. Each n_tuple t is an ordered list of n A relation schema is used to describe a values t = < v1, v2,…., vn>, where each value relation; R is a called the name of this vi, 1  i  n is an element of dom(Ai) or is a relation. The degree of a relation is the special null value. number of attributes n of its relation The ith value v, in tuple t, which schema. corresponds to the attribute Ai, is referred An example of a relation schema for a to as t[Ai]. The terms relation intension for relation of degree 7, which describes the schema R and relation extension for a university students, is given below relation state r(R) are also commonly used. STUDENT (Name, SSN, Homophone, Figure shows an example of a STUDENT Address, Office Phone, Age, GPA) relation, which corresponds to the STUDENT schema specified above. Each tuple in the relation represents a particular student entity. We display the relation as a table, where each tuple is shown as a row and each attribute corresponds to a column header indicating a role or interpretation of the values in that column. Null values represent attributes whose values are unknown or do not exist for some individual STUDENT tuples. The above definition of a relation can be restated as follows. A relation r(R) is a mathematical relation of degree n on the domains dom(A1), dom(A2),…., dom(An), which is a subset ofthe Cartesian product of the domains that define R: r (R)  (dom (A1). dom (A2)…..dom (An,)) The Cartesian product specifies all possible combinations of values from the underlying 2.2 THE ATTRIBUTES AND TUPLES OF A domains. Hence, if we denote the number RELATION STUDENT of values or cardinality of a domain D by 𝐃 , and assume that all domains are For this relation schema, STUDENT is the finite, the total number of tuples in the name of the relation, which has seven Cartesian product is: |dom (A1)|* | dom(A2) attributes. We can specify the following *…….* dom (A2) | previously defined domains for some of the Product of all these combinations, relation attributes of the STUDENT relation: state at a given time-the current Relation © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission state- reflects only the valid tuples that Given a body of data to be represented in a represent a particular state pf the real database, how do we decide on a suitable world. In general, as the state of the real logical structure for it. The required world changes, so does the relation, by relations and their attributes need to be being transformed into another relation specified. One approach is to design schemes state. However, the result of adding an that are in an appropriate normal form. attribute to represent new information that was not originally stored in the relation. 2.4.1 NORMAL FORMS (NF) It is possible for several attributes to have the same domain. The attributes indicate There are various types of NFs available. A different roles, or interpretations, for the relation is said to be in a particular normal domain e. g., in the STUDENT relation, the form only if it satisfies the constraint (that same domain Local-phone – number plays it contains atomic values only to be able to the role of Home phone, referring to the call it as in 1NF.) A number of normal forms “home phone of a student,” and the role of have been defined. Below is drawn a set- Office Phone, referring to the “office phone diagram of NFs. of the student,” 2.3 KEYS 2.3.1 PRIMARY KEYS The term primary key is used to denote the candidate key that is chosen by the database designer as principle means of identifying entities within an entity set. e. g. attribute PART-NUMBER of the PART relation can be called as a primary key. Each PART tuple/row contains a distinct PART/NUMBER value, and this value may be used to distinguish that tuple from all 1st Normal Form others in the relation. Every relation will not have a single attribute primary key, but Functional Dependency every relation ‘will’ have one another they must have unique identification of some Given a relation R, attribute A of R is kind. If the value of Primary key was null, it functionally dependent on attribute B of R, would be equivalent to saying that there if and only if, each A- value in R has exists some tuple that did not have any associated with it precisely one 3-value in unique identification. It was not R (at any instant of time). distinguishable from other touples. We Referring to our PARTS relation it can be enforce the NOT NULL and UNIQUE. seen that PARTS = (PART_NO, PART_NAME, PART_SIZE, QTY_ON_HAND), the attributes 2.4 RELATIONAL DATABASE DESIGN PART_NAME, PART_SIZE, QTY_ON_HAND of it are each functionally dependent on In general, the goal of database design is to attribute PART_No, because given a value generate a set of relation schemes that of PART_NO, there exists precisely one allow us to store information without corresponding value for each of redundancy, yet allow us to retrieve PART_NAME, PARTS_SIZE etc. Using information easily. symbols it is specified as, © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission PARTS. PART NO PARTS.PART NAME S=(SUPLR_NO, SUPL_NAME, SUPL_STATUS, PARTS.PART_NO PARTS.PART_SIZE CITY) PARTS.PART_NO PARTS.QTY_ON_HAND It is supposed to maintain information on or more succinctly suppliers detailing, PARTS.PART_NO PARTS. (PART_NAME, SUPLR_NO : Unique supplier code PART_SIZE, QTY_ON_HAND) SUPL_NAME : Name of supplier SUPL_STATUS : Numerical value The statement PART.PART_NO Indicating quality and other services PARTS.PART_NAME (for example) is read provided by supplier w.r.t. city in which it as 'attribute PARTS.PART_NAME is is located functionally dependent upon attribute CITY : City where the supplier is PARTS.PART_NO' or equivalently, 'attribute located PARTS.PART_NO functionally determine In this relation S, the attribute, CITY is attribute PARTS. PART_NAME'. functionally dependent on the {SUPL_NO, NOTE that there is no requirement in the SUPL_STATUS}; however, it is not fully definition of functional dependence that a functionally dependent on this composite given X-Value appear in only one tuple of R. attribute because, of course, it is also functionally dependent on SUPLR_NO 2.4.1.1 FIRST NORMAL FORM alone. We will use this concept of full functional dependence while we discuss A relation is said to be in the 1st normal the normal forms on next page. form if every attribute is functionally dependent on the key. 2.4.1.2 SECOND NORMAL FORM (2NF) First normal form (1NF) is now To be in 2NF, a table must be in first considered to be part of the formal normal form and no attributes of the table definition of a relation in the basic (flat) should be functionally dependent on only relational model; historically, it was one part of a concatenated primary key. In defined to disallow multi valued attribute, other words, a relation R is in second composite attributes, and their normal form (2NF) if and only if it is 1NF combinations. It states that the domain of and every non key attribute is fully an attribute must include only atomic dependent on the primary key. (simple, indivisible) values and that the e. g. let`s consider ORDER_ITEM relation, value of any attribute in a tuple must be a ORDER_ITEM (ORD_NO, ITEM_NAME, QTY, single value form the domain of that ITEM_PRICE) attribute. Hence, 1NF disallows having a set This relation lists the items contained of values, a tuple of values, or a within various orders, the quantity of each combination of both as an attribute value associated with the order and their unit for a single tuple. In other words, 1 NF prices. disallows “relations within relations” or The key is formed as a combination of relations as attribute of tuples.” The only (ORD_NO + ITEM_NAME), the table is in attribute values permitted by 1NF are 1NF. It is not in 2NF, because ITEM_PRICE single atomic (or indivisible) values. is not functionally dependent on the ORD_NO component of the key. It is FULL FUNCTIONAL DEPENDENCE dependent only on the ITEM_NAME component. QTY is dependent on both Attribute B is fully functionally dependent parts of the key. To achieve 2NF form of on attribute A, if and only if, it is this relation, we decompose the functionally e.g., Take a relation S, ORDER_ITEM relation in, © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission ORDERJTEM (ORD_NO, ITEM_NAME, QTY) As can be seen, it is possible to specify the ITEM (ITEM_NAME, ITEM_PRICE) grade of a librarian just by knowing As we can see in a new 2NF set of relations, LIBRARIAN_NO and irrespective of it is now possible to add a new item details whether the librarian has issued any book irrespective of whether an order is placed or not. on it or not. If unit price of an item, ITEM_PRICE changes for a particular item then, only ITEM relation needs to be undated, effectively making less number of price updates per occurrence of an ITEM_DETAIL. 2.4.1.3 THIRD NORMAL FORM (3 NF) TRANSITIVE DEPENDENCE If attribute A depends on B and B depends on C, due to which A depends on C, then A is said to be transitively dependent on C. Transitive dependency causes problems in updating. Third Normal Form (3NF) To be in 3NF, a table must be in 2NF and no attribute of the table should be transitively functionally dependent on the primary key. Let us consider a relation, BORROW, related to library management, BORROW= (BOOK_NO, PERSON_NO, DURATION, LIBRARIAN_NO, LIBRARIAN_GRADE) This relation lists books issued from various local libraries, for which duration the book was issued, the staff member of Table: Summary of Normal Forms Based the librarian who signed out the book and on Primary Keys and Corresponding grade of that librarian. To identify the Normalization issued books records/tuples, the key used is (BOOKJ_NO + PERSON_NO). Above 2.5 NETWORK DATA MODELING relation is in 2NF but not in 3NF. This is CONCEPTS because LIBRARIAN_GRADE is only transitively dependent on LIBRARIAN_NO There two basic data structures in the and not dependent on network model: records and sets. (BOOK_NO+PERSON_NO). It is functionally dependent on identity of Records, Record Types, and Data Items librarian. So, we decompose the relation as, Data is stored in records; each record BORROW =(BOOK_NO, PERSON_NO, consists of a group of related data values. DURATION, LIBRARIAN_NO) Records are classified into record types, LIBRARIAN=(LIBRARIAN_NO,IBRARIAN_GR where each record type describes the ADE) structure of a group of records that store © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission the same type of information. We give each ADDRESS record type a name, and we also give a MAJORMDEPT name and format (data type) for each data BIRTHDATE item (or attribute) in the record type figure Format 4 shows a record type STUDENT with data CHARACTER 30 items NAME, SSN, ADDRESS, MAJORDEPT CHARACTER 9 & BIRTHDATE. CHARACTER 40 We can declare a virtual data item (or CHARACTER 10 derived attribute) AGE for the record type CHARACTER shown in Figure 4 and write a procedure to calculate the value of AGE from the value of the actual data item BIRTHDATE in each record. A typical database application has numerous types–from a few to a few hundred. To represent relationships between records, the networks model provides the modeling construct called set type which we discuss next. 2.5.1 SET TYPES AND THEIR BASIC PROPERTIES A set type is a description of a 1: N relationship between two record types. Figure 5 shows how we represent a set type diagrammatically as an arrow. This type of diagrammatic representation is called a Bachman diagram. Each set type definition consists of three basic elements:-  A name for the set type.  An owner record type. The set type in Figure is called  A member record type. MAJOR_DEPT, DEPARTMENT is the owner record type, and STUDENT is the member record type. This represents the 1: N relationship between academic departments and students majoring in those Data item name departments. In the database itself, there will be many set occurrences (or set instances) corresponding to a set type. Each instance relates one record from the owner record type DEPARTMENT record in our example to the set of records from the member record type related to it the set of STUDENT records for students who major in that department. Hence, each set occurrence is composed of NAME One owner record from the owner SSN record type. © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission A number of related member records ‘computer Science’ set, and that of (zero or more) from the member record ‘Kareem Rashad’ is the last member type. A record from the member record record. The set of the network model is type cannot exist in more than one set sometimes referred to as an owner- occurrence of a particular set type. This coupled set or co-set, to distinguish it maintains the constraint that a set type from a mathematical set represents a 1:N relationship. In our example a STUDENT record can be related to at most one major DEPARTMENT and hence is a member of at most one set occurrence of the MAJOR DEPT set type. A set occurrence can be identified either by the owner record or by any of the member records. Figure 6 shows four set occurrences (instances) of the MAJDR-DEPT set type. Notice that each instance must have one owner record but can have any number of member records (zero or more). Hence, we usually refer to a set instance by its owner record. The set instances in Figure 6 can be referred to as the `Computer Science`, `Mathematics`, `Physics`, and `Geology` sets. It is customary to use a different representation of a set instance (Figure 7) where the records of the set instance are shown linked together by pointers, which corresponds to a commonly used technique for implementing sets. In the network model, a set instance is not identical to the concept of a set in mathematics. There are two principal differences:-  The set instance has one distinguished element the owner record_ whereas in a mathematical set there is no such distinction among the elements of a set.  In the network model, the member records of a set instance are ordered, whereas order of elements is immaterial in a mathematical set. Hence, we can refer to the first, second, ith, and last member records in a set instance. Figure 7 shows an alternate “linked” representation of an instance of the set MAJOR_DEPT. In figure 7 the record of `Manual Riverva` is the first STUDENT(member) record in the © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission 3 TRANSACTIONS 3.1 TRANSACTIONS: component of the database system called the concurrency-control component. Collections of operations that form a single logical unit of work are called transactions. 3.1.2 TRANSACTION STATE 3.1.1 TRANSACTION CONCEPT A transaction may not always complete its execution successfully such a transaction is A transaction is a unit of program termed aborted. Any changes that the execution that accesses and possibly aborted transaction made to the database updates various data items. To ensure must be undone. integrity of the data, we require that the Once a transaction has committed, we database system maintains the following cannot undo its effects by aborting it. The properties of the transactions: only way to undo the effects of a committed If the database is consistent before an transaction is to execute a compensating execution of the transaction, the database transaction. remains consistent after the execution of the transaction. Ensuring consistency for Active, the initial state; the transaction an individual transaction is the stays in this state while it is executing responsibility of the application programmer Partially committed, after the final who codes the transaction. statement has been executed Ensuring atomicity is the responsibility of Failed, after the discovery that normal the database system itself; specifically, it is execution can no longer proceed handled by a component called the Aborted, after the transaction has been transaction-management component. rolled back and the database has been The durability property guarantees that, restored to its state prior to the start of the once a transaction completes successfully, transaction all the updates that it carried out on the Committed, after successful completion database persist, even if there is a system failure after the transaction completes A transaction is said to have terminated if execution. has either committed or aborted, since the Ensuring durability is the responsibility of actual output may still be temporarily a component of the database system called residing in main memory. the recovery management component. If several transactions are executed 3.1.3 IMPLEMENTATION OF ATOMICITY concurrently, their operations may interleave AND DURABILITY in some undesirable way, resulting in an inconsistent state. The recovery-management component of a The isolation property of a transaction database system implements the support ensures that the concurrent execution of for atomicity and durability. We first transactions results in a system state that is consider a simple, but extremely inefficient, equivalent to a state that could have been scheme. This scheme assumes that only one obtained had these transactions executed transaction is active at a time, and is based one at a time in some order. Ensuring the on making copies of the database, called isolation property is the responsibility of a shadow copies. The scheme assumes that the database is simply a file on disk. A © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission pointer called db-pointer is maintained on schemes. Suppose that the two transactions disk; it points to the current copy of the are executed one at a time in the order T1 database. In the shadow-database scheme, followed by T2. a transaction that wants to update the The execution sequences just described are database first creates a complete Copy of called schedules. They represent the the database. All updates are done on the chronological order in which instructions new database copy, leaving the original are executed in the system. copy, called the shadow copy, untouched. These schedules are serial. Each serial If at any point the transaction has to be schedule consists of a sequence of aborted, the new copy is merely deleted. instructions from various transactions, The old copy of the database has not been where the instructions belonging to one affected. On UNIX systems, the flush single transaction appear together in that command is used of this purpose. schedule. Thus, for a set of n transactions, The transaction is said to have been there exist n! different valid serial schedules. committed at the point where the updated With multiple transactions, the CPU time is db-pointer is written to disk. We can abort shared among all the transactions. In the transaction by deleting simply the new general, it is not possible to predict exactly copy of the database. how many instructions of a transaction will be before the CPU switches to another 3.1.4 DISADVANTAGE transaction. Thus, the number of possible schedules for a set of n transactions is Since executing a single transaction much larger than n! requires copying the entire database. Furthermore, the implementation does not 3.2 SERILIZABLE SCHEDULE allow transactions to execute concurrently with one another. If a concurrent schedule is equivalent to any of the serial schedule it is said to be a 3.1.5 CONCURRENT EXECUTIONS serializable schedule and this property is known as serializability. Ensuring consistency in spite concurrent It is the job of database system to ensure execution of transactions requires extra that any schedule that gets executed will work; it is far easier to insist that leave the database in a consistent state. transactions run serially-that is, one at a Any schedule that executed has the same time, each starting only after the previous effect as a schedule that could have one has completed. There are two good occurred without any concurrent execution. reasons for allowing concurrency. There is The schedule should, in some sense, be an increase in the throughput of the system equivalent to a serial schedule. – that is, in the number of transactions that can be executed in a given amount of time. 3.2.1 SERIALIZABILITY The processor and disk utilization also increase. Concurrent execution reduces the The database system must control average response time: the average time concurrent execution of transactions, to for a transaction to be completed after it ensure that the database state remains has been submitted. The database system consistent. must control the interaction among the concurrent transactions to prevent them 3.2.2 CONFLICT SERIALIZABILITY from destroying the consistency of the database. It does so through a variety of If Ii and Ij refer to the same data item Q, mechanisms called concurrency-control then the order of the two steps may matter. © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission Ii = read (Q), Ij = read (Q). Returning to our previous examples, we Since the result of only the later of the two note that schedule 1 is not view equivalent write instruction is preserved. to schedule 2, since, in schedule 1, the value However, the value obtained by the next of account A read by transaction T2 was read (Q) instruction of S is affected. If there produced by T1, whereas this case does not is no other write (Q) instruction after Ii and hold in schedule 2. Ij in S, then the order Ii and Ij directly affects A schedule S is view serializable if it is view the final value of Q in the database state equivalent to a serial schedule that results from schedule S. Every conflict – serializable schedule is We say that Ii and Ij conflict if they are view serializable, but there are view- operations by different transaction on the Inserializable schedules that are not same data item, and at least one of these conflict serializable. Since every pair of instructions is a write operation. consecutive instructions conflicts, blind Do not conflict, we can swap the order of Ii writes appear in any view – serializable and Ij to produce a new schedule S’. schedule that is not conflict serializable. Schedules 3 and 5 both produce the same final system state. 3.3 RECOVERABILITY No conflicting instructions as follows: If a schedule S can be transformed into a In a system that allows concurrent schedule S’ by a series of swaps of no execution, it is necessary also to ensure conflicting instructions, we say that S and S’ that any transaction Tj that is dependent on are conflict equivalent. Ti is also aborted. The concept of conflict equivalence leads to the concept of conflict serializability. We 3.4 RECOVERABLE SCHEDULES say that a schedule S is conflict serializable if it is conflict equivalent to a serial Most database system requires that all schedule. schedules be recoverable. A recoverable This schedule is not conflict serializable, schedule is one where, for each pair of since it is not equivalent to either the serial transactions Tj and Tjsuch that Tj reads a schedule or the serial schedule data items previously written by Ti, the. commit operation of Ti appears before the The write (B) instruction of T5 conflicts commit operation of Tj. with the read (B) instruction of T1. For the system to determine that schedule 3.5 CASCADE LESS SCHEDULES 8 produces the same outcome as the serial schedule , it must analyze the A single transaction failure leads to a series computation performed by T1 and T5, of transaction rollbacks, is called cascading rather than just the read and write rollback. Cascading rollback is undesirable, operations. since it leads to the undoing of a significant amount of work. It is desirable to restrict 3.2.3 VIEW SERIALIZABILITY the schedules to those where cascading rollbacks cannot occur. Such schedules are The schedules S and S’ are said to be view called cascade less schedules. Tj reads a equivalent if the following three conditions data item previously written by Ti, the are met: Conditions 1 and 2 ensure that commit operation of Ti appears before the each transaction reads the same values in read operation of Tj. It is easy to verify that both schedules and, therefore, performs the every cascade less schedule is also same computation. recoverable. © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission 3.6 IMPLEMENTATION OF ISOLATION Thus, to test for conflict serializability, we need to construct the precedence graph Specifically, schedules that are conflict or and to invoke a cycle – detection algorithm. view serializable and cascade less satisfy these requirements. The goal of 3.9 TESTS FOR VIEW SERIALIZABILITY concurrency–control schemes is to provide a high degree of concurrency, while Testing for view serializability is ensuring that all schedules that can be computationally an expensive problem. generated are conflict or view serializable, And at least one of these transactions and are cascade less. writes. The write (Q) instructions of T3 and T4 are 3.7 TRANSACTION DEFINITION IN SQL called useless writes. We need to develop a scheme for deciding whether an edge needs If a program terminates without either of to be inserted in a precedence graph. A these commands, the updates are either labeled precedence graph. We therefore committed or rolled back. require that any schedule produced by The definition of serialize used by the concurrent processing of a set of standard is that a schedule must have the transactions will have an effect equivalent same effect as would a serial schedule. to a schedule produced when these The levels of consistency specified by SQL - transactions are run serially in some order. 92 are as follows: A system that guarantees this property is Allows only committed records to be read, said ensures serializability. We can test a and further requires that, between two given schedule for conflict serializability by reads of a record by a transaction, no other constructing a precedence graph for the transaction is allowed to update the record. schedule; we can test for view Allows only committed records to be read, serializability in order 2n time by may have been updated by other constructing a labeled precedence graph committed transactions.` Allows even and searching over all possible distinct uncommitted records to be read. labeled graphs for a graph with a cycle. 3.8 TESTING FOR SERILAZABILITY When designing concurrency control schemes, we must show that schedules generated by the scheme are serializable. To do that, we must first understand how to determine, given a particular schedule S, whether the schedule is serializable. In this section, we shall present methods for determining conflict and view serializability. We will show that there exists a simple and efficient algorithm to determine conflict serializability. However, there is no efficient algorithm for determining view serializability. If the precedence graph for S has a cycle, then schedule S is not conflict serializable. If the graph contains no cycles, then the schedule S is conflict serializable. © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission 4 QUEL 4.1 QUEL 4.4 SQL QUEL is the query language in the system SQL allows you to communicate with the INGRES. QUEL is based on relational calculus. database and has the following advantages: The fundamental aspect of the languages  Efficient based on relational calculus is tuple variable  Easy to learn and use (which is a variable that "rangers over""  Functionally complete (SQL allows you some named relation). to define, retrieve, and manipulate data e.g., the query "Get supplier numbers for in the tables). suppliers in London" can be expressed in SQL is a English – like language and can be QUEL as: used by a range of users, including those RANGE of SX is S with little or no programming experience. RETRIEVE (SX. S#) where SX. CITY = It is a non – procedural language. It reduces 'LONDON'. the amount of time required for creating The tuple here is SX that ranges over and maintaining systems. SQL statements relation S. are not case sensitive. The statements can be written on one or more lines. Keywords cannot be abbreviated or split across lines 4.2 SEQUEL and clauses are usually placed on separate lines for readability & ease of editing. In SEQUEL, the expression following For executing a SQL statement places a WHERE can be any expression involving semicolon (;) at the end of the last clause. the attributes of the relation following Statement Description SELECT Retrieves data from the FROM, arithmetic comparisons and database. operations. Boolean connectives (AND.K INSERT Enters new rows, change OR NOT), set operations (UNION, UPDATE existing rows, and removes INTERSECT, MINUS), DELELE unwanted rows from tables set membership (X INS, where S is a set or in the database, respectively. Collectively equivalently S CONTAINS X) and the known as data negation of set membership (X NOT IN S manipulation Language OR S DOES NOT CONTAIN X). (DML) When SEQUEL applies a mapping it does CREATE Sets up, changes, and not eliminate duplicates unless told to do ALTER DROP removes data structures RENAME from tables. Collectively so by the keyword UNIQUE. In SEQUEL, TRUNCATE known as data definition assignment is indicated by preceding a Language (DDL). query by ASSIGN TO R COMMIT Manages the changes made ROLLBACK by DML statements. 4.3 ASSIGN TO R: SAVEPOINT Changes to the data can he grouped together into logical transactions. To assign the result to relation R. Any GRANT Gives or removes access relational algebra expression in SEQUEL REVOICE rights to both the Oracle can be evaluated by applying one operator database and the structures at a time and thereby computing each sub within it Collectively known expression, hence SEQUEL is complete. as data control language (DCL). © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission 4.5 SELECT STATEMENT DEPTNO DNAME LOC ------------ ------------- ------- 10 ACCOUNTING NEW YORK A select statement retrieves information 20 RESEARCH DALLAS from the database. Using a SELECT 30 SALES CHICAGO statement, you can do the following: 40 OPERATIONS BOSTON You can display all columns of data in a 4.5.1 SELECTION table by following the SELECT keyword with an asterisk (*). You can use the selection capability in SQL to choose the rows in a table that you want 4.5.4 SELECTING SPECIFIC COLUMNS, returned by a query. You can use various ALL ROWS criteria to selectively restrict the rows that you see. You can use the SELECT statement to display specific columns of the table by 4.5.2 PROJECTION specifying the column names, separated by commas. You can use the projection capability in SQL to choose the columns in a table that you 4.6 DEFINING A NULL VALUE want returned by your query. You can choose as few or as many columns of the If a row lacks the data value for a particular table as you require. column, that value is said to be null, or to contain null. A null value is a value that is 4.5.3 JOIN unavailable, unassigned, unknown, or inapplicable. A null value is not the same as You can use the join capability in SQL to zero or a space. Zero is a number, and a bring together data that is stored in space is a character. different tables by creating a link between them. In its simplest form, a SELECT 4.7 DUPLICATE ROWS statement must include the following:  A SELECT clause, which specifies the The default display of queries is all rows, column to be displayed including duplicate rows  A FROM clause, which specifies the To eliminate duplicate rows in the result, table containing the columns listed in include DISTINCT keyword in the SELECT the SELECT clause. clause immediately after the SELECT SELECT [DISTINCT] {*, column keyword. [alias],………..} FROM table 4.8 CHARACTERS STRINGS AND DATES In the Syntax:  Character strings and date values are SELECT : is a list of one or more columns enclosed in single quotation marks. FROM identifies which tables  Character values are case sensitive and DISTINCT suppresses duplicates date values are format sensitive. * selects all columns  Number values are not enclosed within Column selects the named column quotation marks. Alias gives selected columns different headings 4.9 COMPARISON OPERATORS FROM table specifies the table containing the columns Comparison operators are used in Select All Columns, All Rows FROM dept; conditions that compare one expression to © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission another. They are used in the WHERE WHERE dname LIKE ‘%\_%’ ESCAPE ‘\’; clause in the following format: The ESCAPE option identifies the backslash (\) as the escape character. Operator Meaning = Equal to 4.9.4 USING THE IS NULL OPERATOR > Greater than >= Greater than equal to The IS NULL operator tests for values that < Less than are null. A null value means the value is table, including duplicate rows and rows SELECT MIN (Sal) containing null values in any of the FROM emp columns. WHERE deptno = 20); COUNT (*) returns the number of rows satisfies the condition in the WHERE clause 4.15.1.2Multiple – Row Sub queries SELECT COUNT (*) FROM emp Sub queries that return more than one row WHERE deptno = 30; are called multiple – row sub queries. You COUNT (expr) returns the number of non use a multiple – row operator, instead of null rows in the column identified by expr. single – row operator, with a multiple – SELECT COUNT (comm.) row sub query. The multiple – row FROM emp operator expects one or more values. WHERE deptno = 30; Operator Meaning 4.15 SUB QUERIES IN Equal to any member in the list ANY Compare value to each value The inner query or the sub query returns a returned by the sub query All Compare value to every value value that is used by the outer query or the returned by the sub query main query. Example 5.2 4.15.1 TYPES OF SUBQUERIES Find the employees who earn the same salary as the minimum salary for (i) Single – row Subquereis: departments. Queries that return only one row form SELECT ename, Sal, deptno the inner SELECT statement. FROM emp © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission WHERE Sal IN WHERE expr1 = (SELECT MIN (Sal) alias.expr2) FROM emp GROUP BY deptno); Example 5.3 The inner query is executed first, SELECT ename, sal, deptno FROM emp 4.15.1.3 Multiple – Column Sub queries WHERE deptno = e.deptno); If you want to compare two or more 4.16 Data Definition Language (DDL) columns, you must write a compound WHERE clause using logical operators. 4.16.1 DATABASE OBJECTS Multiple –column sub queries enable you to combine duplicate WHERE conditions into A database can contain multiple data a single WHERE clause. structures. Each structure should be Syntax outlined in the database design so that it SELECT column, column, can be created during the build stage of FROM table database development. WHERE (column, column...) IN (SELECT column, column, 4.16.2 THE CREATE TABLE STATEMENT FROM table WHERE condition); Create tables to store data by executing the SELECT empno, mgr, deptno SQL, ‘CREATE TABLE’ statement. This FROM emp statement is one of the data definition WHERE (mgr,deptno) IN language (DDL) statements. (SELECT mgr, deptno Syntax FROM emp CREATETABLE [schema.] table WHERE ename =’ MARTIN’) (column data type [DEFAULT AND ename ‘MARTIN’; expr][,…..]); 4.15.1.4 CORRELATED SUBQURIES In the syntax Schema Is the same as the owner’s name The Server performs a correlated sub Table Is the name of the table query when the sub query references a DEFAULT expr Specifies a default value if a column from a table referred to in the value is omitted in the INSERT Statement parent statement. A correlated sub query is Column Is the name of the column evaluated once for each row processed by Data type Is the column’s datatype and the parent statement. The parent statement length can be a SELECT, UPDATE or DELETE statement. The Server performs a 4.16.3 DATATYPES correlated sub query when the sub query references a column from table in the Data type Description parent query. VARCHAR2 Variable-length character data (A (size) maximum size must be specified. Syntax Default and minimum size is 1. SELECT column1, column2, ……. Maximum size is 4000) FROM table1 alias CHAR (size) Fixed - length character data of WHERE column1, operator length size bytes (Default and (SELECT column1, minimum size Is 1. Maximum size is 2000) column2……… NUMBER(p, Number having precision p and FROM table2 s) scales.(The precision is the total © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission number of decimal digits and the scale is the number of digits to the 4.16.5.1 Add a Column right of the decimal point. The precision can range from 1 to 38 You can add columns to a table by using the and the scale can range from 84 to ALTER TABLE statement with the ADD 127.) clause. DATE Date and time values between January 1, 4712 B.C. and December ALTER TABLE 31, 9999 A.D. ADD (column data type [DEFAULT LONG Variable-length character data up expr] to 2 gigabytes. [, column data type]…..); CLOB Single-byte character data up to 4 gigabytes. RAW (size) Paw binary dam of length of size 4.16.5.2 Modify a Column (Size must be specified Maximum size 2000). You can modify existing columns in a table LONG RAW Raw binary data of variable length by using the ALTER TABLE statement with up to 2 gigabytes. the MODIFY clause. Column modification BLOB Binary dam up to 4 gigabytes. can include changes to a column’s data BBILE Binary data stored in an external file; up 4 gigabytes. type, size and default value. 4.16.4 CREATING A TABLE BY USING ALTER TABLE table SUBQUERY MODIFY (column datatype [DEFAULT expr] A second method to create a table is to [, column datatype]….); apply the as sub query clause to both create ALTER TABLE dept30 the table and insert rows returned from the MODIFY (enameVARCHAR2(15) ); sub query. Table altered CREATE TABLE table [(Column, column….)] 4.16.5.3 Dropping a Column As sub query; CREATE TABLE dept30 You can drop a column from a table by AS using the ALTER TABLE statement with the SELECT empno, ename, DROP COLUMN clause. sal*12ANNSAL, ALTER TABLE dep30 hiredate DROP COLUMN job; FROM emp Table altered. WHERE deptno=30; Guidelines 4.16.5 THE ALTER TABLE STATEMENT  The column may or may not contain After you create your tables, you may need data to change the table structures because you  Only one column can be dropped at a omitted a column or your column definition time. need to be changed. You can do this by  The table must have at least one column using the ALTER TABLE statement. remaining in it after it is altered You can use the ALTER TABLE statement  Once a column is dropped, it cannot be to: recovered  Add a new column.  Modify an existing column. 4.16.5.4 DROPPING A TABLE  Define a default value for the new column. © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission The DROP TABLE statement removes the The integrity constraints can be defined definition of a table. When you drop a table, using the CONSTRAINT clause. The the database looses all the data in the table CONSTRAINT clause can appear in CREATE and all the indexes associated with it. TABLE and ALTER TABLE commands. This Syntax clause allows the user to define UNIQUE, DROP TABLE table; PRIMARY KEY, NOT NULL, FOREIGN KEY where table is the name of the table and CHECK integrity constraints. The DROP TABLE dept30; CREATE TABLE command is used when the Table dropped constraints are to be defined at the time of table creation. The ALTER TABLE 4.16.5.5 TRUNCATING TABLE command is used when the table already exists and constraints are to be defined. Another DDL statement is the TRUNCATE CREATE TABLE [schema.] TABLE statement, which is used to remove table all rows from a table and to release the (columndatatype [DEFAULT expr] storage space used by that table. When [column_constraint],…….. using the TRUNCATE TABLE statement, [table_constraint] [,….] ); you cannot rollback row removal. 4.16.6.3 Disabling Constraints Syntax TRUNCATE TABLE; You can disable a constraint without dropping it or re-creating it by using the 4.16.6 INCLUDIG CONSTRAINTS ALTER TABLE statement with the DISABLE clause. An integrity constraint can be thought of as Syntax a way to define a business rule for a column ALTER TABLE table or a table. Integrity constraints are defined DISABLE CONSTRAINT constraint with a table and are stored as part of the [CASCADE]; table’s definition in the data dictionary. ALTER TABLE emp DISABLE CONSTRIANT 4.16.6.1 Data Integrity Constraints emp_empno_pk CASCADE; Constraint Description 4.17 VIEWS NOT NULL Specification that this column may not contain a A view is a logical table based on a table null value based on a table or another view. A view UNIQUE Specifies a column or contains no data of its own but is like a combination of column window through which data from tables whose value must be unique for all rows in table can be viewed or changed. The tables on PRIMARY KEY Uniquely identifies each row which a view is based are called base of the table tables. The view is stored as a SELECT FOREIGGN KEY Establishes and enforces a statement in the data dictionary. foreign key relationship between the column and a 4.17.1 Advantages of Views column of the referenced table.  Views restrict access to the data CHECK Specifies a condition that because the view can display selective must be true. columns from the table.  Views allow users to make simple 4.16.6.2 Defining Constraints queries to retrieve the results from © Copyright Reserved by Gateflix.in No part of this material should be copied or reproduced without permission complicated queries. For example, FROM emp a, (SELECT deptno, max(sal) views allow users to query information maxsal from multiple tables without knowing FROM emp how to write a joint statement. GROUP BY deptno) b  Views provide data independence for WHERE a.deptno = b.deptno adhoc users and application programs. AND a.sal is greater than should match that of columns in the table. If !> is not greater than the number of data values is less, then < is less than specify the column names into which data !< is not less than is being entered as illustrated. >= is greater than or equal to INSERT INTO TABLE-NAME(column1, SELECT * 2. A column or attribute containing the 2 FROM DPT; employee number, which is also the

Document Details

Tags

Related

Full Transcript