6 Lecture notes (unit wise-may be handwritten)_BCA_DBMS.docx.pdf
Document Details
Uploaded by WonEuropium
Tags
Full Transcript
Course Code: 4040233101 Course Name: Database Management System SEMESTER:...
Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 UNIT No 1 Unit Name : Introduction to Database Topics: ⮚ Introduction to Database, DBMS, Applications of DBMS ⮚ Purpose of DBMS, Data Independence ⮚ Types of Database System Architecture (1- tier,2- tier, 3-tier) ⮚ Database System levels(External, Conceptual, Internal), Mappings ⮚ Database, Users and DBA. ⮚ Introduction to types of Data Models. ⮚ Hierarchical data model, Network Data Model, ⮚ Relational Data Model, E-R Data Model ❖ Introduction to Database Data Facts that can be recorded or stored. E.g. Person Name, Age, Gender and Weight etc. Information When data is processed, organized, structured or presented in a given context so as to make it useful, it is called information. Database A Database is a collection of inter-related (logically-related) data. E.g. Books Database in Library, Student Database in University etc. DBMS (Database Management System) A database management system is a collection of inter-related data and set of programs to manipulate those data. DBMS = Database + Set of programs E.g. MS SQL Server, Oracle, My SQL, SQLite, MongoDB etc. Metadata Metadata is data about data. Data such as table name, column name, data type, authorized user and user access privileges for any table is called metadata for that table. Page | 1 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Data dictionary Data dictionary is an information repository which contains metadata. It is usually a part of the system catalog. Data warehouse Data warehouse is an information repository which stored data. It is design to facilitate reporting and analysis. Field A field is a character or group of characters that have a specific meaning. It is also called a data item. It is represented in the database by a value. For Example customer id, name, society and city are all fields for customer Data. Record A record is a collection of logically related fields. For examples, collection of fields (id, name, address & city) forms a record for customer. Applications of DBMS In so many fields, we will use a database management system. Let’s see some of the applications where database management system uses − Railway Reservation System − The railway reservation system database plays a very important role by keeping record of ticket booking, train’s departure time and arrival status and also gives information regarding train late to people through the database. Library Management System − Now-a-days it’s become easy in the Library to track each book and maintain it because of the database. This happens because there are thousands of books in the library. It is very difficult to keep a record of all books in a copy or register. Now DBMS used to maintain all the information related to book issue dates, name of the book, author and availability of the book. Banking − Banking is one of the main applications of databases. We all know there will be a thousand transactions through banks daily and we are doing this without going to Page | 2 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 the bank. This is all possible just because of DBMS that manages all the bank transactions. Universities and colleges − Now-a-days examinations are done online. So, the universities and colleges are maintaining DBMS to store Student’s registrations details, results, courses and grade all the information in the database. For example, telecommunications. Without DBMS there is no telecommunication company. DBMS is most useful to these companies to store the call details and monthly postpaid bills. Credit card transactions − The purchase of items and transactions of credit cards are made possible only by DBMS. A credit card holder has to know the importance of their information that all are secured through DBMS. Social Media Sites − By filling the required details we are able to access social media platforms. Many users sign up daily on social websites such as Facebook, Pinterest and Instagram. All the information related to the users are stored and maintained with the help of DBMS. Finance − Now-a-days there are lots of things to do with finance like storing sales, holding information and finance statement management etc. these all can be done with database systems. Military − In military areas the DBMS is playing a vital role. Military keeps records of soldiers and it has so many files that should be kept secure and safe. DBMS provides a high security to military information. Online Shopping − Now-a-days we all do Online shopping without wasting the time by going shopping with the help of DBMS. The products are added and sold only with the help of DBMS like Purchase information, invoice bills and payment. Human Resource Management − the management keeps records of each employee’s salary, tax and work through DBMS. Manufacturing − Manufacturing companies make products and sell them on a daily basis. To keep records of all those details DBMS is used. Airline Reservation system − Just like the railway reservation system, airlines also need DBMS to keep records of flights arrival, departure and delay status. Purpose of DBMS The purpose of database systems is to manage the following insecurities: ▪ data redundancy and inconsistency, Page | 3 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 ▪ difficulty in accessing data, ▪ data isolation, ▪ atomicity of updates, ▪ concurrent access, ▪ security problems, and ▪ Supports multiple views of data. Data Independence Data independency is the ability to modify a schema definition in one level without affecting a schema definition in the next higher level. Types of data independence Physical data independence Logical data independence Physical data independence Physical data independence allows changing in physical storage devices or organization of file without change in the conceptual view or external view. Modifications at the internal level are occasionally necessary to improve performance. Physical data independence separates conceptual level from the internal level. It is easy to achieve physical data independence. Logical data independence Logical data independence is the ability to modify the conceptual schema without requiring any change in application programs. Conceptual schema can be changed without affecting the existing external schema. Modifications at the logical level are necessary whenever the logical structure of the database is altered. Logical data independence separates external level from the conceptual view. It is difficult to achieve logical data independence. Page | 4 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Types of Data base System Architecture (1- tier, 2- tier, 3-tier) Most widely used architecture is 3-tier architecture. user application client network application server database system 3-tier architecture separates it tier from each other on basis of users. client Page | 5 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 server 1) Database (Data) Tier At this tier, only database resides. Database along with its query processing languages sits in layer-3 of 3-tier architecture. It also contains all relations and their constraints. 2) Application (Middle) Tier At this tier, the application server and program, which access database, resides. For a user this application tier works as abstracted view of database. Users are unaware of any existence of database beyond application. For database-tier, application tier is the user of it. Database tier is not aware of any other user beyond application tier. This tier works as mediator between the two. 3) User (Presentation) Tier An end user sits on this tier. From a user’s aspect, this tier is everything. He/she doesn't know about any existence or form of database beyond this layer. At this layer multiple views of database can be provided by the application. All views which are generated by an application, resides in application tier. Database System levels (External, Conceptual, Internal) Mapping The process of transforming requests and results between the three levels is called mapping. Types of Mapping Conceptual/Internal Mapping External/Conceptual Mapping Page | 6 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Conceptual/Internal Mapping It relates conceptual schema with internal schema. It defines correspondence between the conceptual schema and the database stored in physical devices. It specifies how conceptual records and fields are presented at the internal level. If the structure of stored database is changed, then conceptual/internal mapping must be changed accordingly and conceptual schema can remain invariant. There could be one mapping between conceptual and internal levels. External/Conceptual Mapping It relates each external schema with conceptual schema. It defines correspondence between a particular external view and conceptual schema. If the structure of conceptual schema is changed, then external/conceptual mapping must be changed accordingly and external schema can remain invariant. There could be several mappings between external and conceptual levels. Database Users and DBA There are four different database users. Application programmers These users are computer professionals who write application programs using some tools. E.g. Software developers Sophisticated users These users interact with system without writing program. They form their request in a database query language. E.g. Analyst. Specialized users These users write specialized database applications that do not fit into the traditional data processing framework. E.g. Database Administrator. Naive users These users are unsophisticated users who have very less knowledge of database system. These users interact with the system by using one of the application programs that Page | 7 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 have been written previously. Examples, e.g. Clerk in bank DBA The database administrator is a person in the organization who controls the design and the Use of the database. DBA provides necessary technical support for implementing a database. DBA is involved more in the design, development, testing and operational phases. DBA is a technical person having knowledge of database technology. DBA does not need to be a business person, but any kind of knowledge about a functionality of an organization can be more beneficiary. DBA is a technically focused person, but, he/she should understand more about the business to administer the databases effectively. ❖ Introduction to types of Data Models A database model is a type of data model that defines the logical structure of a database. It determines how data can be stored, accessed and updated in a database management system The most popular example of a database model is the relational model, which uses a table-based format. Type of Database Models are: 1. Hierarchical Model 2. Network Model 3. Entity-relationship Model 4. Relational Model 5. Object-oriented database model 1) Hierarchical Model The hierarchical model organizes data into a tree-like structure, where each record has a single parent or root. The hierarchy starts from the Root data, and expands like a tree, adding child nodes to the parent nodes. In hierarchical model, data is organized into tree-like structure with one-to-many relationship between two different types of data, for example, one department can have many professors and many students. Page | 8 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 2) Network Model This is an extension of the hierarchical model, allowing many-to-many relationships in a tree-like structure that allows multiple parents. 3) Entity-relationship Model In this database model, relationships are created by dividing object of interest into entity and its characteristics into attributes. Page | 9 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 4) Relational Model In this model, data is organized in two-dimensional tables and the relationship is maintained by storing a common attribute. SubjectID SubjectName Teacher 1 DBMS Patel 2 DS Shah Rno StudentName Age 1 Raj 21 2 Meet 22 Rno SubjectID Marks Page | 10 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 1 1 98 1 2 95 2 1 95 2 2 90 5) Object-oriented database model This data model is another method of representing real world objects. It considers each object in the world as objects and isolates it from each other. It groups its related functionalities together and allows inheriting its functionality to other related sub-groups. Page | 11 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Unit 2: Relational Data Model Topics: Structure of relational databases, Domains Relations, Relational algebra-fundamental operators and syntax, relational algebra queries Entity Relationship Diagram Components of E-R Diagram (Entities, Attributes, Relationship) Mapping cardinalities Keys Extended E-R Features: Specialization, Generalization and Aggregation. Structure of relational databases Page | 12 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Relations: Relation is sometimes used to refer to a table in a relational database but is more commonly used to describe the relationships that can be created between those tables in a relational database. ❖ Relational algebra-fundamental operators and syntax Relational algebra is a procedural query language, which takes instances of relations as input and yields instances of relations as output. It uses operators to perform queries. An operator can be either unary or binary. They accept relations as their input and yield relations as their output. Relational algebra is performed recursively on a relation and intermediate results are also considered relations. The fundamental operations of relational algebra are as follows − Select Project Union Set different Cartesian product Rename We will discuss all these operations in the following sections. Page | 13 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Select Operation (σ) It selects tuples that satisfy the given predicate from a relation. Notation − σp(r) Where σ stands for selection predicate and r stands for relation. p is prepositional logic formula which may use connectors like and, or, and not. These terms may use relational operators like − =, ≠, ≥, , ≤. For example − σsubject = "database"(Books) Output − Selects tuples from books where subject is 'database'. σsubject = "database" and price = "450"(Books) Output − Selects tuples from books where subject is 'database' and 'price' is 450. σsubject = "database" and price = "450" or year > "2010"(Books) Output − Selects tuples from books where subject is 'database' and 'price' is 450 or those books published after 2010. Project Operation (∏) It projects column(s) that satisfy a given predicate. Notation − ∏A1, A2, An (r) Where A1, A2, An are attribute names of relation r. Duplicate rows are automatically eliminated, as relation is a set. For example − ∏subject, author (Books) Selects and projects columns named as subject and author from the relation Books. Union Operation (∪) It performs binary union between two given relations and is defined as − r ∪ s = { t | t ∈ r or t ∈ s} Notation − r U s Where r and s are either database relations or relation result set (temporary relation). For a union operation to be valid, the following conditions must hold − r, and s must have the same number of attributes. Attribute domains must be compatible. Page | 14 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Duplicate tuples are automatically eliminated. ∏ author (Books) ∪ ∏ author (Articles) Output − Projects the names of the authors who have either written a book or an article or both. Set Difference (−) The result of set difference operation is tuples, which are present in one relation but are not in the second relation. Notation − r − s Finds all the tuples that are present in r but not in s. ∏ author (Books) − ∏ author (Articles) Output − Provides the name of authors who have written books but not articles. Cartesian Product (Χ) Combines information of two different relations into one. Notation − r Χ s Where r and s are relations and their output will be defined as − r Χ s = { q t | q ∈ r and t ∈ s} σauthor = 'xyz'(Books Χ Articles) Output − Yields a relation, which shows all the books and articles written by xyz. Rename Operation (ρ) The results of relational algebra are also relations but without any name. The rename operation allows us to rename the output relation. 'Rename' operation is denoted with small Greek letter rho ρ. Notation − ρ x (E) Where the result of expression E is saved with name of x. Additional operations are − Set intersection Assignment Natural join Relational Calculus Page | 15 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 In contrast to Relational Algebra, Relational Calculus is a non-procedural query language, that is, it tells what to do but never explains how to do it. Relational calculus exists in two forms − Tuple Relational Calculus (TRC) Filtering variable ranges over tuples Notation − {T | Condition} Returns all tuples T that satisfies a condition. For example − { T.name | Author(T) AND T.article = 'database' } Output − Returns tuples with 'name' from Author who has written article on 'database'. TRC can be quantified. We can use Existential (∃) and Universal Quantifiers (∀). For example − { R| ∃T ∈ Authors(T.article='database' AND R.name=T.name)} Output − The above query will yield the same result as the previous one. ❖ Entity Relationship Diagram The E-R model cannot express relationships among relationships. When would we need such a thing at that time aggregation is used? Consider a database with information about employees who work on a particular project and use a number of machines doing that work. Page | 16 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Fig. A Fig. B Relationship sets work and uses could be combined into a single set. We can combine them by using aggregation. Aggregation is an abstraction through which relationships are treated as higher-level entities. For our example, we treat the relationship set work and the entity sets employee and project as a higher-level entity set called work. Transforming an E-R diagram with aggregation into tabular form is easy. We create a table for each entity and relationship set as before. The table for relationship set uses contains a column for each attribute in the primary key of machinery and work. Step 1: Entities and Simple Attributes: An entity type within ER diagram is turned into a table. You may preferably keep the same name for the entity or give it a sensible name but avoid DBMS reserved words as well as avoid the use of special characters. Each attribute turns into a column (attribute) in the table. The key attribute of the entity is the primary key of the table which is usually underlined. It can be composite if required but can never be null. It is highly recommended that every table should start with its primary key attribute conventionally named as TablenameID. Consider the following simple ER diagram: The initial relational schema is expressed in the following format writing the table names with the attributes list inside a parentheses as shown below ✔ Persons( personid, name, address, email ) Person personid name address Email Persons and Phones are Tables and personid, name, address and email are Columns (Attributes). personid is the primary key for the table : Person Step 2: Multi-Valued Attributes A multi-valued attribute is usually represented with a double-line oval. Page | 17 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 If you have a multi-valued attribute, take that multi-valued attribute and turn it into a new entity or table of its own. Then make a 1:N relationship between the new entity and the existing one. In simple words. ⮚ 1. Create a table for that multi-valued attribute. ⮚ 2. Add the primary (id) column of the parent entity as a foreign key within the new table as shown below: ✔ First table is Persons ( personid, name, address, email ) ✔ Second table is Phones ( phoneid , personid, phone ) personid within the table Phones is a foreign key referring to the personid of Persons Phone phoneid personid phone Step 3: 1:1 Relationship Let us consider the case where the Person has one wife. You can place the primary key of the wife table wifeid in the table Persons which we call in this Page | 18 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 case foreign key as shown below. ✔ Persons( personid, name, address, email , wifeid ) ✔ Wife ( wifeid , name ) Or vice versa to put the personid as a foreign key within the wife table as shown below: ✔ Persons( personid, name, address, email ) ✔ Wife ( wifeid , name , personid) For cases when the Person is not married i.e. has no wifeID, the attribute can set to NULL Persons personid name address email wifeid Wife wifeid name OR Persons personid name address email Wife wifeid name personid Step 4: 1:N Relationships Page | 19 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 For instance, the Person can have a House from zero to many, but a House can have only one Person. In such relationship place the primary key attribute of table having 1 mapping in to the table having many cardinality as a foreign key. To represent such relationship the personid as the Parent table must be placed within the Child table as a foreign key. It should convert to : ✔ Persons( personid, name, address, email ) ✔ House ( houseid, name , address, personid) Persons personid name address email House houseid name address personid Step 5: N:N Relationships For instance, The Person can live or work in many countries. Also, a country can have many people. To express this relationship within a relational schema we use a separate table as shown below: It should convert into : ✔ Persons( personid, name, address, email ) ✔ Countries ( countryid, name) ✔ HasRelat ( hasrelatid, personid , countryid) Page | 20 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Persons personid name address email Countries countryid name HasRelat hasrelatid personid countryid Page | 21 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Page | 22 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 ❖ E-R model Entity-relationship (ER) model/diagram is a graphical representation of entities and their relationships to each other with their attributes. ❖ Mapping cardinalities In database management, cardinality plays an important role. Here cardinality represents the number of times an entity of an entity set participates in a relationship set. Or we can say that the cardinality of a relationship is the number of tuples (rows) in a relationship. Types of cardinality in between tables are: 1. one-to-one 2. one-to-many 3. many-to-one 4. many-to-many 1) One-to-One Cardinalities: One to one cardinality is represented by a 1:1 symbol. In this, there is at most one relationship from one entity to another entity. There are a lot of examples of one-to-one cardinality in real life databases. Page | 23 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 For example, one student can have only one student id, and one student id can belong to only one student. So, the relationship mapping between student and student id will be one to one cardinality mapping. 2) One-to-Many Cardinalities: In One-to-many cardinality mapping, from set 1, there can be a maximum single set that can make relationships with a single or more than one entity of set 2. Or we can also describe it as from set 2, more than one entity can make a relationship with only one entity of set 1. One to one cardinality is the subset of One-to-many Cardinality. It can be represented by 1: M. For Example, in a hospital, there can be various compounders, so the relationship between the hospital and compounders can be mapped through One-to-many Cardinality. Page | 24 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 3) Many-to-One Cardinality: In many to one cardinality mapping, from set 1, there can be multiple sets that can make relationships with a single entity of set 2. Or we can also describe it as from set 2, and one entity can make a relationship with more than one entity of set 1. One to one Cardinality is the subset of many to one Cardinality. It can be represented by M:1. For example, there are multiple patients in a hospital who are served by a single doctor, so the relationship between patients and doctors can be represented by many to one Cardinality. 4) Many-to-Many Cardinalities: In many, many cardinalities mapping, there can be one or more than one entity that can associate with one or more than one entity of set 2. In the same way from the end of set 2, one or more than one entity can make a relation with one or more than one entity of set 1. It is represented by M: N or N: M. ❖ Keys Keys play an important role in the relational database. It is used to uniquely identify any record or row of data from the table. It is also used to establish and identify relationships between tables. Page | 25 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 For example, ID is used as a key in the Student table because it is unique for each student. In the PERSON table, passport_number, license_number, SSN are keys since they are unique for each person. Types of keys: 1. Primary key o It is the first key used to identify one and only one instance of an entity uniquely. An entity can contain multiple keys, as we saw in the PERSON table. The key which is most suitable from those lists becomes a primary key. o In the EMPLOYEE table, ID can be the primary key since it is unique for each employee. In the EMPLOYEE table, we can even select License_Number and Passport_Number as primary keys since they are also unique. o For each entity, the primary key selection is based on requirements and developers. 2. Candidate key o A candidate key is an attribute or set of attributes that can uniquely identify a tuple. o Except for the primary key, the remaining attributes are considered a candidate key. The candidate keys are as strong as the primary key. For example: In the EMPLOYEE table, id is best suited for the primary key. The rest of the attributes, like SSN, Passport_Number, License_Number, etc., are considered a candidate key. Page | 26 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 3. Super Key Super key is an attribute set that can uniquely identify a tuple. A super key is a superset of a candidate key. For example: In the above EMPLOYEE table, for(EMPLOEE_ID, EMPLOYEE_NAME), the name of two employees can be the same, but their EMPLYEE_ID can't be the same. Hence, this combination can also be a key. 4. Foreign key o Foreign keys are the column of the table used to point to the primary key of another table. Page | 27 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 o Every employee works in a specific department in a company, and employee and department are two different entities. So we can't store the department's information in the employee table. That's why we link these two tables through the primary key of one table. o We add the primary key of the DEPARTMENT table, Department_Id, as a new attribute in the EMPLOYEE table. o In the EMPLOYEE table, Department_Id is the foreign key, and both the tables are related. 5. Alternate key There may be one or more attributes or a combination of attributes that uniquely identify each tuple in a relation. These attributes or combinations of the attributes are called the candidate keys. One key is chosen as the primary key from these candidate keys, and the remaining candidate key, if it exists, is termed the alternate key. In other words, the total number of the alternate keys is the total number of candidate keys minus the primary key. The alternate key may or may not exist. If there is only one candidate key in a relation, it does not have an alternate key. For example, employee relation has two attributes, Employee_Id and PAN_No, that act as candidate keys. In this relation, Employee_Id is chosen as the primary key, so the other candidate key, PAN_No, acts as the Alternate key. Page | 28 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 6. Composite key Whenever a primary key consists of more than one attribute, it is known as a composite key. This key is also known as Concatenated Key. For example, in employee relations, we assume that an employee may be assigned multiple roles, and an employee may work on multiple projects simultaneously. So the primary key will be composed of all three attributes, namely Emp_ID, Emp_role, and Proj_ID in combination. So these attributes act as a composite key since the primary key comprises more than one attribute. ❖ Extended E-R Features: Specialization, Generalization and Aggregation ❖ Generalization o Generalization is like a bottom-up approach in which two or more entities of lower level combine to form a higher level entity if they have some attributes in common. o In generalization, an entity of a higher level can also combine with the entities of the lower level to form a further higher level entity. o Generalization is more like subclass and super class system, but the only difference is the approach. Generalization uses the bottom-up approach. o In generalization, entities are combined to form a more generalized entity, i.e., subclasses are combined to make a super class. For example, Faculty and Student entities can be generalized and create a higher level entity Person. Page | 29 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 ❖ Specialization o Specialization is a top-down approach, and it is opposite to Generalization. In specialization, one higher level entity can be broken down into two lower level entities. o Specialization is used to identify the subset of an entity set that shares some distinguishing characteristics. o Normally, the superclass is defined first, the subclass and its related attributes are defined next, and relationship set are then added. For example: In an Employee management system, EMPLOYEE entity can be specialized as TESTER or DEVELOPER based on what role they play in the company. ❖ Aggregation In aggregation, the relation between two entities is treated as a single entity. In aggregation, relationship with its corresponding entities is aggregated into a higher level entity. For example: Center entity offers the Course entity act as a single entity in the relationship which is in a relationship with another entity visitor. In the real world, if a visitor visits a coaching center then he will never enquiry about the Course only or just about the Center instead he will ask the enquiry about both. Page | 30 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Page | 31 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 UNIT 3 Relational Database Design Anomalies, Functional Dependency, Normalization: 1st NF, 2nd NF, 3rd NF, Anomalies What is Anomaly? Anomaly means inconsistency in the pattern from the normal form. In Database Management System (DBMS), anomaly means the inconsistency occurred in the relational table during the operations performed on the relational table. There can be various reasons for anomalies to occur in the database. For example, if there is a lot of redundant data present in our database then DBMS anomalies can occur. If a table is constructed in a very poor manner then there is a chance of database anomaly. Due to database anomalies, the integrity of the database suffers. The other reason for the database anomalies is that all the data is stored in a single table. So, to remove the anomalies of the database, normalization is the process which is done where the splitting of the table and joining of the table (different types of join) occurs. Worker_id Worker_name Worker_dept Worker_address 65 Ramesh ECT001 Jaipur 65 Ramesh ECT002 Jaipur 73 Amit ECT002 Delhi 76 Vikas ECT501 Pune 76 Vikas ECT502 Pune 79 Rajesh ECT669 Mumbai In the above table, we have four columns which describe the details about the workers like their name, address, department and their id. There can be three types of an anomaly in the database: 1) Updation / Update Anomaly When we update some rows in the table, and if it leads to the inconsistency of the table then this anomaly occurs. This type of anomaly is known as an updation anomaly. In the above table, if we want to update the address of Ramesh then we will have to update all the rows where Ramesh is present. If during the update we miss any single row, then there will be two addresses of Ramesh, which will lead to inconsistent and wrong databases. Page | 32 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 2) Insertion Anomaly If there is a new row inserted in the table and it creates the inconsistency in the table then it is called the insertion anomaly. For example, if in the above table, we create a new row of a worker, and if it is not allocated to any department then we cannot insert it in the table so, it will create an insertion anomaly. 3) Deletion Anomaly If we delete some rows from the table and if any other information or data which is required is also deleted from the database, this is called the deletion anomaly in the database. For example, in the above table, if we want to delete the department number ECT669 then the details of Rajesh will also be deleted since Rajesh's details are dependent on the row of ECT669. So, there will be deletion anomalies in the table. Functional Dependency The functional dependency is a relationship that exists between two attributes. It typically exists between the primary key and non-key attribute within a table. X → Y The left side of FD is known as a determinant, the right side of the production is known as a dependent. For example: Assume we have an employee table with attributes: Emp_Id, Emp_Name, Emp_Address. Here Emp_Id attribute can uniquely identify the Emp_Name attribute of employee table because if we know the Emp_Id, we can tell that employee name associated with it. Functional dependency can be written as: Emp_Id → Emp_Name We can say that Emp_Name is functionally dependent on Emp_Id. Types of Functional dependency Page | 33 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Trivial and Nontrivial Dependency Functional dependency is trivial if and only if the right-hand side is a subset of the left-hand side. The following dependency is called trivial: {S#, P#} → S# Functional dependency is nontrivial if and only if the right-hand side is a not subset of the left-hand side. The following dependency is called nontrivial: {S#, P#} → QTY Some FDs might imply others. For example: {S#, P#} → {City, QTY} Implies both of following: {S#, P#}→ City {S#, P#}→ QTY Closure of a Set of Dependencies 1. Reflexivity: If B is a subset of A, then A → B (Trivial) 2. Augmentation: If A → B , then AC → BC 3. Transitivity: If A → B and B → C, then A → C 4. Self-Determination: A → A 5. Decomposition: If A → BC, then A → B and A → C 6. Union: If A → B and A → C then A → BC 7. Composition: If A → B and C → D, then AC → BD Page | 34 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 NORMALIZATION Normalization is a process for evaluating and correcting database structure to minimize data redundancies. The database should be normal means database should be well organized so whatever data we require we can get easily. If we can perform our regular work very easily on given database then we can say that given database is normalized. Advantages of Normalization o The main reason for normalizing the relations is removing anomalies. o Normalization is used to minimize the redundancy from a relation or set of relations. o It is also used to eliminate undesirable characteristics like Insertion, Update, and Deletion Anomalies. o Normalization divides the larger table into smaller and links them using relationships. o The normal form is used to reduce redundancy from the database table. o Improve data consistency within the database. o Much more flexible database design. o Enforces the concept of relational integrity First Normal Form (1NF) o A given relation is said to be in 1NF if and only if every tuple contains exactly one (atomic) value for each attribute. o It states that an attribute of a table cannot hold multiple values. o It must hold only single-valued attribute. o First normal form disallows the multi-valued attribute, composite attribute, and their combinations. Page | 35 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Consider following example: Relation: Supplier (S# is Supplier ID, P# is Part ID) [RelationSCP in Un-normalized form] [Relation SCP in 1st Normal form] [1NF contains functional dependency (FD)]. In relation Supplier status and city are not fully functionally dependent on primary key {S#,P#} status is functionally dependent on city so, CITY → STATUS. status, city are not mutually independent. QTY is dependent on S# and P# so, {S#, P#} → {QTY} Multivalued dependency is also in 1NF. S#→→{status,city} Problems in 1NF are as follow: Page | 36 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Insert:- We cannot insert the fact that a particular supplier is located in a particular city until that supplies at least one part. For Example without knowing supplier S5 supplies which part we cannot insert that S5 is located in Rome. Delete:- If we delete tuple for particular supplier, we delete not only the shipment connection that supplier to a particular part but also information that a particular supplier is located in particular city. For example if we delete the tuple for supplier S3 supplying part P1 , then we also delete that S3 is located in Paris. Update:- In 1NF if we want to change the city of supplier S1 to Rome then we have to change all cities assigned to supplier S1 because of redundancy. So, it might be possible that one S1 contain the city London and other S1 contain the city Rome. These problems lead to 2NF. Second Normal Form (2NF) A relation is in 2NF if it is in 1NF and every non-key attributes dependent on primary key. 2NF is based on primary key concept. We can divide table SCP of 1NF in 2NF as follow. [Functional dependency in 2NF] Page | 37 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 We can say that given structure is in 2NF. Here, The S# in SUPPLIER relation is primary key. FD:s#→status s#→city city→status The S#, P# in SP relation is primary key. FD:{s#,p#}→Qty Three operations for the structure as follow: Insert:- We can insert the information that S5 is located in Rome even though S5 does not currently supply any parts. Delete:- We can delete the shipment connection S3 and P1 by deleting appropriate tuple from SP without losing information that S# is located in Paris. Here problem in table SUPPLIER is that, if we delete any tuple from SUPPLER for particular city then we loss the information for status of that city. Update:- In SUPPLIER relation for a given supplier city appears at once, so we can easily change the city from London to Rome for given supplier S1. But the status for a given city appears more than once in relation SUPPLIER. If we want to change the status of city then also face the problem of redundancy again. These problems lead to 3NF. Page | 38 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Page | 39 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 Third Normal Form (3NF) A relation is in 3NF if and only if it is in 2NF and every non key attribute is non transitively dependent on the primary key. We can divide or split the structure of 2NF into 3NF as follow. [Functional dependency in 3NF] Relation SC and CS both are in the 3NF. In those relations S# and CITY is primary key respectively. In this 3NF we can solve all problems of insert, delete and update as follow: Insert:- We can insert that the status of the Rome is 50 even though we Have no any supplier located in Rome. Delete:- We can also delete the particular city by deleting appropriate tuple from SC without losing the status information of that city. Update:- In 3NF CITY is the primary key of CS so we can change the Status of any city without facing the problem of redundancy. Page | 40 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 UNIT 4 Data Manipulation Commands Topics: ⮚ Adding Table Rows Saving Table Rows Listing Table ⮚ Rows Updating Table Rows Restoring Table Rows ⮚ Deleting Table Rows ⮚ Select Queries: With Conditional Restrictions Arithmetic Operators ← Logical Operators: AND, OR & NOT Special Operators: Between IS NULL Like IN Advanced Data Definition Commands: ⮚ Changing a Column's Data Types ⮚ Changing a Column's Data ⮚ Characteristics ⮚ Adding a Column ⮚ Dropping a Column ⮚ Column Advanced ⮚ Data Updates ⮚ Adding Primary and Foreign Key Deleting a Table from the Database ⮚ Aggregate Function SQL View What is SQL? SQL stands for Structured Query Language SQL lets you access and manipulate databases SQL is an ANSI (American National Standards Institute) standard Structured Query Language (SQL) is the language used to manipulate relational databases. SQL is tied very closely with the relational model. In the relational model, data is stored in structures called relations or tables. Each table has one or more attributes or columns that describe the table. In relational databases, the table is the fundamental building block of a database application. Tables are used to store data on Employees, Equipment, Materials, Warehouses, Purchase Orders, Customer Orders, etc. Columns in the Employee table, for example, might be Last Name, First Name, Salary, Hire Date, Social Security Number, etc. What Can SQL do? execute queries against a database retrieve data from a database insert records in a database update records in a database delete records from a database Page | 41 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 create new databases create new tables in a database create stored procedures in a database create views in a database set permissions on tables, procedures, and views DDL, DML, DCL, TCL DDL - Data Definition Language: statements used to define the database structure or schema. CREATE - to create objects in the database ALTER - alters the structure of the database DROP - delete objects from the database TRUNCATE - remove all records from a table, including all spaces allocated for the records are removed COMMENT - add comments to the data dictionary RENAME - rename an object DML - Data Manipulation Language: statements used for managing data within schema objects. SELECT - retrieve data from the a database INSERT - insert data into a table UPDATE - updates existing data within a table DELETE - deletes all records from a table, the space for the records remains DCL - Data Control Language. GRANT - gives user's access privileges to database REVOKE - withdraw access privileges given with the GRANT command TCL - Transaction Control: statements used to manage the changes made by DML statements. It allows statements to be grouped together into logical transactions. COMMIT - save work done Page | 42 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 SAVEPOINT - identify a point in a transaction to which you can later roll back ROLLBACK - restore database to original since the last COMMIT SET TRANSACTION - Change transaction options like isolation level and what rollback segment to use DML are not auto-commit. i.e. you can roll-back the operations, but DDL are auto-commit Basic Statements sequence of oracle for writing SQL. SELECT (COLUMN LIST) FROM (TABLE NAME) [WHERE (CONDITION )] [GROUP BY (COLUMN NAME)] [HAVING (CONDTION)] [ORDER BY (EXPRESSION)] Select The SELECT statement allows you to retrieve records from one or more tables in your database. This operation is also called projection. Syntax SELECT columns FROM tables WHERE predicates; Examples Select Queries Page | 43 Course Code: 4040233101 Course Name: Database Management System SEMESTER: 1 All columns select * from dept; Selected columns select LOC, DEPTNO from DEPT; Selected columns with Alias select Sname,Sno as Rollno,Maths as Vigaan from Student Add Selected columns and also perform Mathematical operations on column Select Sno,Sname,Eng+Maths as Total , Fees , Fees*0.10 as , Fees + Fees*0.10 after increment from Student Where Conditions Selected columns with > condition SELECT name, city, state FROM suppliers WHERE supplier_id > 1000 All columns with > and < condition SELECT * FROM emp where sal>2000 and sal