802 Chp 1 PDF - Database Concepts
Document Details
Uploaded by StaunchUranium6400
Tags
Summary
This document provides an introduction to database concepts and database management systems. It describes data and information, how databases are used in everyday life, and the advantages of using a database approach over traditional file processing.
Full Transcript
Unit - 1: Database Concepts 1.1 Introduction The key to organizational success is effective decision making which requires timely, relevant and accurate information. Hence information plays a critical role in today's competitive environment. Database Management Software (DBMS) si...
Unit - 1: Database Concepts 1.1 Introduction The key to organizational success is effective decision making which requires timely, relevant and accurate information. Hence information plays a critical role in today's competitive environment. Database Management Software (DBMS) simplifies the task of managing the data and extracting useful information out of it. In this chapter, we shall learn about the basic concepts of databases and also learn how to use DBMS for some applications. 1.2 Basic Concepts and Definitions Data is a collection of raw facts which have not been processed to reveal useful information. Information is produced by processing data as shown in Figure 1.2(a). Data Processing Information Figure 1.2(a): Data Processing For example, given the data of the test marks of all the students in a class (data), the average, maximum and minimum marks in the class can be used as indicators of the performance of the class (information). In other words, we can say that we have extracted the information about average, maximum and minimum marks for given student data in Figure 1.2(b). Marks obtained by students Compute: Average marks, Performance in an examination Maximum marks, Minimum marks of the class Data Processing Information Figure 1.2(b): Example of Data Processing Databases are being used extensively in our day-to-day life. Be it business, engineering, medicine, education, library, to name a few. For example, consider the name, class, roll number, marks in every subject of every student in a school. To record this information about every student in a school, the school might have maintained a register, or stored it on a hard drive using a computer system and software such as a spreadsheet or DBMS package. Such collection of related data that has been recorded, organized and made available for searching is called a Database. A database has the following properties: 1) A database is a representation of some aspect of the real world also called miniworld. Whenever there are changes in this miniworld they are also reflected in the database. 2) It is designed, built and populated with data for specific purpose. 3) It can be of any size and complexity. 4) It can be maintained manually or it may be computerized. 1 Database Management Applications 1.3 Need for a Database In traditional file processing, data is stored in the form of files. A number of application programs are written by programmers to insert, delete, modify and retrieve data from these files. New application programs will be added to the system as the need arises. For example, consider the Sales and Payroll departments of a company. One user will maintain information about all the salespersons in the Sales department in some file say File1 and another user will maintain details about the payroll of the salesperson in a separate file say File2 in the Payroll Department as shown in Figure 1.3(a). Sales Payroll Department Department Application Application Program Program File 1 File 2 Figure 1.3(a): Traditional File Processing System Although both the departments need information about the salesperson but they will store information about the salesperson in different files and will use different application programs to access those files. This would result in: 1. Data Redundancy: Same information is stored in more than one file. This would result in wastage of space. 2. Data Inconsistency: If a file is updated then all the files containing similar information must be updated else it would result in inconsistency of data. 3. Lack of Data Integration: As data files are independent, accessing information out of multiple files becomes very difficult. Database approach overcomes these problems and also adds a lot of advantages as discussed later. In database approach, a single repository of data is maintained which is accessed by different users as per their needs. 1.4 Database Management System (DBMS) A database management system is a collection of programs that enables users to create, maintain and use a database. It enables creation of a repository of data that is defined once and then accessed by different users as per their requirements. Thus there is a single repository of data which is accessed by all the application programs as shown below Figure 1.4(a). 2 Application Programs DBMS Software Stored Database & Its Definition Figure 1.4(a): DBMS Environment The various operations that need to be performed on a database are as follows: 1. Defining the Database: It involves specifying the data type of data that will be stored in the database and also any constraints on that data. 2. Populating the Database: It involves storing the data on some storage medium that is controlled by DBMS. 3. Manipulating the Database: It involves modifying the database, retrieving data or querying the database, generating reports from the database etc. 4. Sharing the Database: Allow multiple users to access the database at the same time. 5. Protecting the Database: It enables protection of the database from software/ hardware failures and unauthorized access. 6. Maintaining the Database: It is easy to adapt to the changing requirements. Some examples of DBMS are – MySQL, Oracle, DB2, IMS, IDS etc. 1.4.1 Characteristics of Database Management Systems The main characteristics of a DBMS are as follows: 1. Self-describing Nature of a Database System: DBMS contains not only the database but also the description of the data that it stores. This description of data is called meta- data. Meta-data is stored in a database catalogue or data dictionary. It contains the structure of the data and also the constraints that are imposed on the data. 2. Insulation Between Programs and Data: Since the definition of data is stored separately in a DBMS, any change in the structure of data would be done in the catalogue and hence programs which access this data need not be modified. This property is called Program-Data Independence. 3. Sharing of Data: A multiuser environment allows multiple users to access the database simultaneously. Thus a DBMS must include concurrency control software to allow simultaneous access of data in the database without any inconsistency problems. 3 Database Management Applications 1.4.2 Types of Users of DBMS DBMS is used by many types of users depending on their requirements and interaction with the DBMS. There are mainly four types of users: 1. End Users: Users who use the database for querying, modifying and generating reports as per their needs. They are not concerned about the working and designing of the database. They simply use the DBMS to get their task done. 2. Database Administrator (DBA): As the name implies, the DBA administers the database and the DBMS. The DBA is responsible for authoring access, monitoring its use, providing technical support, acquiring software and hardware resources. 3. Application Programmers: Application programmes write application programs to interact with the database. These programs are written in high level languages and SQL to interact with the database. 4. System Analyst: System analyst determines the requirements of the end users and then develops specifications to meet these requirements. A system analyst plays a major role in the database design and all the technical, economic and feasibility aspects. 1.4.3 Advantages of using DBMS Approach The need of DBMS itself explains the advantages of using a DBMS. Following are the advantages of using a DBMS: 1. Reduction in Redundancy: Data in a DBMS is more concise because of the central repository of data. All the data is stored at one place. There is no repetition of the same data. This also reduces the cost of storing data on hard disks or other memory devices. 2. Improved Consistency: The chances of data inconsistencies in a database are also reduced as there is a single copy of data that is accessed or updated by all the users. 3. Improved Availability: Same information is made available to different users. This helps sharing of information by various users of the database. 4. Improved Security: Though there is improvement in the availability of information to users, it may also be required to restrict the access to confidential information. By making use of passwords and controlling users' database access rights, the DBA can provide security to the database. 5. User Friendly: Using a DBMS, it becomes very easy to access, modify and delete data. It reduces the dependency of users on computer specialists to perform various data related operations in a DBMS because of its user friendly interface. 1.4.4 Limitations of using DBMS Approach The two main disadvantages of using a DBMS: 1. High Cost: The cost of implementing a DBMS system is very high. It is also a very time- consuming process which involves analyzing user requirements, designing the database specifications, writing application programs and then also providing training. 2. Security and Recovery Overheads: Unauthorized access to a database can lead to 4 threat to the individual or organization depending on the data stored. Also the data must be regularly backed up to prevent its loss due to fire, earthquakes, etc. Hence the DBMS approach is usually not preferred when the database is small, well defined, less frequently changed and used by few users. 1.5 Relational Database Various types of databases have been developed. One of them was relational database developed by E.F Codd at IBM in 1970. It is used to organize collection of data as a collection of relations where each relation corresponds to a table of values. Each row in the table corresponds to a unique instance of data and each column name is used to interpret the meaning of that data in each row. For example, consider EMPLOYEE table in Figure 1.5(a). Each row in this table represents facts about a particular employee. The column names – Name, Employee_ID, Gender, Salary and Date_of_Birth specify how to interpret the data in each row. EMPLOYEE Name Employee_ID Gender Salary Date_of_Birth Neha Mehta 1121 Female 20000 04-03-1990 Paras Bansal 2134 Male 25000 19-10-1993 Himani Verma 3145 Female 20000 23-11-1992 Figure 1.5(a): Employee Table In relational model, F A row is called a Tuple. F A column is called an Attribute. F A table is called as a Relation. F The data type of values in each column is called the Domain. F The number of attributes in a relation is called the Degree of a relation. F The number of rows in a relation is called the Cardinality of a relation. F Relation Schema R is denoted by R (A1, A2, A3,…, An) where R is the relation name and A1, A2, A3,….An is the list of attributes. F Relation State is the set of tuples in the relation at a point in time. A relation state r of relation schema R (A1, A2,..., An), denoted r(R) is a set of n-tuples r = {t1, t2,...., tm}, where each n-tuple is an ordered list of values t = , where vi is in domain of Ai or is NULL. Here n is the degree of the relation and m is the cardinality of the relation. 5 Database Management Applications Hence in Figure 1.5(a), F EMPLOYEE table is a relation. F There are three tuples in EMPLOYEE relation. F Name, Employee_ID, Gender, Salary, Date_of_Birth are attributes. F The domain is a set of atomic (or indivisible) values. The domain of a database attribute is the set of all the possible values that attribute may contain. In order to specify a domain, we specify the data type of that attribute. Following are the domain of attributes of the EMPLOYEE relation: (a) Name – Set of character strings representing names of persons. (b) Employee_ID–Set of 4-digit numbers (c) Gender – male or female (d) Salary – Number (e) Date_of_Birth – Should have a valid date, month and year. The birth year of the employee must be greater than 1985. Also the format should be dd-mm-yyyy. F The degree of the EMPLOYEE relation is 5 as there are five attributes in this relation. F The cardinality of the EMPLOYEE relation is 3 as there are three tuples in this relation. F Relation Schema – EMPLOYEE (Name, Employee_ID, Gender, Salary, Date_of_Birth) F Relation State –{, , } Some More Characteristics of Relations: 1. Ordering of tuples is not important in a Relation. Thus, the following relation in Figure 1.5(b) is same as the relation in Figure 1.5(a). EMPLOYEE Name Employee_ID Gender Salary Date_of_Birth Paras Bansal 2134 Male 25000 19-10-1993 Neha Mehta 1121 Female 20000 04-03-1990 Himani Verma 3145 Female 20000 23-11-1992 Figure 1.5(b): Employee Table (reordered tuples) 2. The ordering of attributes is also unimportant. 3. No two tuples of relation should be identical i.e. given any pair of two tuples, value in at least one column must be different. 6 4. The value in each tuple is an atomic value (indivisible). 5. If the value of an attribute in a tuple is not known or not applicable or not available, a special value called null is used to represent them. For example consider the following cases: F Unknown value: A person has a date of birth but it is not known at the time of data entry. F Unavailable value: A person has a home phone but does not want it to be listed. F Not applicable: College degree attribute would be NULL for a person who has no college degrees. In all the above cases NULL value would be used. A Relational DBMS (RDBMS) is a DBMS which is based on the relational model as discussed above. This is one of the most commonly used databases. Examples of RDBMS are Oracle, MySQL, IBM DB2. 1.5.1 Relational Model Constraints Constraints, are restrictions on the values, stored in a database based on the requirements. For example, in the relation EMPLOYEE, the Employee_ID must be a 4-digit number, the Date_of_Birth must be such that the birth year > 1985. We describe below various types of constraints in Relational model: 1. Domain Constraint: It specifies that the value of every attribute in each tuple must be from the domain of that attribute. For example, the Employee_ID must be a 4-digit number. Hence a value such as “12321” or “A234” violates the domain constraint as the former is not 4-digit long and the latter contains an alphabet. 2. Key Constraint: Before we can explain this constraint, we need to describe the terms superkey, key, candidate key and primary key. (i) Superkey is a set of attributes in a relation, for which no two tuples in a relation state have the same combination of values. Every relation must have at least one superkey which is the combination of all attributes in a relation. Thus for the EMPLOYEE relation, following are some of the superkeys: (a) {Name, Employee_ID, Gender, Salary, Date_of_birth} - default superkey consisting of all attributes. (b) {Name, Employee_ID, Date_of_Birth} (c) {Employee_ID, Gender, Salary} (d) {Name, Employee_ID, Gender} (e) {Employee_ID} 7 Database Management Applications However {Gender, Salary} is not a superkey because both these attributes have identical values for employees Neha and Himani. (ii) Key is the minimal superkey, which means it is that superkey of a relation from which if any attribute is removed then it no longer remains a superkey. For example the superkey {Name, Employee_ID, Gender}is not a key as we can remove Name and Gender from this combination and then what is left {Employee_ID} is still a Superkey. Now {Employee_ID} is a key as it is a superkey as well as no more removals are possible. A relation may have more than one key. Consider the relation PERSON with the following schema: PERSON (Aadhar_number, PAN, Voter_ID_cardno, Name, Date_of_birth, Address). This relation has three keys namely : {Aadhar_number}, {PAN}, {Voter_ID_no} as every individual in India has a unique Aadhar card number, PAN as well as Voter ID card number. (iii) Candidate key: A key as described above is called candidate key of the relation. For example, the PERSON relation has three candidate keys as discussed above. (iv) Primary Key: One of the candidate keys may be designated as Primary key. Primary key is used to identify tuples in a relation. If a relation has many candidate keys it is preferable to choose that one as primary key which has least number of attributes. Primary key are usually underlined in the schema of the relation. For example in the relation schema: PERSON (Aadhar_number, PAN, Voter_ID_cardno, Name, Date_of_birth, Address), Aadhar_number is the primary key. The relation between superkey, key, candidate key and primary key can be explained with the help of Figure 1.5(c). Primary Key Candidate Keys Keys Superkeys Figure 1.5(c): Superkey, Key, Candidate key and Primary key 3. Null Value Constraint: Sometimes it is required that certain attributes cannot have null values. For example, if every EMPLOYEE must have a valid name then the Name attribute is constrained to be NOT NULL. 4. Entity Integrity Constraint: This constraint specifies that primary key of a relation cannot have null value. The reason behind this constraint is that we know primary key contains no duplicates. However if we allow null values for a primary key then there can be multiple tuples for which primary key is having null values. This would imply that we are allowing duplicate values (NULL) for a primary key which itself violates the definition of primary key. 8 5. Referential Integrity Constraint: This constraint is specified between two relations. Before defining this constraint let us study the concept of foreign keys. Foreign key in a relation R1 is the set of attributes in R1 that refer to primary key in another relation R2 if the domain of foreign key attributes is same as that of primary key attributes and the value of foreign key either occurs as a value of primary key in some tuple of R2 or is NULL. R1 is called the referencing relation and R2 is called referenced relation, and a referential integrity constraint holds from R1 to R2. The main purpose of this constraint is to check that data entered in one relation is consistent with the data entered in another relation. For example, consider two relation schemas: Department (Dept_Name, Dept_ID, No_of_Teachers) Teacher (Teacher_Name, Teacher_ID, Dept_ID, Subject) Following are the primary keys (Underlined above): F Dept_ID is the primary key of Department relation. F Teacher_ID is the primary key of Teacher relation. Now you may notice that Dept_ID- the primary key of relation in Department, is also present in relation Teacher. The reason is that every teacher belongs to a particular department. Now that means Dept_ID of Teacher relation must have a value that exists in Dept_ID attribute of Department relation or it can be NULL in case a teacher has not yet been assigned to a department. We say that Dept_ID of Teacher relation is a foreign key that references primary key of Department relation (Dept_ID). It is important to emphasize it is not necessary to have same name for foreign key as of the corresponding referenced primary key. The above two schemas can also be written as follows: Department (Dept_Name, Dept_ID, No_of_Teachers) Teacher (Teacher_Name, Teacher_ID, Dept_No, Subject) Where Dept_No is the foreign key that references Dept_ID of Department relation. A foreign key may also refer to the same relation. For example suppose we have to create a database of all residents in a colony along with their best neighbors. Consider the following relation: Residents (Name, RID, Block_no, House_no, Floor, Neighbor_RID) The Primary key of this relation is RID (Resident ID). In order to store information about neighbor we have created a foreign key Neighbor_RID that references RID of Residents. Note that the referencing and referenced relation are same in this case. 1.6 Structured Query Language (SQL) SQL is a language that is used to manage data stored in a RDBMS. It comprises of a Data Definition Language (DDL) and a Data Manipulation Language (DML) where DDL is a 9 Database Management Applications language which is used to define structure and constraints of data and DML is used to insert, modify and delete data in a database. SQL commands are used to perform all the operations. In order to study SQL commands, a database system needs to be installed on the Computer. There are various softwares available. We will study the MySQL server. SQL uses the terms table, row and column for the relational model terms relation, tuple and attribute. For studying SQL we will use MySQL Community Server 5.6.20 which is freely downloadable. The most recent versions can be found on the website: http://dev.mysql.com/downloads/ Following are the steps to install and configure MySQL Community Server 5.6.20 for studying SQL commands. 1. Open the URL: http://dev.mysql.com/downloads/mysql/#downloads 2. Download the MySQL Community Server 5.6.20 available on the above webpage. You can select the platform (Windows/Linux) as shown below. MySQL Community Server 5.6.20 Select Platform: Microsoft Windows Recommended Download: All MySQL Products. For All Windows Platforms. In One Package Starting with MySQL 5.6 the MySQL Installer package replaces the server-only MSI packages. Windows (x86, 64-bit), MySQL Installer MSI Other Downloads: Windows (x86, 32-bit), MSI Installer 5.6.20 44.8M (mysql-5.6.20-win32.msi) MD5: 59abb64af27634abd0f65a60204b18ab Signature Windows (x86, 64-bit), MSI Installer 5.6.20 47.8M (mysql-5.6.20-win64.msi) MD5: 503dc2840c6732ae3e5dc80a3022f1a7 Signature Windows (x86, 32-bit), ZIP Archieve 5.6.20 337.6M (mysql-5.6.20-win32.zip) MD5: 28528dd2cecdd79d33deca2e1b7058e Signature Windows (x86, 64-bit), ZIP Archieve 5.6.20 342.9M (mysql-5.6.20-win64.zip) MD5: 08028e89f892534114550c75f57f3453 Signature 3. Once you have downloaded the file mysql-installer-community-5.6.20.0.msi, double click on the downloaded file and then click on the “Run” button. 10 4. MySQL Installer will start installing. Click on the “Install MySQL Products” option. MySQL Installer Welcome The MySQL installer guides you through the installation and configuration of your MySQL products. Run it from the Start Menu to perform maintenance tasks later. Select one of the actions below: Install MySQL Products Guide you through the installation and configuration of your MySQL products. About MySQL Lean more about MySQL products and better understand how you can benefit the most. Resources Get more information on how to install MySQL and configure it to run efficiently on you machine. Copyright © 2014, Oracle and/or its affiliates. All rights reserved, Oracle is a registered trademark of Oracle Corporation and/or its affiliated. Other names may be trademarks of their respective owners. 11 Database Management Applications 5. Check the option “I accept the license terms” and then Click on “Next” button. MySQL Installer License Agreement MySQL. Installer To install MySQL, you must accept the Oracle Software License Terms. GNU GENERAL PUBLIC LICENSE Vesrion2, June 1991 License Information Copyright (c) 1989, 1991 Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA Find latest products Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Setup Type Preamble ========= Check Requirements The licenses for most software are designed to take away your freedom to share and change it. But contrast, the GNU General Public License is Installation intended to guarantee your freedom to share and change free] software--to make sure the software is free for all its users. This Configuration General Public License applied to most of the Free Software Foundation’s software and to any other program whose authors commit to Complete using it. (Some other Free Software Foundation software is covered by the GNU Library General Public Licence instead.) You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for this service if you wish), that you receive source code or can get it if you want it, that you can change software or use pieces of it in new\ free programs; and that you know you can do these things. I accept the license terms < Back Next > Cancel 6. Then next click on the “Execute” button. MySQL Installer Find latest products MySQL. Installer Before continuing, the Installer will determine if updates are available for the products you are about to install or upgrade. License Information Connect to the Internet Find latest products Fetch product update information Setup Type Check Requirements Installation Configuration Complete Skip the check for updated (not recommended) < Back Execute Cancel 12 7. On successful execution, click on the “Next” button. MySQL. Installer Find latest products MySQL. Installer Before continuing, the Installer will determine if updates are available for the products you are about to install or upgrade. License Information Find latest products Connect to the Internet Fetch product update information Setup Type The operation is complete. Please click 'Next >' to continue. Check Requirements Installation Configuration Complete Skip the check for updates (not recommended) < Back Next > Cancel 8. Select the “Server only” option. Then click on “Next” button. MySQL Installer Choosing a Setup Type MySQL. Installer Please select the Setup Type that suits your use case. Developer Default Setup Type Description Installs all products needed for Installs only the MySQL Server. This type should License Information MySQL development purposes. be used where you want to deploy a MySQL Server, but will not be developing MySQL Find latest products applications. Server only Installs only the MySQL Server Setup Type product. Check Requirements Client only Installation Installs only the MySQL Client products, without a server. Configuration Full Installation Path: Complete Installs all included MySQL products and features. C:\Program Files\MySQL\ Custom Data Path: Manually select the products that C:\ProgramData\MySQL\MySQL Server 5.6\ should be installed on the system. < Back Next > Cancel 13 Database Management Applications 9. Installer will check for the requirements. If any requirements are required, you have to download them first before installing MySQL server. If all the requirements are met, then the following message will be displayed. Click “Next” to continue. MySQL Installer Check Requirements MySQL. Installer There are no additional requirements to be installed. Please click Next to continue with the product installation. License Information Find latest products Setup Type Check Requirements Installation Configuration Complete < Back Next > Cancel 10. Click on “Execute” to install MySQL Server 5.6.20. MySQL Installer Installation Progress MySQL. Installer The following products will be installed or updated. Product Status Progress Notes License Information MySQL Server 5.6.20 To be installed Find latest products Setup Type Check Requirements Installation Configuration Complete Click [Execute] to install or update the following packages < Back Execute > Cancel 14 11. On successful installation, Click on “Next” MySQL Installer Installation Progress MySQL. Installer The following products will be installed or updated. Product Status Progress Notes License Information MySQL Server 5.6.20 Install success Find latest products Setup Type Check Requirements Installation Configuration Complete Show Details > < Back Execute > Cancel 12. Click on “Next” to start initial configuration. MySQL Installer Configuration Overview MySQL. Installer The following products will now be configured. Product Action to be performed Progress License Information MySQL Server 5.6.20 Initial Configuration. Find latest products Setup Type Check Requirements Installation Configuration Complete Show Details > < Back Execute > Cancel 15 Database Management Applications 13. Select the following configurations and then Click “Next”. MySQL Installer MySQL Server Configuration 1/3 MySQL. Installer Server Configuration Type Choose the correct server configuration type for this MySQL Server installation. This setting will define how much system resources are assigned License Information to the MySQL Server instance. Find latest products Config Type: Development Machine Setup Type Enable TCP/IP Networking Check Requirements Enable this to allow TCP/IP networking. Only localhost connections through named pipes are allowed when this option is skipped. Installation Port Number: 3306 Configuration Open Firewall port for network access Complete Advanced Configuration Select the checkbox below to get additional configuration page where you can set advanced options for this server instance. Show Advanced Options < Back Next > Cancel 14. Type the MySQL root password (minimum 4 characters long) and then click “Next” MySQL Installer MySQL Server Configuration 2/3 MySQL. Installer Root Account Password Enter the password for the root account. Please remember to store this password in a secure place. License Information MySQL Root Password: Find latest products Repeat Password: Setup Type Password Strength: Weak Check Requirements Installation MySQL User Accounts Configuration Crate MySQL user accounts for your users and applications. Complete Assign a role to the user that consists of a set of privileges. MySQL Username Host User Role Add User Edit User Delete User < Back Next > Cancel 16 15. Click on “Next” on the following window. MySQL Installer MySQL Server Configuration 3/3 MySQL. Installer Windows Service Details Please specify a Windows Service name to be used for this MySQL License Information Server instance. A unique name is required for each instance. Find latest products Windows Service Name: MySQL56 Setup Type Start the MySQL Server at System Startup Check Requirements Run Windows Service as... Installation The MySQL Server needs to run under a given user account. Based on the security requirements of your system you need to pick one of the options below. Configuration Standard System Account Complete Recommended for most scenarios. Custom User An existing user account can be selected for advanced scenarios. < Back Next > Cancel 16. Installer will configure the server. On successful configuration, click on “Next”. MySQL Installer Configuration Overview MySQL. Installer The following products will now be configured. Product Action to be performed Progress License Information MySQL Server 5.6.20 Configuration Complete. Find latest products Setup Type Check Requirements Installation Configuration Complete Show Details > < Back Execute > Cancel 17 Database Management Applications 17. Then Click on “Finish”. The installation and configuration of MySQL Server 5.6.20 is now complete. You can now start using the server for creating and modifying databases. MySQL Installer Installation Complete MySQL. Installer The installation procedure has been completed. License Information Find latest products Setup Type Check Requirements Installation Configuration Complete Back Finish Cancel We give below SQL commands used to define and modify a database: 1. Create Table Command: This command is used to create a new table or relation. The syntax for this command is : CREATE TABLE ( [constraint] , [constraint], [constraint] ); where []=optional The keyword CREATE TABLE is followed by the name of the table that you want to create. Then within parenthesis, you write the column definition that consists of column name followed by its data types and optional constraints. There can be as many columns as you require. Each column definition is separated with a comma (,). All SQL statements should end with a semicolon (;). Table 1 shows the data types commonly used. 18 Data type Meaning Example CHAR (n) Fixed length character string. CHAR(5):“Ashok” 'n' is the number of characters. “Vijay” VARCHAR(n) Variable length character string. VARCHAR(15): 'n' is the maximum number of “Vijay Kumar” characters in the string. “Ashok Sen” DATE Date in the form of YYYY-MM-DD DATE: '2014-03-20' INTEGER 23 Integer number 56789 Fixed point number m represents the DECIMAL(5,2) : 999.99 DECIMAL (m, d) number of significant digits that are -567.78 stored for values and d represents DECIMAL (5) : 23456 the number of digits that can be 99999 stored following the decimal point. If d is zero or not specified then the value does not contains any decimal part. Table 1: Commonly used data types Suppose we wish to create a database of all the teachers working in a school. This database should include the Teacher relation (schema given below): Teacher (Teacher_ID, First_Name, Last_Name, Gender, Date_of_Birth, Salary, Dept_No) The above schema stores the information about all the teachers working in the school such as their unique ID, first and last name, gender, salary, date of birth and the department to which the teacher belongs. To create the above relations in SQL, following CREATE TABLE command is used: CREATE TABLE Teacher ( Teacher_ID INTEGER, First_Name VARCHAR(20), Last_Name VARCHAR(20), Gender CHAR(1), Salary DECIMAL(10,2), Date_of_Birth DATE, Dept_No INTEGER ); 19 Database Management Applications To create the above table in MySQL, click on MySQL 5.6 Command Line Client in the Start Menu. Following command line window will open: Enter the password as you have entered during the installation of MySQL community server 5.6.20. After entering the password, you can see the MySQL monitor: You can now type the SQL commands on the mysql> prompt. Now the next step is to create a database using the CREATE DATABASE command. CREATE DATABASE School; Once the database is created, you can check it in the list of databases that currently exist on the server by using the SHOW DATABASES command as shown below: In case you want to remove a database, this can be done by using the DROP DATABASE command. 20 Next step is to tell the server which database we will use for further statements. This can be done by using the USE command. Now you can create the Teacher table which would be associated with the School database. To verify you can use SHOW TABLES command which displays all the tables created in the current database. Database Constraints: DBMS can enforce several constraints for smooth operations on databases. These constraints can be specified while creating the table as shown below: 21 Database Management Applications a) NOT NULL: An attribute value may not be permitted to be NULL. For example, the First name of the Teacher cannot be NULL. Hence NOT NULL constraint can be specified in this case. CREATE TABLE TEACHER ( Teacher_ID INTEGER, First_NameVARCHAR(20) NOT NULL, Last_NameVARCHAR(20), Gender CHAR(1), Salary DECIMAL(10,2), Date_of_Birth DATE, Dept_No INTEGER ); b) DEFAULT : If a user has not entered a value for an attribute, then default value specified while creating the table is used. For example, if a teacher's salary has not been entered, then by default the database should store 40000 assuming that the minimum salary given to every teacher is ` 40000. This is illustrated as follows: CREATE TABLE TEACHER ( Teacher_ID INTEGER, First_Name VARCHAR(20) NOT NULL, Last_Name VARCHAR(20), Gender CHAR(1), Salary DECIMAL(10,2) DEFAULT 40000, Date_of_Birth DATE, Dept_No INTEGER ); In MySQL, if you want to look at the structure and description of the tables created, DESC command can be used. The description of the table Teacher created above is as follows: 22 c) CHECK: In order to restrict the values of an attribute within a range, CHECK constraint may be used. For example Dept_No of any teacher must not exceed 110. This can be specified as follows: CREATE TABLE TEACHER ( Teacher_ID INTEGER, First_Name VARCHAR(20) NOT NULL, Last_Name VARCHAR(20), Gender CHAR(1), Salary DECIMAL(10,2) DEFAULT 40000, Date_of_Birth DATE, Dept_No INTEGER CHECK (Dept_No 50000; Result: 33 Database Management Applications First_Name Last_Name Dept_Name Tarannum Malik Puneet NULL Vinita Ghosh Vansh NULL Logical comparisons equal to (=), less than (), less than or equal to (=), not equal to() can be used in the WHERE clause. (c) Query: To display Teacher_ID,First_Name,Last_Name and Dept_No of teachers who belongs to department number 4 or 7. SELECT Teacher_ID,First_Name,Last_Name, Dept_No FROM Teacher WHERE Dept_No = 4 OR Dept_No = 7; Result: Teacher_ID First_Name Last_Name Dept_No 103 Surbhi Bansal 4 104 Megha Khanna 4 109 Neha Singh 7 Thus Boolean operations AND,OR can also be used in the WHERE clause. (d) Query: To retrieve names of all the teachers and the names and numbers of their respective departments. Note that the above query requires two tables – Teacher and Department. Consider the following query: SELECT First_Name, Last_Name, Dept_ID, Dept_Name FROM Teacher, Department; This query will result in a set in which the number of rows will be the number of rows in Teacher table (14) multiplied with number of rows in Department table (9), i.e. 126 rows. Since we have not specified any WHERE clause, this query will combine each row in Teacher table with each row of Department table resulting in Cartesian product of two tables. This is called as Cross Join. Now if we have to combine each teacher with his/her respective department, there should be a connecting column between the two tables which can be used to join them. Thus Dept_No (Teacher table) and Dept_ID (Department table) can be used to join the two tables. This type of join will result in joining rows from Teacher and Department table 34 based on the equality between Dept_No and Dept_ID columns. The query is as follows: SELECT First_Name, Last_Name, Dept_ID, Dept_Name FROM Teacher, Department WHERE Dept_ID=Dept_No; Here Dept_ID=Dept_No is the join condition. This would produce the following result: First_Name Last_Name Dept_ID Dept_Name Shanaya Batra 1 Chemistry Namit Gupta 1 Chemistry Tarun Mehta 2 Computer Science Vansh NULL 2 Computer Science Alice Walton 3 English Puneet NULL 3 English Surbhi Bansal 4 Hindi Megha Khanna 4 Hindi Tarannum Malik 5 Physics Divya Chaudhary 6 Commerce Neha Singh 7 Biology Saurbh Pant 8 Mathematics Sumita Arora 9 Economics Vinita Ghosh 9 Economics Now suppose we have to retrieve the similar details for the teacher in Chemistry department, the query would be: SELECT First_Name, Last_Name, Dept_ID, Dept_Name FROM Teacher, Department WHERE Dept_ID=Dept_No AND Dept_name=”Chemistry”; First_Name Last_Name Dept_ID Dept_Name Shanaya Batra 1 Chemistry Namit Gupta 1 Chemistry (e) Suppose the teacher and department table both had same names for the department number, say Dept_ID as shown below: CREATE TABLE Department 35 Database Management Applications ( Dept_ID INTEGER PRIMARY KEY, Dept_Name VARCHAR (30) NOT NULL ); CREATE TABLE Teacher ( Teacher_ID INTEGER, First_Name VARCHAR(20) NOT NULL, Last_Name VARCHAR(20), Gender CHAR(1), Salary DECIMAL(10,2) DEFAULT 40000, Date_of_Birth DATE, Dept_ID INTEGER, CONSTRAINT TEACHER_PK PRIMARY KEY (Teacher_ID), CONSTRAINT TEACHER_FK FOREIGN KEY (Dept_ID) REFERENCES Department (Dept_ID) ); In such case, when the join condition is specified, there will be an ambiguity about which Dept_ID we are talking about. To resolve this problem, we have to prefix the name of the attribute with the relation name followed by a period as shown in the query below: Query: To retrieve names of all the teachers who belong to Hindi department. SELECT First_Name, Last_Name FROM Teacher, Department WHERE Department. Dept_ID=Teacher. Dept_ID AND Dept_Name="Hindi"; First_Name Last_Name Surbhi Bansal Megha Khanna Another method is to create aliases. Aliases are used to resolve ambiguity of the relations. They are created by using the keyword 'AS'. For example the above query can also be written as: SELECT First_Name, Last_Name FROM Teacher AS T, Department AS D WHERE D.Dept_ID = T. Dept_ID AND Dept_Name="Hindi"; Here T is an alias for Teacher table and D is an alias for Department table. We can also create alias to rename a column name as shown below: 36 SELECT First_Name AS Fname, Last_Name AS Lname FROM Teacher AS T, Department AS D WHERE D.Dept_ID = T. Dept_ID AND Dept_Name="Hindi"; Result: Fname Lname Surbhi Bansal Megha Khanna Note that the column names as displayed have been changed. (f) In SQL, duplicate tuples can appear more than once in a table and in the result of a query. However if the requirement is to list distinct values of an attribute then this can be done by using the keyword – 'DISTINCT'. For example, following query will list all the Department numbers corresponding to departments having male teachers. SELECT Dept_No FROM Teacher; WHERE GENDER ='M'; The result of the above query is: Dept_No 2 3 1 8 2 In the above result, 2 is appearing twice which is not required as the query is to find only the department numbers. Hence we can use DISTINCT keyword in the SELECT clause so that there is no repetition in the result. SELECT DISTINCT Dept_No FROM Teacher; WHERE GENDER ='M'; The result of the above query is: Dept_No 1 2 3 8 37 Database Management Applications (g) Sometimes it is required to match part of the string. This is called as string pattern matching. We can use 'LIKE' keyword along with two more reserved characters - % (percent) and _ (underscore) for specifying different number of characters. % replaces zero or more number of random characters and _ replaces a single character. Some examples: F Query: To retrieve names of all the teachers starting from letter 'S'. SELECT First_Name FROM Teacher WHERE First_Name LIKE "S%"; Result: First_Name Shanaya Surbhi Saurabh Sumita F Query: To retrieve names of all the teachers having 6 characters in the first name and starting with 'S'. SELECT First_Name FROM Teacher WHERE First_Name LIKE "S_ _ _ _ _"; Result: First_Name Surbhi Sumita F Query: To retrieve names of all the teachers having at least 6 characters in the first name. SELECT First_Name FROM Teacher WHERE First_Name LIKE "_ _ _ _ _ _%"; Result: First_Name Shanaya Surbhi Tarannum 38 Puneet Saurabh Sumita Vinita (h) Suppose it required to sort the result of a query based on some attributes. This can be achieved by using the clause – ORDER BY followed by the attributes which needs to be sorted. (By default the order is ascending) For ascending order the keyword ASC and for descending order the keyword DESC is used. By default the order is ascending. F Query: To list the names of teachers in alphabetical order. SELECT First_Name, Last_Name FROM Teacher ORDER BY First_Name, Last_Name; Result: First_Name Last_Name Alice Walton Divya Chaudhary Megha Khanna Namit Gupta Neha Singh Puneet NULL Saurabh Pant Shanaya Batra Sumita Arora Surbhi Bansal Tarannum Malik Tarun Mehta Vansh NULL Vinita Ghosh F Query: To list the names of all the Departments in the descending order of their names. SELECT Dept_Name FROM Department ORDER BY Dept_Name DESC; Result: 39 Database Management Applications Dept_Name Physics Mathematics Hindi English Economics Computer Science Commerce Chemistry Biology F Query: To retrieve the names and department numbers of all the teachers ordered by the Department number and within each department ordered by the names of the teachers in descending order. SELECT First_Name, Last_Name, Dept_No FROM Teacher ORDER BY Dept_No ASC, First_Name DESC, Last_Name DESC; First_Name Last_Name Dept_No Shanaya Batra 1 Namit Gupta 1 Tarun Mehta 2 Vansh NULL 2 Alice Walton 3 Puneet NULL 3 Surbhi Bansal 4 Megha Khanna 4 Tarannum Malik 5 Divya Chaudhary 6 Neha Singh 7 Saurabh Pant 8 Sumita Arora 9 Vinita Ghosh 9 (i) To test whether a value is unavailable or unknown or not applicable in a column (i.e, NULL values), SQL allows us to test this condition using keywords IS NULL and IS NOT 40 NULL. Query: To retrieve all the details of those employees whose last name is not specified. SELECT * FROM Teacher WHERE Last_Name IS NULL; Result: Teacher_ID First_Name Last_Name Gender Salary Date_of_Birth Dept_No 107 Puneet NULL M 52500.00 1976-09-25 3 104 Vansh NULL M 53500.00 1982-05-04 2 (j) We can have another query in the WHERE clause of SQL query if the condition is based on the result of another query as shown below: Query: To retrieve the names of all the departments having female teachers. SELECT DISTINCT Dept_Name FROM Department WHERE Dept_ID IN (Select Dept_No FROM Teacher WHERE Gender = 'F'); Result: Dept_Name Chemistry English Hindi Physics Commerce Biology Economics The above query is a nested query (query within another query). The outer one is called outer query and the inner one is called as inner query. The inner query will return those department ID's which have female teachers and the outer query will retrieve the names of those departments respectively. We could have also written the above query by using JOIN condition as shown below: SELECT DISTINCT Dept_Name 41 Database Management Applications FROM Department , Teacher WHERE Dept_ID = Dept_No AND Gender='F'; (k) Sometimes it is required to apply certain mathematical functions on group of values in a database. Such functions are called Aggregate Functions. For example retrieving the total number of teachers in all the Departments. Following are the commonly used built-in aggregate functions: F COUNT- It counts the numbers of tuples in the result of the query. F SUM – It finds the sum of all the values for a selected attribute which has numeric data type. F MAX –It finds the maximum value out of all the values for a selected attribute which has numeric data type. F MIN - It finds the minimum value out of all the values for a selected attribute which has numeric data type. F AVG – It finds the average value of all the values for a selected attribute which has numeric data type. Examples: F Query: To find total salary of all the teachers. SELECT SUM(Salary) AS Total_Salary FROM Teacher; Result: Total_Salary 203250.00 The above query finds the sum of all the values in the Salary column and the column is renamed via an alias as Total_Salary which is optional as it only enhances readability. F Query: To find the maximum and minimum salary. SELECT MAX(Salary) AS Max_Salary, MIN(Salary) AS Min_Salary FROM Teacher; Result: Max_Salary Min_Salary 54000.00 34000.00 42 F Query: To count the number of teachers earning more than Rs 40000. SELECT COUNT(Salary) FROM Teacher WHERE Salary > 40000; Result: COUNT (Salary) 9 The above query can also be written as follows: SELECT COUNT(*) FROM Teacher WHERE Salary >40000; The difference is in the use of asterisk (*). Asterisk symbol is used to count the number of rows in the result of the query. F Query: To retrieve the number of teachers in “Computer Science” Department. SELECT COUNT(*) AS No_of_Computer_Science_Teachers FROM Department, Teacher WHERE Dept_Name = "Computer Science"AND Dept_No=Dept_ID; Result: No_of_Computer_Science_Teachers 2 (l) We can also use arithmetic operators in the SELECT clause. For example, if we want to display Teacher name, current salary and a 10% increase in the salary for those teachers who belongs to Department number 4, the SELECT statement can be written as shown below: SELECT First_Name, Last_Name, Salary, Salary*1.1 AS New_Salary FROM Teacher WHERE Dept_No = 4; Result: First_Name Last_Name Salary New_Salary Surbhi Bansal 34000.00 37400.00 Megha Khana 38000.00 41800.00 43 Database Management Applications (m) Grouping based on an attribute can be done in SQL. For such grouping, GROUP BY clause is added in the SQL query. For example, we have to find the number of teachers teaching in each Department. Thus we have to group the result based on the Departments and for each Department we have to count number of teachers who teach in that Department. This query is written by using GROUP BY clause and aggregate function as shown below: SELECT Dept_No, COUNT(*) AS No_of_Teachers FROM Teacher GROUP BY Dept_No; Result: Dept_No No_of_Teachers 1 2 2 2 3 2 4 2 5 1 6 1 7 1 8 1 9 2 The above result can be enhanced if we display the name of Departments also as shown below: SELECT Dept_No, Dept_Name, COUNT(*) AS No_of_Teachers FROM Teacher, Department WHERE Dept_ID = Dept_No GROUP BY Dept_No; Result: Dept_No Dept_Name No_of_Teachers 1 Chemistry 2 2 Computer Science 2 3 English 2 4 Hindi 2 5 Physics 1 44 6 Commerce 1 7 Biology 1 8 Mathematics 1 9 Economics 2 Also it is important to note that the attribute used for grouping (Dept _No in the above query) must also be present in the SELECT clause. It is also possible to apply some condition on the group. This condition will not come under the WHERE clause, but in a new clause HAVING. For example, we have to find those departments which have more than one teacher. SELECT Dept_No, Dept_Name, COUNT(*) AS No_of_Teachers FROM Teacher, Department WHERE Dept_No=Dept_ID GROUP BY Dept_No HAVING COUNT(*) > 1; Result: Dept_No Dept_Name No_of_Teachers 1 Chemistry 2 2 Computer Science 2 3 English 2 4 Hindi 2 9 Economics 2 (n) A Select command can also result in an empty set. For example, retrieve the name of Teacher with ID=115. Since there is no such teacher in the Teacher table, following query results in an empty set. SELECT * FROM Teacher WHERE Teacher_ID = 115; Result: Empty set 45 Database Management Applications Exercise: Q1. Consider the following Employee table: Table Name: Employee Employee_ID Employee_Name Job_Title Salary Bonus Age Manager_ID 1201 Divya President 50000 NULL 29 NULL 1205 Amyra Manager 30000 2500 26 1201 1211 Rahul Analyst 20000 1500 23 1205 1213 Manish Salesman 15000 NULL 22 1205 1216 Megha Analyst 22000 1300 25 1201 1217 Mohit Salesman 16000 NULL 22 1205 The primary key of this table is Employee_ID and Manager_ID is a foreign key that references Employee_ID. Write SQL commands for the following: (a) Create the above table. (b) Insert values as shown above. (c) Delete the Employee having Employee_ID 1217. (d) Update the salary of “Amyra” to 40000. (e) Alter the table Employee so that NULL values are not allowed for Age column. (f) Write a query to display names and salaries of those employees whose salary are greater than 20000. (g) Write a query to display details of employees who are not getting any bonus. (h) Write a query to display the names of employees whose name contains “a” as the last alphabet. (i) Write a query to display the name and Job title of those employees whose Manager_ID is 1201. (j) Write a query to display the name and Job title of those employees whose Manager is “Amyra”. (k) Write a query to display the name and Job title of those employees aged between 26 years and 30 years (both inclusive) Q2. A Railway company uses machines to sell tickets. The machine details and daily sales information are recorded in two tables: 46 Table Name: Machine Field Data Type Machine_ID CHAR (3) Station CHAR (30) Table Name: Sales Field Data Type Machine_ID CHAR (3) Date DATE Tickets_Sold INTEGER Income DECIMAL (8, 2) The primary key of the table Machine is Machine_ID. Records in the table Sales are uniquely identified by the fields Machine_ID and Date. (a) Create the tables Machine and Sales. (b) Write a query to find the number of ticket machines in each station. (c) Write a query to find the total ticket income of the station “New Delhi” of each day. (d) Write a query to find the total number of tickets sold by the machine (Machine_ID = 122) till date. 47 Database Management Applications