W10 2024 CMT221- ch07- INTRO TO SQL PART 2 (ELEARN) PDF
Document Details
Uploaded by SophisticatedBigBen1448
Universiti Sains Malaysia
Dr Azleena Mohd Kassim
Tags
Summary
Chapter 7 of a Database Organization and Design document titled Introduction to Structured Query Language (SQL). It includes learning objectives and introduces introductory SQL concepts.
Full Transcript
CHAPTER 7 INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) DR AZLEENA MOHD KASSIM CMT221 DATABASE ORGANISATION AND DESIGN LEARNING OBJECTIVES o After completing this chapter, you will be able to: o Retrieve specified colum...
CHAPTER 7 INTRODUCTION TO STRUCTURED QUERY LANGUAGE (SQL) DR AZLEENA MOHD KASSIM CMT221 DATABASE ORGANISATION AND DESIGN LEARNING OBJECTIVES o After completing this chapter, you will be able to: o Retrieve specified columns of data from a database o Join multiple tables in a single SQL query o Restrict data retrievals to rows that match complex criteria o Aggregate data across groups of rows o Create subqueries to preprocess data for inclusion in other queries o Identify and use a variety of SQL Functions for string, numeric, and date manipulation o Explain the key principles in crafting a SELECT Query INTRODUCTION TO SQL o Table 7.1 © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. ORDER BY CLAUSE OPTIONS © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. ORDER BY CLAUSE OPTIONS oORDER BY clause is especially useful when the listing order is important o Syntax: SELECT columnlist FROM tablelist [ORDER BY columnlist [ASC|DESC] ]; o Cascading order sequence o 1. ORDER BY last name o 2. Within matching last names, ORDER BY first name o 3. Within matching first and last names, ORDER BY middle initial © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. ORDER BY Clause Options © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. ORDER BY CLAUSE OPTIONS List product description, vendor code, in date and price entered before 21-01-2012 and with price less than or equal to $50.00. Order the list by vendor code and price in descending order. © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. WHERE CLAUSE OPTIONS WHERE CLAUSE OPTIONS oSelecting rows with conditional restrictions o WHERE clause is used to add conditional restrictions to the SELECT statement that limit the rows returned by the query o Syntax: SELECT columnlist FROM tablelist [WHERE conditionlist ] [ORDER BY columnlist [ASC | DESC] ]; oUsing comparison operators on character attributes o May be used to place restrictions on character-based attributes oUsing comparison operators on dates o Date procedures are often more software-specific than other SQL procedures © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. WHERE CLAUSE OPTIONS oLogical operators: AND, OR, and NOT o SQL allows you to include multiple conditions in a query through the use of these logical operators o Boolean algebra is dedicated to the use of logical operators oOld-style joins o Generally not recommended o Make complex queries more difficult to maintain o Susceptible to undetected errors oSpecial operators o BETWEEN o IN o LIKE o IS NULL o NOT The LIKE Special Operator THE LIKE SPECIAL OPERATOR o The LIKE Special Operator is used in conjunction with wildcards to find patterns within string attributes. o use the per cent sign (%) and underscore (_) wildcard characters to make matches when the entire string is not known. o % means any and all following or preceding characters are eligible Example attribute name in employee table: Johnson, Jones, Jennifer, July, Jambu SELECT * FROM employees WHERE name LIKE 'J%'; o 'J%' includes Johnson, Jones, Jennifer, July, and Jambu. SELECT * FROM employees WHERE name LIKE ‘Jo%'; o 'Jo%' includes Johnson and Jones. SELECT * FROM employees WHERE name LIKE ‘_ _n%'; o _ _n%' includes Jones and Jennifer because they have ‘n' as the third character AGGREGATE PROCESSING AGGREGATE PROCESSING oTakes a collection of rows and reduces it to a single row o SQL provides useful aggregate functions that count, find minimum and maximum values, calculate averages, etc. oAggregate functions o Count o MIN and MAX o SUM and AVG oGrouping data o GROUP BY clause syntax: SELECT columnlist FROM tablelist [WHERE conditionlist ] [GROUP BY columnlist ] [ORDER BY columnlist [ASC | DESC] ]; AGGREGATE PROCESSING oHAVING clause o Operates very much like the WHERE clause in the SELECT statement o HAVING clause is applied to the output of a GROUP BY operation o Syntax: SELECT columnlist FROM tablelist [WHERE conditionlist ] [GROUP BY columnlist ] [HAVING conditionlist ] [ORDER BY columnlist [ASC | DESC] ]; © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. AGGREGATE PROCESSING Table 7.7 Some Basic SQL Aggregate Functions Function Output COUNT The number of rows containing non-null values MIN The minimum attribute value encountered in a given column MAX The maximum attribute value encountered in a given column SUM The sum of all values for a given column AVG The arithmetic mean (average) for a specified column AGGREGATE PROCESSING CRAFTING SELECT QUERIES oBuild one clause at a time o FROM o WHERE o GROUP BY o HAVING o SELECT o ORDER BY CRAFTING SELECT QUERIES (GROUP BY + HAVING) CRAFTING SELECT QUERIES (GROUP BY + HAVING) SUBQUERIES SUBQUERIES oKey characteristics o A subquery is a query (SELECT statement) inside another query o A subquery is normally expressed inside parentheses o The first query in the SQL statement is known as the outer query o The query inside the SQL statement is known as the inner query o The inner query is executed first o The output of an inner query is used as the input for the outer query o The entire SQL statement is sometimes referred to as a nested query oSubquery can return one or more values o One single value (one column and one row) o A list of values (one column and multiple rows) o A virtual table (multicolumn, multirow set of values) © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. SUBQUERIES ▪WHERE subqueries ▪IN subqueries ▪HAVING subqueries ▪Multirow subqueries ▪FROM subqueries ▪Attribute subqueries ▪Correlated subqueries © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. SUBQUERIES oWHERE subqueries o Most common type of subquery uses an inner SELECT subquery on the right side of a WHERE comparison expression oIN subqueries o IN operator: used to compare a single attribute to a list of values o IN subquery: values are not known beforehand, but can be derived using a query oHAVING subqueries o HAVING clause: used to restrict the output of a GROUP BY query by applying conditional criteria to the grouped rows oMultirow subquery operators: ALL and ANY o ALL operator compares a single value with a list of values returned by the first subquery using a comparison operator other than equals o ANY operator compares a single value to a list of values and select only the rows greater than or less than any value in the list © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. SUBQUERIES o FROM subqueries o FROM clause specifies the table(s) from which the data will be drawn oAttribute list subqueries o Inline subquery: subquery expression o Example: can be used to list the difference between each product’s price and the average product price oCorrelated subquery o Executes once for each row in the outer query o Inner query is related to the outer query; the inner query references a column of the outer subquery o Can also be used with the EXISTS special operator o Can be used whenever there is a requirement to execute a command based on the result of another query o Can be used with uncorrelated subqueries, but it is almost always used with correlated subqueries © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. SUBQUERIES A subquery (also called a nested query) is a query embedded within another SQL query. Outer Query SELECT column1, column2,... FROM table WHERE condition ( SELECT column-list FROM table Inner Query WHERE subquery_condition ); Enclosed in Parentheses: Subqueries are always enclosed in parentheses to separate them from the outer query Inner query will be executed first Output from inner query – used as the input for outer query SUBQUERIES -EXAMPLE SELECT a.studentid, a.name, b.total_marks FROM student a, marks b WHERE a.studentid = b.studentid AND b.total_marks > (SELECT total_marks FROM marks WHERE studentid = 'V002'); WHERE a.studentid = b.studentid AND b.total_marks > 80 SUBQUERIES -EXAMPLE SELECT a.studentid, a.name, b.total_marks FROM student a, marks b WHERE a.studentid = b.studentid AND b.total_marks > (SELECT total_marks FROM marks WHERE studentid = 'V002'); For better readability, the query can be rewritten using explicit INNER JOIN SELECT a.studentid, a.name, b.total_marks FROM student a INNER JOIN marks b ON a.studentid = b.studentid WHERE b.total_marks > (SELECT total_marks FROM marks WHERE studentid = 'V002'); oUKEA HOME Supplies – Extended Database UKEA HOME SUPPLIES - OVERVIEW Table: PRODUCT Table: LINE Table: VENDOR UKEA HOME SUPPLIES - OVERVIEW Table: CUSTOMER Table: INVOICE SUBQUERIES Take notes in CLASS for the answer SUBQUERIES When a inner subquery returns a NULL , the outer query also returns NULL SELECT P_CODE FROM PRODUCT a Single Value (1 Column, 1 Row) WHERE P_PRICE >= ( Use Case: SELECT AVG (P_PRICE) The subquery returns one scalar value (one column and one row) that the outer query uses. FROM PRODUCT ); a List of Values (1 Column, Multiple Rows) SELECT V_CODE FROM VENDOR Use Case: WHERE V_CODE IN ( The subquery returns multiple rows of values SELECT V_CODE FROM PRODUCT ); from a single column, often used with IN, ANY, or ALL. a Virtual Table (Multiple Columns and SELECT MAX(C_PRODUCT) Multiple Rows) FROM ( SELECT V_CODE, COUNT(P_CODE) AS C_PRODUCT Use Case: FROM PRODUCT GROUP BY V_CODE The subquery behaves like a temporary table (or ) AS P_COUNT; inline view) and can be used in the FROM clause. SUBQUERIES WHERE Subqueries → filter rows based on a condition that involves a subquery Find all products with a price (P_PRICE) greater than or equal to the average product price. SELECT P_CODE, P_PRICE FROM PRODUCT WHERE P_PRICE >= ( SELECT AVG(P_PRICE) FROM PRODUCT ); Calculates the average price Filters and retrieves only those (AVG(P_PRICE)) of all products in the products where P_PRICE is greater than PRODUCT table → AVG = 56.421 or equal to 56.421 SUBQUERIES WHERE Subqueries List all customers who ordered the product "Claw hammer". SELECT DISTINCT CUS_CODE, CUS_LNAME, CUS_FNAME FROM CUSTOMER JOIN INVOICE USING (CUS_CODE) JOIN LINE USING (INV_NUMBER) JOIN PRODUCT USING (P_CODE) WHERE P_CODE = ( SELECT P_CODE FROM PRODUCT LINE table WHERE P_DESCRIPT = 'Claw hammer’ ); CUS_CODE CUS_LNAME CUS_FNAME 10011 Dunne Leona 10014 Orlando Myron SUBQUERIES IN Subqueries IN subqueries are used to filter rows based on a list of values obtained from a subquery Lists all customers who have purchased a “hammer” or any kind of saw or saw blade. SELECT DISTINCT CUS_CODE, CUS_LNAME, CUS_FNAME FROM CUSTOMER JOIN INVOICE USING (CUS_CODE) compare the P_CODE not to one JOIN LINE USING (INV_NUMBER) product code (single value) but to a list JOIN PRODUCT USING (P_CODE) of product code values WHERE P_CODE IN ( SELECT P_CODE FROM PRODUCT to compare a single attribute to a list of WHERE P_DESCRIPT LIKE '%hammer%’ values, you use the IN operator OR P_DESCRIPT LIKE '%saw%’ ); SUBQUERIES HAVING Subqueries HAVING clause is used to restrict the output of a GROUP BY query by applying a conditional criteria to the grouped rows List all products with the total quantity sold greater than the average of quantity sold. We need to list all products where the total quantity sold (sum of quantities for each product) is greater than the average of quantity sold (average quantity sold across all products). Products (P_CODE): We need to group the data by products. Total quantity sold: For each product, we need to calculate the total quantity sold. Average quantity sold: We need to calculate the average of the total quantities sold across all products. Filtering based on the average: We need to filter products whose total quantity sold is greater than the average. SUBQUERIES HAVING Subqueries List all products with the total quantity sold greater than the average of quantity sold. Start with the Basic Query SELECT P_CODE, SUM(LINE_UNITS) FROM LINE GROUP BY P_CODE total quantity sold for each product (P_CODE). SUBQUERIES HAVING Subqueries List all products with the total quantity sold greater than the average of quantity sold. Calculate the Average Quantity Sold SELECT AVG(LINE_UNITS) FROM LINE; single value representing the average quantity sold across all products → AVG (LINE_UNITS) = 2.56 SUBQUERIES HAVING Subqueries List all products with the total quantity sold greater than the average of quantity sold. Combine the Total Quantity Sold and Average Quantity Sold SELECT P_CODE, SUM(LINE_UNITS) FROM LINE GROUP BY P_CODE HAVING SUM(LINE_UNITS) > ( SELECT AVG(LINE_UNITS) FROM LINE ); SUM (LINE_UNITS) > 2.56 SUBQUERIES Multirow Subquery Operators ALL ALL → Return true if all the subquery values meet the condition Suppose you want to list all products with a product cost that is greater than all individual product costs for products provided by vendors from Florida (FL). Find all products. Compare the cost of each product (calculated as P_QOH * P_PRICE, which represents the product's total cost based on quantity on hand and price). List products whose cost is greater than all products provided by vendors from Florida (FL). SUBQUERIES Multirow Subquery Operators ALL ALL → Return true if all the subquery values meet the condition Suppose you want to list all products with a product cost that is greater than all individual product costs for products provided by vendors from Florida (FL). List All Products with Their Cost Identify Products from Vendors in Florida SELECT P_CODE, P_QOH * P_PRICE SELECT V_CODE FROM PRODUCT; FROM VENDOR WHERE V_STATE = 'FL'; Compare Each Product's Cost with Products from Vendors in Florida SELECT P_QOH * P_PRICE FROM PRODUCT WHERE V_CODE IN ( SELECT V_CODE FROM VENDOR WHERE V_STATE = 'FL’ ); SUBQUERIES Multirow Subquery Operators ALL → Return true if all the subquery values meet the condition ALL Suppose you want to list all products with a product cost that is greater than all individual product costs for products provided by vendors from Florida (FL). Use ALL to Filter Products with Greater Cost SELECT P_QOH * P_PRICE FROM PRODUCT SELECT P_CODE, P_QOH * P_PRICE FROM PRODUCT WHERE V_CODE IN ( SELECT V_CODE WHERE P_QOH * P_PRICE > ALL ( FROM VENDOR WHERE V_STATE = 'FL’ ); SELECT P_QOH * P_PRICE FROM PRODUCT WHERE V_CODE IN ( SELECT V_CODE FROM VENDOR WHERE V_STATE = 'FL’ ) ); SUBQUERIES Multirow Subquery Operators ALL SELECT P_CODE, P_QOH * P_PRICE FROM PRODUCT WHERE P_QOH * P_PRICE > ALL ( SUBQUERY A SELECT P_QOH * P_PRICE FROM PRODUCT WHERE V_CODE IN ( SELECT V_CODE SUBQUERY B FROM VENDOR WHERE V_STATE = 'FL’ ) ); > 2159.1 SUBQUERIES Multirow Subquery Operators ALL If the same Question but change to < ALL “Less than ALL” is equivalent to all values “less than < 467.4 lowest in the list” SUBQUERIES Multirow Subquery Operators Return true if any of the subquery values meet the condition ANY Select only rows where a value is greater than or less than any value in a list List products with cost more than any individual products provided by vendors from Florida SUBQUERIES Attribute List Subqueries An Attribute List Subquery is a type of subquery that returns a list of attributes (columns) rather than a single value. SELECT statement uses the attribute list to indicate what columns to project in the resulting set SUBQUERIES Correlated Subqueries Executes once for each row in the outer query Inner query is related to the outer query; the inner query references a column of the outer subquery List all product sales in which the units sold value is greater than the average units sold value for that product (as opposed to the average for all products). STEPS: 1. Compute the average-units-sold value for a product. 2. Compare the average computed in Step 1 to the units sold in each sale row and then select only the rows in which the number of units sold is greater. SUBQUERIES Correlated Subqueries SUBQUERIES oCorrelated Subqueries SQL Functions SQL FUNCTIONS oSQL functions are very useful tools o Many types oDate and time functions o All date functions take one parameter of a date or character data type and return a value © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. SQL FUNCTIONS oNumeric functions o Can be grouped in many different ways, such as algebraic, trigonometric, and logarithmic © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. SQL FUNCTIONS oString functions o Among the most-used functions in programming © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. SQL FUNCTIONS oString functions SQL FUNCTIONS oConversion functions o Allow you to take a value of a given data type and convert it to the equivalent value in another data type RELATIONAL SET OPERATORS © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. RELATIONAL SET OPERATORS oUNION o Combines rows from two or more queries without including duplicate rows o Syntax: query UNION query oUNION ALL o Used to produce a relation that retains the duplicate rows o Used to unite more than just two queries © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. RELATIONAL SET OPERATORS oINTERSECT o Can be used to combine rows from two queries, returning only the rows that appear in both sets o Syntax: query INTERSECT query © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. RELATIONAL SET OPERATORS oEXCEPT (MINUS) o Combines rows from two queries and returns only the rows that appear in the first set but not in the second o Syntax: query EXCEPT query or query MINUS query © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. RELATIONAL SET OPERATORS UNION NO DUPLICATE Query: SELECT student_id, interest_name, 'Music' AS category FROM music_interests UNION SELECT student_id, interest_name, 'Sport' AS category You are given two tables, music_interests and FROM sports_interests; sports_interests, each containing a list of student interests. The music_interests table contains the music-related interests, and the sports_interests table RESULTS: contains the sports-related interests. Neither table has an interest_type column. student_id interest_name category 1 K-pop Music 2 EDM Music Write a SQL query that combines both tables using 3 Rock Music UNION. In the output, add a column called category to indicate whether the interest comes from the 4 Classical Music Music music_interests table or the sports_interests table. 101 Football Sport The final result should contain all student interests 102 Badminton Sport from both categories, with no duplicates. 3 Rock Climbing Sport 104 Basketball Sport RELATIONAL SET OPERATORS UNION ALL WITH DUPLICATE RELATIONAL SET OPERATORS INTERSECT List the idol_name and group_name, where the idol_id is in the result of the I N T E R S E C T w i t h t h e c o l l a b o r a t i o n s t a b l e. Yo u n e e d t o f i n d i d o l s p a r t i c i p a t i n g i n dance and vocal collaborations. SELECT idol_name, group_name FROM idols WHERE idol_id IN ( SELECT idol_id FROM collaborations WHERE collab_type = 'Dance’ INTERSECT SELECT idol_id FROM collaborations WHERE collab_type = 'Vocal' ); Final Result: idol_name group_name Jisoo Blackpink Jennie Blackpink Jungkook BTS List the idol_id of idols who are part of Blackpink group and have participated in Dance collaborations (from the collaborations table). SELECT idol_id, idol_name FROM idols WHERE group_name = 'Blackpink' INTERSECT SELECT idol_id, idol_name FROM collaborations JOIN idols ON collaborations.idol_id = idols.idol_id WHERE collab_type = 'Dance'; Final Result: idol_id idol_name 1 Jisoo 2 Jennie 3 Lisa RELATIONAL SET OPERATORS MINUS ACTIVITY Q2: What Will happen to the result if the query is modified this way? RELATIONAL SET OPERATORS Q2: What Will happen to the result if the query is modified this way? CRAFTING SELECT QUERIES oKnow your data o The importance of understanding the data model that you are working in cannot be overstated o Real-world databases are messy; most database systems remain in service in an organization for decades oKnow the problem o Understand the question you are attempting to answer o Information reporting requests will come from a range of sources; may be one-time events or ongoing operations within an application SUMMARY oSQL commands can be divided into two overall categories: data definition language (DDL) commands and data manipulation language (DML) commands oThe ANSI standard data types are supported by all RDBMS vendors in different ways o The basic data types are NUMBER, NUMERIC, INTEGER, CHAR, VARCHAR, and DATE oThe SELECT statement is the main data retrieval command in SQL oThe column list represents one or more column names separated by commas oOperations that join tables can be classified as inner joins and outer joins oA natural join returns all rows with matching values in the matching columns and eliminates duplicate columns oJoins may use keywords such as USING and ON © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. SUMMARY oThe ORDER BY clause is used to sort the output of a SELECT statement oThe WHERE clause can be used with the SELECT, UPDATE, and DELETE statements to restrict the rows affected by the DDL command oAggregate functions (COUNT, MIN, MAX, and AVG) are special functions that perform arithmetic computations over a set of rows oSubqueries and correlated queries are used when it is necessary to process data based on other processed data oMost subqueries are executed in a serial fashion oSQL functions are used to extract or transform data oSQL provides relational set operators to combine the output of two queries to generate a new relation oCrafting effective and efficient SQL queries requires a great deal of skill © 2019 Cengage. May not be copied, scanned, or duplicated, in whole or in part, except for use as permitted in a license distributed with a certain product or service or otherwise on a password-protected website for classroom use. END