Data_management3.pdf
Document Details
Uploaded by CapableAmethyst
Tags
Full Transcript
3.1 SPECIAL OPERATORS AND CLAUSES IN operator The IN operator is used in a WHERE clause to determine if a value matches one of several values. BETWEEN operator The BETWEEN operator provides an alternative way to determine if a value is between two other values. The operator is written value BETWEEN...
3.1 SPECIAL OPERATORS AND CLAUSES IN operator The IN operator is used in a WHERE clause to determine if a value matches one of several values. BETWEEN operator The BETWEEN operator provides an alternative way to determine if a value is between two other values. The operator is written value BETWEEN minValue AND maxValue and is equivalent to value >= minValue AND value <= maxValue. LIKE operator The LIKE operator, when used in a WHERE clause, matches text against a pattern using the two wildcard characters % and _. % matches any number of characters. Ex: LIKE 'L%t' matches "Lt", "Lot", "Lift", and "Lol cat". ● _ matches exactly one character. Ex: LIKE 'L_t' matches "Lot" and "Lit" but not "Lt" and "Loot". The LIKE operator performs case-insensitive pattern matching by default or case-sensitive pattern matching if followed by the BINARY keyword. Ex: LIKE BINARY 'L%t' matches 'Left' but not 'left'. ● To search for the wildcard characters % or _, a backslash (\) must precede % or _. Ex: LIKE 'a\%' matches "a%". DISTINCT clause The DISTINCT clause is used with a SELECT statement to return only unique or 'distinct' values. ORDER BY clause A SELECT statement selects rows from a table with no guarantee the data will come back in a certain order. The ORDER BY clause orders selected rows by one or more columns in ascending (alphabetic or increasing) order. The DESC keyword with the ORDER BY clause orders rows in descending order. 3.2 SIMPLE FUNCTIONS Numeric functions A function operates on an expression enclosed in parentheses, called an argument, and returns a value. Usually, the argument is a simple expression, such as a column name or fixed value. Some functions have several arguments, separated by commas, and a few have no arguments at all. Each function operates on, and evaluates to, specific data types. Function Description Example SELECT ABS(-5); ABS(n) Returns the absolute value of n returns 5 SELECT LOG(10); Returns the natural logarithm of n LOG(n) returns 2.302585092994046 SELECT POW(2, 3); POW(x, y) Returns x to the power of y returns 8 SELECT RAND(); Returns a random number between 0 (inclusive) and 1 (exclusive) RAND() returns 0.11831825703225868 SELECT ROUND(16.25, 1); ROUND(n, d) Returns n rounded to d decimal places returns 16.3 SELECT SQRT(25); SQRT(n) Returns the square root of n returns 5 String functions String functions manipulate string values. SQL string functions are similar to string functions in programming languages like Java and Python. Function CONCAT(s1, s2, ...) Description Returns the string that results from concatenating the string arguments Example SELECT CONCAT('Dis', 'en', 'gage'); returns 'Disengage' SELECT LOWER('MySQL'); LOWER(s) Returns the lowercase s returns 'mysql' REPLACE(s, from, to) Returns the string s with all occurrences of from replaced with to SELECT REPLACE('This and that', 'and', 'or'); returns 'This or that' SUBSTRING(s, pos, len) Returns the substring from s that starts at position pos and has length len SELECT SUBSTRING('Boomerang', 1, 4); returns 'Boom' Returns the string s without leading and trailing spaces TRIM(s) SELECT TRIM(' '); test returns 'test' SELECT UPPER('mysql'); UPPER(s) Returns the uppercase s returns 'MYSQL' Date and time functions Date and time functions operate on DATE, TIME, and DATETIME data types. Function Description Example SELECT CURDATE(); CURDATE() CURTIME() NOW() Returns the current date, time, or date and time in 'YYYY-MM-DD', 'HH:MM:SS', or 'YYYY-MM-DD HH:MM:SS' format returns '2019-01-25' SELECT CURTIME(); returns '21:05:44' SELECT NOW(); returns '2019-01-25 21:05:44' SELECT DATE('2013-03-25 22:11:45'); DATE(expr) TIME(expr) Extracts the date or time from a date or datetime expression expr returns '2013-03-25' SELECT TIME('2013-03-25 22:11:45'); returns '22:11:45' SELECT DAY('2016-10-25'); returns 25 DAY(d) MONTH(d) YEAR(d) Returns the day, month, or year from date d SELECT MONTH('2016-10-25'); returns 10 SELECT YEAR('2016-10-25'); returns 2016 SELECT HOUR('22:11:45'); returns 22 HOUR(t) MINUTE(t) SECOND(t) SELECT MINUTE('22:11:45'); Returns the hour, minute, or second from time t returns 11 SELECT SECOND('22:11:45'); returns 45 SELECT DATEDIFF('2013-03-10', '2013-03-04'); DATEDIFF(expr1, expr2) TIMEDIFF(expr1, expr2) Returns expr1 - expr2 in number of days or time values, given expr1 and expr2 are date, time, or datetime values returns 6 SELECT TIMEDIFF('10:00:00', '09:45:30'); returns 00:14:30 3.3 AGGREGATE FUNCTIONS Aggregate functions An aggregate function processes values from a set of rows and returns a summary value. Common aggregate functions are: ● COUNT() counts the number of rows in the set. ● MIN() finds the minimum value in the set. ● MAX() finds the maximum value in the set. ● SUM() sums all the values in the set. ● AVG() computes the arithmetic mean of all the values in the set. Aggregate functions appear in a SELECT clause and process all rows that satisfy the WHERE clause condition. If a SELECT statement has no WHERE clause, the aggregate function processes all rows. GROUP BY clause Aggregate functions are commonly used with the GROUP BY clause. The GROUP BY clause consists of the GROUP BY keyword and one or more columns. Each simple or composite value of the column(s) becomes a group. The query computes the aggregate function separately, and returns one row, for each group. HAVING clause The HAVING clause is used with the GROUP BY clause to filter group results. The optional HAVING clause follows the GROUP BY clause and precedes the optional ORDER BY clause. 3.4 JOIN QUERIES Joins In relational databases, reports are commonly generated from data in multiple tables. Multi-table reports are written with join statements. A join is a SELECT statement that combines data from two tables, known as the left table and right table, into a single result. The tables are combined by comparing columns from the left and right tables, usually with the = operator. The columns must have comparable data types. Usually, a join compares a foreign key of one table to the primary key of another. However, a join can compare any columns with comparable data types. Prefixes and aliases Occasionally, join tables contain columns with the same name. When duplicate column names appear in a query, the names must be distinguished with a prefix. The prefix is the table name followed by a period. Use of a prefix makes column names more complex. To simplify queries or result tables, a column name can be replaced with an alias. The alias follows the column name, separated by an optional AS keyword. Inner and full joins A join clause determines how a join query handles unmatched rows. Two common join clauses are: ● INNER JOIN selects only matching left and right table rows. ● FULL JOIN selects all left and right table rows, regardless of match. In a FULL JOIN result table, unmatched left table rows appear with NULL values in right table columns, and vice versa. The join clause appears between a FROM clause and an ON clause: ● The FROM clause specifies the left table. ● The INNER JOIN or FULL JOIN clause specifies the right table. ● The ON clause specifies the join columns. An optional WHERE clause follows the ON clause. Left and right joins In some cases, the database user wants to see unmatched rows from either the left or right table, but not both. To enable these cases, relational databases support left and right joins: ● LEFT JOIN selects all left table rows, but only matching right table rows. ● RIGHT JOIN selects all right table rows, but only matching left table rows. An outer join is any join that selects unmatched rows, including left, right, and full joins. The UNION keyword combines the two results into one table. 3.5 EQUIJOINS, SELF-JOINS, AND CROSS-JOINS Equijoins An equijoin compares columns of two tables with the = operator. Most joins are equijoins. A non-equijoin compares columns with an operator other than =, such as < and >. Self-joins A self-join joins a table to itself. A self-join can compare any columns of a table, as long as the columns have comparable data types. If a foreign key and the referenced primary key are in the same table, a self-join commonly compares those key columns. In a self-join, aliases are necessary to distinguish left and right tables. Cross-joins A cross-join combines two tables without comparing columns. A cross-join uses a CROSS JOIN clause without an ON clause. As a result, all possible combinations of rows from both tables appear in the result. 3.6 SUBQUERIES Subqueries A subquery, sometimes called a nested query or inner query, is a query within another SQL query. The subquery is typically used in a SELECT statement's WHERE clause to return data to the outer query and restrict the selected results. The subquery is placed inside parentheses (). Correlated subqueries A subquery is correlated when the subquery's WHERE clause references a column from the outer query. In a correlated subquery, the rows selected depend on what row is currently being examined by the outer query. An alias can also help differentiate the columns. An alias is a temporary name assigned to a column or table. The AS keyword follows a column or table name to create an alias. EXISTS operator Correlated subqueries commonly use the EXISTS operator, which returns TRUE if a subquery selects at least one row and FALSE if no rows are selected. The NOT EXISTS operator returns TRUE if a subquery selects no rows and FALSE if at least one row is selected. Flattening subqueries Many subqueries can be rewritten as a join. Most databases optimize a subquery and outer query separately, whereas joins are optimized in one pass. So joins are usually faster and preferred when performance is a concern. Replacing a subquery with an equivalent join is called flattening a query. 3.7 COMPLEX QUERY EXAMPLE 3.8 VIEW TABLES Creating views Table design is optimized for a variety of reasons, such as minimal storage, fast query execution, and support for relational and business rules. Occasionally, the design is not ideal for database users and programmers. View tables solve this problem. Views restructure table columns and data types without changes to the underlying database design. A view table is a table name associated with a SELECT statement, called the view query. The CREATE VIEW statement creates a view table and specifies the view name, query, and, optionally, column names. If column names are not specified, column names are the same as in the view query result table. Querying views A table specified in the view query's FROM clause is called a base table. Unlike base table data, view table data is not normally stored. Instead, when a view table appears in an SQL statement, the view query is merged with the SQL query. The database executes the merged query against base tables. In some databases, view data can be stored. A materialized view is a view for which data is stored at all times. Whenever a base table changes, the corresponding view tables can also change, so materialized views must be refreshed. Advantages of views View tables have several advantages: ● ● Protect sensitive data. A table may contain sensitive data. A view can exclude sensitive columns but include all other columns. Authorizing users and programmers access to the view but not the underlying table protects the sensitive data. Save complex queries. Complex SELECT statements can be saved as a view. Database users can reference the view without writing the SELECT statement. Save optimized queries. Often, the same result table can be generated with equivalent SELECT statements. Although the results of equivalent statements are the same, performance may vary. To ensure fast execution, the optimal statement can be saved as a view and distributed to database users. For the above reasons, views are supported in all relational databases and are frequently created by database administrators. Database users need not be aware of the difference between view and base tables. ● Inserting, updating, and deleting views View tables are commonly used in SELECT statements. Using views in INSERT, UPDATE, and DELETE statements is problematic: Primary keys. If a base table primary key does not appear in a view, an insert to the view generates a NULL primary key value. Since primary keys may not be NULL, the insert is not allowed. ● Aggregate values. A view query may contain aggregate functions such as AVG() or SUM(). One aggregate value corresponds to many base table values. An update or insert to the view may create a new aggregate value, which must be converted to many base table values. The conversion is undefined, so the insert or update is not allowed. ● Join views. In a join view, foreign keys of one base table may match primary keys of another. A delete from a view might delete foreign key rows only, or primary key rows only, or both the primary and foreign key rows. The effect of the join view delete is undefined and therefore not allowed. WITH CHECK OPTION clause Databases that allow view updates face one particularly bothersome behavior. A view insert or update may create a row that does not satisfy the view query WHERE clause. In this case, the inserted or updated row does not appear in the view table. From the perspective of the database user, the insert or update appears to fail even though the base tables have changed. ● To prevent inserts or updates that appear to fail, databases that support view updates have an optional WITH CHECK OPTION clause. When WITH CHECK OPTION is specified, the database rejects inserts and updates that do not satisfy the view query WHERE clause. Instead, the database generates an error message that explains the violation. 3.9 RELATIONAL ALGEBRA Relational algebra has nine operations. Each operation is denoted with a special symbol, often a letter of the Greek alphabet. Operation symbols can be combined with tables in expressions, just as + - × / can be combined with numbers in arithmetic expressions. Each relational algebra expression is equivalent to an SQL query and defines a single result table. Operation Symbol Greek letter Derivation Select � sigma corresponds to Latin letter s, for Select Project Π Pi corresponds to Latin letter P, for Project Product × multiplication symbol Join ⋈ multiplication symbol with vertical bars Union ∪ set theory Intersect ∩ set theory Difference − set theory Rename � rho corresponds to Latin letter r, for Rename Aggregate � gamma corresponds to Latin letter g, for group