Databases Notes PDF

NOTATKI ZE SLAJDÓW database system= external storage + DBMS + query language DBMS - software that manages access to databases // not only manages the access but mainly manages the database system overall Query Language = DDL + DML + DCL DDL (Data Definition Language) – defines data structures to be manipulated on by DML DML (Data Manipulation Language) – performs operations (fetching and changing) on data defined by DDL DCL (Data Control Language)– controls access to data (e.g. security, concurrency, …) Isolating the database system architecture levels (logical, internal, external) allows to achieve independence of a program from data: Logical data independence - resistance of external schemas (views) to changes made in the conceptual schemas (like ERD) Physical data independence - resistance of a conceptual schema to changes made in the internal (physical) schema (like disk files). Architecture of a database system: External (user) level - A set of external views (external schemas) through which users see the contents of the database. The view is an abstracted fragment of the database, defined in the terms of the query language. Logical (conceptual, model) level - Contains a conceptual model (conceptual scheme) of the database. This model covers the entire contents of the database (relations, objects, …) as seen by the database administrator or database owner. Internal (physical) level - The physical representation of the database in the form of disk files and algorithms to access them. It is a low-level representation (level of memory units, e.g. bytes or disk pages). When creating a new table in the database you need to specify: name of the table (unique in the database) column names (unique in the table) data type of each column. Database taxonomy- grouped by number of nodes or data model. Number of nodes: centralized distributed federated Data model: relational object-oriented, object-relational semi-structural NoSQL DBMS functions: data access data integrity concurrency control (resolving conflicts during simultaneous access to the same data) data security recovery of data (performing restart, restoring data from backups) data distribution (if one database is physically placed on different computer systems) parallel data processing Database Dictionary - a special database containing information about data (metadata) stored in other (utility) databases. It contains: descriptions of the structure of database objects (e.g. table and views schemas,...) user descriptions (identifiers, permissions,...) definitions of data formats used in the database statistical data on data values in the database, used for query optimization Access to it: An ordinary user has only read-only permissions, restricted to certain parts of Data Dictionary DBA can have wider access to the Data Dictionary, but its modification can perform only through system procedures DBMS has full access to the Data Dictionary (e.g. it modifies DD when creating a new object in the database) DBA (DataBase Administrator) - A person that controls the access to the database and its functioning. DBA Tasks: giving users and groups of users access to the database (permissions to create database objects, fetching and modify them, etc.) monitoring the performance of the database defining the strategy of archiving and recovering data after failures launching new data access strategies DBA Tools: programs to load a database from external files data archiving programs programs for restoring data from the archive programs for collecting statistics and analyzing them procedures for defining users and user groups programs for operating on a Data Dictionary programs for analyzing the database log From ERD to relational database schema I. Representing entity sets - each entity set is represented as a relation with the primary key which is the same as the primary key of the entity set. II. Representing relationships between entity sets one-to-one relationship - Represented by a foreign key inserted to any of the two relations. one-to-many relationship - Represented by a foreign key inserted to the relation on the n side of the relationship. many-to-many relationship - Represented by a separate relation with the primary key composed of primary keys of the related entity sets. SQL interactive - Instructions are input interactively; results are immediately presented to the user in a format determined by DBMS (SQL does not have any I/O or formatting instructions). SQL embedded - Instructions are included in the program written in another language; results can be passed to variables of the program for further processing. Aggregate functions: (są też przykładowe w selectach below) COUNT() - number of rows returned by the query AVG() - arithmetic average of values in a numerical column SUM() - sum of values in a numerical column MIN() - minimal value in a character, numerical or date/ datetime column MAX() - maximal value in a character, numerical or date / datetime column other statistical functions (optionally) DBMS-defined functions for: date / datetime values, strings, text fields, large binary objects, etc. Selfjoin - joining a table with itself. It enables us to correlate data that are stored in different rows of the same table. To perform a selfjoin you must use table aliases (correlation names). In a subquery, the SELECT statement can be nested in the WHERE or HAVING clause. A nested SELECT statement, called an internal query, provides values for the WHERE or HAVING condition. The SELECT statement containing the subquery is an external query. Unioning Tables: Summing tables treated as sets of rows (the UNION operator of relational algebra). The tables must be compatible (the same number of columns with matching datatypes). Notes: 1. Duplicate rows are eliminated from the result table unless ALL is specified. 2. If the names of the columns of the unioned tables are different, the result table assumes the names of the columns from the first table. 3. If there is an ORDER BY clause, the numbers must be specified instead of column names Subtracting tables treated as sets of rows (the MINUS operator of relational algebra). The tables must be compatible (the same number of columns with matching datatypes). Notes: 1. Duplicate rows are eliminated from the result table unless ALL is specified. 2. If the names of the columns of the subtracted tables are different, the result table assumes the names of the columns from the first table. 3. If there is an ORDER BY clause, the numbers must be specified instead of column names Intersecting tables treated as sets of rows (the INTERSECT operator of relational algebra). The tables must be compatible (the same number of columns with matching datatypes). What is the index? It is a table that contains the key values in one column and in the other column disk addresses (record numbers) of rows in the indexed table that contain these values. What is its purpose? 1. It allows you to increase the efficiency of data access from the table (SELECT statement). 2. Organizes the rows of the table in a specific order. 3. Prevents duplicate values in the rows of the table. When to create an index? 1. When the table is often processed sequentially according to the values of a certain column (ascendingly or descendingly). 2. When it is necessary to ensure the uniqueness of the column(s) values. 3. When access to the table is often carried out according to values of a column or columns (associative access). What is the cost of indices? 1. Storage cost (keeping additional data). 2. Time cost (the need to update the index while updating the table by INSERT, UPDATE, DELETE statements. View (virtual table) A combination of rows and columns selected from one or several tables. After creating a view, you can perform SELECT statements on it as on a regular table. In the database (in the data dictionary) only the definition of the view is stored. The view is materialized when it is needed. The benefits of using views: The same data can be seen differently by different users. They allow you to focus on relevant data and ignore others. They provide an additional level of independence from data: If something changes in the database schema, it is often enough to change the view definition without changing the program. They introduce data protection by hiding data: Users see only views, i.e. only those data to which they have the right of access. Differences between views and tables: Normally, views do not contain data; they contain only information on how to collect data if they are needed (i.e. from which tables, from which columns and how). Views can not be indexed (but you can index tables included in the view definition). In general, you cannot update data through a view. SELECTY podstawowe ZAPYTANIE DZIAŁANIE SELECT“text” wypisuje text SELECT2+2 wypisuje 4 SELECT* FROMtable wyświetla całą tabelę SELECTcol1,col2 FROMtable wyświetla kolumnę col1 i col2 w określonej kolejności SELECTDISTINCTcol FROMtable wyświetla kolumnę col bez powtarzania wielokrotnych wystąpień wartości SELECTcol1,col2 FROMtable WHERE wyświetla podane kolumny z podanej tabeli z wierszy, col3=’cośtam’ ANDcol4=123 które spełniają warunek (w col3 mają wpisane costam oraz 123 w col4); (analogicznie tak jak AND może być też O R warunek może też być nierównością (np. year>=1998) albo nawet O R NOT) SELECTcol1,col2 FROMtable WHERE -||-; warunek dodatkowy: w col4 musi znajdować się col3=’cośtam’ ANDcol4 IN(‘bułka’,’makaron’) bułka lub makaron (to skrócona wersja zapisu “col4=’bułka’ OR col4=’makaron’) (...) ANDcol4 NOT IN(‘oliwki’,’rodzynki’) w col4 musi być coś innego niż oliwki lub rodzynki SELECTcol1,col2 FROMtable (...) gdzie treści w kolumnie col1 zaczynają się literą A WHEREcol1 LIKE‘A%’ (“%” to pozostała część treści i jest nieistotna) (może być też NOT LIKE) (...) LIKE‘A___D’ to znaczy, że szukamy wyrazu, który zaczyna się na A, a na piątym miejscu ma D (między nimi jest odpowiednia ilość powtórzeń podkreślnika) SELECT* FROMtable operator BETWEEN sprawdza, czy podany numer WHEREnumber BETWEENnr1 ANDnr2 należy do podanego przedziału SELECTcol1, col2 FROMtable WHERE wyświetla wspólnie wyniki dla 2 różnych zapytań. col3=cośtam Muszą być koniecznie 2 te same kolumny, UNIONSELECTcol1, col2 FROMtable niekoniecznie tablice i warunki. (warunki mogą być WHEREcol4=cośtam innego nierównościami) UNION ALL zezwala na duplikaty (bo sam UNION zwraca tylko distinct wartości) SELECTcostam FROMtable wyświetla costam z tabeli uporządkowane malejąco ORDER BYyear DESC, name względem roku, a następnie normalnie alfabetycznie względem imienia SELECTcol1 AS“costam”, col2 ASkolumna wyświetla kolumnę col1 pod nową nazwą costam, a FROMtable col2 pod nazwą kolumna GROUP BYcol wyświetla dane pogrupowane względem zawartości kolumny col SELECT* FROMcostam1 łączy dwie tabele (costam1 i costam2) na podstawie ID, JOINcostam2 ONcostam1.id = costam2.id które jest wspólne w obu tabelach alternatywnie:JOINcostam2 USING(id) WHEREnumber = (SELECT MIN(number) SELECT MIN - szuka najmniejszego numeru FROM table) SELECTAVG(col1) FROMtable AVG - wyświetla średnią wartości z podanej kolumny SELECT SUM(col1) FROMtable SUM - zwraca sumę wartości z podanej kolumny SELECTCOUNT(col1) count(*) zwraca ilość wierszy w całej tabeli, lub count(*) count([nazwa_kolumny]) zwraca ilość nie-NULL wierszy w danej kolumnie SELECT COUNT(DISTINCT col1) zlicza ilość wierszy w kolumnie col1 bez powtórzeń SELECT*FROMtable tu ogólnie chodzi o to, że on sobie sortuje wszystko w ORDERBY taki sposób, że najpierw rok, potem nazwa, ale jeśli w CASE danym wierszu w kolumnie “col” znajduje się x lub y, to WHENcol IN('x',’y')THEN1 wtedy te wiersze są wyświetlane na samym końcu (bo ELSE0 do tych wierszy przypisaliśmy cyferkę 1, a że 1>0 oraz END,year, name chcemy sortować wszystko rosnąco, to tak to jest) HAVING MAX (number) < 2000 “pod warunkiem, że największy z tabeli number jest mniejszy od 2000” - HAVING to takie WHERE, ale użyte z aggregate functions (bo WHERE nie można z nimi używać) albo po GROUP BY LIKE“J$?%%” ESCAPE“?” denotes the string that begins with the letter “J” followed by one dollar sign followed by one percent sign followed by any string of 0 or more characters.

Databases Notes PDF

Document Details

Tags

Related

Summary

Full Transcript