Introduction to Databases PDF
Document Details
Tags
Summary
This document is a lecture on the fundamentals of database systems. It covers various aspects including the historical context, definitions, models, languages, and functions of database management systems (DBMS).
Full Transcript
Introduction to Databases File-based Definition Program defines and manages it’s own data Limitations of File-based Separation and isolation Duplication Program & data dependence Fixed queries Proliferation of application programs History of D...
Introduction to Databases File-based Definition Program defines and manages it’s own data Limitations of File-based Separation and isolation Duplication Program & data dependence Fixed queries Proliferation of application programs History of Database Systems First generation – Hierarchical model Information Management System (IMS) – Network model Conference on Data System Languages (CODASYL) Data Base Task Group (DBTG) – Limitation Complex program for simple query Minimum data independence No theoretical foundation Second generation – Relational model E. R. Codd DB2, Oracle – Limitation Limited data modeling Third generation – Object-relational DBMS – Object-oriented DBMS Database Definition – A collection of self-describing and integrated data files System catalog – Meta data – Data dictionary – Overhead data Data abstraction Database Management System Facility Data definition language (DDL) Data manipulation language (DML) Structured query language (SQL) Security system Integrity system Concurrency control system Backup & recovery system View mechanism DBMS Environment Hardware – Client-server architecture Software – dbms, os, network, application Data – Schema, subschema, table, attribute People – Data administrator & database administrator – Database designer: logical & physical – Application programmer – End-user: naive & sophisticated Procedure – Start, stop, log on, log off, back up, recovery Advantages of DBMS Control redundancy Consistency Integrity Security Concurrency control Backup & recovery Data standard More information Data sharing & conflict control Productivity & accessibility Economy of scale Maintenance Limitations of DBMS Complexity Size Cost – Software – Hardware – Conversion Performance Vulnerability Database Environment Three-Level ANSI-SPARC Architecture External level – User's view – Data gathering Conceptual level – Organization view (entity, attribute, & relationship) – Constraints and security – Entity-relationship diagram – Normalization Internal level – Physical presentation – Storage, index, compression, & encryption – File organization Database Schemas Database schema (intension) & database instance (extension) – External schemas (subschema) – Conceptual schema (database schema) – Internal schema Mapping External/conceptual mapping Conceptual/internal mapping Example Data Independence Logical data independence Physical data independence Database Languages Data Definition Language (DDL) – Structured Query Language (SQL) Data Manipulation Language (DML) – Procedural DML – Non-procedural DML Structured Query Language (SQL) Query-by-Example (QBE) Fourth-Generation Language – Form, report, graphics, & application generators Data Models Definition – Integrated concept for describing data, relationships and constraints Types – Object-based data models – Record-based data models – Physical data models (internal structure, ordering, & paths) Object-Based Data Models Entity-relationship – Entity, attribute, relationship – Usage: documentation Object-oriented – Object, class, subclass, inheritance, state (attributes), behavior (methods or actions), encapsulation, message, polymorphism – Usage: building software Record-Based Data Models Relational data model Network model Hierarchical Model Relational Data Model Terminology – Relations – Attributes – Tuples Record relationship – One-to-many relationship Usage – Ad hoc reporting Network Model CODASYL DBTG Terminology – Data item & group item – Record type: owner, member – Set type: optional, mandatory, permanent Record relationship – Many-to-many relationship – Link or pointer between set owner and set member Usage – Large volume transaction processing Hierarchical Model IBM Information Management Systems Terminology – Tree, general tree, & subtree – Nodes, root node, parent node, and child node – Segment type ( root, parent, child) Record relationship – One-to-many relationship – Link or pointer between parent node and child node – No many-to-many relationship Usage – Large volume transaction processing Functions of a DBMS Data storage, retrieval and update A user-accessible catalog Transaction support Concurrency control Recovery Authorization Integrity Data independence Data communication Other Utilities Components of a DBMS Query processor Database manager File manager DML preprocessor DDL compiler Catalog manager Components of Database Manager Authorization control Command processor Integrity checker Query optimizer Transaction manager Scheduler Recovery manager Buffer manager Teleprocessing - I Two-tier client-server architecture – Client: user interface, business and data processing logic – Database server: data validation and database access – Advantage Accessing to distributed database Increasing performance and consistency Lowering server and communication cost Supporting open systems architecture Teleprocessing - II Three-tier client-server architecture – Client: user interface – Application server: business and processing logic – Database server: data validation and database access – Advantage Reducing client cost Software distribution Maintenance cost Balancing load Teleprocessing - III Transaction processing (TP) monitor – Transaction manager between client and server – Advantage Transaction routing Distributed transaction for load balancing The Relational Model Agenda Concept Property Key Integrity Constraint Base Relation & View Concept Relation Attribute Domain Tuple Degree Cardinality Relational Database Property Unique relation name Unique attribute No duplication tuples No repeating group (single value for an attribute) Same domain for an attribute Insignificance of the tuples or attributes order Key Superkey Candidate key (Alternative key) Primary key Foreign key Integrity Constraint Null Integrity constraint – Entity – Reference – Domain (edit or field) – Enterprise (business rule) Base Relation & View Base relation – A named relation of an entity in conceptual schema – Tuples stored in physical database View – Virtual relation – Dynamic generated for end user – Not stored in the physical database Purpose of views – Security – User friendly report – Performance Points To Remember Concept Property Key Integrity Constraint Base Relation & View Relational Algebra Relational Languages Procedural language – Relational algebra Non-procedural language – Relational Calculus Other Languages – Transform-oriented languages – Graphical languages – Fourth-generation languages – Fifth-generation languages Relational Algebra Intra Relation Selection - horizontal partition – SELECT relation WHERE condition [GIVING relation] – condition(relation) Projection - vertical partition – PROJECT relation OVER (attribute, attribute,...)[GIVING relation] – attributes(relation) Union-Compatible Union - tuples in either of 2 union compatible relations – relation1 UNION relation2 [GIVING relation] – relation1 relation2 Difference - tuples in one union compatible relation, but not other —relation1 MINUS relation2 [GIVING relation] —relation1 - relation2 Union-Compatible Intersection - tuples in both of 2 union compatible relations – relation1 INTERSECT relation2 [GIVING relation] – relation1 relation2 Any 2 Relations Product - Cartesian product of any two relations – relation1 TIMES relation2 [GIVING relation] – relation1 relation2 Join - common domain – relation1 JOIN relation2 WHERE condition [GIVING relation] – relation1 condition relation2 Division - same pattern – relation1 divided by relation2 [GIVING relation] – relation1 / relation2 Types of Join Equi - based on equality with 2 identical attribute columns Theta - a join based on a relational operator (=,,=,~=) Natural - equijoin with no duplicate column Outer - includes tuples without matches (left, right, full) Types of Outer Join Left outer join - including every tuple of the left (first) relation in the solution Right outer join - including every tuple of the right (second) relation in the solution Full outer join - including every tuple of the left (first) relation and right (second) relation in the solution Types of Join Semi-join: only the tuple of the left (first) relation with match – relation1 SEMIJOIN relation2 WHERE condition [GIVING relation] – relation1 lxcondition relation2 Semi-theta-join Semi-natural join