Lecture2 database.pdf

Document Details

Uploaded by Deleted User

Tags

database systems relational model data models computer science

Full Transcript

ICDT1202Y DATABASE SYSTEMS Lecture 2 Relational Data Model 1 Learning Outcomes Differentiate between the various data models. Explain the properties of a relation. Define relational keys. Explain relational constraints. Differentiate be...

ICDT1202Y DATABASE SYSTEMS Lecture 2 Relational Data Model 1 Learning Outcomes Differentiate between the various data models. Explain the properties of a relation. Define relational keys. Explain relational constraints. Differentiate between database instances and schemas. Describe a two tier and three tier architecture. 2 Data Models A data model is a collection of high level description concepts for describing data. It hides many low-level storage details. It is a set of concepts to describe the structure of a database, and certain constraints that the database should obey. A schema is a description of a particular collection of data, using the given data model. 3 Categories of data models (1) Conceptual (high-level, semantic) data models: – Provides concepts that are close to the way many users perceive data. Uses concepts such as entities, attributes and relationships: – An entity represents a real-world object or concept (Example: student or lecturer) – An attribute represents some property of interest that further describes an entity. – A relationship among two entities represents an association between two entities (Example: teaches relationship) 4 Categories of data models (2) Physical (low-level, internal) data models: – Provide concepts that describe details of how data is stored in the computer. Implementation (representational) data models: – Provide concepts that fall between the above two, balancing user views with some computer storage details. – These include the widely-used relational data model and network and hierarchical data models. 5 Entity Relationship Model (1) E-R model of real world. – Entities (objects). Example: customers, accounts, bank branch – Relationships between entities. Example: Account A-101 is held by customer Johnson. – Relationship set depositor associates customers with accounts. Widely used for database design. – Database design in E-R model usually converted to design in the relational model which is used for storage and processing. 6 Entity Relationship Model (2) Example of schema in the entity- relationship model. 7 Relational Model (1) A simple data model: the Relational Data Model – data stored in relations (tables). Schema: RelationName( field1 : type1 ,..., fieldn : typen ). A declarative query language: SQL Programmer specify what answers a query should return, but not how the query is executed. DBMS picks the best execution strategy. Provides physical data independence (applications need not need to worry about how data is physically structured and stored). 8 Relational Model (2) The relational database model achieves structural independence. Any type of association be it one-to-one, one- to-many, many-to-many can be easily implemented with the relational model. The relational database model has a very powerful and flexible query capability. 9 Relational Model – Example 1 A relation of students: Students(sid : string, name : string, age : integer, gpa : real) An instance of the students relation can be represented as follows: 10 Relational Model – Example 2 A sample relational database: 11 Example Relations 12 Example of Attribute domains 13 Definitions Informal Terms Formal Terms Table Relation Column Attribute/Domain Row Tuple Values in a column Domain Table Definition Schema of a Relation Populated Table Extension 14 Relational Model Terminology A relation is a table with columns and rows. – Only applies to logical structure of the database, not the physical structure. An attribute is a named column of a relation. Domain is the set of allowable values for one or more attributes. Tuple is a row of a relation. Degree is the number of attributes in a relation. Cardinality is the number of tuples in a relation. Relational Database is a collection of normalized relations with distinct relation names. 15 Properties of Relations Relation name is distinct from all other relation names in relational schema. Each cell of relation contains exactly one atomic (single) value. Each attribute has a distinct name. Values of an attribute are all from the same domain. Each tuple is distinct; there are no duplicate tuples. Order of attributes has no significance. Order of tuples has no significance, theoretically. 16 Relational Keys Superkey – An attribute, or set of attributes, that uniquely identifies a tuple within a relation. Candidate Key – Superkey (K) such that no proper subset is a superkey within the relation. – An attribute or a minimal set of attributes that uniquely identifies a tuple within a relation. Primary Key – Candidate key selected to identify tuples uniquely within relation. Alternate Keys – Candidate keys that are not selected to be primary key. Foreign Key – Attribute, or set of attributes, within one relation that matches candidate key of some (possibly same) relation. 17 Referential Integrity A constraint involving two relations (the previous constraints involve a single relation). Used to specify a relationship among tuples in two relations: the referencing relation and the referenced relation. Tuples in the referencing relation R1 have attributes FK (called foreign key attributes) that reference the primary key attributes PK of the referenced relation R2. A tuple t1 in R1 is said to reference a tuple t2 in R2 if t1[FK] = t2[PK]. A referential integrity constraint can be displayed in a relational database schema as a directed arc from R1.FK to R2.PK. 18 Referential Integrity constraints The value in the foreign key column (or columns) FK of the referencing relation R1 can be either: (1) a value of an existing primary key value of the corresponding primary key PK in the referenced relation R2,, or.. (2) a null. In case (2), the FK in R1 should not be a part of its own primary key. 19 Semantic Integrity constraints Based on application semantics and cannot be expressed by the model per se. E.g., “the max. no. of hours per employee for all projects he or she works on is 56 hrs per week” A constraint specification language may have to be used to express these. 20 Example Schema 21 Relationship Integrity constraints 22 Update Operations on Relations (1) Update Operations: – INSERT a tuple. – DELETE a tuple. – MODIFY a tuple. Integrity constraints should not be violated by the update operations. Several update operations may have to be grouped together. Updates may propagate to cause other updates automatically. This may be necessary to maintain integrity constraints. 23 Update Operations on Relations (2) In case of integrity violation, several actions can be taken: – Cancel the operation that causes the violation (REJECT option). – Perform the operation but inform the user of the violation. – Trigger additional updates so the violation is corrected (CASCADE option, SET NULL option) – Execute a user-specified error-correction routine. 24 Database Schema (1) Database Schema: The description of a database. Includes descriptions of the database structure and the constraints that should hold on the database. – Can be represented using a diagram and that is known as a schema diagram. Schema Diagram: A diagrammatic display of (some aspects of) a database schema: – Including, names of record types and data items and some types of constraints 25 Database Schema (2) Schema Construct: A component of the schema or an object within the schema. – Example: STUDENT, COURSE, etc An example schema diagram: Student Name StudentID Course Module ModuleName ModuleID CreditHours Department Grade_Report studentID CourseID Grade 26 Database Instance/State (1) Database Instance/State: The actual data stored in a database (the content of a database) at a particular moment in time. For a new database, we define its schema first. At that moment there is no data in the database and it is said to be in an empty state. Initial Database State: Refers to the database when it is loaded with initial data. Valid State: A state that satisfies the structure and constraints of the database. 27 Database Instance/State (2) The database schema changes very infrequently. The database state changes every time the database is updated. Schema is also called intension, whereas state is called extension. 28 Three-Schema Architecture (1) 29 Three-Schema Architecture (2) Many views, single conceptual (logical) schema and physical schema: – Views (External schema) describe how users see the data. – Conceptual schema defines logical structure. – Physical schema describes files and indexes used. Known as the Three-Schema Architecture 30 Three-Schema Architecture (3) Defines DBMS schemas at three levels: – Internal schema at the internal level to describe physical storage structures and access paths. Typically uses a physical data model. – Conceptual schema at the conceptual level to describe the structure and constraints for the whole database for a community of users. Uses a conceptual or an implementation data model. – External schemas at the external level to describe the various user views. Usually uses the same data model as the conceptual level. 31 Three-Schema Architecture (4) 32 Conceptual Schema The conceptual schema describes all the relations stored in the database. Creating a good conceptual schema is not a simple task. It is called conceptual schema design. It involves: – Determining the different relations(entities) needed – The number of fields for each relation – The type of each field – The relationship between relations – Constraints –... 33 Internal Schema The internal schema specifies how the relations are actually stored in secondary storage devices. It also specifies indexes used to speed up access to the relations. Decisions about the internal schema depend on: – Understanding how the data is going to be accessed. – The facilities provided by the DBMS. 34 External Schema The external schema is a refinement of the conceptual schema. It allows customized and authorized access to individual users or groups of users. Every database has one physical and one conceptual schema, but many external schemas. Each view is tailored to a particular group of users. 35 Data Independence Applications insulated from how data is structured and stored. Logical data independence: protection from changes in the logical structure of data – The capacity to change the conceptual schema without having to change the external schemas and their application programs. Physical data independence: protection from changes in the physical structure of data – The capacity to change the internal schema without having to change the conceptual schema. When a schema at a lower level is changed, only the mappings between this schema and higher-level schemas need to be changed. The higher-level schemas themselves are unchanged. 36 DBMS Languages High Level or Non-procedural Languages: – Example: SQL are set-oriented and specify what data to retrieve than how to retrieve. Also called declarative languages. – Low Level or Procedural Languages: record- at-a-time; they specify how to retrieve data and include constructs such as looping. 37 Overall system architecture A database system is partitioned into modules that deal with each of the responsibilities of the system and is broadly divided into storage manager and query processor components 38 Storage Manager (1) A program module that provides the interface between the low-level data stored in the db and the app. Programs and queries submitted to the system. It is responsible for the interaction with the file manager. It translates the various DML statements into low- level file commands. It is responsible for the storing, retrieving, and updating data in the db. 39 Storage Manager (2) It implements several data structures as part of the physical system implementation. – Data files which store the data itself. – Data dictionary which stores the metadata about the structure of the db. – Indices which provide fast access to data items that hold particular values. 40 Query Processor DDL interpreter that interprets DDL statements and records into the definitions in the data dictionary. DML compiler that translates DML statements in a query language into an evaluation plan consisting of low-level instructions that query evaluation engine understands. Query evaluation engine which executes low- level instructions generated by the DML compiler. 41 Two tier Client architecture (1) Client (tier 1) manages user interface and runs applications. Server (tier 2) holds database and DBMS. Advantages include: – wider access to existing databases – increased performance – possible reduction in hardware costs – reduction in communication costs – increased consistency 42 Two tier Client architecture (2) 43 Three tier Client architecture (1) Client side presented two problems preventing true scalability: – Considerable resources required on client’s computer to run effectively. – Significant client side administration overhead. By 1995, three layers proposed, each potentially running on a different platform. Advantages: – Client requiring less expensive hardware. – Application maintenance centralized. – Easier to modify or replace one tier without affecting others. – Separating business logic from database functions makes it easier to implement load balancing. – Maps quite naturally to Web environment. 44 Three tier Client architecture (2) 45

Use Quizgecko on...
Browser
Browser