CPE3 Data Model, Data Schema & Data Independence PDF
Document Details
Uploaded by Deleted User
Tags
Summary
This document is a set of lecture notes on data modeling for a database systems course. It covers data models, such as relational models and entity-relationship models, and discusses data independence, schemas, and instances. It details the advantages and disadvantages of specific data models and aims to help students understand the concepts.
Full Transcript
Module 2: Data Model, Data Schema & Data Independence Contents Lesson 1: Data Models Lesson 2: Entity-Relationship Models Lesson 3: Relational Models Lesson 4: Data Schema Lesson 5: Data Independence Data Models Data models define how the logical structure of a database is modeled or show how...
Module 2: Data Model, Data Schema & Data Independence Contents Lesson 1: Data Models Lesson 2: Entity-Relationship Models Lesson 3: Relational Models Lesson 4: Data Schema Lesson 5: Data Independence Data Models Data models define how the logical structure of a database is modeled or show how data is represented in the conceptual level of the database architecture. Data models define how data is connected to each other and how they are processed and stored inside the system. Data Models Entity-Relationship Models Entity-Relationship (ER) Model is based on the notion of real-world entities and relationships among them. While formulating real-world scenario into the database model, the ER Model creates entity set, relationship set, general attributes and constraints. ER Model is best used for the conceptual design of a database. ER Model is based on: Entities and their attributes. Relationships among entities. Entity-Relationship Models Example: Relational Models The most popular data model in DBMS is the Relational Model. It is more a scientific model than others. This model is based on first-order predicate logic and defines a table as an n-ary relation. The main highlights of this model are: Data is stored in tables called relations. Relations can be normalized (Normalization – values saved are atomic values.) Each row in a relation contains a unique value. Each column in a relation contains values from a same domain. Advantages of Data Models Data Models help us in representing data accurately. It helps us in finding the missing data and also in minimizing Data Redundancy. Data Model provides data security in a better way. The data model should be detailed enough to be used for building the physical database. The information in the data model can be used for defining the relationship between tables, primary and foreign keys, and stored procedures. Disadvantages of Data Models In the case of a vast database, sometimes it becomes difficult to understand the data model. You must have the proper knowledge of SQL to use physical models. Even smaller change made in structure require modification in the entire application. There is no set data manipulation language in DBMS. To develop Data model one should know physical data stored characteristics. Conclusions Data modeling is the process of developing data model for the data to be stored in a Database. Data Models ensure consistency in naming conventions, default values, semantics, security while ensuring quality of the data. Data Model structure helps to define the relational tables, primary and foreign keys and stored procedures. There are three types of conceptual, logical, and physical. The main aim of conceptual model is to establish the entities, their attributes, and their relationships. Conclusions Logical data model defines the structure of the data elements and set the relationships between them. A Physical Data Model describes the database specific implementation of the data model. The main goal of a designing data model is to make certain that data objects offered by the functional team are represented accurately. The biggest drawback is that even smaller change made in structure require modification in the entire application. Data Schema A database schema is the skeleton structure that represents the logical view of the entire database. It defines how the data is organized and how the relations among them are associated. A database schema defines its entities and the relationship among them. It contains a descriptive detail of the database, which can be depicted by means of schema diagrams. The database designers design the schema to help programmers understand the database and make it useful. A database schema can be divided broadly into three categories: Physical Database Schema − this schema pertains to the actual storage of data and its form of storage like files, indices, etc. In this level, it is expressed how data is stored in blocks of storages. Logical Database Schema – this schema defines all the logical constraints that need to be applied on the data stored. It defines tables, views, and integrity constraints. In this level, the programmers, as well as the database administrator (DBA), work. View Database Schema – can be defined as the design of the database at the view level, which generally describes end-user interaction with database systems. Data Instance A database instance is a state of operational database with data at any given time. It contains a snapshot of the database. Database instances tend to change with time. A DBMS ensures that its every instance (state) is in a valid state, by diligently following all the validations, constraints, and conditions that the database designers have imposed. Example: Let's say a table teacher in our database whose name is School, suppose the table has 50 records, so the instance of the database has 50 records for now and tomorrow we are going to add another fifty records, so tomorrow the instance have total of 100 records. Data Independence A database system normally contains a lot of data in addition to users’ data. For example, it stores data about data, known as metadata, to locate and retrieve data easily. Metadata itself follows a layered architecture, so that when we change data at one layer, it does not affect the data at another level. This data is independent but mapped to each other. Logical Data Independence Logical data is data about database, that is, it stores information about how data is managed inside. For example, a table (relation) stored in the database and all its constraints, applied on that relation. Logical data independence is a kind of mechanism, which liberalizes itself from actual data stored on the disk. If we do some changes on table format, it should not change the data residing on the disk. Physical Data Independence All the schemas are logical, and the actual data is stored in bit format on the disk. Physical data independence is the power to change the physical data without impacting the schema or logical data. For example, in case we want to change or upgrade the storage system itself − suppose we want to replace hard-disks with SSD − it should not have any impact on the logical data or schemas. Data Independence Examples of Physical Data Independence: Using a new storage device like Hard Drive or Magnetic Tapes Modifying the file organization technique in the Database Switching to different data structures. Changing the access method. Modifying indexes. Changes to compression techniques or hashing algorithms. Change of Location of Database from say C drive to D Drive. Examples of changes under Logical Data Independence: Add/Modify/Delete a new attribute, entity or relationship is possible without a rewrite of existing application programs. Merging two records into one. Breaking an existing record into two or more records. Thanks! Any questions?