Chapter 2 Data Models PDF
Document Details
Uploaded by IndulgentByzantineArt
Tags
Summary
This document introduces data models, explaining their importance in managing complex real-world data environments. It details the fundamental components of database design, such as entities, attributes, relationships, and constraints, illustrating them with examples. The key concepts of data modeling, business rules, and their relation to data model components are also discussed.
Full Transcript
DATA MODELS Learning Outcomes Data modeling and why data models are important The basic data-modeling building blocks What business rules are and how they influence database design How the major data models evolved How data models can be classified by their level of abstraction Introduction ▪...
DATA MODELS Learning Outcomes Data modeling and why data models are important The basic data-modeling building blocks What business rules are and how they influence database design How the major data models evolved How data models can be classified by their level of abstraction Introduction ▪ Designers, programmers, and end users see data in different ways ▪ Different views of same data lead to designs that do not reflect organization’s operation ▪ Data modeling reduces complexities of database design ▪ Various degrees of data abstraction help reconcile varying views of same data Data Modeling and Data Models ▪ Data modeling: Iterative and progressive process of creating a specific data model for a determined problem domain ▪ Data models: Simple representations (usually graphical) of complex real-world data structures ◦ Define how data is connected to each other and how they are processed and stored inside the system. ▪ Model - Abstraction of a real-world object or event Importance of Data Models Facilitate interaction among the designer, the Are a communication tool applications programmer, and the end user Give an overall view of the database Consolidate views from various perspective Organize data for various users Are an abstraction for the creation of good database Data Model Basic Building Blocks 1. Entity: Unique and distinct object used to collect and store data A person, place, thing, or event about which data will be collected and store 2. Attribute: Characteristic of an entity 3. Relationship: Entity Attribute Describes an association among entities ✓One-to-many (1:M) ✓Many-to-many (M:N or M:M) ✓One-to-one (1:1) Relationship Constraint 4. Constraint: A restriction placed on the data Example Entity CUSTOMER, RECEIPT, MUSICAL CONCERT, ROOM Attribute CUSTOMER NAME, CUSTOMER PHONE NUMBER, CUSTOMER ADDRESS Relationship An agent can serve many customers A customer may be served by one agent Constraint An employee’s salary must have values that are between 6,000 and 350,000 A student’s GPA must be between 0.00 and 4.00 Each class must have one and only one teacher Business Rules Descriptions of policies, procedures, or principles within a specific organization Must be: Description of operations to create/enforce actions within In writing an organization’s environment Kept up to date Easy to understand Widely disseminated Describe characteristics of data as viewed by the company Example Business rules ▪ A customer may generate many invoices ▪ An invoice is generated by only one customer ▪ A training session cannot be scheduled for fewer than 10 employees or for more than 30 employees Sources of Business Rules Company Department Policy makers managers managers Direct Written interviews with documentation end users Reasons for Identifying and Documenting Business Rules ▪ Help standardize company’s view of data ▪ Communications tool between users and designers ▪ Allow designer to: ◦ Understand the nature, role, scope of data, and business processes ◦ Develop appropriate relationship participation rules and constraints ◦ Create an accurate data model Translating Business Rules into Data Model Components ▪ Nouns translate into entities ▪ Verbs translate into relationships among entities ▪ Relationships are bidirectional ▪ Questions to identify the relationship type ◦ How many instances of B are related to one instance of A? ◦ How many instances of A are related to one instance of B? In how many classes can one student enroll? How many students can enroll in one class? Naming Conventions ▪ Naming occurs during translation of business rules to data model components ▪ Names should make the object unique and distinguishable from other objects ▪ Names should also be descriptive of objects in the environment and be familiar to users ▪ Proper naming: ▪ Facilitates communication between parties ▪ Promotes self-documentation Entity names Attribute names Good practice: use prefix of entity name or abbreviation in which the attribute occurs. E.g. attribute CREDIT_LIMIT of the CUSTOMER’S entity is named as CUS_CREDIT_LIMIT Exercise Given the following business rules: a. Each sales representative writes many invoices b. Each invoice is written by one sales representative c. Each sales representative is assigned to one department d. Each department has many sales representatives e. Each customer can generate many invoices f. Each invoice is generated by one customer Find entities, relationship (name and classification) and possible attributes for each entity. The Evolution of Major Data Models GENERATION TIME DATA MODEL EXAMPLES First 1960s-1970s File System VMS/VSAM Second 1970s Hierarchical and Network IMS, ADABAS, IDS-II Third Mid- 1970s Relational DB2, Oracle, MSSQL Server, MySQL Fourth Mid-1980s Object-oriented Versant, Objectivity/DB, Object/relational (O/R) DB2 UDB, Oracle 11g Fifth Mid-1990s XML dbXML, Tamino, DB2 UDB, Oracle 11g, Hybrid DBMS MS SQL Server Emerging Models: Late 2000s to present Key-value store SimpleDB (Amazon) NoSQL Column store BigTable (Google) Cassandra (Apache) Hierarchical and Network Models HIERARCHICAL MODELS NETWORK MODELS ▪ Manage large amounts of data for ▪ Represent complex data complex manufacturing projects relationships more effectively ▪ Represented by an upside-down ▪ Improve database performance and tree which contains segments impose a database standard ◦ Segments: Equivalent of a file ▪ Depicts both one-to-many (1:M) and system’s record type many-to-many (M:N) relationships ▪ Depicts a set of one-to-many (1:M) relationships 16 Hierarchical Model ADVANTAGES DISADVANTAGES ▪ Promotes data sharing ▪ Requires knowledge of physical data storage characteristics ▪ Parent/child relationship promotes conceptual simplicity and data ▪ Navigational system requires integrity knowledge of hierarchical path ▪ Database security is provided and ▪ Changes in structure require enforced by DBMS changes in all application programs ▪ Efficient with 1:M relationships ▪ Implementation limitations ▪ No data definition ▪ Lack of standards 17 A Hierarchical Structure Network Model ADVANTAGES DISADVANTAGES ▪ Conceptual simplicity ▪ System complexity limits efficiency ▪ Handles more relationship types ▪ Navigational system yields complex implementation, application ▪ Data access is flexible development, and management ▪ Data owner/member relationship ▪ Structural changes require changes promotes data integrity in all application programs ▪ Conformance to standards ▪ Includes data definition language (DDL) and data manipulation language (DML) A Network Data Model Standard Database Concepts Schema Conceptual organization of the entire database as viewed by the database administrator Subschema Portion of the database seen by the application programs that produce the desired information from the data within the database Standard Database Concepts Data manipulation language (DML) Environment in which data can be managed and is used to work with the data in the database Schema data definition language (DDL) Enables the database administrator to define the schema components The Relational Model ▪ Developed by E.F. Codd (IBM) in 1970 ▪ Produced an automatic transmission database that replaced standard transmission databases ▪ Table (relations) – Matrix consisting of row/column intersections – Each row in a relation is called a tuple ▪ Relational models were considered impractical in 1970 ▪ Model was conceptually simple at expense of computer overhead Relational Database Management System (RDBMS) ▪ Performs basic functions provided by the hierarchical and network DBMS systems ▪ Makes the relational data model easier to understand and implement ▪ Hides the complexities of the relational model from the user ▪ Relational diagram - Representation of entities, attributes, and relationships ▪ Relational table stores collection of related entities Relational Tables A Relational Diagram Cengage Learning © 2015 Relational Model ADVANTAGES DISADVANTAGES ▪ Structural independence is ▪ Requires substantial hardware and promoted using independent system software overhead tables ▪ Conceptual simplicity gives ▪ Tabular view improves conceptual untrained people the tools to use a simplicity good system poorly ▪ Ad hoc query capability is based ▪ May promote information problems on SQL ▪ Isolates the end user from physical-level details ▪ Improves implementation and management simplicity SQL-Based Relational Database Application SQL-based relational database application involves three parts: ▪ End-user interface Allows end user to interact with the data ▪ Collection of tables stored in the database Each table is independent from another Rows in different tables are related based on common values in common attributes ▪ SQL engine Executes all queries The Entity Relationship Model ▪ Widely accepted standard for data modeling ▪ Introduced by Chen in 1976 ▪ Graphical representation of entities and their relationships in a database structure ▪ Entity relationship diagram (ERD) ◦ Uses graphic representations to model database components ▪ Entity instance or entity occurrence ◦ Rows in the relational table ▪ Entity set is collection of like entities ▪ Connectivity: Term used to label the relationship types Entity Relationship Model ADVANTAGES DISADVANTAGES ▪ Visual modeling yields ▪ Limited constraint conceptual simplicity representation ▪ Visual representation makes it an ▪ Limited relationship effective communication tool representation ▪ Is integrated with the dominant ▪ No data manipulation language relational model ▪ Loss of information content occurs when attributes are removed from entities to avoid crowded displays The ER Model Notations The Object-Oriented Data Model (OODM) or Semantic Data Model ▪ Data and relationships are contained in a single structure known as an object ▪ OODM (object-oriented data model) is the basis for OODBMS ― Semantic data model ▪ An object: ― Contains operations ― Are self-contained: a basic building-block for autonomous structures ― Is an abstraction of a real-world entity The Object-Oriented Data Model (OODM) ▪ Attributes describe the properties of an object ▪ Objects that share similar characteristics are grouped in classes ▪ Classes are organized in a class hierarchy ▪ Inheritance: object inherits methods and attributes of parent class ▪ UML based on OO concepts that describe diagrams and symbols ― Used to graphically model a system Object-Oriented Model ADVANTAGES DISADVANTAGES ▪ Semantic content is added ▪ Slow development of standards caused vendors to supply their ▪ Visual representation includes own enhancements semantic content Compromised widely accepted ▪ Inheritance promotes data standard integrity ▪ Complex navigational system ▪ Learning curve is steep ▪ High system overhead slows transactions A Comparison of OO, UML, and ER Models Object/Relational and XML ▪ Extended relational data model (ERDM) ◦ Supports OO features and complex data representation ◦ Object/Relational Database Management System (O/R DBMS) ◦ Based on ERDM, focuses on better data management ▪ Extensible Markup Language (XML) ◦ Manages unstructured data for efficient and effective exchange of all data types Big Data Aims to: ◦ Find new and better ways to manage large amounts of web and sensor-generated data ◦ Provide high performance and scalability at a reasonable cost Characteristics ◦ Volume – amount of data ◦ Velocity – speed in data growth rapidly and speed to process data quickly ◦ Variety – data comes in multiple different data formats Big Data Challenges Volume does not allow the usage of conventional structures Expensive OLAP tools proved inconsistent dealing with unstructured data Big Data New Technologies Hadoop Hadoop Distributed File System (HDFS) MapReduce NoSQL NoSQL Databases ▪ Amazon (search product), Facebook, Youtube, Google Map use NoSQL database ▪ Not based on the relational model ▪ Support distributed database architectures ▪ Provide high scalability, high availability, and fault tolerance ▪ Support large amounts of sparse data ▪ Geared toward performance rather than transaction consistency ▪ Store data in key-value stores NoSQL ADVANTAGES DISADVANTAGES ▪ High scalability, availability, and ▪ Complex programming is fault tolerance are provided required ▪ Uses low-cost commodity ▪ There is no relationship support hardware ▪ There is no transaction integrity ▪ Supports Big Data support ▪ Key-value model improves storage efficiency A Simple Key-value Representation The Evolution of Data Models Data Model Basic Terminology Comparison Degrees of Data Abstraction ▪ Database designer starts with abstracted view, then adds details ▪ ANSI Standards Planning and Requirements Committee (SPARC) ― Defined a framework for data modeling based on degrees of data abstraction (1970s): ✓ External ✓ Conceptual ✓ Internal Data Abstraction: https://youtu.be/LOgfAWwair4 45 Data Abstraction Levels The External Model ▪ End users’ view of the data environment ▪ ER diagrams are used to represent the external views ▪ External schema: Specific representation of an external view ₋ Entities ₋ Relationships ₋ Processes ₋ Constraints External Models for Tiny College The Conceptual Model ▪ Represents a global view of the entire database by the entire organization ▪ All external views integrated into single global view: conceptual schema ▪ Conceptual schema: Basis for the identification and high-level description of the main data objects ▪ ER model most widely used ▪ ERD graphically represents the conceptual schema ▪ Has a macro-level view of data environment ▪ Is software and hardware independent – Does not depend on the DBMS software used to implement the model – Does not depend on the hardware used in the implementation of the model – Changes in hardware or software do not affect database design at the conceptual level Conceptual Model for Tiny College The Internal Model ▪ Representing database as seen by the DBMS ― mapping conceptual model to the DBMS ▪ Internal schema: Specific representation of an internal model ▪ Uses the database constructs supported by the chosen database ▪ Is software dependent and hardware independent ― Change in DBMS software requires internal model be changed ▪ Logical independence: Changing internal model without affecting the conceptual model Internal Model for Tiny College The Physical Model ▪ Operates at lowest level of abstraction ― Describes the way data are saved on storage media such as disks or tapes ▪ Requires the definition of physical storage and data access methods ▪ Relational model aimed at logical level ― Does not require physical-level details ▪ Physical independence: Changes in physical model do not affect internal model Levels of Data Abstraction Cengage Learning © 2015 REVIEW QUESTIONS 1. Why data models is important? 2. What are database basic building blocks? 3. What is business rules? 4. What are the sources of business rules? 5. How data models evolve? 6. What are data abstraction? 7. What are the degree of data abstraction? Summary A data model is an abstraction of a complex real-world data environment Basic data modeling components: – Entities – Attributes – Relationships – Constraints Business rules identify and define basic modeling components Summary Hierarchical model – Set of one-to-many (1:M) relationships between a parent and its children's segments Network data model – Uses sets to represent 1:M relationships between record types Relational model – Current database implementation standard – ER model is a tool for data modeling Complements relational model Summary Object-oriented data model: object is basic modeling structure Relational model adopted object-oriented extensions: extended relational data model (ERDM) OO data models depicted using UML Data-modeling requirements are a function of different data views and abstraction levels – Three abstraction levels: external, conceptual, internal