Data Models: Database Design Concepts and Evolution PDF

Summary

This document provides an overview of data models and database design, covering topics such as hierarchical, network, relational, and object-oriented models. It examines the basic building blocks and evolution of data modeling approaches, with a focus on different model types and their characteristics. Key concepts include entities, attributes, and relationships in the context of database structures.

Full Transcript

IT1924 Data Models Database design focuses on how the database structure will be used to store and manage end-user data. Data Modeling, the first step in designing a database, refers to the process of creating a specific data model for a dete...

IT1924 Data Models Database design focuses on how the database structure will be used to store and manage end-user data. Data Modeling, the first step in designing a database, refers to the process of creating a specific data model for a determined problem domain. A data model is relatively simple representation, usually graphical, of more complex real-world data structures. In general terms, a model is an abstraction of a more complex real-world object or event. (Coronel and Morris, 2017, p. 36) Importance of Database Models (Coronel and Morris, 2017) Data models can facilitate interaction among the designer, the applications programmer, and the end user. A well- developed data model can even foster improved understanding of the organization for which the database design is developed. The importance of data modeling cannot be overstated. Data constitutes the most basic information used by a system. Applications are created to manage data and to help transform data into information, but data is viewed in different ways by different people. Data Model Basic Building Blocks Basic building blocks for data model are the following: Entity – It is a person, place, thing, or event about which data will be collected and stored. Attribute – It is a characteristic of an entity. Relationship – It describes an association among entities. o Three (3) types of relationships: One-to-one (1:1) relationship One-to-many (1:M) relationship Many-to-many (M:M) relationship Evolution of Data Models (Coronel and Morris 2017) The quest for better data management has led to several models that attempt to resolve the previous model's critical shortcomings and to provide solutions to ever-evolving data management needs. These models represent schools of thought as to what a database is, what is should do, the types of structures that it should use, and the technology that would be used to implement these structures. Hierarchical Model - It was developed in the 1960s to manage large amounts of data for complex manufacturing projects. - The model's basic logical structure is represented by an upside-down tree. It contains levels, or segments. - Segment is the equivalent of a file system's record type. Network Model - It was created to represent complex data relationships more effectively than the hierarchical model, to improve database performance, and to impose a database standard. - The network database model is generally used today, the standard database concepts that emerged with the network model are still used by modern data models: o Schema – It is the conceptual organization of the entire database as viewed by the database administrator. o Subschema – It defines the portion of the database by the application programs that actually produce the desired information from the data in the database. o Data Manipulation Language (DML) – It defines the environment in which data can be managed. o Data Definition Language (DDL) – It allows the database administrator to define the schema components. Relational Model - It was introduced in 1970 by E. F. Codd of IBM. - The relational model represented a major breakthrough for both users and designers. - The foundation of mathematical concept is known as a relation. Entity Relationship Model - It was introduced in 1976 by Peter Chen. - The graphical representation of entities and their relationships in a database structure quickly became popular, because it complemented the relational data model concepts. - The relational data model and ERM are combined to provide the foundation for tightly structured database design. 02 Handout 1 *Property of STI [email protected] Page 1 of 3 IT1924 Object-Oriented Model - Increasingly complex real-world problems demonstrated a need for a data model that more closely represented the real world. In the Object-Oriented Data Model (OODM), both data and its relationships are contained in a single structure known as an object. In turn, the OODM is the basis for the Object-Oriented Database Management System (OODBMS). - The OODM is said to be a semantic data model because it indicates meaning. - The Object-Oriented Data Model is based on the following components: o An object is an abstraction of a real-world entity o Attributes describe the properties of an object. o Objects that share similar characteristics are grouped in classes. A class is a collection of similar objects with shared structure (attributes) and behavior (methods). o Classes are organized in a class hierarchy. The class hierarchy resembles an upside-down tree in which each class has only one parent. o Inheritance is the ability of an object within the class hierarchy to inherit attributes. o Object-oriented data models are typically depicted using Unified Modeling Language (UML) class diagrams. UML is a language based on Object-Oriented concepts that describes a set of diagrams and symbols you can use to graphically model a system. Extensible Markup Language (XML) – A metalanguage used to represent and manipulate data elements. Unlike other markup languages, XML permits the manipulation of a document's data elements. Emerging Data Models: Big Data and NoSQL Big Data - It refers to a movement to find new and better ways to manage large amounts of web and sensor-generated data and derive business insight from it, while simultaneously providing high performance and scalability at a reasonable cost. - The term seems to have been first used in a computing framework by John Mashey, Silicon Graphics scientist in the 1990s. However, it seems to be Douglas Laney, a data analyst from the Gartner Group, who first described the basic characteristics of Big Data databases: o Volume – It refer to the amounts of data being stored. o Velocity – It refers not only to the speed with which data grows but also to the need to process this data quickly in order to generate information and insight. o Variety – It refers to the fact that the data being collected comes in multiple different data formats. NoSQL - It is a large-scale distributed database system that stores structured and unstructured data in efficient ways. - Searching in Amazon, sending messages in Facebook, videos in YouTube, or searching for directions in Google Maps, are examples of those that use a NoSQL database. - The following are the general characteristics of NoSQL databases: o They are not based on the relational model and SQL, hence the name NoSQL. o They support distributed database architectures. o They provide high scalability, high availability, and fault tolerance. o They support very large amounts of sparse data. o They are geared toward performance rather than transaction consistency. - NoSQL supports distributed database architecture – One of the big advantages of NoSQL databases is that they generally use a distributed database node. - NoSQL supports very large amounts of sparse data – NoSQL databases can handle very high volumes of data. In particular, they are suited for sparse data – that is, for cases in which the number of attributes is very large but the number of actual data instances is low. - NoSQL provides high scalability, high availability, and fault tolerance – True to its web origins, NoSQL databases are designed to support web operations, such as the ability to add capacity in the form of nodes to the distributed database when the demand is high, and to do it transparently and without downtime. - Most NoSQL databases are geared toward performance rather than transactions consistency – One of the biggest problems if very large distributed databases are enforcing data consistency. Distributed databases automatically make copies of data elements at multiple nodes to ensure high availability and fault tolerance. 02 Handout 1 *Property of STI [email protected] Page 2 of 3 IT1924 Degrees of Data Abstraction In early 1970s, the American National Standards Institute (ANSI) Standards Planning and Requirements Committee (SPARC) defined a framework for data modeling based on degrees of data abstraction. The resulting ANSI/SPARC architecture defines three (3) levels of data abstraction: external, conceptual, and internal (Coronel and Morris, 2017, p. 60). External Model - It is the end user's view of the data environment. - It refers to people who use the application programs to manipulate the data and generate information. - ER diagrams will be used to represent the external views. A specific representation of an external view is known as an external schema. Conceptual Model - It represents a global view of the entire database by the entire organization. - Also known as a conceptual schema, it is the basis for the identification and high-level description of the main data objects. Internal Model - It is the representation of the database as "seen" by the DBMS. - It requires the designer to match the conceptual model's characteristics and constraints to those of the selected implementation model. - Internal schema depicts a specific representation of an internal model, using the database constructs supported by the chosen database. Physical Model operates at the lowest level of abstraction, describing the way data is saved on storage media such as magnetic, solid state, or optical media. The physical model requires the definition of both the physical storage devices and the (physical) access methods required to reach the data within those storage devices, making it both software and hardware dependent (Coronel and Morris, 2017, p. 63). REFERENCES: Coronel, C. and Morris, S. (2017). Database systems: design, implementation, and management, 12th edition. USA: Cengage Learning. Elmasri, R. and Navathe, S. (2016). Fundamentals of database systems, 7th edition. USA: Pearson Higher Education. Kroenke, D. and Auer, D. (2016). Database processing: fundamentals, design, and implementation. England: Pearson Education Limited. 02 Handout 1 *Property of STI [email protected] Page 3 of 3