Data Modeling and Partitioning

TantalizingLithium avatar
TantalizingLithium
·
·
Download

Start Quiz

Study Flashcards

40 Questions

What is the recommended partitioning strategy when facts contain many optional dimensional keys?

Partitioning on a date key

What is the primary purpose of creating views in a data model?

To control access to certain data elements

What is the correct sequence of steps in reverse engineering a database?

PDM, LDM, CDM

What is the purpose of a Data Model Scorecard?

To evaluate data model quality

Why is contextual organization important in data modeling?

To group entities by subject area or function

What is the benefit of using views in data modeling?

To standardize common objects or queries

What is the primary focus of a Logical Data Model (LDM)?

Business solution that the existing system meets

Why is continuous improvement important in data modeling?

To ensure model correctness, completeness, and consistency

What is the primary purpose of a metadata repository?

To offer an easily accessible way to view and navigate the contents of the repository

What type of data model patterns represent the building blocks that span the business and data modeler worlds?

Assembly patterns

What is the purpose of integration patterns in data modeling?

To provide a framework for linking assembly patterns in common ways

What is an industry data model?

A data model pre-built for an entire industry, such as healthcare or telecom

Where can industry data models be obtained from?

Through vendors or industry groups

Why may an organization need to customize a purchased industry data model?

Because it will have been developed from multiple other organizations’ needs

What determines the level of customization required for an industry data model?

How close the model is to an organization’s needs, and how detailed the most important parts are

What are elementary data model patterns used for?

To resolve many-to-many relationships, and to construct self-referencing hierarchies

What is the primary purpose of a data model?

To make data easier to consume

What does a data model help to explain?

The boundaries for data context and implementation

What is a key benefit of data modeling in terms of knowledge retention?

It preserves corporate memory

What is the role of a data model in understanding an organization or business area?

To understand data structures

What is data modeling most frequently performed in the context of?

System development lifecycle (SDLC)

What is the primary goal of data modeling?

To understand the data structure

What is the analogy used to describe the role of a data modeler?

A mapmaker learning and documenting a geographic landscape

What is the importance of understanding the vocabulary that supports data modeling?

It is important for data modeling because it is about the process of definition

What is the main benefit of conformed dimensions?

They can be shared across dimensional models

What is the purpose of standardizing definitions of terms in conformed facts?

To ensure consistent terminology and values across individual marts

What is the Unified Modeling Language (UML) primarily used for?

Modeling software

What is a key feature of a UML Class Model?

It has an Operations or Methods section

What is the equivalent of Operations in ER diagrams?

Stored Procedures

What is represented by 'Stdntno' in the UML Class Model in Figure 41?

Student number

What is the data type of 'Strtdt' in the UML Class Model in Figure 41?

Date

What is the name of the operation that represents the expected graduation date in the UML Class Model in Figure 41?

ExpctGraddt

What type of entity is Student in the example shown in Figure 38?

Independent entity

What is the characteristic of an identifying relationship?

Primary key is migrated as a primary foreign key attribute

What is the purpose of a domain in data modeling?

To standardize the characteristics of the attributes

What is the type of attribute that contains values outside of its assigned domain?

Invalid attribute

What is the result of migrating the primary key of the parent as a non-primary foreign key attribute to the child?

Non-identifying relationship

What is the name of the entity that relies on other entities in the example shown in Figure 38?

Registration

What is a set of possible values that an attribute can be assigned?

Domain

What is the purpose of assigning a domain to an attribute?

To standardize the characteristics of the attributes

Study Notes

Data Modeling

  • Data modeling is a process that requires quality control, and continuous improvement practices should be employed.
  • Techniques such as time-to-value, support costs, and data model quality validators can be used to evaluate the model for correctness, completeness, and consistency.

Partitioning for Performance

  • Partitioning on a date key is recommended, especially when facts contain many optional dimensional keys (sparse).
  • When partitioning on a date key is not possible, a study is required based on profiled results and workload analysis to propose and refine the subsequent partitioning model.

Creating Views

  • Views can be used to control access to certain data elements, or to embed common join conditions or filters to standardize common objects or queries.
  • Views themselves should be requirements-driven, and in many cases, they will need to be developed via a process that mirrors the development of the LDM and PDM.

Reverse Engineering

  • Reverse engineering is the process of documenting an existing database.
  • The PDM is completed first to understand the technical design of an existing system, followed by an LDM to document the business solution that the existing system meets, and then the CDM to document the scope and key terminology within the existing system.

Data Model Patterns

  • Data model patterns are reusable modeling structures that can be applied to a wide class of situations.
  • There are elementary, assembly, and integration data model patterns.
  • Elementary patterns are the ‘nuts and bolts’ of data modeling, and include ways to resolve many-to-many relationships, and to construct self-referencing hierarchies.
  • Assembly patterns represent the building blocks that span the business and data modeler worlds.
  • Integration patterns provide the framework for linking the assembly patterns in common ways.

Industry Data Models

  • Industry data models are data models pre-built for an entire industry, such as healthcare, telecom, insurance, banking, or manufacturing.
  • These models are often both broad in scope and very detailed, and can be purchased through vendors or obtained through industry groups.
  • Any purchased data model will need to be customized to fit an organization, as it will have been developed from multiple other organizations’ needs.

Data Modeling and Data Models

  • Data modeling is most frequently performed in the context of systems development and maintenance efforts, known as the system development lifecycle (SDLC).
  • Data modeling is about the process of definition, and it is important to understand the vocabulary that supports the practice.

Entities and Attributes

  • Dependent entities have at least one identifying relationship, where the primary key of the parent (the entity on the one side of the relationship) is migrated as a foreign key to the child’s primary key.
  • In non-identifying relationships, the primary key of the parent is migrated as a non-primary foreign key attribute to the child.

Domain

  • In data modeling, a domain is the complete set of possible values that an attribute can be assigned.
  • A domain may be articulated in different ways, and provides a means of standardizing the characteristics of the attributes.
  • All values inside the domain are valid values, and those outside the domain are referred to as invalid values.

Conformed Dimensions and Facts

  • Conformed dimensions are built with the entire organization in mind, allowing these dimensions to be shared across dimensional models, due to containing consistent terminology and values.
  • Conformed facts use standardized definitions of terms across individual marts, and different business users may use the same term in different ways.

Object-Oriented (UML)

  • The Unified Modeling Language (UML) is a graphical language for modeling software.
  • The UML class model specifies classes (entity types) and their relationship types.
  • The UML class model has a variety of notations, of which one concerns databases.

Learn about data modeling strategies, including partitioning for performance and creating views to control access to data elements.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free
Use Quizgecko on...
Browser
Browser