Podcast
Questions and Answers
Which statement best describes the process of normalization in database design?
Which statement best describes the process of normalization in database design?
- A method of combining all data into a single table to simplify data access
- A technique for encrypting sensitive data within a database
- A process of increasing data redundancy to improve query performance
- A process for evaluating and correcting table structures to minimize data redundancy and anomalies (correct)
Denormalization consistently results in a higher normal form.
Denormalization consistently results in a higher normal form.
False (B)
What is the primary goal of normalization in database design?
What is the primary goal of normalization in database design?
to eliminate data anomalies
A table is in 2NF if it is in 1NF and has no ______ dependencies.
A table is in 2NF if it is in 1NF and has no ______ dependencies.
Match the following normal forms with their descriptions:
Match the following normal forms with their descriptions:
In which normal form does every determinant have to be a candidate key?
In which normal form does every determinant have to be a candidate key?
Normalization should not be part of the database design process.
Normalization should not be part of the database design process.
What is the term for an attribute that cannot be further subdivided into meaningful components?
What is the term for an attribute that cannot be further subdivided into meaningful components?
The level of detail represented by the values stored in a table's row is known as ______.
The level of detail represented by the values stored in a table's row is known as ______.
Match the following concepts with their descriptions:
Match the following concepts with their descriptions:
Which of the following is a characteristic of unnormalized data?
Which of the following is a characteristic of unnormalized data?
Higher levels of normalization are always more desirable for every database design.
Higher levels of normalization are always more desirable for every database design.
What is the term for the process of increasing speed in a database, typically by producing a lower normal form?
What is the term for the process of increasing speed in a database, typically by producing a lower normal form?
According to good DB design, the optimal relational DB is in ______ form.
According to good DB design, the optimal relational DB is in ______ form.
Match the situations to when normalization would be used
Match the situations to when normalization would be used
Which of the following best describes a 'repeating group' in the context of normalization?
Which of the following best describes a 'repeating group' in the context of normalization?
Fifth Normal Form (5NF) can have further lossless decompositions
Fifth Normal Form (5NF) can have further lossless decompositions
What term describes occurs when one key determines multiple values of 2 other attributes & those attributes are independent of each other?
What term describes occurs when one key determines multiple values of 2 other attributes & those attributes are independent of each other?
A surrogate key is a(n) ______ PK w/ the purpose of simplifying assignment of Pks to table
A surrogate key is a(n) ______ PK w/ the purpose of simplifying assignment of Pks to table
Match the following Normal Form Rules:
Match the following Normal Form Rules:
Flashcards
Normalization
Normalization
Process for evaluating and correcting table structures to minimize data redundancies and anomalies, assigning attributes to entities.
Data Redundancy Reduction
Data Redundancy Reduction
A state where attributes are assigned to entities so that data redundancies are reduced or eliminated.
Denormalization
Denormalization
A design approach that considers end-users' demand for fast performance by producing a lower normal form.
Main Goal of Normalization
Main Goal of Normalization
Signup and view all the flashcards
Unnormalized Data
Unnormalized Data
Signup and view all the flashcards
Table Characteristics
Table Characteristics
Signup and view all the flashcards
1NF (First Normal Form)
1NF (First Normal Form)
Signup and view all the flashcards
2NF (Second Normal Form)
2NF (Second Normal Form)
Signup and view all the flashcards
3NF (Third Normal Form)
3NF (Third Normal Form)
Signup and view all the flashcards
Partial Dependency
Partial Dependency
Signup and view all the flashcards
Transitive Dependency
Transitive Dependency
Signup and view all the flashcards
Repeating Group
Repeating Group
Signup and view all the flashcards
Update Anomalies
Update Anomalies
Signup and view all the flashcards
Insertion Anomalies
Insertion Anomalies
Signup and view all the flashcards
Deletion Anomalies
Deletion Anomalies
Signup and view all the flashcards
Composite Primary Key
Composite Primary Key
Signup and view all the flashcards
Determinant
Determinant
Signup and view all the flashcards
Surrogate Key
Surrogate Key
Signup and view all the flashcards
Atomic Attribute
Atomic Attribute
Signup and view all the flashcards
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form (BCNF)
Signup and view all the flashcards
Study Notes
Normalization Overview
- Normalization evaluates and corrects table structures to minimize data redundancies and anomalies.
- It assigns attributes to entities, reducing or eliminating data redundancies.
- The process is based on determination and functional dependency.
- There are three stages of normal forms, with 3NF being the best, but that doesn't mean always most desirable
Normal Forms Stages:
- 1NF (First Normal Form): Considered the worst.
- 2NF (Second Normal Form).
- 3NF (Third Normal Form): Considered the best.
- Higher normal forms require more relational join operations for specified output and resources.
- Design success balances end-user demands and fast performance.
- Denormalization produces lower normal forms, usually to increase speed, but can introduce data anomalies and redundancy (e.g., 3NF to 2NF).
Need for Normalization
- Normalization uses entity-relationship modeling.
- Normalization is applied in two main situations:
- After design completion, database designers use normalization to analyze relationships among attributes in each entity to determine if a structure can be improved.
- Database designers use normalization to modify existing data structures to improve the database design and create an appropriate structure.
- The process is the same whether designing new or existing structures.
- It is rare to design a completely new database using just normalization.
Goals and Steps of Normalization
- The main goal is to eliminate data anomalies by eliminating unnecessary or unwanted data redundancies.
- Functional dependency uses to identify which attribute (or set) determines other attributes exists.
- The steps include:
- Identifying business roles.
- Identifying and defining business and data constraints.
- Defining functional dependencies.
- Identifying entities, attributes, and relationships.
- Eliminating multivalued attributes.
Unnormalized Data and Table Characteristics
- Unnormalized data is raw data in its original state.
- It exists with several multivalued data elements.
- Table characteristics:
- Each relation/table represents a single subject.
- Each row/column intersection contains a single value, not a group of values.
- No data item is unnecessarily stored in more than one table.
- Ensures data is updated in one place.
- Non-prime attributes depend on the primary key (PK).
- The entire key and nothing but the PK. Ensures that the data is uniquely identifiable by PK.
- Prevents insertion, update, or deletion anomalies.
Normal Forms Explained
- 1NF: Table format with no repeating groups, and primary keys are identified.
- 2NF: 1NF plus no partial dependencies.
- 3NF: 2NF plus no transitive dependencies.
- Norms are presented from the perspective of the candidate key, assuming each table has one candidate key that is the primary key.
- A data modeler's objective is ensuring tables are in at least 3NF.
Dependencies: Partial and Transitive
- Norm starts by identifying dependencies and progressively breaking up the table into a new set of relations based on dependencies.
- Partial Dependencies: Functionality where the determinant is only part of the primary key; the attribute can identify the value of attribute C
- Straightforward and easy to identify.
- Transitive Dependencies: When the attribute is dependent on another attribute that's not part of the primary key where X determines the value of Z via Y.
- More difficult to identify, and they often occur among nonprime attributes.
###First Normal Form (1NF)
- A repeating group is a group of multiple entries of the same or multiple types in any single key attribute.
- Repeating must be eliminated by ensuring each row defines a single entity instance.
- Each row-column intersection has only a single value.
- Identifying the normal forms lets you know where you are in the normalization process.
- A three-step removal procedure:
- Eliminate repeating groups by changing the table's focus from project to assignment and creating separate rows.
- Identify the primary key.
- Identify all dependencies.
- Dependency diagrams, showing all data dependencies, a birdseye view of the relationship between entities
Anomalies
- Update anomalies: Modifying
- Insertion anomalies: Adding
- Deletion Anomalies
Conversion to Second Normal Form (2NF)
- A table is in 2NF if it is in 1NF and has no partial dependencies, meaning no attribute depends on only a portion of the primary key.
- Happens only when INF has a composite pk. if INF has single-attribute pk its automatically in 2NF
- This occurs only when 1NF has a composite primary key.
- If the 1NF has a single-attribute primary key, it's automatically in 2NF.
- Steps:
- Make a new table and eliminate partial dependencies where each component of primary key acts as a determinant.
- The new table contains a copy of the primary key component and this primary key component remains foreign keys in new table
Conversion to Third Normal Form (3NF)
- dependent in partial dep. are removed from OG table t placed in new table w/ dep. determinant
- foreign Keys created
- Partial dependency can exist only when tables Pk has several attributes, a table w/ PK of a single attribute is automatically INF
- A table is in 3NF when it is in 2NF and has no transitive dependencies.
- Steps:
- Make a new table to eliminate transitive dependencies.
- A determinant is any attribute whose value within a row determines other values.
- For every transitive dependency, write a copy of its determinant in a new table.
- Reassign corresponding dependent attributes.
- Place dependent attributes in new tables with their determinants and remove them from their original tables.
- Draw a new dependency diagram to show all defined tables.
- Name the table to reflect its contents and functions.
Improving Design
- Improving the database's ability to provide information and enhancing its operational characteristics.
- Normalization itself doesn't guarantee good designs; it helps eliminate data redundancies.
- Steps:
- Minimalize data entry errors
- Evaluates naming conventions
- Refines attribute atomicity
Errors and Keys
- Surrogate keys are artificial primary keys used to simplify the assignment of primary keys to tables.
- These keys are usually numeric, generated automatically by DBMS, are free of semantic content, are unique, and are hidden from the end user.
- When the design allows for descriptive entries in determinant attributes, it's easy to make data-entry errors that lead to referential integrity violations, so add an attribute as a unique identifier.
Atomic Attributes
- Atomic attributes cannot be further subdivided to produce meaningful components (e.g., last name).
- Parts of a transaction treated as a single, invisible, logical unit (Opposite of atomic),
- Cannot be divided into smaller units but needs single values.
- Improving atomicity gains querying flexibility.
- Designers prefer single-valued attributes based on business rules and processing requirements.
- Identify new attributes and relationships.
- Refine primary key as required for data granularity and Designers ensure a suitable primary (Evaluate need for surrogate Key)
Further Refinement of Normalization
- Storing attribute tables derive use by making it easy to right app software to produce desired results
- Evaluate how to properly implement using derived attributes
- Granularity is the level of detail represented by the values stored in a table's row.
- Data stored at the lowest level of granularity is atomic data (e.g., daily, weekly).
- Designers also ensure candidate key uniqueness.
- Designers must identify granularity levels and evaluate the need for a surrogate key.
Boyce-Codd Normal Form (BCNF)
- A special type of 3NF where every determinant is a candidate key is a candidate Key
- A table has to have more than Key, and 3NF and BCNF are equal
- BCNF is violated only when a table has more than one candidate key is not equal
Additional Normal Forms
- Fourth Normal Form (4NF): When in 3NF with no multivalued dependencies.
- Multivalue dependency occurs when one key determines multiple values of two other attributes. Those attributes are independent of each other and if tables 1 and 2 are implemented separately, they are likely to contain hulls.
- Solution: Create a new table for components of multi-valued dependency.
- All attributes must depend on the primary key and be independent of each other.
- No row may contain two or more multivalued facts about an entity.
Fifth Normal Form (5NF)
- When in 4NF, cannot have further lossless decompositions.
- Project Join NF (PJNF) means that a table cannot be decomposed anymore without losing data or creating incorrect information.
- Lossless decomposition: When, even if it decomposes it will recreate the original table after joining together again.
- It contains no missing data or new erroneous data.
- The relationship between decomposed tables is represented by a relational diagram.
- While higher forms can provide value, the value is limited by the additional processing to work with the data for normalization.
Normalization and DB Design
- The normalization should be part of the design process.
- ERD provides big picture, micro view, of orgs data requirements and operations
- Which is also an iterative process, Identifying entities, attributes, & relationships
- Normalization in micro view of entities in ERD
- Iterative and incremental normalization process
Denormalization and Normalization Purity
- Denormalization is used to increase processing speeds for Optimal relational DB is in 3NF.
- The database design considers processing and reporting requirements, and processing speed.
- If enough storage space is available, designers have limited choices:
- Storing data in a permanently denormalized table (not recommended) only if performance is an issue.
- Temporary denormalized table from permanent normalized table to make report as disappears.
- Only implements viable processing options, if performance is not an issue.
- Normalization purity is difficult to sustain in a modern database environment.
Warehouse Data defects
- Data warehouses routinely use 2NF.
- Unnormalized data defects include:
- Data updates are less efficient because programs must deal with larger tables.
- Indexing is more cumbersome and not practical.
- Unnormalized tables yield no simple strategies for creating virtual tables (called views).
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.