Podcast
Questions and Answers
Which of these anomalies are associated with database design?
Which of these anomalies are associated with database design?
- Insertion
- Deletion
- Modification
- All of the above (correct)
Insertion anomalies occur when certain information cannot be stored unless other, unrelated information is also stored.
Insertion anomalies occur when certain information cannot be stored unless other, unrelated information is also stored.
True (A)
What type of anomaly occurs when deleting a record results in the loss of other related but necessary information?
What type of anomaly occurs when deleting a record results in the loss of other related but necessary information?
deletion anomaly
A database suffers from a(n) __________ anomaly when the same attribute has different values across multiple tuples.
A database suffers from a(n) __________ anomaly when the same attribute has different values across multiple tuples.
Match each database anomaly type with its corresponding description:
Match each database anomaly type with its corresponding description:
What design objective do the informal guidelines for relation schema primarily aim to achieve?
What design objective do the informal guidelines for relation schema primarily aim to achieve?
Combining attributes from multiple entity types into a single relation always simplifies the interpretation of the database.
Combining attributes from multiple entity types into a single relation always simplifies the interpretation of the database.
What kind of problem do NULL values in tuples introduce when performing COUNT or SUM aggregate operations?
What kind of problem do NULL values in tuples introduce when performing COUNT or SUM aggregate operations?
A relation schema should be designed to avoid generating __________ tuples, which are extra rows appearing after a join that do not accurately represent the data.
A relation schema should be designed to avoid generating __________ tuples, which are extra rows appearing after a join that do not accurately represent the data.
Match the informal guidelines for relation schema design with their respective aims:
Match the informal guidelines for relation schema design with their respective aims:
What does a functional dependency X → Y mean?
What does a functional dependency X → Y mean?
If X→Y, it automatically implies that Y→X.
If X→Y, it automatically implies that Y→X.
In the context of functional dependencies, what is the term for the set of inference rules?
In the context of functional dependencies, what is the term for the set of inference rules?
According to the reflexivity rule in Armstrong's axioms, if B is a subset of A, then A→__________.
According to the reflexivity rule in Armstrong's axioms, if B is a subset of A, then A→__________.
Match the Armstrong's Axioms rules with their descriptions:
Match the Armstrong's Axioms rules with their descriptions:
What is the term used to describe the set of all functional dependencies that can be inferred from a given set of functional dependencies, F?
What is the term used to describe the set of all functional dependencies that can be inferred from a given set of functional dependencies, F?
Functional dependency equivalence means that two sets of functional dependencies can infer the exact same dependencies.
Functional dependency equivalence means that two sets of functional dependencies can infer the exact same dependencies.
What term describes a set of functional dependencies that covers another set if every dependency in the second set can be inferred from the first?
What term describes a set of functional dependencies that covers another set if every dependency in the second set can be inferred from the first?
A minimal __________ of functional dependencies is a simplified, non-redundant set that logically implies all dependencies in the original set.
A minimal __________ of functional dependencies is a simplified, non-redundant set that logically implies all dependencies in the original set.
Match the concept with its definition:
Match the concept with its definition:
What is the primary goal of normalization in database design?
What is the primary goal of normalization in database design?
Normalization strictly focuses on optimizing query performance, and not on data integrity.
Normalization strictly focuses on optimizing query performance, and not on data integrity.
Attributes participating in the key are termed what?
Attributes participating in the key are termed what?
The domain of an attribute must only include single valued attributes in __________ normal form.
The domain of an attribute must only include single valued attributes in __________ normal form.
Match the normal form name with the related attribute to it:
Match the normal form name with the related attribute to it:
What is the characteristic of a table in First Normal Form (1NF)?
What is the characteristic of a table in First Normal Form (1NF)?
A table that is in 1NF is automatically in 2NF.
A table that is in 1NF is automatically in 2NF.
What condition related to non-prime attributes must be met for a table to be in Second Normal Form (2NF)?
What condition related to non-prime attributes must be met for a table to be in Second Normal Form (2NF)?
A table satisfies the Third Normal Form (3NF) if it is in 2NF and has no __________ dependency.
A table satisfies the Third Normal Form (3NF) if it is in 2NF and has no __________ dependency.
Match each normal form with its characteristic criterion:
Match each normal form with its characteristic criterion:
What is the key requirement for a relation schema to be in Boyce-Codd Normal Form (BCNF)?
What is the key requirement for a relation schema to be in Boyce-Codd Normal Form (BCNF)?
Any relation that is in BCNF is also automatically in 3NF.
Any relation that is in BCNF is also automatically in 3NF.
BCNF is violated when a non-trivial functional dependency exists where what condition isn't met?
BCNF is violated when a non-trivial functional dependency exists where what condition isn't met?
BCNF is also referred to as __________ Normal Form.
BCNF is also referred to as __________ Normal Form.
Match the dependencies with their definitions:
Match the dependencies with their definitions:
What principle does a lossless-join decomposition uphold?
What principle does a lossless-join decomposition uphold?
A dependency-preserving decomposition ensures that each functional dependency in the original relation can be directly enforced in one of the resulting relations.
A dependency-preserving decomposition ensures that each functional dependency in the original relation can be directly enforced in one of the resulting relations.
What is the goal of relational decomposition w.r.t normal forms?
What is the goal of relational decomposition w.r.t normal forms?
The attribute __________ condition requires that each attribute in the original relation must appear in at least one relation schema in the decomposition.
The attribute __________ condition requires that each attribute in the original relation must appear in at least one relation schema in the decomposition.
Match database decomposition with the correct descriptions:
Match database decomposition with the correct descriptions:
Flashcards
Insertion Anomalies
Insertion Anomalies
Anomalies that occur when inserting data into a database table.
Deletion Anomalies
Deletion Anomalies
Anomalies that occur when deleting data from a database table results in unintended data loss.
Modification Anomalies
Modification Anomalies
Anomalies that occur when updating data in a database table leads to inconsistencies.
Informal Design Guidelines
Informal Design Guidelines
Signup and view all the flashcards
Clear Semantics
Clear Semantics
Signup and view all the flashcards
Data Redundancy
Data Redundancy
Signup and view all the flashcards
Schema Design Guideline
Schema Design Guideline
Signup and view all the flashcards
NULL Values
NULL Values
Signup and view all the flashcards
Spurious Tuples
Spurious Tuples
Signup and view all the flashcards
Functional Dependency (FD)
Functional Dependency (FD)
Signup and view all the flashcards
Armstrong's Axioms
Armstrong's Axioms
Signup and view all the flashcards
Reflexivity Rule
Reflexivity Rule
Signup and view all the flashcards
Augmentation Rule
Augmentation Rule
Signup and view all the flashcards
Transitivity Rule
Transitivity Rule
Signup and view all the flashcards
Closure of Functional Dependencies
Closure of Functional Dependencies
Signup and view all the flashcards
Cover of FDs
Cover of FDs
Signup and view all the flashcards
Equivalence of FDs
Equivalence of FDs
Signup and view all the flashcards
Minimal Cover
Minimal Cover
Signup and view all the flashcards
Normalization
Normalization
Signup and view all the flashcards
First Normal Form (1NF)
First Normal Form (1NF)
Signup and view all the flashcards
Second Normal Form (2NF)
Second Normal Form (2NF)
Signup and view all the flashcards
Third Normal Form (3NF)
Third Normal Form (3NF)
Signup and view all the flashcards
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form (BCNF)
Signup and view all the flashcards
Superkey
Superkey
Signup and view all the flashcards
Candidate Key
Candidate Key
Signup and view all the flashcards
Primary Key
Primary Key
Signup and view all the flashcards
Prime Attribute
Prime Attribute
Signup and view all the flashcards
Nonprime Attribute
Nonprime Attribute
Signup and view all the flashcards
Decomposition
Decomposition
Signup and view all the flashcards
Attribute Preservation
Attribute Preservation
Signup and view all the flashcards
Dependency Preservation
Dependency Preservation
Signup and view all the flashcards
Lossless Join
Lossless Join
Signup and view all the flashcards
Study Notes
Module 4: Normalization
- Module covers database design anomalies, normalization concepts, functional dependencies, Armstrong's Axioms, normal forms (1NF, 2NF, 3NF, BCNF), and lossless join and dependency preserving decomposition.
Database Design Anomalies
- Anomalies in database design include insertion, deletion, and modification issues.
Insertion Anomalies
- Insertion anomalies occur when certain information can't be stored unless other, related information is also stored.
- In an EMP_DEPT relation with attributes Ename, Ssn, Bdate, Address, Dnumber, Dname, and Dmgr_ssn, adding a new employee requires assigning them to a department or using NULLs.
- Adding a new department with no employees requires using NULLs for the employee Ssn, which should be the primary key.
- In an EMP_PROJ relation with attributes Emp#, Proj#, Ename, Pname, and No_hours, a project can't be inserted without assigning an employee, and vice versa.
Deletion Anomalies
- Deletion anomalies occur when deleting information results in the loss of other, related information.
- In an EMP_DEPT relation, deleting the last employee record from a department results in losing information about that department.
- Deleting Borg, James record, leads to losing data about Head Quarters dept.
Modification Anomalies
- Modification anomalies occur when inconsistencies arise due to the need to update multiple copies of repeated data.
- For example, in the EMP_DEPT relation, changing the manager of department 5 requires updating multiple tuples.
- If updates aren't applied consistently across all tuples, the database becomes inconsistent, showing different values for the same department.
Informal Design Guidelines for Relation Schema
- There are four informal guidelines to determine the quality of relation schema design.
- Semantics: ensure the meaning of attributes is clear.
- Redundancy Reduction: design schemas to avoid insertion, deletion, and update anomalies.
- Null value reduction: to limit NULL values in tuples
- Avoidance of Spurious Tuples: disallow the creation of false tuples from joins.
Clear Semantics
- Attributes within a relation schema should have a clear, real-world meaning and proper interpretation.
- Semantics refers to the meaning resulting from interpreting attribute values within a tuple.
Guideline 1
- Design relation schemas to ensure ease of explanation of meaning.
- Avoid combining attributes from multiple entity and relationship types into a single relation.
- Relations should represent a single entity or relationship type for straightforward interpretation.
- Combining multiple entities and relationships can result in semantic ambiguities.
Guideline 1 Violation
- Combining Employee and Department attributes into a single table loses meaning because tuples do not clearly refer to individual relations and their attributes.
Redundancy and Update Anomalies
- Data redundancy exists when the same data is stored in multiple places within a database.
- Redundancy leads to wasted storage space and update anomalies (insertion, deletion, and modification).
Guideline 2
- Design database schemas so that insertion, deletion, and update anomalies do not occur.
Null Values
- Null values indicate attributes that are not applicable, values that are unknown, or values that exist but are unavailable.
- Null can waste storage at the storage level.
- Problems happen with understanding of attributes and with specifying JOIN operations at the logical level.
- NULL values can lead to unpredictable results with COUNT or SUM aggregate functions.
Guideline 3
- Avoid placing attributes that may frequently be NULL in a base relation.
- Design relations to minimize NULL values.
- Attributes with frequent NULLs can be placed in separate relations with the primary key.
Spurious Tuples
- Consider EMP_LOCS(EName, PLocation) and EMP_PROJ1(SSN, PNumber, Hours, PName, PLocation) versus EMP_PROJ(SSN, PNumber, Hours, EName, PName, PLocation).
- The former, when joined, does not give any information about EMP_PROJ
- Natural joins can produce many rows that are not in the initial EMP_PROJ relation
- In a natural join the extra rows are called spurious tuples
- Another design guideline is that to design relation schemas so that they can be joined with equality conditions on attributes that are either primary keys or foreign keys
- This will stop spurious tuples from being generated.
Natural Join & Spurious Tuples
- Decomposing EMP_PROJ into EMP_LOCS and EMP_PROJ1 is undesirable because when they are joined using NATURAL JOIN, we do not get correct information from the original attributes.
- Plocation relates EMP_LOCS and EMP_PROJ1 but is neither a primary nor foreign key in either.
Guideline 4
- Design relation schemas to ensure joins with equality conditions on appropriate attributes (primary key, foreign key) prevent spurious tuples.
- Avoid relations containing matching attributes that are not primary/foreign keys, as joins may produce spurious tuples.
Summary of Design Guidelines
- Design guidelines address anomalies causing redundant work during insertion and modification causing data loss during deletion.
- Design guidelines address waste of storage space from NULLs and difficulty of selections, aggregations, and joins caused by NULL values.
- The guidelines address the generation of invalid and spurious data during joins on base relations with mismatched attributes.
Functional Dependencies
- Functional Dependency (FD) is a constraint between two attribute sets in a database.
- Given schema R with attributes A1, A2, ..., An, functional dependency is denoted by X → Y.
- In any two tuples t1 and t2, if t1[X] = t2[X], then it must also be that t1[Y] = t2[Y].
- X→Y means Y values depend on X values.
- Y is functionally dependent on X.
Key Points on Functional Dependencies
- X→Y indicates Y's value is determined by X's value, but it doesn't imply Y→X.
Functional Dependency Examples
- In EMP_PROJ with attributes SSN, PNO, ENAME, PNAME, PLOC, HOURS,
- SSN (Social Security Number) determines ENAME (Employee Name). -SSN→ENAME
- PNO (Project Number) determines PNAME (Project Name) and PLOCATION.
- {SSN, PNUMBER} determine HOURS. -{SSN, PNUMBER}→HOURS
- {X,Y}→Z can be written as XYZ
Determining Validity
- A→B
- a1 = b1
- a2 = b3 This is determined to be invalid as only unique matching can be used
- B→A
- b1 = a1
- b3= a2
- b2 = a1 Determined to be valid.
Armstrong's Axioms
- Armstrong's axioms provide inference rules for functional dependencies, introduced by William W Armstrong.
- Armstrong test the logical implication of functional dependencies.
- If F is functional dependencies closure of F, denoted as F+ represents all dependencies logically implied by F
- Armstrong's Axioms are rules which, used serially, create closure of functional dependancies
Axiom Rules
- IR1: Reflexivity Rule: If A includes B, then A holds B (A→B).
- IR2: Augmentation Rule: If A→B holds and Y is an attribute set, then AY→BY also holds, indicating adding attributes to dependencies does not change basic dependencies.
- IR3: Transitivity Rule: If A→B and B→C holds, then A→C also holds.
Secondary Rules
- IR4: Union is where A→B, A→C holds, then A→BC holds.
- IR5: Decomposition is where A→BC holds then A→B, A→C hold.
- IR6: composition: If A→B and X→Y holds, then AX→BY holds.
- IR7: Pseudo Transitivity If A→B holds and BC→D holds, then AC→D holds.
Why Armstrong Axioms are Sound and Complete
- Soundness means that, given functional dependencies F on relation schema R, any dependency inferred from F using Armstrong's rules holds in every relation state r of R that satisfies dependencies in F.
- Completeness means using primary rules of Armstrong axioms repeatedly to infer dependencies to the end and the complete set of all possible dependencies can be inferred from F.
Closure Set
- Represent the set of FDs specified on R.
- Closure of F (F+) is the set of all FDs, F, as well as all dependencies that can be inferred/deduced from F.
- If department and department phone number and department social security depends on the respective info, then department depends on department phone number
- It is then stated this inferred number need not be stated as well
Closure Algorithm
- Closure is algorithm to see FDs
- A set F of FDs on a relation schema R, and a set of attributes X, which is a subset of R.
- The sets are used to determine further attributes and so on.
Examples
- Questions given with certain set R and other sets, students required to find the sets under the closure algorithm.
Covers & Equivalencies
- A set of functional dependencies F is said to cover another set of functional dependencies E if every FD in E is also in F+;
- A way used to find relation dependencies.
- Two sets of functional dependencies E and F are equivalent if E+ = F+.
Minimal Cover
- Minimal cover happens when a set of function dependencies and every dependency in E is in the closure of F
Key Facts regarding Minimal Cover
- Every dependency has a single attribute RHS with proper subsets of X & equivalent sets in F and dependencies that are equivalent to F
Key Steps for Minimal Set Algorithm
- Set it to E
- Replace each with a new liner FD
- Ensure each FD is liner
- Perform the if statement to determine new FD
- Perform it again to finish the FD
Example
- An example has an FD set and the question shows the steps to find the correct set.
- Step 1 is All above dependencies are in canonical form and only have one attribute
- Step 2 is to determine what attribute if needing to be changed & has redundancy
- Final is to find the reduced transitive to remove the redundancy
Normalization
- A methodology to ensure that the the attributes into smaller relations are broken-up into more simplistic terms
Main Keys with Normalization
- Normalization consists of mainly keys consisting of FDs, primary etc
- Helps minimise modifications and insertions
- Ensures 1NF, 2NF 3NF and Boyce Codd Normal form +
- Also applies to 4-5NF dependancies and mutli-valued elements.
Keys
- It is subset-of R with the property that no two tuples t1 and t2 in any legal relation state r of R will have equivalent closures
Normal Attributes
- Is a set of attributes that are candidate keys but called secondary.
- A Prime attribute must be a member of some candidate key.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.