Slide 4 - Normalization theory

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

Which of the following aspects is primarily addressed during the 'Analyzing real-world applications' stage when designing relational databases?

  • Mapping the domain expert needs to a model. (correct)
  • Optimizing relational schemas via normalization.
  • Developing database application software.
  • Transferring the design document to a relational model.

What is the primary input for the 'Transferring the design document to the relational model' step in designing relational database applications?

  • Process workflows for the database application.
  • A high-level design document containing ER diagrams. (correct)
  • Screen designs reflecting update screens and query screens.
  • Optimized relational schemas.

Which of the following best describes the role of normalization theory in the context of database design?

  • It serves as a mathematical basis for optimizing relational schemas. (correct)
  • It defines the processes for capturing real-world application requirements.
  • It facilitates the translation of ER diagrams into relational schemas.
  • It provides a foundation for developing database application software.

What is the primary goal of eliminating redundancy in database design?

<p>To optimize relational schemas and reduce data maintenance problems. (A)</p> Signup and view all the answers

Which type of data anomaly is most likely to occur when deleting a record from a poorly normalized database table?

<p>Deletion anomaly. (B)</p> Signup and view all the answers

In a database table, if a professor retires and their record is deleted, also resulting in a loss of the only record of a specific salary for a professor's rank, what type of anomaly is this?

<p>Deletion anomaly. (B)</p> Signup and view all the answers

A database schema contains an 'Employee' table with attributes including 'employeeID', 'employeeName', and 'departmentName'. If every time a 'departmentName' is updated, multiple rows in the 'Employee' table must be modified to ensure consistency, which type of anomaly is likely?

<p>Update anomaly. (D)</p> Signup and view all the answers

In a database, an update anomaly is least likely to directly cause which of the following problems?

<p>Increased storage space. (C)</p> Signup and view all the answers

Which of the following scenarios exemplifies an insertion anomaly?

<p>Being unable to add a new department to the database because there are no employees in that department yet. (D)</p> Signup and view all the answers

In the context of database design, what is the purpose of functional dependencies?

<p>To identify relationships between attributes within a relation. (C)</p> Signup and view all the answers

Given a relation R with attributes A and B, what does the functional dependency A → B signify?

<p>The value of attribute A determines a unique value for attribute B. (B)</p> Signup and view all the answers

Which of the following real-world roles is primarily responsible for determining functional dependencies in database design?

<p>Domain expert. (C)</p> Signup and view all the answers

What is the significance of determining the closure of a set of functional dependencies?

<p>To derive all possible dependencies that can be inferred from the given set. (A)</p> Signup and view all the answers

Given a set of functional dependencies F = {A → B, B → C}, which of the following dependencies can be derived using the transitivity rule?

<p>A → C (B)</p> Signup and view all the answers

Considering the functional dependency A → BC, which of the following inferences using the decomposition rule is correct?

<p>A → B and A → C (B)</p> Signup and view all the answers

Which of the following is true about a trivial functional dependency?

<p>It involves a dependency where the right-hand side attributes are a subset of the left-hand side attributes. (D)</p> Signup and view all the answers

What is the purpose of finding a minimal cover for a set of functional dependencies?

<p>To simplify the set of dependencies without changing its closure. (C)</p> Signup and view all the answers

You are given relation R(A, B, C, D, E) with FDs A → B, B → C, and C → D. Which of the following dependencies indicates that attribute E is extraneous in the dependency A → BCDE?

<p>A → E can be derived from the other FDs. (B)</p> Signup and view all the answers

In database normalization, what does it mean for an attribute to be 'atomic'?

<p>The attribute cannot be further subdivided. (B)</p> Signup and view all the answers

A database table contains a 'Contact' attribute that stores multiple phone numbers separated by commas. Which normal form does this violate?

<p>First Normal Form (1NF). (B)</p> Signup and view all the answers

What condition must be met for a relation to be in Boyce-Codd Normal Form (BCNF)?

<p>Every determinant must be a candidate key. (A)</p> Signup and view all the answers

Which of the following is a key difference between BCNF and 3NF?

<p>BCNF is stricter and eliminates more redundancy than 3NF. (A)</p> Signup and view all the answers

Which of the following conditions will exclude a table from being in BCNF?

<p>A non-key attribute determines part of the candidate key. (C)</p> Signup and view all the answers

Which of the below statements best describes the third normal form?

<p>A relation is in 3NF if for every non-trivial functional dependency X → A, X is a superkey, or A is a prime attribute. (A)</p> Signup and view all the answers

What is a key characteristic of a relation that satisfies Third Normal Form (3NF)?

<p>It allows some limited redundancy to maintain dependency preservation. (C)</p> Signup and view all the answers

Why might a database designer choose 3NF over BCNF?

<p>To guarantee dependency preservation, even with some redundancy. (A)</p> Signup and view all the answers

What is the primary goal of decomposing a relational schema?

<p>To eliminate redundancy and satisfy normal forms. (C)</p> Signup and view all the answers

When is a decomposition considered a 'lossy' decomposition?

<p>When the original relation cannot be fully reconstructed from the decomposed relations. (D)</p> Signup and view all the answers

A database table 'StudentCourse' with attributes 'studentID', 'courseID', and 'grade' is decomposed into two tables: 'Student' (studentID) and 'Course' (courseID, grade). What is the most likely problem created by this decomposition?

<p>It is a lossy decomposition because the relationship between student and course is lost. (C)</p> Signup and view all the answers

Under what condition is a lossless-join decomposition guaranteed?

<p>If the common attributes form a superkey for at least one of the relations. (D)</p> Signup and view all the answers

What does it mean for a decomposition to be 'dependency preserving'?

<p>All functional dependencies in the original relation can be inferred from the dependencies in the decomposed relations. (D)</p> Signup and view all the answers

Which of the following is most likely to decrease when a database schema is decomposed without dependency preservation?

<p>Query execution time. (D)</p> Signup and view all the answers

Given a relation R with functional dependencies F, and a lossless-join decomposition of R into R1 and R2 with corresponding dependencies F1 and F2, what condition signifies that the decomposition is dependency-preserving?

<p>(F1 ∪ F2)+ = F+ (C)</p> Signup and view all the answers

Why is dependency preservation considered important in database design?

<p>It enables easier enforcement of integrity constraints without complex joins. (A)</p> Signup and view all the answers

What is denormalization in the context of database design, and why is it sometimes used?

<p>The process of combining relations to improve query performance. (A)</p> Signup and view all the answers

Which of the following is most likely to improve when a database is intentionally denormalized?

<p>Query performance. (A)</p> Signup and view all the answers

What is a key trade-off in the decision to denormalize a database?

<p>Balancing improved query performance with increased data redundancy. (D)</p> Signup and view all the answers

In databases, which of the following best describes a 'Normal Form'?

<p>A high-quality schema. (D)</p> Signup and view all the answers

During the 'Eliminating Redundancy' stage of relational database design, what serves as the mathematical foundation?

<p>Normalization Theory (A)</p> Signup and view all the answers

What input is required during the 'Eliminating Redundancy' step in relational database design?

<p>Relational schemas (C)</p> Signup and view all the answers

What is the primary role of domain experts in the initial phase of designing relational database applications?

<p>Describing the application's needs and functionalities (D)</p> Signup and view all the answers

In the context of relational databases, what is a potential consequence of data redundancy?

<p>Data maintenance problems and anomalies (B)</p> Signup and view all the answers

How does redundancy directly contribute to the occurrence of an update anomaly in a database?

<p>By requiring multiple rows to be modified for a single update (A)</p> Signup and view all the answers

Consider a database table 'Employees' with columns 'employeeID', 'projectName', and 'projectLocation'. If deleting an employee also deletes the only record of a specific project location, what type of anomaly is this?

<p>Deletion anomaly (B)</p> Signup and view all the answers

A database table 'BookAuthors' includes 'bookID', 'bookTitle', and 'authorName'. If updating an author's name requires updating multiple rows across different books, which type of anomaly is likely?

<p>Update anomaly (D)</p> Signup and view all the answers

In a 'Customers' table, which of the following issues is least likely to be a direct consequence of an update anomaly?

<p>Increased storage space (D)</p> Signup and view all the answers

Which of the following scenarios primarily demonstrates an insertion anomaly?

<p>Inability to add a new course without assigning students. (B)</p> Signup and view all the answers

What principle underlies the concept of functional dependencies in database design?

<p>Identifying attributes that uniquely determine other attributes (D)</p> Signup and view all the answers

Given a relation R with attributes X and Y, what does the notation X → Y express?

<p>X functionally determines Y (A)</p> Signup and view all the answers

In database design, which stakeholder typically possesses the knowledge required to define functional dependencies accurately?

<p>Domain Expert (A)</p> Signup and view all the answers

What is the primary application of the closure of a set of functional dependencies?

<p>Identifying all possible dependencies within a relation (B)</p> Signup and view all the answers

Given F = {A → B, B → C, C → D}, what dependency is derived by applying the transitivity rule?

<p>A → D (A)</p> Signup and view all the answers

Given A → BC, which inference correctly applies the decomposition rule?

<p>A → B (B)</p> Signup and view all the answers

In the context of functional dependencies, what defines a trivial functional dependency?

<p>A dependency where the right-hand side is a subset of the left-hand side (D)</p> Signup and view all the answers

Why is it important to find a minimal cover for a set of functional dependencies?

<p>To simplify the set of dependencies without changing their closure (B)</p> Signup and view all the answers

Given relation R(A, B, C, D, E) with dependencies A → B, B → C, C → D, which scenario suggests that attribute E is extraneous in A → BCDE?

<p>A → E can be inferred from other dependencies (D)</p> Signup and view all the answers

In the context of database normalization, what does it mean for an attribute to be 'atomic'?

<p>The attribute cannot be further subdivided (A)</p> Signup and view all the answers

A database table contains a 'Skills' attribute storing multiple skills separated by semicolons (e.g., 'Programming; Design; Testing'). Which normal form does this violate?

<p>First Normal Form (A)</p> Signup and view all the answers

Which condition must be met for a relation to satisfy Boyce-Codd Normal Form (BCNF)?

<p>Every determinant is a candidate key (D)</p> Signup and view all the answers

What scenario illustrates the violation of BCNF but adherence to 3NF in a database table?

<p>A non-key attribute determines part of a candidate key. (A)</p> Signup and view all the answers

Which of the following defines a state where a table is excluded from being in BCNF?

<p>A non-key attribute determines part of the primary key. (B)</p> Signup and view all the answers

What condition is specific to Third Normal Form (3NF)?

<p>All non-key attributes are fully functionally dependent on the primary key and no non-key attribute is transitively dependent on the primary key. (B)</p> Signup and view all the answers

What situation does Third Normal Form (3NF) specifically aim to avoid?

<p>Transitive dependencies on non-key attributes. (B)</p> Signup and view all the answers

Under what circumstances might one opt for 3NF over BCNF in database design?

<p>When dependency preservation is more important than eliminating subtle redundancy. (A)</p> Signup and view all the answers

What is a significant reason for decomposing a relational schema?

<p>To reduce data redundancy and anomalies. (D)</p> Signup and view all the answers

When does a decomposition become 'lossy'?

<p>When information is lost and cannot be reconstructed from the decomposed tables. (D)</p> Signup and view all the answers

If a table 'StudentMajor' with 'studentID' and 'majorName' is split into 'Student' (studentID) and 'Major' (majorName), and the relation between students and majors is lost, what issue arises?

<p>A potential lossy decomposition. (C)</p> Signup and view all the answers

Under which condition is a lossless-join decomposition assured?

<p>If the common attributes form a superkey for at least one of the tables. (D)</p> Signup and view all the answers

What is most likely to be affected negatively when a database schema is decomposed without ensuring dependency preservation?

<p>Data integrity enforcement. (A)</p> Signup and view all the answers

Given relation R with functional dependencies F, and a lossless-join decomposition of R into R1 and R2 leading to F1 and F2, what condition signifies that the decomposition also preserves dependencies?

<p>(F1 ∪ F2)+ = F+. (C)</p> Signup and view all the answers

What is denormalization in database design, and why would a designer use it?

<p>The process of combining tables or adding redundant data; to improve query performance. (D)</p> Signup and view all the answers

Which aspect of database performance is most likely to see improvement due to denormalization?

<p>Query execution speed. (C)</p> Signup and view all the answers

What key tradeoff is involved when deciding to denormalize a database?

<p>Query speed versus data redundancy. (C)</p> Signup and view all the answers

According to database normalization principles, what is the primary benefit of adhering to a higher normal form?

<p>Reduced data anomaly risks. (C)</p> Signup and view all the answers

What is the primary trade-off when choosing between a higher normal form like BCNF and a lower normal form like 3NF during database design?

<p>Balancing data redundancy against the complexity of maintaining functional dependencies. (A)</p> Signup and view all the answers

In the context of relational database design, what is a key characteristic of a lossless-join decomposition?

<p>It ensures that no data is lost when the original relation is reconstructed by joining the decomposed relations. (A)</p> Signup and view all the answers

Considering a relation that has been decomposed into multiple relations, what condition signifies that the decomposition is dependency-preserving?

<p>The closure of the union of the functional dependencies of the decomposed relations is equal to the closure of the functional dependencies of the original relation. (D)</p> Signup and view all the answers

Which of the following scenarios would most likely lead a database designer to consider denormalization?

<p>To improve the performance of frequent and complex queries that require joining multiple tables. (D)</p> Signup and view all the answers

Within the context of database normalization, what does the term 'atomic' refer to regarding attributes?

<p>The attribute's values cannot be further subdivided. (B)</p> Signup and view all the answers

Flashcards

Anomalies of Redundancy

Problems that arise from data redundancy, affecting data maintenance.

Insertion Anomaly

Adding data introduces inconsistencies if not done uniformly.

Delete Anomaly

Removing data unintentionally deletes related information.

Update Anomaly

Changing data requires multiple updates, risking inconsistencies.

Signup and view all the flashcards

Functional Dependency (A→B)

A uniquely determines B, value of B = F(value of A).

Signup and view all the flashcards

Functional Dependency

Every input value of A, has a unique value of B.

Signup and view all the flashcards

Reflexivity Rule

If R ⊆ S, then S → R is a Trivial Functional Dependency

Signup and view all the flashcards

Augmentation Rule

If A → B, then AC → BC. Adding same attributes to both sides.

Signup and view all the flashcards

Transitivity Rule

If A → B and B → C, then A → C. Dependency passes through.

Signup and view all the flashcards

Decomposition Rule

If A→BC then A → B and A → C. You can split dependencies BC.

Signup and view all the flashcards

Union Rule

If A→B and A→C then A→BC. Combine common dependencies.

Signup and view all the flashcards

Closure of F (F+)

The set of all functional dependencies implied by a given set.

Signup and view all the flashcards

Closure of Attributes (ABC+)

Attributes reachable from set given functional dependencies.

Signup and view all the flashcards

Minimal Cover

Remove redundancies without losing information.

Signup and view all the flashcards

Normal Form

Good schema / database design.

Signup and view all the flashcards

Atomic

Each attribute value being indivisible.

Signup and view all the flashcards

First Normal Form (1NF)

Each attribute value must be atomic.

Signup and view all the flashcards

Boyce-Codd Normal Form (BCNF)

If each FD X→Y, either Y⊆X or X is superkey.

Signup and view all the flashcards

Third Normal From (3NF)

If X→Y, either Y⊆X or X is key or Y is part of a key.

Signup and view all the flashcards

Decomposition

Splitting a table into smaller, related tables.

Signup and view all the flashcards

Lossless-Join Decomposition

Reconstructing original tuples from decomposed tables.

Signup and view all the flashcards

R = R1 U R2

Not missing any attributes in the decomposition

Signup and view all the flashcards

R1 ∩ R2

R1 contains in it the key for R2, or R2 contains in it the key for R1

Signup and view all the flashcards

Foreign key pointing to R2’s primary key

A foreign key in R1 referring to R2.

Signup and view all the flashcards

Dependency Preserving

Guarantee functional dependencies can be checked.

Signup and view all the flashcards

Functional Dependencies (Preserved)

Fulfills 3NF properties.

Signup and view all the flashcards

Denormalization

Normalization breaks database, have performance problems.

Signup and view all the flashcards

Study Notes

Redundancy Elimination, Schema Refinement, and Normalization

  • The roadmap for designing relational database applications is to be reviewed
  • Anomalies of redundancy, including insert, delete, and update anomalies, are covered.
  • Functional dependencies as well as properties are described
  • Functional dependencies are used to eliminate redundancies
  • Normal forms (1st, 3rd, and Boyce Codd) are covered
  • Decomposition and its properties, inclusive of Lossy vs. Lossless Join as well as Dependency Preserving Decompositions is reviewed
  • Focus on BCNF vs 3NF and denormalization

Roadmap for Designing Relational Database Applications

  • Analyzing real-world applications for relational database design involves domain experts telling their story of what they need
  • A system analyst on the DBMS side maps the domain expert needs to a model that database developers use for application design
  • A modelling language like ER Model (Entity-Relationship Model) is used, which is a high-level data model
  • Data entities with clear relationships for the given real-world application need to be defined
  • Provide a simple conceptual design for the database and its view of data
  • The output is a high-level design document including ER diagrams and mock-up screen designs
  • The mock-up screen designs should reflect update screens, reports screens and query screens as well as workflows of processes

Transferring Design Document and Eliminating Redundancy

  • Design documents with ER Diagrams are transferred into the relational model's format, rendering relational schemas
  • DB Design tools or manual processes may be used
  • Relational schemas, which are the input to the model, get optimized which becomes the output
  • Normalization theory is the mathematical foundation
  • DB Design tools or manual processes tools may be used
  • Input: Optimized relational schemas, screen designs, process workflows
  • Output: Database Application

Data Redundancy

  • Data redundancy leads to wasted storage due to data duplication
  • Redundancy causes data maintenance problems or anomalies during insertion (Insertion Anomaly), update (Update Anomaly), and deletion (Delete Anomaly)

Data Redundancy Anomalies

  • Anomaly occurs if Steve who is a full professor retires and is deleted from the table
  • This means that because Steve was the only full professor at the time, important information on base salaries will be irretrievable
  • In the event Sue moves to, gets a job at another university, and is deleted from the table
  • The fact that Eduardo is the Math Head of Department is lost

Data Redundancy: Update Anomaly

  • Data Anomaly occurs if the university decides to make the base salary for assistant professors $36K
  • If this process only changes one location it creates inconsistency with Sallys record
  • To fix the inconsistency, the base salary needs updating in multiple locations
  • The Chemistry department experiences the same issue of an update anomaly if the head of Department changes
  • Maintaining data consistency is a challenge when entities are updated

Data Redundancy: Insertion Anomaly

  • Insertion Anomaly occurs if there is an attempt to insert a new instructor without knowing the new hire's base salary or department head
  • The table will need to insert nulls where the information is not available yet
  • If a new Data Science department is formed, there is a potential insertion anomaly
  • This is because the new department requires the table to insert nulls throughout the table which means missing the data for instructor information because there are no professors/hires yet

Functional Dependencies (FDs)

  • A and B are two attributes where A uniquely determines B; B is functionally dependent on A
  • The value of B = F(value of A), and F is a function where the function denotes a unique value of B for every input value of A
  • An example of FDs is AC→ BDE, with multiple attributes on either side
  • Given tuples t₁er, t₂∈ r, π₁(t₁) = π₁(t2), it can be implied that πρ(t₁) = πβ(t2)
  • With Attributes XY as the key for Table r with Schema, XY → R and, the key attributes determine the attributes in a given tuple in r

Functional Dependencies (FDs): Example Table

  • The instructor consists of a string iid, name, rank, and department along with a real base and strings dept_head
  • FDs exist between iid name, rank, base_salary, department, and dept_head
  • Department has an FD with Department Head and rank has an FD with base salary

Functional Dependencies (FDs): Domain Expert

  • Domain Experts decide that A determines B, such that A → B, iid → INSTRUCTOR, and Department → Dept-head
  • System Analysts supply programmers with FDs by the set of FDs determined by the Domain Experts, along with ER diagrams

Functional Dependencies (FDs): Reasoning

  • FDs exist between the iid, name, rank, and base_salary
  • The department has an FD with Department Head
  • Rank has an FD with base salary
  • Reflexivity exists between the idds, names and rank, base_salary, and department, and dept_head
  • In the Reflexivity, ∀ R⊆S, S➔ R, with Termed Trivial FD
  • Augmentation: If A→B then AC→ BC
  • Transitivity: If A→B and B→ C then A→ C
  • Decomposition: If A→BC then A➔ B and A→ C
  • Union: If A→B and A→ C then A→ BC

Functional Dependencies (FDs): Closure

  • The closure of a set of FDs is known as the F+, which indicates all the FDs derived or implied from F
  • With F = {A→B and B→C and B D}, A → C and A → D are in F+ by Transitivity
  • For F = {A→B and B→C and B D}, A→ ABCD is in F+ by Union
  • For F = {A→B and B→C and B D}, AD → CD is in F+ by Augmentation
  • For F = {A→B and B→C and B D}, AD → C is in F+ by Decomposition

Closure & Minimal Cover

  • ABC+ shows all the attributes reachable from ABC given a set of FDs
  • With F = {A→B and B→C and B D}, B+ is BCD
  • To demonstrate a set of attributes X is a key for R, it is necessary to Show that X+ R
  • Given the minimal cover equation for FDs:
  • iid name, rank, base_salary, department, dept_head
  • department → dept_head
  • Rank → base_salary
  • In general, the higher the redundancies, the more complex a scheme data maintenance
  • Ensuring FDs do not have redundancies or extraneous attributes minimizes scheme complexity
  • As an example, given F = {A→B, B→C, A → C, A→ BD, AB → D, B → D), A→ C is redundant
  • Since, it can be implied by transitivity since it is implied by A→B and B→ C which means A➔ C can be eliminated from F
  • With F = {A→B, B→C, A → C, A→ BD, AB → D, B → D}, A→ BD is redundant
  • A→ B and B→D implies A➔ D (by Transitivity), A→ B and A➔D implies that A→ BD (By Augmentation) and A➔BD can be eliminated

FDs, Schema, and Normal Forms

  • Database schemas and tables can be optimized and enhanced by eliminating redundancies
  • Scientists define quality schemas and denote them as being 'Good' database designs
  • They classify these 'good' designs into 'Normal Forms', which are ways of minimizing anomalies
  • There are several normal forms, such as the First, Second and Third normal form
  • In order to determine if a schema or table is of 'good design (BCNF), determine if the schema R satisfies BCNF requirements, then confirm by asking a systems analyst
  • To eliminate redundancies, you may see FDs where certain tuples seem to introduce redundancies
  • In order to work with specific values for X and then the correspoding values for Y to ensure you get that desired design

First Normal Form (1NF)

  • First Normal Form ensures that every attribute value is atomic, i.e is indivisible
  • It is important to confirm whether the 1NF is atomic or not
  • This can be verified from example where we impose that our course IDS should be atomic(CS451), or not atomic(CS,451)
  • An example of non-atomic data is "Mariam, Pamela, Bob", since here we'd want children's names to parse individually
  • All schemas should be checked to ensure attribute values are atomic
  • Multiple attributes as well as non-atomic attributes are indicators that you are breaking the 1st NF

Boyce Codd Normal Form (BCNF)

  • The Boyce Codd Normal Form is a "Good Form"
  • R is in BCNF when referring to F, if for all X➔Y in F⁺
  • Either X➔Y is a trivial Functional Dependency, i.e., Y ⊆ X, or; X is a super key for R
  • If R is in BCNF, then it is considered a good schema form and, therefore, a good design that will not experience anomalies or redundancies

Third Normal Form (3NF)

  • R is in 3NF when referring to F+, for all X➔Y
  • Either X➔Y is a trivial Functional Dependency, i.e., Y ⊆ X, or; X is a key for R, or; or, Y is part of a key for R
  • This is More relaxed than BCNF and includes being in 3NF and being in BCNF
  • Schemas in BCNF follow 3NF but the 3NF cannot necessarily be considered a BCNF

Schema Evaluation: Example Case

  • Given an Instructor table that is referenced by an iid, is instructor schema in BCNF?
  • Instructor(iid: string, name: string, rank: string, base_salary: real, department: string, dept_head: string)
  • It requires knowing all Functional Dependencies for a given Table
  • To determine the FD you must ask the Domain Expert
  • The question of if the current instructor schema is in the 'good design' has to be answered

Schema Evaluation: Instructor

  • Start by answering what the FDs with the instructor table are
  • As an example, the FD may suggest, iid → name, rank, base_salary, department, dept_head OR rank → base_salary OR department → dept_head
  • Answer this question for each of the above existing FD scenarios in the table
  • If the instructor schema is in BCNF for the FD with, iid → name, rank, base_salary, department, dept_head: Confirm via systems analyst
  • Instructor Schema in BCNF for FD of rank → base_salary (Not true, Rank is not key)
  • Conclude that rank and base salary is redundantly replicated
  • Instructor Schema in BCNF for FD of department → dept_head (Not true, departmentis not KEY)
  • Conclude: Department and Department head is redundantly replicated and therefore making instructor a bad design

Instructor Schema: Decomposition

  • Since it is concluded that instructor schema is a bad design determine what solution can be employed:
  • Decompose the instructor schema into smaller sub schemas that satisfy the conditions required
  • Therefore this instructor table may undergo decomposition into tables such as rank, name and other variations
  • One example can for the Schema of table instructor with parameters ranging from iid to dept head it is split into Department, dept head, and rank, base salary

Correctness in Decomposition

  • It is important to have lossless join decomposition when working with instructor tables
  • Decomposition may cause performance overhead when data in a query is taken out of the relation and requires joining
  • Efficiency is required when checking various functional dependencies
  • The challenge is to be able to avoid redundancy and manage performance

Lossy Decomposition

  • Consider the Instructor Table and the FD: department department_head with the appropriate table relations to each entity
  • Now decompose that table into the 2 tables of: iid, name, rank, base_salary and department, department_head
  • Take note that the two new tables are in BCNF, which allows us to reconstruct tuples back into their original schemas
  • To help with reconstruction, try to determine what Stephens department is or department information, we realize that is lost
  • This is now a Lossy Decomposition, because now the data must guarantee we can reconstruct the table while considering relations
  • The conclusion: decompose into iid name, rank, base_salary and department → dept_head when the name and rank are in the respective table and the arrow goes between dept table

LossLess Join Decomposition

  • Lossless Join looks at the correctness aspect of decomposition
  • In one example say table r has scheme which has FD X ➔ Y, but XUYCR and decompose the r into the two r1 r2 schemas from the table scheme
  • Say the schema can be considered a lossless join iff: R= R1 union R2
  • What that effectively is saying is you should be able to reconstruct the table from R1 and R2

Lossless-Join Decomposition: Condition 2

  • Since you cant miss any attributes in the decomposition, R= R1 u R2, you may need to consider the key for R2 in R1
  • This makes R1 have a key for R2s primary key, also R2 can be deemed to have a key pointing in R1, all of this allows one condition to be true
  • There may be different conditions and situations to make that work, with the result being foreign key primary key natural joins using a foreign key in R1

Lossless Decomposition: Instructor

  • Reiterate the lossless conditions regarding the instructor with R1 = (department, dept_head); R2 = (iid, name, rank, base_salary, department); where the R1 n R2 = dept

Dependency Preservation

  • If R is a schema with Functional Dependency denoted by F1 F this essentially means that one the functional attribute set of R1 is only R1
  • However if F2, F Functional Dependencies apply for attribute sets in R2 only
  • Therefore the decomp of R in tables R1 R2 is only dependency preserving iff = (F1 UF2)+ = F +
  • This decomposition of R1 and R2 means the overall schemas must be dependency performing

Dependency Preservation (DP): Benefits

  • Lossless join has an affect on the overall correctness
  • DP has overall affects to the performance
  • What happens if you perform actions where DP does not occur?
  • The FD is never captured after we decomposed in R1 and R2, so essentially whatever one was going enforce with data is not longer there and we lost some data in this process.
  • It is always important to avoid joining after any operation

Normal Forms: 3NF vs. BCNF

  • BCNF is not always dependency preserving whereas 3NF always has dependency preservation
  • Given the table: Consider the table T = (X, Y, Z) while the table is Not in BCNF
  • The table becomes the following decomposition: T1 = (X, Y), T2 = (X, Z), which does not preserve BCNF
  • There are multiple considerations as a result, mainly that there is no dependency preserving decomposition
  • You need to determine the need for Third Normal Form (3NF), and see why is 3NF needed
  • Third Normal Form (3NF) - In the case where you still need to run BCNF we must keep in the mind that the x can possibly match this 3rd condition that now makes this new table part of the new key, in other words their is no need to decompose into anything for this problem and all dependency is preserved.

Rationale Behind 3NF

  • There are lossless join decompositions, with the same data being copied into data
  • It always be the cheaper and allow you to do joins to check if F1 is not violated or to check efficiency
  • An adoption on this form a more efficient implementation for BCNF can be done by adding redundancies via a more efficient limited case
  • At this point, we need to look what this means because there are tradeoffs, such as giving way to limited redundancy.
  • But this is what makes 3NF, its guarantee of having decompositions with dependency and preservation via only lossless joins that allow you retain dependencies

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Database Relations and Normal Forms
18 questions
Relational Database Design Quiz
9 questions
Relational Database Design Overview
10 questions
Use Quizgecko on...
Browser
Browser