Podcast
Questions and Answers
Which of the following aspects is primarily addressed during the 'Analyzing real-world applications' stage when designing relational databases?
Which of the following aspects is primarily addressed during the 'Analyzing real-world applications' stage when designing relational databases?
- Mapping the domain expert needs to a model. (correct)
- Optimizing relational schemas via normalization.
- Developing database application software.
- Transferring the design document to a relational model.
What is the primary input for the 'Transferring the design document to the relational model' step in designing relational database applications?
What is the primary input for the 'Transferring the design document to the relational model' step in designing relational database applications?
- Process workflows for the database application.
- A high-level design document containing ER diagrams. (correct)
- Screen designs reflecting update screens and query screens.
- Optimized relational schemas.
Which of the following best describes the role of normalization theory in the context of database design?
Which of the following best describes the role of normalization theory in the context of database design?
- It serves as a mathematical basis for optimizing relational schemas. (correct)
- It defines the processes for capturing real-world application requirements.
- It facilitates the translation of ER diagrams into relational schemas.
- It provides a foundation for developing database application software.
What is the primary goal of eliminating redundancy in database design?
What is the primary goal of eliminating redundancy in database design?
Which type of data anomaly is most likely to occur when deleting a record from a poorly normalized database table?
Which type of data anomaly is most likely to occur when deleting a record from a poorly normalized database table?
In a database table, if a professor retires and their record is deleted, also resulting in a loss of the only record of a specific salary for a professor's rank, what type of anomaly is this?
In a database table, if a professor retires and their record is deleted, also resulting in a loss of the only record of a specific salary for a professor's rank, what type of anomaly is this?
A database schema contains an 'Employee' table with attributes including 'employeeID', 'employeeName', and 'departmentName'. If every time a 'departmentName' is updated, multiple rows in the 'Employee' table must be modified to ensure consistency, which type of anomaly is likely?
A database schema contains an 'Employee' table with attributes including 'employeeID', 'employeeName', and 'departmentName'. If every time a 'departmentName' is updated, multiple rows in the 'Employee' table must be modified to ensure consistency, which type of anomaly is likely?
In a database, an update anomaly is least likely to directly cause which of the following problems?
In a database, an update anomaly is least likely to directly cause which of the following problems?
Which of the following scenarios exemplifies an insertion anomaly?
Which of the following scenarios exemplifies an insertion anomaly?
In the context of database design, what is the purpose of functional dependencies?
In the context of database design, what is the purpose of functional dependencies?
Given a relation R with attributes A and B, what does the functional dependency A → B signify?
Given a relation R with attributes A and B, what does the functional dependency A → B signify?
Which of the following real-world roles is primarily responsible for determining functional dependencies in database design?
Which of the following real-world roles is primarily responsible for determining functional dependencies in database design?
What is the significance of determining the closure of a set of functional dependencies?
What is the significance of determining the closure of a set of functional dependencies?
Given a set of functional dependencies F = {A → B, B → C}, which of the following dependencies can be derived using the transitivity rule?
Given a set of functional dependencies F = {A → B, B → C}, which of the following dependencies can be derived using the transitivity rule?
Considering the functional dependency A → BC, which of the following inferences using the decomposition rule is correct?
Considering the functional dependency A → BC, which of the following inferences using the decomposition rule is correct?
Which of the following is true about a trivial functional dependency?
Which of the following is true about a trivial functional dependency?
What is the purpose of finding a minimal cover for a set of functional dependencies?
What is the purpose of finding a minimal cover for a set of functional dependencies?
You are given relation R(A, B, C, D, E) with FDs A → B, B → C, and C → D. Which of the following dependencies indicates that attribute E is extraneous in the dependency A → BCDE?
You are given relation R(A, B, C, D, E) with FDs A → B, B → C, and C → D. Which of the following dependencies indicates that attribute E is extraneous in the dependency A → BCDE?
In database normalization, what does it mean for an attribute to be 'atomic'?
In database normalization, what does it mean for an attribute to be 'atomic'?
A database table contains a 'Contact' attribute that stores multiple phone numbers separated by commas. Which normal form does this violate?
A database table contains a 'Contact' attribute that stores multiple phone numbers separated by commas. Which normal form does this violate?
What condition must be met for a relation to be in Boyce-Codd Normal Form (BCNF)?
What condition must be met for a relation to be in Boyce-Codd Normal Form (BCNF)?
Which of the following is a key difference between BCNF and 3NF?
Which of the following is a key difference between BCNF and 3NF?
Which of the following conditions will exclude a table from being in BCNF?
Which of the following conditions will exclude a table from being in BCNF?
Which of the below statements best describes the third normal form?
Which of the below statements best describes the third normal form?
What is a key characteristic of a relation that satisfies Third Normal Form (3NF)?
What is a key characteristic of a relation that satisfies Third Normal Form (3NF)?
Why might a database designer choose 3NF over BCNF?
Why might a database designer choose 3NF over BCNF?
What is the primary goal of decomposing a relational schema?
What is the primary goal of decomposing a relational schema?
When is a decomposition considered a 'lossy' decomposition?
When is a decomposition considered a 'lossy' decomposition?
A database table 'StudentCourse' with attributes 'studentID', 'courseID', and 'grade' is decomposed into two tables: 'Student' (studentID) and 'Course' (courseID, grade). What is the most likely problem created by this decomposition?
A database table 'StudentCourse' with attributes 'studentID', 'courseID', and 'grade' is decomposed into two tables: 'Student' (studentID) and 'Course' (courseID, grade). What is the most likely problem created by this decomposition?
Under what condition is a lossless-join decomposition guaranteed?
Under what condition is a lossless-join decomposition guaranteed?
What does it mean for a decomposition to be 'dependency preserving'?
What does it mean for a decomposition to be 'dependency preserving'?
Which of the following is most likely to decrease when a database schema is decomposed without dependency preservation?
Which of the following is most likely to decrease when a database schema is decomposed without dependency preservation?
Given a relation R with functional dependencies F, and a lossless-join decomposition of R into R1 and R2 with corresponding dependencies F1 and F2, what condition signifies that the decomposition is dependency-preserving?
Given a relation R with functional dependencies F, and a lossless-join decomposition of R into R1 and R2 with corresponding dependencies F1 and F2, what condition signifies that the decomposition is dependency-preserving?
Why is dependency preservation considered important in database design?
Why is dependency preservation considered important in database design?
What is denormalization in the context of database design, and why is it sometimes used?
What is denormalization in the context of database design, and why is it sometimes used?
Which of the following is most likely to improve when a database is intentionally denormalized?
Which of the following is most likely to improve when a database is intentionally denormalized?
What is a key trade-off in the decision to denormalize a database?
What is a key trade-off in the decision to denormalize a database?
In databases, which of the following best describes a 'Normal Form'?
In databases, which of the following best describes a 'Normal Form'?
During the 'Eliminating Redundancy' stage of relational database design, what serves as the mathematical foundation?
During the 'Eliminating Redundancy' stage of relational database design, what serves as the mathematical foundation?
What input is required during the 'Eliminating Redundancy' step in relational database design?
What input is required during the 'Eliminating Redundancy' step in relational database design?
What is the primary role of domain experts in the initial phase of designing relational database applications?
What is the primary role of domain experts in the initial phase of designing relational database applications?
In the context of relational databases, what is a potential consequence of data redundancy?
In the context of relational databases, what is a potential consequence of data redundancy?
How does redundancy directly contribute to the occurrence of an update anomaly in a database?
How does redundancy directly contribute to the occurrence of an update anomaly in a database?
Consider a database table 'Employees' with columns 'employeeID', 'projectName', and 'projectLocation'. If deleting an employee also deletes the only record of a specific project location, what type of anomaly is this?
Consider a database table 'Employees' with columns 'employeeID', 'projectName', and 'projectLocation'. If deleting an employee also deletes the only record of a specific project location, what type of anomaly is this?
A database table 'BookAuthors' includes 'bookID', 'bookTitle', and 'authorName'. If updating an author's name requires updating multiple rows across different books, which type of anomaly is likely?
A database table 'BookAuthors' includes 'bookID', 'bookTitle', and 'authorName'. If updating an author's name requires updating multiple rows across different books, which type of anomaly is likely?
In a 'Customers' table, which of the following issues is least likely to be a direct consequence of an update anomaly?
In a 'Customers' table, which of the following issues is least likely to be a direct consequence of an update anomaly?
Which of the following scenarios primarily demonstrates an insertion anomaly?
Which of the following scenarios primarily demonstrates an insertion anomaly?
What principle underlies the concept of functional dependencies in database design?
What principle underlies the concept of functional dependencies in database design?
Given a relation R with attributes X and Y, what does the notation X → Y express?
Given a relation R with attributes X and Y, what does the notation X → Y express?
In database design, which stakeholder typically possesses the knowledge required to define functional dependencies accurately?
In database design, which stakeholder typically possesses the knowledge required to define functional dependencies accurately?
What is the primary application of the closure of a set of functional dependencies?
What is the primary application of the closure of a set of functional dependencies?
Given F = {A → B, B → C, C → D}, what dependency is derived by applying the transitivity rule?
Given F = {A → B, B → C, C → D}, what dependency is derived by applying the transitivity rule?
Given A → BC, which inference correctly applies the decomposition rule?
Given A → BC, which inference correctly applies the decomposition rule?
In the context of functional dependencies, what defines a trivial functional dependency?
In the context of functional dependencies, what defines a trivial functional dependency?
Why is it important to find a minimal cover for a set of functional dependencies?
Why is it important to find a minimal cover for a set of functional dependencies?
Given relation R(A, B, C, D, E) with dependencies A → B, B → C, C → D, which scenario suggests that attribute E is extraneous in A → BCDE?
Given relation R(A, B, C, D, E) with dependencies A → B, B → C, C → D, which scenario suggests that attribute E is extraneous in A → BCDE?
In the context of database normalization, what does it mean for an attribute to be 'atomic'?
In the context of database normalization, what does it mean for an attribute to be 'atomic'?
A database table contains a 'Skills' attribute storing multiple skills separated by semicolons (e.g., 'Programming; Design; Testing'). Which normal form does this violate?
A database table contains a 'Skills' attribute storing multiple skills separated by semicolons (e.g., 'Programming; Design; Testing'). Which normal form does this violate?
Which condition must be met for a relation to satisfy Boyce-Codd Normal Form (BCNF)?
Which condition must be met for a relation to satisfy Boyce-Codd Normal Form (BCNF)?
What scenario illustrates the violation of BCNF but adherence to 3NF in a database table?
What scenario illustrates the violation of BCNF but adherence to 3NF in a database table?
Which of the following defines a state where a table is excluded from being in BCNF?
Which of the following defines a state where a table is excluded from being in BCNF?
What condition is specific to Third Normal Form (3NF)?
What condition is specific to Third Normal Form (3NF)?
What situation does Third Normal Form (3NF) specifically aim to avoid?
What situation does Third Normal Form (3NF) specifically aim to avoid?
Under what circumstances might one opt for 3NF over BCNF in database design?
Under what circumstances might one opt for 3NF over BCNF in database design?
What is a significant reason for decomposing a relational schema?
What is a significant reason for decomposing a relational schema?
When does a decomposition become 'lossy'?
When does a decomposition become 'lossy'?
If a table 'StudentMajor' with 'studentID' and 'majorName' is split into 'Student' (studentID) and 'Major' (majorName), and the relation between students and majors is lost, what issue arises?
If a table 'StudentMajor' with 'studentID' and 'majorName' is split into 'Student' (studentID) and 'Major' (majorName), and the relation between students and majors is lost, what issue arises?
Under which condition is a lossless-join decomposition assured?
Under which condition is a lossless-join decomposition assured?
What is most likely to be affected negatively when a database schema is decomposed without ensuring dependency preservation?
What is most likely to be affected negatively when a database schema is decomposed without ensuring dependency preservation?
Given relation R with functional dependencies F, and a lossless-join decomposition of R into R1 and R2 leading to F1 and F2, what condition signifies that the decomposition also preserves dependencies?
Given relation R with functional dependencies F, and a lossless-join decomposition of R into R1 and R2 leading to F1 and F2, what condition signifies that the decomposition also preserves dependencies?
What is denormalization in database design, and why would a designer use it?
What is denormalization in database design, and why would a designer use it?
Which aspect of database performance is most likely to see improvement due to denormalization?
Which aspect of database performance is most likely to see improvement due to denormalization?
What key tradeoff is involved when deciding to denormalize a database?
What key tradeoff is involved when deciding to denormalize a database?
According to database normalization principles, what is the primary benefit of adhering to a higher normal form?
According to database normalization principles, what is the primary benefit of adhering to a higher normal form?
What is the primary trade-off when choosing between a higher normal form like BCNF and a lower normal form like 3NF during database design?
What is the primary trade-off when choosing between a higher normal form like BCNF and a lower normal form like 3NF during database design?
In the context of relational database design, what is a key characteristic of a lossless-join decomposition?
In the context of relational database design, what is a key characteristic of a lossless-join decomposition?
Considering a relation that has been decomposed into multiple relations, what condition signifies that the decomposition is dependency-preserving?
Considering a relation that has been decomposed into multiple relations, what condition signifies that the decomposition is dependency-preserving?
Which of the following scenarios would most likely lead a database designer to consider denormalization?
Which of the following scenarios would most likely lead a database designer to consider denormalization?
Within the context of database normalization, what does the term 'atomic' refer to regarding attributes?
Within the context of database normalization, what does the term 'atomic' refer to regarding attributes?
Flashcards
Anomalies of Redundancy
Anomalies of Redundancy
Problems that arise from data redundancy, affecting data maintenance.
Insertion Anomaly
Insertion Anomaly
Adding data introduces inconsistencies if not done uniformly.
Delete Anomaly
Delete Anomaly
Removing data unintentionally deletes related information.
Update Anomaly
Update Anomaly
Signup and view all the flashcards
Functional Dependency (A→B)
Functional Dependency (A→B)
Signup and view all the flashcards
Functional Dependency
Functional Dependency
Signup and view all the flashcards
Reflexivity Rule
Reflexivity Rule
Signup and view all the flashcards
Augmentation Rule
Augmentation Rule
Signup and view all the flashcards
Transitivity Rule
Transitivity Rule
Signup and view all the flashcards
Decomposition Rule
Decomposition Rule
Signup and view all the flashcards
Union Rule
Union Rule
Signup and view all the flashcards
Closure of F (F+)
Closure of F (F+)
Signup and view all the flashcards
Closure of Attributes (ABC+)
Closure of Attributes (ABC+)
Signup and view all the flashcards
Minimal Cover
Minimal Cover
Signup and view all the flashcards
Normal Form
Normal Form
Signup and view all the flashcards
Atomic
Atomic
Signup and view all the flashcards
First Normal Form (1NF)
First Normal Form (1NF)
Signup and view all the flashcards
Boyce-Codd Normal Form (BCNF)
Boyce-Codd Normal Form (BCNF)
Signup and view all the flashcards
Third Normal From (3NF)
Third Normal From (3NF)
Signup and view all the flashcards
Decomposition
Decomposition
Signup and view all the flashcards
Lossless-Join Decomposition
Lossless-Join Decomposition
Signup and view all the flashcards
R = R1 U R2
R = R1 U R2
Signup and view all the flashcards
R1 ∩ R2
R1 ∩ R2
Signup and view all the flashcards
Foreign key pointing to R2’s primary key
Foreign key pointing to R2’s primary key
Signup and view all the flashcards
Dependency Preserving
Dependency Preserving
Signup and view all the flashcards
Functional Dependencies (Preserved)
Functional Dependencies (Preserved)
Signup and view all the flashcards
Denormalization
Denormalization
Signup and view all the flashcards
Study Notes
Redundancy Elimination, Schema Refinement, and Normalization
- The roadmap for designing relational database applications is to be reviewed
- Anomalies of redundancy, including insert, delete, and update anomalies, are covered.
- Functional dependencies as well as properties are described
- Functional dependencies are used to eliminate redundancies
- Normal forms (1st, 3rd, and Boyce Codd) are covered
- Decomposition and its properties, inclusive of Lossy vs. Lossless Join as well as Dependency Preserving Decompositions is reviewed
- Focus on BCNF vs 3NF and denormalization
Roadmap for Designing Relational Database Applications
- Analyzing real-world applications for relational database design involves domain experts telling their story of what they need
- A system analyst on the DBMS side maps the domain expert needs to a model that database developers use for application design
- A modelling language like ER Model (Entity-Relationship Model) is used, which is a high-level data model
- Data entities with clear relationships for the given real-world application need to be defined
- Provide a simple conceptual design for the database and its view of data
- The output is a high-level design document including ER diagrams and mock-up screen designs
- The mock-up screen designs should reflect update screens, reports screens and query screens as well as workflows of processes
Transferring Design Document and Eliminating Redundancy
- Design documents with ER Diagrams are transferred into the relational model's format, rendering relational schemas
- DB Design tools or manual processes may be used
- Relational schemas, which are the input to the model, get optimized which becomes the output
- Normalization theory is the mathematical foundation
- DB Design tools or manual processes tools may be used
- Input: Optimized relational schemas, screen designs, process workflows
- Output: Database Application
Data Redundancy
- Data redundancy leads to wasted storage due to data duplication
- Redundancy causes data maintenance problems or anomalies during insertion (Insertion Anomaly), update (Update Anomaly), and deletion (Delete Anomaly)
Data Redundancy Anomalies
- Anomaly occurs if Steve who is a full professor retires and is deleted from the table
- This means that because Steve was the only full professor at the time, important information on base salaries will be irretrievable
- In the event Sue moves to, gets a job at another university, and is deleted from the table
- The fact that Eduardo is the Math Head of Department is lost
Data Redundancy: Update Anomaly
- Data Anomaly occurs if the university decides to make the base salary for assistant professors $36K
- If this process only changes one location it creates inconsistency with Sallys record
- To fix the inconsistency, the base salary needs updating in multiple locations
- The Chemistry department experiences the same issue of an update anomaly if the head of Department changes
- Maintaining data consistency is a challenge when entities are updated
Data Redundancy: Insertion Anomaly
- Insertion Anomaly occurs if there is an attempt to insert a new instructor without knowing the new hire's base salary or department head
- The table will need to insert nulls where the information is not available yet
- If a new Data Science department is formed, there is a potential insertion anomaly
- This is because the new department requires the table to insert nulls throughout the table which means missing the data for instructor information because there are no professors/hires yet
Functional Dependencies (FDs)
- A and B are two attributes where A uniquely determines B; B is functionally dependent on A
- The value of B = F(value of A), and F is a function where the function denotes a unique value of B for every input value of A
- An example of FDs is AC→ BDE, with multiple attributes on either side
- Given tuples t₁er, t₂∈ r, π₁(t₁) = π₁(t2), it can be implied that πρ(t₁) = πβ(t2)
- With Attributes XY as the key for Table r with Schema, XY → R and, the key attributes determine the attributes in a given tuple in r
Functional Dependencies (FDs): Example Table
- The instructor consists of a string iid, name, rank, and department along with a real base and strings dept_head
- FDs exist between iid name, rank, base_salary, department, and dept_head
- Department has an FD with Department Head and rank has an FD with base salary
Functional Dependencies (FDs): Domain Expert
- Domain Experts decide that A determines B, such that A → B, iid → INSTRUCTOR, and Department → Dept-head
- System Analysts supply programmers with FDs by the set of FDs determined by the Domain Experts, along with ER diagrams
Functional Dependencies (FDs): Reasoning
- FDs exist between the iid, name, rank, and base_salary
- The department has an FD with Department Head
- Rank has an FD with base salary
- Reflexivity exists between the idds, names and rank, base_salary, and department, and dept_head
- In the Reflexivity, ∀ R⊆S, S➔ R, with Termed Trivial FD
- Augmentation: If A→B then AC→ BC
- Transitivity: If A→B and B→ C then A→ C
- Decomposition: If A→BC then A➔ B and A→ C
- Union: If A→B and A→ C then A→ BC
Functional Dependencies (FDs): Closure
- The closure of a set of FDs is known as the F+, which indicates all the FDs derived or implied from F
- With F = {A→B and B→C and B D}, A → C and A → D are in F+ by Transitivity
- For F = {A→B and B→C and B D}, A→ ABCD is in F+ by Union
- For F = {A→B and B→C and B D}, AD → CD is in F+ by Augmentation
- For F = {A→B and B→C and B D}, AD → C is in F+ by Decomposition
Closure & Minimal Cover
- ABC+ shows all the attributes reachable from ABC given a set of FDs
- With F = {A→B and B→C and B D}, B+ is BCD
- To demonstrate a set of attributes X is a key for R, it is necessary to Show that X+ R
- Given the minimal cover equation for FDs:
- iid name, rank, base_salary, department, dept_head
- department → dept_head
- Rank → base_salary
- In general, the higher the redundancies, the more complex a scheme data maintenance
- Ensuring FDs do not have redundancies or extraneous attributes minimizes scheme complexity
- As an example, given F = {A→B, B→C, A → C, A→ BD, AB → D, B → D), A→ C is redundant
- Since, it can be implied by transitivity since it is implied by A→B and B→ C which means A➔ C can be eliminated from F
- With F = {A→B, B→C, A → C, A→ BD, AB → D, B → D}, A→ BD is redundant
- A→ B and B→D implies A➔ D (by Transitivity), A→ B and A➔D implies that A→ BD (By Augmentation) and A➔BD can be eliminated
FDs, Schema, and Normal Forms
- Database schemas and tables can be optimized and enhanced by eliminating redundancies
- Scientists define quality schemas and denote them as being 'Good' database designs
- They classify these 'good' designs into 'Normal Forms', which are ways of minimizing anomalies
- There are several normal forms, such as the First, Second and Third normal form
- In order to determine if a schema or table is of 'good design (BCNF), determine if the schema R satisfies BCNF requirements, then confirm by asking a systems analyst
- To eliminate redundancies, you may see FDs where certain tuples seem to introduce redundancies
- In order to work with specific values for X and then the correspoding values for Y to ensure you get that desired design
First Normal Form (1NF)
- First Normal Form ensures that every attribute value is atomic, i.e is indivisible
- It is important to confirm whether the 1NF is atomic or not
- This can be verified from example where we impose that our course IDS should be atomic(CS451), or not atomic(CS,451)
- An example of non-atomic data is "Mariam, Pamela, Bob", since here we'd want children's names to parse individually
- All schemas should be checked to ensure attribute values are atomic
- Multiple attributes as well as non-atomic attributes are indicators that you are breaking the 1st NF
Boyce Codd Normal Form (BCNF)
- The Boyce Codd Normal Form is a "Good Form"
- R is in BCNF when referring to F, if for all X➔Y in F⁺
- Either X➔Y is a trivial Functional Dependency, i.e., Y ⊆ X, or; X is a super key for R
- If R is in BCNF, then it is considered a good schema form and, therefore, a good design that will not experience anomalies or redundancies
Third Normal Form (3NF)
- R is in 3NF when referring to F+, for all X➔Y
- Either X➔Y is a trivial Functional Dependency, i.e., Y ⊆ X, or; X is a key for R, or; or, Y is part of a key for R
- This is More relaxed than BCNF and includes being in 3NF and being in BCNF
- Schemas in BCNF follow 3NF but the 3NF cannot necessarily be considered a BCNF
Schema Evaluation: Example Case
- Given an Instructor table that is referenced by an iid, is instructor schema in BCNF?
- Instructor(iid: string, name: string, rank: string, base_salary: real, department: string, dept_head: string)
- It requires knowing all Functional Dependencies for a given Table
- To determine the FD you must ask the Domain Expert
- The question of if the current instructor schema is in the 'good design' has to be answered
Schema Evaluation: Instructor
- Start by answering what the FDs with the instructor table are
- As an example, the FD may suggest, iid → name, rank, base_salary, department, dept_head OR rank → base_salary OR department → dept_head
- Answer this question for each of the above existing FD scenarios in the table
- If the instructor schema is in BCNF for the FD with, iid → name, rank, base_salary, department, dept_head: Confirm via systems analyst
- Instructor Schema in BCNF for FD of rank → base_salary (Not true, Rank is not key)
- Conclude that rank and base salary is redundantly replicated
- Instructor Schema in BCNF for FD of department → dept_head (Not true, departmentis not KEY)
- Conclude: Department and Department head is redundantly replicated and therefore making instructor a bad design
Instructor Schema: Decomposition
- Since it is concluded that instructor schema is a bad design determine what solution can be employed:
- Decompose the instructor schema into smaller sub schemas that satisfy the conditions required
- Therefore this instructor table may undergo decomposition into tables such as rank, name and other variations
- One example can for the Schema of table instructor with parameters ranging from iid to dept head it is split into Department, dept head, and rank, base salary
Correctness in Decomposition
- It is important to have lossless join decomposition when working with instructor tables
- Decomposition may cause performance overhead when data in a query is taken out of the relation and requires joining
- Efficiency is required when checking various functional dependencies
- The challenge is to be able to avoid redundancy and manage performance
Lossy Decomposition
- Consider the Instructor Table and the FD: department department_head with the appropriate table relations to each entity
- Now decompose that table into the 2 tables of: iid, name, rank, base_salary and department, department_head
- Take note that the two new tables are in BCNF, which allows us to reconstruct tuples back into their original schemas
- To help with reconstruction, try to determine what Stephens department is or department information, we realize that is lost
- This is now a Lossy Decomposition, because now the data must guarantee we can reconstruct the table while considering relations
- The conclusion: decompose into iid name, rank, base_salary and department → dept_head when the name and rank are in the respective table and the arrow goes between dept table
LossLess Join Decomposition
- Lossless Join looks at the correctness aspect of decomposition
- In one example say table r has scheme which has FD X ➔ Y, but XUYCR and decompose the r into the two r1 r2 schemas from the table scheme
- Say the schema can be considered a lossless join iff: R= R1 union R2
- What that effectively is saying is you should be able to reconstruct the table from R1 and R2
Lossless-Join Decomposition: Condition 2
- Since you cant miss any attributes in the decomposition, R= R1 u R2, you may need to consider the key for R2 in R1
- This makes R1 have a key for R2s primary key, also R2 can be deemed to have a key pointing in R1, all of this allows one condition to be true
- There may be different conditions and situations to make that work, with the result being foreign key primary key natural joins using a foreign key in R1
Lossless Decomposition: Instructor
- Reiterate the lossless conditions regarding the instructor with R1 = (department, dept_head); R2 = (iid, name, rank, base_salary, department); where the R1 n R2 = dept
Dependency Preservation
- If R is a schema with Functional Dependency denoted by F1 F this essentially means that one the functional attribute set of R1 is only R1
- However if F2, F Functional Dependencies apply for attribute sets in R2 only
- Therefore the decomp of R in tables R1 R2 is only dependency preserving iff = (F1 UF2)+ = F +
- This decomposition of R1 and R2 means the overall schemas must be dependency performing
Dependency Preservation (DP): Benefits
- Lossless join has an affect on the overall correctness
- DP has overall affects to the performance
- What happens if you perform actions where DP does not occur?
- The FD is never captured after we decomposed in R1 and R2, so essentially whatever one was going enforce with data is not longer there and we lost some data in this process.
- It is always important to avoid joining after any operation
Normal Forms: 3NF vs. BCNF
- BCNF is not always dependency preserving whereas 3NF always has dependency preservation
- Given the table: Consider the table T = (X, Y, Z) while the table is Not in BCNF
- The table becomes the following decomposition: T1 = (X, Y), T2 = (X, Z), which does not preserve BCNF
- There are multiple considerations as a result, mainly that there is no dependency preserving decomposition
- You need to determine the need for Third Normal Form (3NF), and see why is 3NF needed
- Third Normal Form (3NF) - In the case where you still need to run BCNF we must keep in the mind that the x can possibly match this 3rd condition that now makes this new table part of the new key, in other words their is no need to decompose into anything for this problem and all dependency is preserved.
Rationale Behind 3NF
- There are lossless join decompositions, with the same data being copied into data
- It always be the cheaper and allow you to do joins to check if F1 is not violated or to check efficiency
- An adoption on this form a more efficient implementation for BCNF can be done by adding redundancies via a more efficient limited case
- At this point, we need to look what this means because there are tradeoffs, such as giving way to limited redundancy.
- But this is what makes 3NF, its guarantee of having decompositions with dependency and preservation via only lossless joins that allow you retain dependencies
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.