Relational Model and Scheme

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

In the context of relational databases, what does a 'relation scheme' primarily define?

  • The physical storage location of the database files.
  • A specific set of data values stored in a table at a given time.
  • A set of rules for data validation and integrity.
  • The definition or structure of a table, including its attributes. (correct)

Which of the following best describes a 'relation' or 'relation instance' in the context of a relational database?

  • The primary key constraint enforced across multiple tables.
  • A set of possible domain values for an attribute.
  • The abstract definition of a table's structure and constraints.
  • A specific snapshot of data residing in a table at a particular moment. (correct)

What is the significance of 'domains' in relational database modeling?

  • They determine the relationships between tables.
  • They enforce primary key uniqueness across all tables.
  • They specify the range of acceptable values for an attribute. (correct)
  • They define the physical storage structure of database files.

Consider a database schema with a single table containing employee information and project assignments. If the employee's name, title, and salary are repeated for each project they are assigned to, which anomaly is most likely to occur when updating an employee's salary?

<p>Update Anomaly (C)</p> Signup and view all the answers

In a database table storing project information, if you cannot add a new project until at least one employee is assigned to it, which type of anomaly is present?

<p>Insertion Anomaly (D)</p> Signup and view all the answers

Consider a database where deleting the last employee assigned to a specific project inadvertently removes all information about that project. Which anomaly does this scenario exemplify?

<p>Deletion Anomaly (B)</p> Signup and view all the answers

What is the primary goal of normalizing a database?

<p>To eliminate data anomalies and improve data integrity. (B)</p> Signup and view all the answers

Which of the following is a key consideration when decomposing a database schema into a desirable normal form?

<p>Preserving all dependencies and ensuring lossless decomposition. (B)</p> Signup and view all the answers

Given a relation R with attributes A, B, and C, where A -> B and B -> C, which normal form is violated if R is not further decomposed?

<p>Third Normal Form (3NF) (B)</p> Signup and view all the answers

In relational algebra, what is the primary purpose of the 'selection' operator?

<p>To extract a subset of tuples based on a specified condition. (B)</p> Signup and view all the answers

What is the primary function of the 'projection' operator in relational algebra?

<p>Selecting a subset of columns from a relation. (B)</p> Signup and view all the answers

What is a key requirement for two relations to be 'union-compatible'?

<p>They must have same degree + corresponding attributes over same domain (B)</p> Signup and view all the answers

What is the result of a 'set difference' operation between two relations R and S?

<p>All tuples that exist in R but not in S. (D)</p> Signup and view all the answers

What operation combines tuples of two relations in all possible combinations?

<p>Cartesian product (D)</p> Signup and view all the answers

What is the primary purpose of the intersection operator in relational algebra?

<p>To select tuples that are common to two relations. (A)</p> Signup and view all the answers

What is the primary distinction between inner and outer joins?

<p>Inner joins only return matching tuples, while outer joins also include non-matching tuples from one or both tables. (A)</p> Signup and view all the answers

What is the result of a left outer join between two tables, A and B?

<p>All rows from A and matching rows from B, with NULLs for non-matching columns from B. (A)</p> Signup and view all the answers

How does a natural join differ from a theta join?

<p>A natural join requires attributes with the same name, while a theta join is based on an arbitrary comparison. (C)</p> Signup and view all the answers

Which operation is effectively a 'pre-join' filter, reducing the number of tuples before a join operation?

<p>Semijoin (C)</p> Signup and view all the answers

Which type of relational calculus is SQL based on?

<p>Tuple relational calculus (B)</p> Signup and view all the answers

What is the fundamental characteristic of a computer network?

<p>An interconnected collection of autonomous computers. (B)</p> Signup and view all the answers

What distinguishes a Wide Area Network (WAN) from a Local Area Network (LAN)?

<p>WANS cover larger geographic distances, higher latency and typically have different speeds, and different transmission media (A)</p> Signup and view all the answers

Which network topology is characterized by 'no regularity in the interconnection' between nodes, such as the Internet?

<p>Irregular (A)</p> Signup and view all the answers

What is the key characteristic of 'point-to-point' communication?

<p>Communication occurs only between two specific nodes. (B)</p> Signup and view all the answers

What distinguishes 'broadcast' communication from 'multi-cast' communication?

<p>Multi-cast is essentially a special case of broadcast where messages are sent to a defined subset of nodes whereas broadcast sends to all. (A)</p> Signup and view all the answers

In data communication, what is the term of the amount of information that can be transmitted over a channel in a given time unit?

<p>Bandwidth (C)</p> Signup and view all the answers

What is a key difference between circuit switching and packet switching?

<p>Circuit switching establishes a dedicated channel, while packet switching divides messages into packets. (C)</p> Signup and view all the answers

Which is a critical function of communication protocols?

<p>Ensuring error-free, reliable and efficient communication. (C)</p> Signup and view all the answers

Why is TCP/IP considered a 'layered architecture'?

<p>Because each function is divided into layers, or stack, of abstractions. (A)</p> Signup and view all the answers

What type of attribute is best suited to uniquely identify each record/tuple within a relational database table?

<p>Primary Key (B)</p> Signup and view all the answers

A functional dependency exists when?

<p>The value of one attribute can be determined by another attribute. (A)</p> Signup and view all the answers

Which normal form ensures that all non-key attributes are fully functionally dependent on the primary key?

<p>Second Normal Form (C)</p> Signup and view all the answers

Which one focuses on eliminating transitive dependencies?

<p>Third Normal Form (C)</p> Signup and view all the answers

How does BCNF enhance 3NF?

<p>Eliminating redundancy by enforcing that every determinant is a candidate key (C)</p> Signup and view all the answers

What protocol is commonly used for file transfer over the Internet?

<p>FTP (D)</p> Signup and view all the answers

What's the purpose of using CSMA/CD access method in bus networks?

<p>To listen before and during data transmission, thus detecting collisions. (D)</p> Signup and view all the answers

What happens to the tuples in outer join that do not satisfy the join predicate?

<p>Other relation's attribute values set to NULL (D)</p> Signup and view all the answers

What is the general meaning of the following expression: $R \Join_{R.A=S.B} S$?

<p>Equi-join of relations R and S (C)</p> Signup and view all the answers

Which SQL command retrieves every single column from a desired table?

<p>SELECT * FROM desiredTable (B)</p> Signup and view all the answers

What does the acronym DHCP stand for?

<p>Dynamic Host Configuration Protocol (C)</p> Signup and view all the answers

What is the primary use of SQL WHERE clause?

<p>To filter data based on a specified condition (D)</p> Signup and view all the answers

Flashcards

What is a Relation in the Relational Model?

A relation with attributes defined over 'n' domains, containing 'n'-tuples with values from these domains.

What is a relation scheme?

A definition; i.e., a set of attributes

What is a relational database scheme?

A set of relation schemes.

What is a Domain?

A type, defining acceptable values for a variable.

Signup and view all the flashcards

What is Repetition Anomaly?

The NAME, TITLE, and SAL attribute values are repeated for each project that an employee is involved in.

Signup and view all the flashcards

What is Update Anomaly?

If an attribute of project (say SAL of an employee) is updated, multiple tuples have to be updated to reflect the change.

Signup and view all the flashcards

What is Insertion Anomaly?

It may not be possible to store information about a new project until an employee is assigned to it.

Signup and view all the flashcards

What is Deletion Anomaly?

If an engineer, who is the only employee on a project, leaves, his personal information cannot be deleted, or the information about that project is lost.

Signup and view all the flashcards

What is Normalization?

A process of concept separation which applies a top-down methodology for producing a schema by subsequent refinements and decompositions.

Signup and view all the flashcards

What is First Normal Form (1NF)?

Eliminates the relations within relations or relations as attributes of tuples.

Signup and view all the flashcards

What is Second Normal Form (2NF)?

Eliminate the partial functional dependencies of non-prime attributes to key attributes

Signup and view all the flashcards

What is Third Normal Form (3NF)?

Eliminate the transitive functional dependencies of non-prime attributes to key attributes

Signup and view all the flashcards

What is Boyce-Codd Normal Form (BCNF)?

Eliminate the partial and transitive functional dependencies of prime (key) attributes to key.

Signup and view all the flashcards

What is Functional Dependence?

Given relation R defined over U = {A1, A2, ..., An} where X ⊆ U, Y ⊆ U. If, for all pairs of tuples t₁ and t₂ in any legal instance of relation scheme R, t1[X] = t2[X] ⇒ t₁[Y] = t2[Y].

Signup and view all the flashcards

What is Relational Algebra?

Specifies how to obtain the result using a set of operators

Signup and view all the flashcards

What is Selection?

Produces a horizontal subset of the operand relation

Signup and view all the flashcards

What is Projection?

Produces a vertical slice of a relation.

Signup and view all the flashcards

What is Union?

Result contains tuples that are in 'R' or in 'S', but not both (duplicates removed). R and S should be union-compatible.

Signup and view all the flashcards

What is Set Difference?

Result contains all tuples that are in 'R', but not in 'S'. R and S union-compatible.

Signup and view all the flashcards

What is Cartesian Product?

The result of R × S is a relation of degree (k₁+ k2) and consists of all (n₁* n2)-tuples where each tuple is a concatenation of one tuple of R with one tuple of S.

Signup and view all the flashcards

What is 0-Join?

Join is derivate of Cartesian product. There are various forms of join. The primary classification is between inner join and outer join.

Signup and view all the flashcards

What is Natural Join?

It is a equi-join of two relations R and S over an attribute (or attributes) common to both R and S and projecting out one copy of those attributes

Signup and view all the flashcards

What is Semijoin?

R F S = ΠΑ(RFS) = П₁(R) × ПАСB(S) = R F ΠАСВ(S)

Signup and view all the flashcards

What is Division (Quotient)?

T = R ÷ S gives T of degree k₁-k2 [i.e., T(Y) where Y = A-B] such that for a tuple t to appear in T, the values in t must appear in R in combination with every tuple in S.

Signup and view all the flashcards

What is Relational Calculus?

Specify the properties that the result should hold

Signup and view all the flashcards

What is a Computer Network?

An interconnected collection of autonomous computers that are capable of exchanging information among themselves.

Signup and view all the flashcards

What is the Internet?

Network of networks.

Signup and view all the flashcards

What is the Wide are network (WAN)?

Distance between any two nodes > 20km and can go as high as thousands of kms

Signup and view all the flashcards

What is the Local area network (LAN)?

Limited geographic scope (usually < 2km) and Speeds 10-1000 Mbps

Signup and view all the flashcards

What is Metropolitan area network (MAN)?

In between LAN and WAN

Signup and view all the flashcards

What is Irregular Topology?

No regularity in the interconnection - e.g., Internet

Signup and view all the flashcards

What is Bus Topology?

Typical in LANs – Ethernet Using Carrier Sense Medium Access with Collision Detection (CSMA/CD)

Signup and view all the flashcards

What is Point-to-point (unicast)?

One or more (direct or indirect) links between each pair of nodes Communication always between two nodes

Signup and view all the flashcards

What is Broadcast (multi-point)?

Messages are transmitted over a shared channel and received by all the nodes Each node checks the address and if it not the intended recipient, ignores

Signup and view all the flashcards

What is hosts connected by Links?

Hosts are connected by links, each of which can carry one or more channels

Signup and view all the flashcards

What is Capacity - bandwidth?

Host connected by multiple Channels, the the amount of information that can be transitive over the channel in a given time unit

Signup and view all the flashcards

What is Packet switching?

Messages are divided into fixed size packets, each of which is routed from the source to the destination

Signup and view all the flashcards

What is Circuit switching

A dedicated channel is established between the sender and receiver for the duration of the session

Signup and view all the flashcards

What is Communication Protocols?

Software that ensures error-free, reliable and efficient communication between hosts

Signup and view all the flashcards

Study Notes

Relational Model

  • A relation R features attributes A = {A1, A2, ..., An}.
  • It is defined over 'n' domains D = {D1, D2, ..., Dn}.
  • The domains are not necessarily distinct.
  • The relation uses values {Dom₁, Dom2, ..., Domn}.
  • It is a finite, time-varying set of n-tuples (d1, d2, ..., d₁) where each 'di' belongs to 'Domi'.
  • There are constraints such as A1 ⊆ D1, A2 ⊆ D2, ..., An⊆ Dn.
  • Represented as R(A1, A2, ..., An) or R(A1: D1, A2: D2, ..., An: Dn).
  • An instance of R at a given time is a set of n-tuples: {(A1: d₁, A2: d2, ..., An: dn) | d₁ ∈ Dom₁, d₂ ∈ Dom2, ..., dn ∈ Domn}.
  • It has a tabular structure where R is the table heading, attributes are table columns, and each tuple is a row.

Relational Scheme

  • A relational scheme defines a set of attributes.
  • Example: EMP(ENO, ENAME, TITLE, SAL, PNO, RESP, DUR) and PROJ (PNO, PNAME, BUDGET)

Relational Database Scheme

  • A relational database scheme is a set of relation schemes.
  • It is a set of sets of attributes.
  • A relation over a scheme R = {A1, ..., An} is a subset of the Cartesian product of the domains of all attributes: r ⊆ Dom₁ × Dom₂ × ... × Domn.

Domains

  • A domain is a type in the programming language sense.
  • Examples: Name: String, Salary: Real.
  • Domain values are a set of acceptable values for a variable of a given type.
  • Examples: CdnNames = {...}, ProfSalary = {45,000 - 150,000}.
  • Simple/Composite domains can include address which is Street name+street number+city+province+ postal code
  • Binary operations can be performed, such as comparisons and addition, where there is domain compatibility
  • Full support for domains is not fully provided in many current relational DBMSs.

Relation Keys

  • Underlined attributes are relation keys and tuple identifiers.
  • Tabular form is used

Repetition Anomaly

  • The NAME, TITLE, and SAL attribute values are repeated for each project an employee is involved in
  • This is a waste of space and complicates updates.

Update Anomaly

  • If any attribute of a project, like an employee's salary (SAL), is updated, multiple tuples must reflect the change.

Insertion Anomaly

  • It might not be possible to store information about a new project until an employee is assigned to it.

Deletion Anomaly

  • If an engineer, who is the only employee on a project, leaves the company, all information about that project might be lost.
  • Many tuples may be deleted.

Normalization

  • Improves each relation individually based on desired characteristics.
  • Involves normal forms.
  • Normalization is a process of concept separation using a top-down approach by subsequent refinements and decompositions.
  • Avoid combining unrelated sets of facts in one table. Each relation should contain an independent set of facts.
  • Universal relation assumption is applied.
  • This includes 1NF to 3NF and 1NF to BCNF.

Normalization Issues

  • How to decompose a schema into a desirable normal form.
  • Defined by lossless recomposability and dependency preservation.
  • Lossless decomposition means no information loss.
  • Reconstructability means the ability to recover the original relation with no spurious joins.
  • Dependency preservation means the constraints holding on the original relation should be enforceable by the decomposed relations.
  • Processing time may increase due to joins and there is denormalization.

Functional Dependence

  • Relation R is defined over U = {A1, A2, ..., An} where X ⊆ U, Y ⊆ U.
  • For all pairs of tuples t1 and t2 in any legal instance of relation scheme R, if t1[X] = t2[X] ⇒ t1[Y] = t2[Y], then X → Y holds in R.
  • In relation EMP: (ENO, PNO) → (ENAME, TITLE, SAL, DUR, RESP)
  • In relation PROJ: PNO → (PNAME, BUDGET)

Normal Forms Based on Functional Dependencies

  • 1NF: Eliminates relations within relations or relations as attributes of tuples
  • First Normal Form (1NF): Eliminates the partial functional dependencies of non-prime attributes to key attributes.
  • Second Normal Form (2NF): Eliminates the transitive functional dependencies of non-prime attributes to key attributes.
  • Third Normal Form (3NF): Eliminates the partial and transitive functional dependencies of prime (key) attributes to key.
  • Boyce-Codd Normal Form (BCNF).

Relational Algebra

  • Specifies how to obtain the result using a set of operators.
  • Expressed as <Operator>parameters(〈Operands) → (Result)
  • The operands are relations and result is also a relation.

Relational Algebra Operators

Fundamental

  • Selection
  • Projection
  • Union
  • Set difference
  • Cartesian product

Additional

  • Intersection
  • θ-join
  • Natural join
  • Semijoin
  • Division

Union Compatibility

  • Union operations must have the same degree
  • Corresponding attributes must be defined over the same domain

Selection

  • Produces a horizontal subset of the operand relation.
  • General form: σF(R) = {t | t ∈ R and F(t) is true}.
  • R is a relation, t is a tuple variable, and F is a formula.
  • F consists of operands (constants or attributes), arithmetic comparison operators (<, >, =, ≠, ≤, ≥), and logical operators (∧, ∨, ¬).

Projection

  • Produces a vertical slice of a relation.
  • General form: ΠA1,...,An(R) = {t[A1,..., An] | t ∈ R}.
  • R is a relation, t is a tuple variable, and {A1,..., An} is a subset of R's attributes.
  • Projection can generate duplicate tuples, with this feature being available in commercial systems and SQL.

Union

  • Similar to set union.
  • General form: R ∪ S = {t | t ∈ R or t ∈ S}.
  • R and S are relations, t is a tuple variable.
  • The result contains tuples that are in R or S but not both (duplicates removed), R and S must be union-compatible.

Set Difference

  • General form: R - S = {t | t ∈ R and t ∉ S}.
  • R and S are relations, t is a tuple variable.
  • Result contains all tuples that are in R but not in S, R - S ≠ S - R.
  • R and S must be union-compatible.

Cartesian (Cross) Product

  • For relations R of degree k1 and cardinality n1, and S of degree k2 and cardinality n2.
  • Cartesian product: R × S = {t [A1,...,Ak1, Ak1+1,...,Ak1+k2] | t[A1,………,Ak1] ∈ R and t[Ak1+1,...,Ak1+k2] ∈ S}.
  • The result of R × S is a relation of degree (k1 + k2) and consists of all (n1 * n2)-tuples.
  • Each tuple is a concatenation of one tuple of R with one tuple of S.

0-Join

  • Derived from Cartesian product.
  • Primary classification is between inner join and outer join.
  • General form: R ⋈F(R.Ai, S.Bj) S = {t[A1,...,An, B1,...,Bm] | t[A1,...,An] ∈ R and t[B1,...,Bm] ∈ S and F(R.Ai, S.Bj) is true}.
  • R and S are relations, t is a tuple variable, and F (R.Ai, S.Bj) is a selection formula
  • A derivative of Cartesian product: R ⋈F S = σF(R × S).

Types of Join

Equi Join

  • Is a special case of a 0-join
  • The formula F only contains equality (=) as an arithmetic operator.

Natural Join

  • It is a equi-join of two relations R and S over an attribute (or attributes) common to both R and S
  • It projects out one copy of those attributes
  • R ⋈ S = ΠR∪SσF(R × S)

Outer Join

  • Inner join requires that the joined tuples from two operand relations satisfy the join predicate, outer does not.
  • It ensures that tuples from one or both relations that do not satisfy the join condition still appear in the final result with other relation's attribute values set to NULL.
  • Three types: left (tuples from left operand are always in the result), ring (tuples from the right operand relation are always in the result), and full (Tuples from both relations are always in the result).

Semijoin

  • Semijoin of relation R, defined over attributes A, by relation S, defined over attributes B: a subset of tuples of R that participate in the join of R with S.
  • RF S = ΠA(RF S) = ΠA(R) ⋈ ΠA∩B(S) = R ⋈F ΠA∩B(S)
  • Decreases the number of tuples needed to form the join.
  • Results in a decreased number of secondary storage accesses by making better use of memory
  • Reduces the amount of data transmitted between sites to evaluate a query in a DDB
  • The resultant relation does not have the PAY attribute and is therefore smaller.

Division (Quotient)

  • For relations R of degree 'k1' and S of degree 'k2', where A (attributes of R) and B (attributes of S), and B is a subset of A.
  • T = R ÷ S gives T of degree k1-k2 where Y = A-B. For a tuple 't' to appear in T, the values in 't' must appear in R in combination with every tuple in S.
  • Derivation: R ÷ S = ΠY(R) – ΠY((ΠY(R) × S) – R)

Relational Calculus

  • Specifies the characteristics of the final result.
  • Includes tuple relational calculus and domain relational calculus.

Tuple Relational Calculus

  • Queries of the form {t | F{t}} where 't' is a tuple variable, and F is a well-formed formula.
  • Includes atomic formulas and tuple variable membership expressions.
  • An atomic formula is R.t or R(t) where tuple 't' belongs to relation R.
  • Conditions are s[A] θ t[B] or s[A] θ c where 's' and 't' are tuple variables, 'A' and 'B' are components of 's' and 't', 'c' is a constant, and θ is a comparison operator.
  • SQL is an example of tuple relational calculus, at least in simple examples

Domain Relational Calculus

  • Queries of the form {x1, x2, ..., xn | F(x1, x2, ..., xn)}
  • F is a well-formed formula in which x1, x2, ..., xn are the free variables.
  • QBE is an example

Computer Network

  • An interconnected collection of autonomous computers that exchange information among themselves.
  • Components are hosts (nodes, end systems), switches, and communication links.

Internet

  • A network of networks consisting of clients, intranets, ISP networks, and servers.

Types of Networks (by scale)

Wide Area Network (WAN)

  • Distance between nodes > 20km, can reach thousands of kms
  • Incurs long delays due to distances
  • Heterogeneous transmission media
  • Speeds of 150Mbps to 10Gbps (OC192 backbone).

Local Area Network (LAN)

  • Limited geographic scope (typically < 2km)
  • Speeds of 10-1000 Mbps
  • Short delays and low noise

Metropolitan Area Network (MAN)

  • In between LAN and WAN

Types of Networks (by topology)

Irregular

  • No consistent configuration.
  • E.g., the Internet.

Bus

  • Typical in LANs such as Ethernet
  • Uses Carrier Sense Medium Access with Collision Detection (CSMA/CD).

Others

  • Star
  • Ring
  • Mesh

Communication Schemes

Point-to-point (unicast)

  • One or more direct/indirect links between each pair of nodes.
  • Only communicates between two nodes
  • Sender and receiver are identified by addresses in the message header.
  • Message leverages one of multiple links between the sender and receiver using switching or routing.

Broadcast (multi-point)

  • Messages are transmitted over a shared channel.
  • Includes multi-cast, where the message is sent to a subset of nodes.

Communication Alternatives

  • Twisted pair
  • Coaxial
  • Fiber optic cable
  • Satellite
  • Microwave
  • Wireless.

Data Communication

  • Hosts are connected by "links," each of which can carry one or more "channels."
  • A "link" is a physical entity, whereas a "channel" is a logical one.
  • The amount of data transmitted over the channel is called "bandwidth."
  • Alternative messaging schemes include packet switching.
  • Messages are divided into fixed size packets routed from source to destination when packet switching
  • A dedicated channel between the sender and the receiver for the session with circuit switching

Communication Protocols

  • Software that ensures reliable and efficient communication between hosts.
  • Arranged in a layered architecture known as the protocol stack or protocol suite.
  • Transmission Control Protocol/Internet Protocol(TCP/IP) is the best-known which is used in the internet

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser