Transparent Management of Distributed Data

RightfulBandura avatar
RightfulBandura
·
·
Download

Start Quiz

Study Flashcards

20 Questions

What does transparency refer to in a system?

Transparency refers to the separation of the higher-level semantics of a system from lower-level implementation issues.

Which type of transparency protects the user from the details of the network and even hides the existence of the network from the user?

Network Transparency

Logical data independence refers to the immunity of user applications to changes in the physical structure of data.

False

Fragmentation refers to the division of each database relation into smaller fragments for reasons of ____, ____, and ____.

performance, availability, reliability

Match the fragmentation alternatives with their descriptions:

Horizontal fragmentation = Partitioning relation into sub-relations with subset of tuples Vertical fragmentation = Defining sub-relations based on a subset of attributes

What are the three major players responsible for providing transparency in a system?

Language/Compiler, Operating System, DDBMS

What are the two basic alternatives for partitioning a distributed database?

Partitioned (or non-replicated) and Replicated (fully or partially)

What factors need to be considered in distributed query processing?

All of the above

Concurrency control ensures the synchronization of accesses to a distributed database.

True

Distributed deadlock management includes alternatives for prevention, avoidance, and ________. (Fill in the blank)

detection/recovery

Match the following replication protocols with their descriptions:

Eager protocol = Force updates to all replicas before transaction completion Lazy protocol = Update one copy and then propagate after transaction completes

What are some additional issues faced in distributed databases due to the changing environment?

Looser federation, multi-database systems, growth of the Internet, peer-to-peer computing, and web growth

What is a distributed database defined as?

A collection of multiple, logically interrelated databases distributed over a computer network.

Which of the following are types of accesses involved in a Distributed Database System?

Local access

What are the two main characteristics of candidate applications for a Distributed Database System?

Large number of users and users are physically spread across large geographical area.

In a centralized Client-Server System, data management is distributed across multiple systems.

False

In a Distributed Database System, the __________ contains the global schema.

DDBMS

What are the three orthogonal dimensions/Alternatives of DDBS?

Delivery modes

What is the purpose of periodic delivery in a Distributed Database System?

To send data from the server to clients at regular intervals.

Distributed Database Systems help reduce telecommunication costs.

True

Study Notes

Distributed Database Systems Overview

  • A distributed database system is a collection of multiple, logically interrelated databases distributed over a computer network.
  • A distributed DBMS (DDBMS) is the software system that permits the management of the distributed database and makes the distribution transparent to the users.

Characteristics of Distributed Database Systems

  • Data management at multiple sites: data is stored and managed at geographically multiple sites.
  • Local requirements: each site storing data in a DDBS is called a local site, catering to local users.
  • Global perspective: the DDBS fulfills global requirements in a transparent way.

Access to a Distributed Database System

  • Three types of accesses:
    • Local access: access by users connected to a site, accessing data from the same site.
    • Remote access: a user connected to a site, accessing data from another site.
    • Global access: data is displayed after being collected from all locations, regardless of access location.

Application of Distributed Databases

  • Candidates for a DDBS have two main characteristics:
    • Large number of users.
    • Users are physically spread across a large geographical area.
  • Examples of candidates:
    • Banking applications.
    • Air ticketing.
    • Business at multiple locations.

Data Delivery Alternatives

  • Three orthogonal dimensions/alternatives of DDBS exist:
    • Delivery modes (pull-only, push-only, hybrid).
    • Frequency (periodic, conditional, ad-hoc).
    • Communication methods (unicast, multicast).

Communication Methods

  • Unicast: one-to-one communication from a server to a client.
  • Multicast: one-to-many communication from a server to multiple clients.

Distributed Database System vs Centralized Client-Server System

  • Centralized C/S System:
    • Data management is carried out on a single centralized system.
    • Data is accessed from different machines (clients) connected through a network.
    • Data storage and management is mainly done on the server.

Promises of Distributed Database Systems

  • Four fundamental advantages:
    • Transparent management of distributed and replicated data.
    • Reliable access to data through distributed transactions.
    • Improved performance.
    • Easier system expansion.

Transparency in Distributed Database Systems

  • Types of transparencies:
    • Data independence.
    • Network transparency or distribution transparency.
    • Replication transparency.
    • Fragmentation transparency.
  • Responsibility of transparency:
    • Access layer (language/compiler).
    • Operating system level.
    • DBMS level.

Layers of Transparency

  • Language/compiler provides transparency features.
  • Operating system provides network transparency.
  • DBMS provides fragmentation or replication transparency.### Distributed Database Systems (DDBS)

Critical Issues in DDBS

  • Consistency of data across multiple sites is a serious issue
  • Distributed transactions should be executed at multiple sites without fear of failure
  • Reliability through distributed transactions ensures user applications do not need to worry about coordinating accesses to individual local databases or site/communication link failures

Advantages of DDBS

Improved Performance

  • Data Localization: data is stored close to its point of use, reducing remote access delays and contention for CPU and I/O services
  • Query Parallelism: a query can be executed in parallel, improving performance

Easier System Expansion

  • Expansion can be handled by adding processing and storage power to the network
  • More economical than establishing a single, centralized system

Complications Introduced by Distribution

  • Selection of the Copy: choosing the right copy based on constraining variables (distance, load, site availability, etc.)
  • Failure Recovery: synchronization of copies after failure is more complex than in centralized systems
  • Complexity: overall system is more complex than centralized database systems
  • Cost: involves more cost, as hardware and trained manpower are needed at multiple sites
  • Distribution of Control: access to data must be carefully managed, with well-defined rights for local sites

Design Issues in DDBS

  • Distributed Database Design: how to place the database and applications across sites
  • Distributed Directory Management: global or local directories, centralized or distributed
  • Distributed Query Processing: deciding on a strategy for executing queries over the network
  • Distributed Concurrency Control: maintaining consistency of multiple copies of the database
  • Distributed Deadlock Management: prevention, avoidance, and detection/recovery
  • Reliability of Distributed DBMS: ensuring consistency and detecting/recovering from failures
  • Replication: fully or partially replicating the database
  • Relationship among Problems: design issues affect each other

Distributed Concurrency Control

  • Ensuring integrity of the database despite concurrent access
  • Two classes: pessimistic (synchronizing before execution) and optimistic (executing and checking consistency)
  • Two fundamental primitives: Locking and Timestamping

Distributed Deadlock Management

  • Alternatives: prevention, avoidance, and detection/recovery

Reliability of Distributed DBMS

  • Ensuring consistency of the database despite failures
  • Recovering from failures and bringing databases up-to-date

Replication

  • Ensuring consistency of replicas (copies of the same data item)
  • Protocols: Eager (forcing updates to all replicas before completion) or Lazy (propagating updates after completion)

Relationship among Problems

  • Design of distributed databases affects many areas
  • Directory management, query processing, concurrency control, and replication are interrelated

This quiz assesses your understanding of transparent management of distributed and replicated data, including the concept of transparency in system design. It covers how transparency enables the development of complex applications and allows users to pose queries without worrying about implementation details.

Make Your Own Quizzes and Flashcards

Convert your notes into interactive study material.

Get started for free
Use Quizgecko on...
Browser
Browser