20 Questions
What does transparency refer to in a system?
Transparency refers to the separation of the higher-level semantics of a system from lower-level implementation issues.
Which type of transparency protects the user from the details of the network and even hides the existence of the network from the user?
Network Transparency
Logical data independence refers to the immunity of user applications to changes in the physical structure of data.
False
Fragmentation refers to the division of each database relation into smaller fragments for reasons of ____, ____, and ____.
performance, availability, reliability
Match the fragmentation alternatives with their descriptions:
Horizontal fragmentation = Partitioning relation into sub-relations with subset of tuples Vertical fragmentation = Defining sub-relations based on a subset of attributes
What are the three major players responsible for providing transparency in a system?
Language/Compiler, Operating System, DDBMS
What are the two basic alternatives for partitioning a distributed database?
Partitioned (or non-replicated) and Replicated (fully or partially)
What factors need to be considered in distributed query processing?
All of the above
Concurrency control ensures the synchronization of accesses to a distributed database.
True
Distributed deadlock management includes alternatives for prevention, avoidance, and ________. (Fill in the blank)
detection/recovery
Match the following replication protocols with their descriptions:
Eager protocol = Force updates to all replicas before transaction completion Lazy protocol = Update one copy and then propagate after transaction completes
What are some additional issues faced in distributed databases due to the changing environment?
Looser federation, multi-database systems, growth of the Internet, peer-to-peer computing, and web growth
What is a distributed database defined as?
A collection of multiple, logically interrelated databases distributed over a computer network.
Which of the following are types of accesses involved in a Distributed Database System?
Local access
What are the two main characteristics of candidate applications for a Distributed Database System?
Large number of users and users are physically spread across large geographical area.
In a centralized Client-Server System, data management is distributed across multiple systems.
False
In a Distributed Database System, the __________ contains the global schema.
DDBMS
What are the three orthogonal dimensions/Alternatives of DDBS?
Delivery modes
What is the purpose of periodic delivery in a Distributed Database System?
To send data from the server to clients at regular intervals.
Distributed Database Systems help reduce telecommunication costs.
True
Study Notes
Distributed Database Systems Overview
- A distributed database system is a collection of multiple, logically interrelated databases distributed over a computer network.
- A distributed DBMS (DDBMS) is the software system that permits the management of the distributed database and makes the distribution transparent to the users.
Characteristics of Distributed Database Systems
- Data management at multiple sites: data is stored and managed at geographically multiple sites.
- Local requirements: each site storing data in a DDBS is called a local site, catering to local users.
- Global perspective: the DDBS fulfills global requirements in a transparent way.
Access to a Distributed Database System
- Three types of accesses:
- Local access: access by users connected to a site, accessing data from the same site.
- Remote access: a user connected to a site, accessing data from another site.
- Global access: data is displayed after being collected from all locations, regardless of access location.
Application of Distributed Databases
- Candidates for a DDBS have two main characteristics:
- Large number of users.
- Users are physically spread across a large geographical area.
- Examples of candidates:
- Banking applications.
- Air ticketing.
- Business at multiple locations.
Data Delivery Alternatives
- Three orthogonal dimensions/alternatives of DDBS exist:
- Delivery modes (pull-only, push-only, hybrid).
- Frequency (periodic, conditional, ad-hoc).
- Communication methods (unicast, multicast).
Communication Methods
- Unicast: one-to-one communication from a server to a client.
- Multicast: one-to-many communication from a server to multiple clients.
Distributed Database System vs Centralized Client-Server System
- Centralized C/S System:
- Data management is carried out on a single centralized system.
- Data is accessed from different machines (clients) connected through a network.
- Data storage and management is mainly done on the server.
Promises of Distributed Database Systems
- Four fundamental advantages:
- Transparent management of distributed and replicated data.
- Reliable access to data through distributed transactions.
- Improved performance.
- Easier system expansion.
Transparency in Distributed Database Systems
- Types of transparencies:
- Data independence.
- Network transparency or distribution transparency.
- Replication transparency.
- Fragmentation transparency.
- Responsibility of transparency:
- Access layer (language/compiler).
- Operating system level.
- DBMS level.
Layers of Transparency
- Language/compiler provides transparency features.
- Operating system provides network transparency.
- DBMS provides fragmentation or replication transparency.### Distributed Database Systems (DDBS)
Critical Issues in DDBS
- Consistency of data across multiple sites is a serious issue
- Distributed transactions should be executed at multiple sites without fear of failure
- Reliability through distributed transactions ensures user applications do not need to worry about coordinating accesses to individual local databases or site/communication link failures
Advantages of DDBS
Improved Performance
- Data Localization: data is stored close to its point of use, reducing remote access delays and contention for CPU and I/O services
- Query Parallelism: a query can be executed in parallel, improving performance
Easier System Expansion
- Expansion can be handled by adding processing and storage power to the network
- More economical than establishing a single, centralized system
Complications Introduced by Distribution
- Selection of the Copy: choosing the right copy based on constraining variables (distance, load, site availability, etc.)
- Failure Recovery: synchronization of copies after failure is more complex than in centralized systems
- Complexity: overall system is more complex than centralized database systems
- Cost: involves more cost, as hardware and trained manpower are needed at multiple sites
- Distribution of Control: access to data must be carefully managed, with well-defined rights for local sites
Design Issues in DDBS
- Distributed Database Design: how to place the database and applications across sites
- Distributed Directory Management: global or local directories, centralized or distributed
- Distributed Query Processing: deciding on a strategy for executing queries over the network
- Distributed Concurrency Control: maintaining consistency of multiple copies of the database
- Distributed Deadlock Management: prevention, avoidance, and detection/recovery
- Reliability of Distributed DBMS: ensuring consistency and detecting/recovering from failures
- Replication: fully or partially replicating the database
- Relationship among Problems: design issues affect each other
Distributed Concurrency Control
- Ensuring integrity of the database despite concurrent access
- Two classes: pessimistic (synchronizing before execution) and optimistic (executing and checking consistency)
- Two fundamental primitives: Locking and Timestamping
Distributed Deadlock Management
- Alternatives: prevention, avoidance, and detection/recovery
Reliability of Distributed DBMS
- Ensuring consistency of the database despite failures
- Recovering from failures and bringing databases up-to-date
Replication
- Ensuring consistency of replicas (copies of the same data item)
- Protocols: Eager (forcing updates to all replicas before completion) or Lazy (propagating updates after completion)
Relationship among Problems
- Design of distributed databases affects many areas
- Directory management, query processing, concurrency control, and replication are interrelated
This quiz assesses your understanding of transparent management of distributed and replicated data, including the concept of transparency in system design. It covers how transparency enables the development of complex applications and allows users to pose queries without worrying about implementation details.
Make Your Own Quizzes and Flashcards
Convert your notes into interactive study material.
Get started for free