Podcast
Questions and Answers
In a centralized system, how do general-purpose computers typically access shared memory?
In a centralized system, how do general-purpose computers typically access shared memory?
- Via a common bus connecting CPUs and device controllers. (correct)
- By implementing a message-passing interface.
- Using a distributed cache coherency protocol.
- Through a dedicated high-speed network connection.
What is the primary role of the back-end in a client-server database system?
What is the primary role of the back-end in a client-server database system?
- Providing a graphical user interface for users.
- Handling communication between the client and the network.
- Generating reports and forms for data presentation.
- Managing access structures, query evaluation, and concurrency control. (correct)
Why is replacing mainframes with client-server architectures beneficial for organizations?
Why is replacing mainframes with client-server architectures beneficial for organizations?
- Mainframes provide better user interfaces.
- Client-server architectures offer better functionality for the cost and easier maintenance. (correct)
- Mainframes are easier to scale and maintain.
- Client-server architectures centralize all processing tasks.
Which type of server system is commonly used in relational database systems?
Which type of server system is commonly used in relational database systems?
What is the function of the Log writer process in a transaction server?
What is the function of the Log writer process in a transaction server?
What is the purpose of implementing mutual exclusion in database systems?
What is the purpose of implementing mutual exclusion in database systems?
What is the primary advantage of using data servers in high-speed LANs?
What is the primary advantage of using data servers in high-speed LANs?
What is 'cache coherency' in the context of data caching in data servers?
What is 'cache coherency' in the context of data caching in data servers?
What characterizes a 'coarse-grain parallel' machine?
What characterizes a 'coarse-grain parallel' machine?
How is 'speedup' measured in the context of parallel systems?
How is 'speedup' measured in the context of parallel systems?
What is 'transaction scaleup' designed to address in parallel database systems?
What is 'transaction scaleup' designed to address in parallel database systems?
Which factor contributes to sublinear speedup and scaleup in parallel systems?
Which factor contributes to sublinear speedup and scaleup in parallel systems?
What is a limitation of using a 'bus' interconnection network in parallel systems?
What is a limitation of using a 'bus' interconnection network in parallel systems?
In a 'hypercube' interconnection network, how are components connected?
In a 'hypercube' interconnection network, how are components connected?
What is a characteristic of the 'shared memory' architecture in parallel database systems?
What is a characteristic of the 'shared memory' architecture in parallel database systems?
Where does the bottleneck typically occur in a 'shared disk' parallel database system?
Where does the bottleneck typically occur in a 'shared disk' parallel database system?
What is the primary advantage of a 'shared nothing' architecture in parallel database systems?
What is the primary advantage of a 'shared nothing' architecture in parallel database systems?
What is a 'hierarchical' database architecture a combination of?
What is a 'hierarchical' database architecture a combination of?
What is a key characteristic of distributed systems concerning data?
What is a key characteristic of distributed systems concerning data?
What is the primary goal of homogeneous distributed databases?
What is the primary goal of homogeneous distributed databases?
How does a 'global transaction' differ from a 'local transaction' in a distributed database?
How does a 'global transaction' differ from a 'local transaction' in a distributed database?
What is a significant trade-off in distributed systems related to data management?
What is a significant trade-off in distributed systems related to data management?
What is the purpose of the two-phase commit protocol (2PC) in distributed databases?
What is the purpose of the two-phase commit protocol (2PC) in distributed databases?
What is the primary difference between local-area networks (LANs) and wide-area networks (WANs)?
What is the primary difference between local-area networks (LANs) and wide-area networks (WANs)?
What is a key characteristic of groupware applications working on WANs with discontinuous connections?
What is a key characteristic of groupware applications working on WANs with discontinuous connections?
In a client-server architecture, if the front-end requires data mining and analysis tools, where would these tools reside?
In a client-server architecture, if the front-end requires data mining and analysis tools, where would these tools reside?
To ensure the safe concurrent access of shared data, database systems use mutual exclusion, which is typically implemented using:
To ensure the safe concurrent access of shared data, database systems use mutual exclusion, which is typically implemented using:
Considering transaction server processes, what function does the checkpoint process serve?
Considering transaction server processes, what function does the checkpoint process serve?
What is the significance of message passing overhead in page-shipping versus item-shipping scenarios within data servers?
What is the significance of message passing overhead in page-shipping versus item-shipping scenarios within data servers?
In data server architectures, 'lock caching' is employed between transactions. Which statement explains its main benefit?
In data server architectures, 'lock caching' is employed between transactions. Which statement explains its main benefit?
How does 'skew' affect overall execution time in parallel systems?
How does 'skew' affect overall execution time in parallel systems?
How often does a server process add log records in a log record buffer?
How often does a server process add log records in a log record buffer?
Which of these choices is NOT involved in the structure of transaction server process?
Which of these choices is NOT involved in the structure of transaction server process?
To avoid overhead of interprocess communication for lock request/grant, what is an alternative?
To avoid overhead of interprocess communication for lock request/grant, what is an alternative?
What does a node function as, regarding shared nothing systems?
What does a node function as, regarding shared nothing systems?
What is the disadvantage of added complexity required to ensure proper coordination among sites?
What is the disadvantage of added complexity required to ensure proper coordination among sites?
Why are wide-area networks with continuous connection (e.g. the Internet) needed for implementing distributed database systems?
Why are wide-area networks with continuous connection (e.g. the Internet) needed for implementing distributed database systems?
What is a downside to shared memory systems?
What is a downside to shared memory systems?
A fixed-sized problem executing on a small system is given to a system which is N-times is known as
A fixed-sized problem executing on a small system is given to a system which is N-times is known as
How does the mesh network typically scale?
How does the mesh network typically scale?
What is a characteristic of the processes in Server Processes?
What is a characteristic of the processes in Server Processes?
Flashcards
Centralized Systems
Centralized Systems
Run on a single computer system and do not interact with other computer systems.
General-purpose computer system
General-purpose computer system
One to a few CPUs and device controllers connected through a common bus, providing access to shared memory.
Single-user system
Single-user system
Typically has one CPU, one or two hard disks, and an OS that may support only one user
Client-Server Systems
Client-Server Systems
Signup and view all the flashcards
Back-end (Client-Server)
Back-end (Client-Server)
Signup and view all the flashcards
Front-end (Client-Server)
Front-end (Client-Server)
Signup and view all the flashcards
Transaction Servers
Transaction Servers
Signup and view all the flashcards
Transaction Server Process
Transaction Server Process
Signup and view all the flashcards
Server Processes
Server Processes
Signup and view all the flashcards
Multithreaded Processes
Multithreaded Processes
Signup and view all the flashcards
Database writer process
Database writer process
Signup and view all the flashcards
Log writer process
Log writer process
Signup and view all the flashcards
Process monitor process
Process monitor process
Signup and view all the flashcards
Mutual Exclusion
Mutual Exclusion
Signup and view all the flashcards
Data Servers
Data Servers
Signup and view all the flashcards
Page/Item Shipping
Page/Item Shipping
Signup and view all the flashcards
Cache Coherency
Cache Coherency
Signup and view all the flashcards
Parallel Systems
Parallel Systems
Signup and view all the flashcards
Coarse-grain Parallel
Coarse-grain Parallel
Signup and view all the flashcards
Fine-grain Parallel
Fine-grain Parallel
Signup and view all the flashcards
Throughput
Throughput
Signup and view all the flashcards
Response Time
Response Time
Signup and view all the flashcards
Speedup
Speedup
Signup and view all the flashcards
Scaleup
Scaleup
Signup and view all the flashcards
Batch Scaleup
Batch Scaleup
Signup and view all the flashcards
Shared Memory Architecture
Shared Memory Architecture
Signup and view all the flashcards
Transaction Scaleup
Transaction Scaleup
Signup and view all the flashcards
Startup Costs
Startup Costs
Signup and view all the flashcards
Interference (Parallel)
Interference (Parallel)
Signup and view all the flashcards
Bus Network
Bus Network
Signup and view all the flashcards
Mesh Network
Mesh Network
Signup and view all the flashcards
Hypercube Network
Hypercube Network
Signup and view all the flashcards
Shared Memory
Shared Memory
Signup and view all the flashcards
Shared Disk
Shared Disk
Signup and view all the flashcards
Shared Nothing
Shared Nothing
Signup and view all the flashcards
Hierarchical Architecture
Hierarchical Architecture
Signup and view all the flashcards
Distributed Systems
Distributed Systems
Signup and view all the flashcards
Homogeneous Database
Homogeneous Database
Signup and view all the flashcards
Local Transaction
Local Transaction
Signup and view all the flashcards
Global Transaction
Global Transaction
Signup and view all the flashcards
Local-Area Networks (LANs)
Local-Area Networks (LANs)
Signup and view all the flashcards
Wide-Area Networks (WANs)
Wide-Area Networks (WANs)
Signup and view all the flashcards
Study Notes
- Covers database system architectures
- Includes: centralized and client-server systems, server system architectures, parallel systems, distributed systems, and network types.
Centralized Systems
- Run on a single computer and do not interact with other computer systems
- The general-purpose computer system includes few CPUs and device controllers connected through a common bus.
- Access to shared memory is provided by the common bus.
- A single-user system is typically a desktop unit with one CPU, one or two hard disks, and supports only one user.
- A multi-user system has more disks, memory, CPUs, uses a multi-user OS, and serves many users via terminals.
- Multi-user systems are often called server systems.
Client-Server Systems
- Server systems satisfy requests from m client systems.
- Database functionality in client-server systems can be divided into back-end and front-end components
- Back-end manages access structures, query evaluation/optimization, concurrency control, and recovery.
- Front-end consists of tools such as forms, report-writers, and graphical user interfaces.
- SQL or an application program interface provides the interface between the front-end and back-end.
- Replacing mainframes with networks of workstations or personal computers connected to back-end server machines provide better functionality for the cost.
- Client-server systems have flexibility in locating resources, expanding facilities, has better user interfaces and easier maintenance
Server System Architecture
- Server systems are broadly categorized into transaction servers and data servers
- Widely used in relational database systems transaction servers
- Data servers are used in object-oriented database systems.
Transaction Servers
- Also called query server systems or SQL server systems.
- Clients send requests to the server which executes transactions, and then ships the results back to the client.
- Requests are specified in SQL and communicated via a remote procedure call (RPC) mechanism.
- Transactional RPC allows many RPC calls to form a transaction.
- Open Database Connectivity (ODBC) is a C language API from Microsoft used to connect to a server, send SQL requests, and receive results.
- JDBC standard for Java is similar to ODBC
Transaction Server Process Structure
- A transaction server consists of multiple processes accessing data in shared memory.
- Server processes receive user queries (transactions), execute them, and send results back.
- Server processes may be multithreaded, allowing a single process to execute several user queries concurrently
- A lock manager process is used.
- A database writer process outputs modified buffer blocks to disks continually
Transaction Server Processes
- Log writer processes add log records to a log record buffer and outputs them to stable storage.
- Checkpoint processes perform periodic checkpoints
- Process monitor processes monitor other processes
- Process monitor processes take recovery actions if processes fail, such as aborting transactions and restarting processes.
- Shared memory contains shared data in: the buffer pool, lock table, log buffer, and cached query plans
- All database processes can access shared memory
- Database systems implement mutual exclusion using operating system semaphores or atomic instructions to prevent two processes from accessing the same data structure simultaneously.
- To avoid interprocess communication overhead for lock request/grant, each database process operates directly on the lock table.
- The lock manager process is still used for deadlock detection
Data Servers
- Utilized in high-speed LANs when clients have comparable processing power to the server and the tasks are compute-intensive.
- Data is shipped to clients for processing, and results are sent back to the server.
- This architecture requires full back-end functionality at the clients.
- Used in many object-oriented database systems where issues include: page-shipping vs item-shipping, locking, data caching, and lock caching
Data Servers - Page-Shipping vs Item-Shipping
- Page-shipping involves a larger unit but fewer messages.
- Item-shipping uses a smaller unit, thus involves more messages
- Worth prefetching related items along with requested items.
- Page shipping can be thought of as prefetching
Data Servers - Locking
- Overhead of requesting and getting locks from the server is high due to message delays.
- Locks can be granted on requested and prefetched items
- With page shipping, the transaction is granted a lock on the whole page.
- Locks on prefetched items can be called back by the server and returned by the client if not used.
- Locks on a page can be deescalated to locks on items when lock conflicts occur, and locks from unused items can then be returned to the server.
Data Servers - Data Caching
- Data can be cached at the client, even in between transactions.
- Check that data is up-to-date before being used (cache coherency).
- Checking can be done when requesting a lock on a data item.
Data Servers - Lock Caching
- Locks can be retained by the client system, even between transactions.
- Transactions can acquire cached locks locally, without contacting the server.
- The server calls back locks from clients when it receives conflicting lock requests.
- The client returns the lock once no local transaction is using it and is similar to de-escalation, but across transactions.
Parallel Systems
- Contain multiple processors and disks connected by a fast interconnection network.
- A coarse-grain parallel machine has few powerful processors
- A massively parallel or fine-grain parallel machine utilizes thousands of smaller processors.
- Throughput, the tasks completed in a time interval, and response time, the time to complete a task, are the 2 main performance measures
Speed-Up and Scale-Up
- Speedup involves giving a fixed-size problem executing on a small system to a system that is N-times larger.
- Speedup is measured by dividing small system elapsed time by large system elapsed time, and is linear if this equation equals N.
- Scaleup involves increasing the size of both the problem and the system: an N-times larger system used to perform an N-times larger job.
- Scaleup is measured by dividing small system small problem elapsed time by big system big problem elapsed time, and is linear if the equation equals 1.
Batch and Transaction Scaleup
- A single large job is batch scaleup which is typical of most decision support queries and scientific simulations.
- Batch scaleup uses an N-times larger computer on an N-times larger problem.
- Transaction scaleup involves numerous small queries from independent users to a shared database.
- Transaction scaleup involves N-times as many users submitting requests to an N-times larger database on an N-times larger computer
- Transaction scaleup is well-suited to parallel execution.
Factors Limiting Speedup and Scaleup
- Speedup and scaleup commonly are often sublinear due to start-up costs, interference and skew
- Startup costs: the costs of starting up multiple processes may dominate computation time
- Interference: competing processes accessing shared resources spend more time waiting than performing useful work.
- Skew: increased parallelism increases the variance in service times of parallel tasks, and execution time depends on the slowest task.
Interconnection Network Architectures
- Bus: System components send data on and receive it from a single communication bus.
- Does not scale well with increasing parallelism.
- Mesh: Components arranged as nodes, connect to adjacent components.
- Communication links grow with components and scales better.
- May require 2√n hops to send message to a node or √n with wraparound​.
- Hypercube: Components are numbered in binary; components connect if binary representations differ by one bit.
- N components connect to log(n) other components and can reach each other via log(n) links, which reduces communication delays.
Parallel Database Architectures
- Shared memory: processors share a common memory
- Shared disk: processors share a common disk
- Shared nothing: processors share neither memory nor disk
- Hierarchical: hybrid of shared memory, disk, and nothing
Parallel Database Architectures - Shared Memory
- Processors and disks access a common memory, typically via a bus or interconnection network.
- There is efficient communication between processors
- Shared memory can be accessed by any processor without having software move it
- Architecture is not scalable past 32 or 64 processors because the bus becomes a bottleneck
- Shared Memory Widely used for lower degrees of parallelism (4 to 8).
Parallel Database Architectures - Shared Disk
- All processors can directly access all disks via an interconnection network, but processors have private memories.
- The memory bus is not a bottleneck and provides a degree of fault tolerance
- If a processor fails, others take over its tasks since the database has disks that are accessible from all processors
- Shared disks systems can scale to a somewhat larger number of processors, but the communication between processors is slower
- Examples include: IBM Sysplex and DEC clusters (now part of Compaq) running Rdb (now Oracle Rdb), which were early commercial users; its weakness now is that there is a bottleneck to the disk subsystem
Parallel Database Architectures - Shared Nothing
- A node consist of processor, memory, and one or more disks. Nodes communicate via the interconnection network.
- The node functions as the server for the data on its disk or disks.
- Data accessed from local disks does not pass through the interconnection minimizing the interference of resource sharing
- Shared-nothing multiprocessors can be scaled up to thousands of processors without interference.
- Cost of communication and non-local disk access are the main drawback
- Examples include: Teradata, Tandem, Oracle-n CUBE
Parallel Database Architectures - Hierarchical
- Combines characteristics of shared-memory, shared-disk, and shared-nothing architectures
- Top level is a sharded-nothing architecture where nodes use an interconnection network but do not share disks or memory
- Each node of the system could be a shared-memory system with a few processors
- Alternately, each node could be a shared-disk system, and each of the systems sharing a set of disks could share a shared-memory system, but the complexity leads to distributed virtualmemory architectures also called non-uniform memory architectures
- Reduces the programming complexity in distributed virtual-memory architectures which can also be called non-uniform memory architecture (NUMA).
Distributed Systems
- Data spreads over multiple machines, also called sites or nodes, connected by a network.
- Data is shared by users on multiple machines.
Distributed Databases
- In homogeneous distributed databases, all sites have same software/schema, and data is partitioned among sites.
- The goal is to provide a single database view, hiding distribution details.
- In heterogeneous distributed databases, different sites use different software/schema.
- The goal is to integrate existing databases to provide useful functionality
- A local transaction accesses data in a single initiating site
- A global transaction accesses data in a different site than the initiating site or in multiple sites
Trade-offs in Distributed Systems
- Sharing data allows users at one site to access data at other sites.
- Each site retains a degree of control over locally stored data, referred to as autonomy.
- Higher system availability from data replication at remote sites, even when one site fails.
- Coordination complexity, development cost, bug potential, and processing overhead are all disadvantages.
Implementation Issues for Distributed Databases
- Atomicity guarantees must update data at multiple sites.
- The two-phase commit protocol (2PC) ensures atomicity.
- Each site executes the transaction until right before the commit and leaves the final decision to a coordinator.
- Each site must follow the coordinator's verdict, even if failures occur while waiting.
- 2PC is not always appropriate.
- Other transactions that can be used include: persistent messaging, and workflows
- Distributed concurrency control (and deadlock detection) is required
- Data items may be replicated to improve data availability
- Details are in Chapter 22
Network Types
- Local-area networks (LANs) are composed of processors distributed over small areas like a single building, or a few buildings
- Wide-area networks (WANs) are composed of processors distributed over a large geographical area.
Networks Types
- WANs with a continuous connection (e.g., the Internet) are needed for implementing distributed database systems
- Groupware applications like Lotus notes can work on WANs with discontinuous connections through data that is periodically updated.
- Copies of data may be updated independently
- This can result in non-serializable executions due to different order of operations being executed.
- Resolution is application dependent.
Studying That Suits You
Use AI to generate personalized quizzes and flashcards to suit your learning preferences.
Related Documents
Description
Explanation of database system architectures. Covers centralized, client-server, parallel, and distributed systems. Explains network types and how servers handle requests from multiple clients.