Distributed Systems Exercise 03 PDF
Document Details
Uploaded by StatelyAgate7771
University of Zurich
2023
Dr. Bruno Rodrigues, Prof. Dr. Burkhard Stiller
Tags
Summary
This document is lecture notes on Distributed Systems for Exercise 03, provided by the University of Zurich. Topics include consistency, consensus, and related concepts in distributed systems.
Full Transcript
This is background and exercise material, only the teaching slides’ content and the book's chapters refer to the valid class information. Distributed Systems Exercise 03 Dr. Bruno Rodrigues, Prof. Dr. Burkhard Stiller Department of Informatics IfI, Communication Systems Group CSG, University of Zur...
This is background and exercise material, only the teaching slides’ content and the book's chapters refer to the valid class information. Distributed Systems Exercise 03 Dr. Bruno Rodrigues, Prof. Dr. Burkhard Stiller Department of Informatics IfI, Communication Systems Group CSG, University of Zurich UZH [rodrigues|stiller]@ifi.uzh.ch © 2023 UZH, CSG@IfI 1 Outline ❑ Summary Distributed Systems – DS part 3 – Consistency and Consensus ❑ Review E03 © 2023 UZH, CSG@IfI 2 Content DS ❑ Tour D‘Horizon on Distributed Systems and Encoding in Distributed Systems – Language-specific Formats – JSON, XML, Binary, ASN.1 – IPFS ❑ Unreliable Communication Systems – Faults – Unreliable Networks and Clocks – Knowledge, Truth, Lies ❑ Consistency and Consensus in Distributed Systems – Guarantees – Linearizability – Ordering © 2023 UZH, CSG@IfI 3 Summary DS Part 3 Consistency and Consensus in Distributed Systems © 2023 UZH, CSG@IfI 4 Consistency ❑ ❑ Order and causality matters to keep consistency Any two replicas (of an object, record, data) in a replicated database might return different values – Due to replication lag ❑ Eventual consistency – Eventually (i.e., at no defined time), if no writes are performed, all replicas of a database will show consistent data ❑ Strong consistency – Atomic consistency – Immediate consistency – External consistency © 2023 UZH, CSG@IfI Linearizability 5 Serializability ❑ Serial x serializability – Serializability is related to the concurrent execution (in arbitrary order) of transactions Order and causality matters to keep consistency ❑ – Requires support for locking and recoverability • Each operation executes completely or not at all (to be recovered) • No partial results exist – “all-or-nothing” – Serializability applies synchronization • Logical and vector clocks (see earlier module) © 2023 UZH, CSG@IfI 6 Notions of Linearizability ❑ Multiple nodes concurrently accessing replicated data – E.g., data bases in a Distributed System ❑ Linearizability shall allow a Distributed System to behave like a “centralized systems” – I.e., one replica of “the” data and atomic operations on these – Data replication is transparent to the application ❑ Linearizability allows applications to read the most upto-date value, not values from stale caches or replicas “the quality or state – Linearizability guarantees recency ❑ ❑ of being recent” Linearizability deals with a single operation (read/write) Linearizability is defined in terms of real-time © 2023 UZH, CSG@IfI 7 Linearizable System – Definition ❑ A system is called “linearizable”, if and only if (iff) the last read operation (based on a global time) on a single register or object returns the most up-to-date value ❑ A system might be serializable, but not linearizable – Serializable • Transactions are executed in some order as they were serial → Global time is not important – Linearizable • Operations in the system returning the most recent value → Global time is important ❑ Databases might provide serializability and linearizability ❑ → Strict serializability (e.g., finance, booking/reservation, ecommerce) © 2023 UZH, CSG@IfI 8 Causality Considerations ❑ ❑ ❑ ❑ ❑ Linearizability is stronger than causal consistency However, linearizability is not the only way to ensure causality Making a system linearizable can harm its performance and availability A system can be causally consistent without being linearizable Causal consistency can be implemented more efficiently © 2023 UZH, CSG@IfI 9 Atomic Commit Requirements ❑ For a database supporting transactions spanning several nodes or partitions the problem that a transaction may fail on some nodes, but succeed on others is obvious ❑ All nodes need to agree on outcome of the transaction – Either they all abort or roll back, if anything goes wrong – Or they all commit, if nothing goes wrong ❑ Maintenance of transactions’ ACID principle is key – Atomicity, Consistency, Isolation, Durability © 2023 UZH, CSG@IfI 10 Desirable Properties of an Atomic Commit ❑ Assume a Transaction Coordinator (TC), Client A, and Client B have separate notions of committing ❑ Correctness – If one commits, no one aborts – If one aborts, no one commits ❑ Liveness (in a sense related to performance) – If no failures happen and A and B can commit, then commit – If failures happen, come to “some” conclusion as soon as possible © 2023 UZH, CSG@IfI 11 One-Phase Commit (1PC) Protocol ❑ Create a Transaction Coordinator (TC) – A single authoritative entity client ❑ Four entities – Client, TC, Bank A, Bank B ❑ “OK” TC Operation – – – – Client sends “start” to TC TC sends “debit” to A TC sends “credit” to B TC reports “OK” to client © 2023 UZH, CSG@IfI “start” 12 “debit” “credit” A B Two-Phase Commit (2PC) Protocol (1) ❑ ❑ Same entities as in 1PC Operation – Client “starts” and TC sends client “prepare” messages to A and B – A and B respond, saying “OK” “start” whether they’re willing to commit “commit” – If both say “yes,” TC sends “commit” TC “commit” messages “prepare” “prepare” – If either says “no,” TC sends “yes” “abort” messages “yes” – A and B “decide to commit”, A B if they receive a commit message. © 2023 UZH, CSG@IfI 13 E03 Review © 2023 UZH, CSG@IfI 14 Exercise 1 1 Yes, it would be completed. Tx1 is independent of Tx2/3/4. Only a totally ordered multicast only ensure the delivery of msgs/tx (e.g., ZoeKeeper or etcd). Nodes are not available/responsive. Assuming 2PC, node fails/take too long to answer a ‘prepare’ Assuming data is replicated based on 1PC, no checks are performed. Thus, atomicity may not be reached. © 2023 UZH, CSG@IfI 15 Exercise 2 Replicas are delivered typically delayed, can return different values due to a lag Replicas eventually reach consistency when no writes are performed Data is consistent across replicas, and it is promptly available at all nodes. Linear © 2023 UZH, CSG@IfI 16 Exercise 3 Atomic: operations are fully completed or aborted Immediate: once an operation is completed, all subsequent readings reflects the new value External: ensures that all transactions are seen in the same order (crucial for financial operations, for example) © 2023 UZH, CSG@IfI 17 Exercise 4 Without a coordinator and enforcing locks, there are no guarantees that T1 will always execute before T2. Ordering preserves causality when a transaction coordinator is not present. For example, by using a lamport/vector clock. Still, ordering would not ensure consistency and delays/failures may occur. © 2023 UZH, CSG@IfI 18 Exercise 5 A linearized system returns the most recent value of a write, whereas a Serialized system concerns the execution order of operations. Yes, it is possible to achieve linearizability considering a strict lock or coordinator. If both are achieved: strict serializability. In this case, there are drawbacks wrt. performance to be considered © 2023 UZH, CSG@IfI 19 Exercise 6 Timestamps can be used to ensure partial causality in non-linearizable systems, i.e., it would be possible to determine the relation of certain/some transactions. Locks and a transaction coordinator can be used to ensure a system totally ordered and causally consistent, i.e., it is possible to determine the relation of all transactions in any point in time. © 2023 UZH, CSG@IfI 20 Exercise 7 X © 2023 UZH, CSG@IfI 21 Exercise 8 Act of making different entities (e.g., processes) work together in a distributed system for a goal or effect. Is an effect of serializability and recoverability in a distributed system, that guarantees that an operation is completed at all participants or at none. © 2023 UZH, CSG@IfI 22 Exercise 9 - Transaction Coordinator (TC) crashes before a transaction is completed - One of the nodes crashes - Failures of network links © 2023 UZH, CSG@IfI 23 Exercise 10 Whenever a nested transaction commits, it reports its status and the status of its descendants to its parent. Therefore, when a transaction enters the committed state, it has a correct list of its committed descendants. Therefore, when the top-level transaction starts the two-phase commit protocol, its list of committed descendants is correct. It checks the descendants and makes sure they can still commit or must abort. There may be nodes that ran unsuccessful descendants which are not included in the twophase commit protocol. These will discover the outcome by querying the top-level transaction. © 2023 UZH, CSG@IfI 24 A1 B1 A2 B2 Exercise 11 - P2 Sends “status” request to all processors - TC is waiting If P5 has sent “No” → Normally “abort”, but here we have the 80% case P2 needs 3 “Yes” from others in this case to commit. © 2023 UZH, CSG@IfI 25 Exercise 11 *Initial situation: response to TC’s request *P2 rebooting TC P1 P2 P3 P4 P1 P5 P2 TC *TC waits P3 P4 P5 *P2 sends status request to all processors TC P1 P2 © 2023 UZH, CSG@IfI P3 *TC receives 4/5 yes Sends ‘‘commit’’ to 4 processors and ‘‘abort’’ P4 P4 TC P5 P1 26 P2 P3 *TC receives 3/5 yes. Abort to all processors P4 P5 Questions? © 2023 UZH, CSG@IfI 27