Distributed System.pdf
Document Details
Related
- PCSII Depression/Anxiety/Strong Emotions 2024 Document
- A Concise History of the World: A New World of Connections (1500-1800)
- Human Bio Test PDF
- University of Santo Tomas Pre-Laboratory Discussion of LA No. 1 PDF
- Vertebrate Pest Management PDF
- Lg 5 International Environmental Laws, Treaties, Protocols, and Conventions
Full Transcript
1 UNIT Characterization of Distributed System CONTENTS 1-2B to 1-4B Part-1 : Introduction, Example of Distributed System...
1 UNIT Characterization of Distributed System CONTENTS 1-2B to 1-4B Part-1 : Introduction, Example of Distributed System 1-4B to 1-9B Part-2 : Resource Sharing and Web..eeech..+cese+e Challenges, Architectural Models, Fundamental Models Part-3: Theoretical Foundation for.............. l-10B to 1-12B Distributed System : Limitations of Distributed System, Absence of Global Clock, Shared Memory Part-4 : Logical Clock, Lamports 1-12B to 1-15B and Vectors Logical Clocks Part-5: Concept in Message System 1-16B to 1-20B Causal Order, Total Order, Total Causal Order, Techniques for Message Ordering, Causal Ordering of Messages, Global State and Termination Detection. 1-1 B (CS-Sem-7) 1-2 B (CS-Sem-7) Characterization of Distributed System PART- 1 Introduction, Example of Distributed System. Questions-Answers Long Answer Type and Medium Answer Type Questions main Que 1.1. What is a distributed system ? Describe the characteristics of distributed- system. Give two example of AKTU2014-15, Marks 05 distributed system. Answer Distributed system : 1. A distributed system is a system in which software or hardware communicates and components connected via communication network coordinates their actions only by passing messages. 2. Computers that are connected by a network may be spatially separated by distanee. 3. Resources may be managed by servers and accessed by clients. Characteristics of distributed system : 1. Heterogeneity : Distributed system enables the users to access services computers and and run application over a heterogeneous collection of networks. characteristics 2. Openness : The openness of a computer system is the that determine whether the system can be extended and re-implemented in various ways. 3. Concurrency : Concurrency in distributed system is use to help different time. users to access the shared resource at the same 4. Scalability : A system is described as scalable if it remains effective when there is significant increase in the number of resources and the numbers of users. 5. Security : Security provides confidentially, integrity and availability of the information resources. Example of distributed system: 1. Internet : The Internet is a very large distributed system. It enables users to make use of services such as the World Wide Web, e-mail and file transfer. Distributed System 1-3 B(CS-Sem-7) 2 Intranet : a An intranet is a private network that is contained within an enterprise. b An intranet is connected to the internet via router, which allows the users inside the intranet to make use of services such as web or e-mail. Que 1.2. What are distributed systems ? What are significant advantages and applications of distributed system ? AKTU2018-19, Marks 10 Answer Distributed system: Refer Q. 1.1, Page 1-2B, Unit-1. Advantages of distributed system : 1. Data sharing: It allows many users to access to a common database. 2 Resource sharing : Expensive peripherals like color printers can be shared among different nodes (or systems). 3. Communication: Enhance human-to-human communication, e.g., email, chat. 4. Flexibility :Spread the workload over the available machines. Applications of distributed systems : 1 Telecommunication networks such as telephone networks and cellular networks. 2. Network applications, world wide web and peer-to-peer networks. 3 Real-time process controls aircraft control systems. 4. Parallel computation. Que 1.3. How the distributed computing system is better than parallel processing system ? Explain. AKTU2017-18,Marks 10 Answer Distributed computing system is better than parallel processing system because of followingadvantages: 1. Economics : Microprocessors offer better performance than parallel processing system. 2. Speed : Adistributed system may have more total computing power than parallel processing system. 3. Reliability : If some of the machines are downed, the distributed system as a whole can stillsurvive with small degradation of performance. 14B (CS-Sem-7) Characterization ofDistributed System 4. Ineremental growth : Computing power can be added in small increments. 5. Data sharing:Allow many users access to a common database. 6 Device sharing : Allow many users to share expensive peripherals. 7. Flexibility : In distributed computing workload can be spread over the available machines in the most cost effective way. Que 1.4. What is distributed transparency? Explain the different types of distributed transparencies. AKTU 2017-18, Marks 10 Answer Distributed transparency is the property of distributed databases by the virtue of which the internal details of the distribution are hidden from the users. Types of transparencies : 1. Access transparency : It enables local and remote resources to be accessed using identical operations. 2. Location transparency : It enables resources to be accessed without knowledge of their physical or network location. 3. Concurrency transparency: It enables several processes to operate concurrently using shared resources without interference between them. 4 Replication transparency:It enables multiple instances of resources to be used to increase reliability and performance without knowledge of the replicas by users or application programmers. 5. Failure transparency :It enables the concealment of faults, allowing users and application program to complete their tasks despite the failure of hardware or software components. 6 Performance transparency:It allows the system to be reconfigured to improved performances as load varies. PART-2 Resource Sharing and Web Challenges, Architectural Models, Fundamental Models. Questions-Answers Long Answer Type and Medium Answer Type Questions Distributed System 1-5B (CS-Sem-7) Que 1.5.How the resource sharing is done in distributed system ? Explain with an example. Answer 1. Resource sharing is one of the major advantages which is obtained from distributed system. 2. In a distributed system, the resources are enclosed within computers and can only be accessed from other computers by 3. communication. Each resource must be managed by a program that offers a communication interface enabling the resource to be accessed. 4. The program also helps the resources to be updated reliably and consistently. For example : 1. The client-server model provides an effective general purpose approach to the sharing of information and resources in distributed systems. 2. In this model, a client sends a request to a server for getting some services such as reading a block of a file. 3. The server executes the request and sends back a reply to the client that contains the result of request processing. Que 1.6. Discuss the major issue in designing a distributed system. AKTU2017-18, Marks 10 Answer Major issues in designing a distributed system : 1, Heterogeneity : a. Distributed system must be constructed from variety of different networks, operating systems, computer hardware's and programming languages. b. Internet communication protocol mask the difference (heterogeneity) in networks and middleware can deal with the other differences. 2 Openness : Distributed system should be extensible i.e. to develop interface for the distributed system component so that they can be integrated to new extension of distributed system. 3. Security : Encryption can be used to provide adequate amount of shared resources and to keep sensitive information secret when it is transmitted in messages over a network. b. Denial of Services (DoS) attack is one of the big problems for security. 1-6 B (CS-Sem-7) Characterization ofDistributed System 4. Scalability : a. Scalability refers to the capability of a system to adapt to increased service load. b It is inevitable that a distributed system will grow with time since it is very common to add new machËnes to take care of increased work load. Therefore, a distributed system should be designed to easily cope with the growth of nodes and users in the system 5. Fault avoidance : Fault avoidance deals with designing the components ofthe system in such a way that the occurrence of faults in minimized. b. Conservative design practice such as using high reliability components are often employed for improving the system's reliability based on the idea offault avoidance. 6. Transparency : a. Transparency aims to hide the details ofdistribution from the users. b. For an example, user or programmer ned not be concerned with its location or the details of how its operations are accessed by other components, or whether it will be replicated or migrated. Que 1.7, Why is scalability an important feature in the design of distributed system ? Discuss some of the guiding principles for designing a scalable distributed system. |AKTU 2014-15, Marks 10 Answer Scalability is important features in design of distributed' system because : a. It helps the system to work efficiently with an increase in number of users. b. It increases the system performance by incorporating additional resources. Guiding principle for designing scalable distributed system: 1. Avoid centralized entities: Use ofcentralized entities should be avoided in the design of scalable distributed system because : In centralized system, if centralized entity fails then the entire system will also fail. i. Capacity of the network that connects the centralized entity gets saturated. iüi. In case of wide-area network system, traffic in the network increases. Distributed System 1-7B(CS-Sem-7) b. Replication of resources and distributed control algorithms are frequently used techniques to achieve scalable system. For better scalability,functionally symmetric configuration should be used in which all nodes of the system should play equal role in the operation of the system. 2. Avoid centralized algorithms: a. Acentralized algorithm is one that operates by collecting information from all nodes, processing this information on a single node and then distributing the result toother nodes. b. Time complexity of centralized algorithm may be very high which creates heavy network traffic and consumes network bandwidth. C. Therefore, in the design of a distributed operating system, only decentralized algorithm should be used. 3. Perform most operations on client workstations : If possible, an operation should be performed on the client's workstation. b This principle enhances the scalability of the system, since it allows graceful degradation of system performance as the system grows in size. C. Caching is a frequently used technique for the realization of this principle. 4 Controlling the cost of physical resources : As the demand for a resource grows, it should be possible to extend the system, at reasonable cost, to meet it. Que 1.8. Discuss architectural models of distributed system. Answer 1 An architecture model of a distributed system simplifies and abstracts the functions of the individual components of a distributed system 2. It also considers the placement of the components across a network of computers and the interrelationship between the components. 3. The main objective of these models is to make the system reliable, manageable, adaptable and cost-effective. 4. The two main types of architectural model are : a. Client-server model (Search engine): Fig. 1.8.1 illustrates the simple structure in which client processes interact with individual server processes in separate host computers in order to access the shared resources that they manage. 1-8 B (CS-Sem-7) Characterization of Distributed System Invocation Inyocation Client Server Result Server Result Client Key: process: Computer Fig. 1.8.1. Clients invoke individual servers. This is the architecture that is most widely employed. Client-server model offers a direct and simple approach tothe sharing of data and other resources. iv. Servers may acts as a client of other servers. V For example, a web server is often a client of a local file server that manages the files in which the web pages are stored. b. Peer-to-Peer model : In this architecture, all of the processes which are involved in task play similar roles, interacting cooperatively as peers without any distinction between client and server processes. ii The Fig. 1.8.2 illustrates the form of a peer-to-peer application. Application Application Coordination) Coordination Code Code Peer-2 Peer-1 /Application Coordination) Code Peer-3 Fig. 1.8.2. Distributed application based on peer-to-peer processes. Distributed System 1-9 B (CS-Sem-7) ii. Applications are composed of large numbers-of peer processes running on separate computers and the pattern of communication between them depends on application requirements. Que 1.9. Explain the fundamental models of distributed system. Answer 1 Fundamental models are based on the fundamental properties that allow us to be more specific about their characteristics, failures and security risks that they might exhibit. 2. The purpose of a model is : a. To make explicit all the relevant assumptions about the systems. b. To make generalizations concerning what is possible or impossible, given those assumptions. Following are the fundamental model : 1. Interaction models : a It is concerned with performance.of process communication channels and absence of global clock. b Interaction model is further classified as synchronous and asynchronous system. C. Interacting process perform all of the activity in adistributed system. d. Each process has its own state, consisting of the set of data that it can access and update, including the variable in its program. The state belonging toeach process is completely private. 2. Failure model : a. Ina distributed system both processes and communicationchannels may fail, sÍ this model is capable of handling all the failure. b. The failure model defines and classifies the faults that occur in the system. C. It provides a model to understand the effects offaults in the system. 3. Security model : a. It identifies the possible threats to processes and communication channels in an open distributed system such as integrity, authentication, privacy etc. b. The architectural model provides the basis for our security model: i The security ofa distributed system can be achieved by securing the processes and the channels used for their interactions and by protecting the objects that they encapsulate against unauthorized access. ii Protection is described in terms of objects, although the concepts apply equally well to resources of all types. 1-10B (CS-Sem-7) Characterization of Distributed System PART-3 Theoretical Foundation for Distributed System : Limitations of Distributed System, Absence of Global Clock, Shared Memory. Questions-Answers Long Answer Type and Medium Answer Type Questions Que 1.10. Explain the limitations of distributed system with example. AKTU 2018-19, Marks 10 Answer Limitations of distributed systems are as follows : 1. Absence of global clock: In a distributed system, global clock (or common clock) is not present. b Suppose a global clock is available for all the processes in the system. C In this case, two different processes can observe a global clock value at different instants due to unpredictable message transmission delays. d Therefore, two different processes, may falsely perceive two diferent instants in physical time to be a single instant in physical time. 2. Absence of shared memory : a The computer in a distributed system do not share common memory,an up-to-date state of the entire system is not available to any individual process. b. It is necessary for reasoning about the system's behaviour, debugging, recovering from failures, etc. C. Aprocess in a distributed system can obtain a coherent but partial view of the system or a complete but incoherent view ofthe system. d. A view is said to be coherent if allthe observations of different processes (computers) are made at the same physical time. e. Because of the absence of a global clock in'a distributed system, obtaininga coherent global state of the system is difficult. Distributed System 1-11 B(CS-Sem-7) Example: Local state Local state of A of B Initial state Communication Rs. 500 Rs. 200 Channels S1: A S2: B Fig. 110.1. Message under tran sit (Not yet reached to B) Rs. 450 Rs. 50 Rs. 200 S1: A S2: B Fig. 1.10.1. a. S1 records its local state (Rs. 450) just after debit (- 50) and S2 records its location (200) before receiving. b If transit message is not taken care off Global state = Local state S1+ Local state S2 = 450+ 200 = 650 = Rs. 50 missing i.e., in coherent system Que 1.11. What are distributed systems ? What are significant advantages, applications and limitations of distributed systems ? Explain with examples, what could be the impact of absen ce of global clock & shared memory. AKTU 2015-16, Marks 10 Answer Distributed systems: Refer Q.1.1, Page 1-2B, Unit-1. Significant advantages and applications of distributed systems : Refer Q. 1.2, Page 1-3B, Unit-1. Limitations of distributed systemns : 1. Absence of shared memory. 2. Absence of global clock. 3. The initial deployment cost of a distributed system is very high. 1-12 B(CS-Sem-7) Characterization of Distributed System Impact of the absence of global clock: 1 It is difficult in a distributed system to reason about the temporal order at events. 2. It is dificult to design and debug algorithms for adistributed system as compared to centralized systems. 3. Collecting up-to-date information on the state of the entire system in harder. Impact of the absence of shared memory : 1. An up-to-date state of the entire system is not available to any of the individual processes. 2 Recovery from failure cannot be possible. For example: Refer Q. 1.10, Page 1-1OB, Unit-1. PART-4 Logical Clock, Lamport's and Vectors Logical Clocks. Questions-Answers Long Answer Type and Medium Answer Type Questions the important Que 1.12. What are Lamport logical clocks ? List clocks. If A and B conditions to be satisfied by Lamport logical A > B then and if represent two distinct events in a process statement. Justify the CA) < C(B) but vice-versa not true. AKTU2015-16, Marks 10 Answer Lamport logical clocks : monotonically increasing software counter, A Lamport logical clock is a to any physical clock. whose value need bear no particular relationship Lamport logical clocks: Following conditions are to be satisfied by process P.,and a occurs before 1 Ifa andb are two events within the same b, then C(a) < C{b). P, and bis the receipt of that 2 It a is the sending of a message by process message by process P, then C{a) 0, solves the Byzantine agreement problem for 3m + lor more in the presence of at most m faulty processors processors. Let n denote the total number of processors(where, n >3m+ 1). The algorithm is recursively defined as follows : Algorithm OM(0): 1 The source processor sends its value to every processor. 2 Each processor uses the value it receives from the source. If it receives no value, then it uses a default value of 0. Algorithm OM (m), m>0: 1 The source processor sends its value to every processor. 2 For each i, let v, be the value processor i receives from the source (if it receives no value, then it uses a default value of 0). Processor iacts as the new source and initiates algorithm OM(m - 1) wherein, it sends the value v, to each of the other n - 2 processors. 3 For each i andj (where iandj are not equal), letv,be the value processor ireceived from processor j in step 2 using Algorithm OM (m - 1). If it receives no value, then it uses a default value of0. Processor i uses the value majority (v,, V,..). 4 The majority function is used to select the majority value out of values received in round of message exchange. 5. The function majority (u, V V,-) computes majority of values U,, Uy....Uif it exists (otherwise returns 0). Que 3.6. Describe Lamport-Shostak-Pease algorithm. How does vector clock overcome the disadvantages of Lamport clock ? Explain with an example. AKTU2016-17, Marks 15 Answer Lamport-Shostak-Pease algorithm : Refer Q. 3.5, Page 3-5B, Unit-3.. Vector clock overcome advantage of Lamport clock: With Lamport clocks, we cannot determine whether two events are casually related by looking at the timestamps, because if CA) < C(B) does not always mean A’ Bwhile veçtor clock allow to compare the timestamps of the events to determine whether they are casually related or not. Distributed System 3-7B (CS-Sem-7) For example :Vector timestamp is shown in Pig. 3.6.1. The event e, with vector timestamp (2, 1, 0) is causally ordered before the event e.,, with the vector timestamp (2,1, 4),but is concurrent with the event e having timestamp (0, 0, 2). (1, 1,0) (2, 1, 0) P, (0, 0, 0) e11 e12 Space (0, 1, 0), (2, 2, 4) To.o,o) e1 (2, 1, 3) ez2 (2, 1, 4) Pg (0, 0, 1) (0, 0, 2) (0, 0, 0) e31 E32 eg4 Time Fig.3.6.1. Que 3.7. What do you understand by Byzantine agreement problem? AKTU2018-19, Marks10 OR What is Byzantine agreement problem ? Provide the solution to Byzantine agreement problem. AKTU2018-19, Marks 10 Answer 1. In Byzantine agreement problem a single value, which is to be agreed on is initializes by an arbitrary processes and all non-faulty processes have to agree on that value. 2. There are n processes, n = lp Pgy.. P,} with unique names over N={1,.. , n) and at most Byzantine participants t 3t). Solution to Byzantine agreement : Solution to Byzantine problem is given by Lamport Shostak-Pease algorithm. agreement Lamport Shostak-Pease algorithm : Refer Q. 3.5, Page 3-5B, Unit-3. Que 3.8. Show that a Byzantine agreement cannot be reached among three processors, where one processor is faulty. OR Explain treatment of impossible result for the solution of Byzantine agreement problem. Answer 1 Sometimes, the agreement problem may lead to such a condition which is quite impossible to solve. 2 The situation where the agreement is impossible, called as impossible result. 3 This type of problem cannot be reached to agreement. 4 In a system, the impossible result situation is found with more than two processors. 5 Let us check the situation of impossible result in a system with three processors. 6. Consider a system with three processors, Po, P, and P: 7. We assume that there are only two values, 0 and 1, on which processors agree and processor P, initiates the initial value. 8 There are two possibilities: Case I: P, is not faulty : 1. Assume P, is faulty. 2 Suppose that P, broadcasts an initial value of 1to both P, and P,. 3 Processor P, acts maliciously and communicates a value of 0 to processor P: 4 Thus, P, receives conflicting values from P, and P,. 5 However, since P, is non-faulty, processor P, must accept 1 as the agreed upon value. 3-9 B (CS-Sem-7) Distributed System Po 1 1 P1 P. Fig. 3.8,1 Processor P, is non-faulty. Case II : P, is faulty : 1 Suppose that processor P, sends an initial value of 1 to P, and 0 to P, 2. Processor P, will communicate the value 0 to P,. 3 As far as P, is concerned, this case will look identical to Case I. 4 So any agreement protocol which works for three processors cannot distinguish between the two cases and must force P, to accept 1as the agreed upon value whenever P, is faced with such situations. 5 However, in case II, this will work only if P, is also made to accept 1as the agreed upon value. 6 Using a similar argument, we can show that ifP, receives an initial value of 0 from Po, then it must take 0 as the agreed upon value, even if P, communicatesa value of 1. 7 However, if this is followed in case II, P, will agree on a value of 1 and P, will agree on a value of 0. Po 1 P; 0 Fig. 3.8.2 Processor P, is faulty. Therefore, no solution exists for the Byzantine agreement problem for three processors,which can work under single processor failure. Que 3.9. Describe Byzantine agreement problem, and explain its solution. Show that Byzantine agreement cannot always be reached among four processors if two processors are faulty. AKTU2017-18, Marks 10 3-10 B (CS-Sem-7) Agreement Protocols Answer Byzantine agreement problem and its solution : Refer Q. 3.7, Page 3-7B, Unit-3. Proof: 1 Considera system with four processors as P,, P,, P, Pa Assume that processors are exchanging three values x, y and z to each other, P, initiate the initial value and processors P, andP, are faulty. 2 To initiate the agreement, processor P, execute algorithm OM(1)and sends its value x to all other processor as shown in Fig. 3.9.1. P1 X X P2 P3 P4 Fig. 3.9.1. 3. After receiving the value x from source processor P,, processors P,, P, and P, execute the algorithm OM(0). 4 Processor P, is non-faulty and send value x to processor P, and Pa Faulty processors P, and P, sends valuey to (Ps, P) and z to (P,, P;) respectively as shown in Fig. 3.9.2. P X P2 P3 PA Fig. 3.9.2. 5. After receiving all the messages, processor P;, P, Pz and P, decide on the majority value. Majority values for Byzantine solution : ProcessorReceived majority Common majority values values P (x, x, 2) (x, y, z) 0 P PA (x, x, y) Distributed System 3-11B (CS-Sem-7) 6 According to majority value table, processors does not agree on single common majority value, which violates the condition of Byzantine agreement problem. 7 "This proves that Byzantine agreement cannot always reach among four processors if two processors are faulty. PART-3 Application of AgreementProtocol, Atomic Commit in Distributed Database System. Questions-Answers Long Answer Type and Medium Answer Type Questions Que 3.10. What do you understand by the fault-tolerant clock synchronization ? Answer 1. In distributed systems, it is often necessary that sites (or processes) maintain physical clocks that are synchronized with one another. 2. Since physical clocks have a drift problem, they must be periodically resynthesized. 3. Such periodic synchronization becomes extremely difficult if the Byzantine failure is allowed because a faulty process can report different clock values to different processes. 4. Now the assumptions regarding thesystem are: a. All clocks are initially synchronized to approximately the same values. b Anon-faulty process's clock runs at approximately correct rate. C. A non-faulty process can read the clock value of another non faulty process with at most a small error e. 5. A clock synchronization algorithm should satisfy the following two conditions: a. At any time, the values of the clocks of all non-faulty processes must be approximately equal. b. There is a small bound on the amount by which the clock of a non faulty process is changed during each resynchronization. This condition implies that resynchronization does not cause a clock value to jump arbitrarily far, thereby preventing the clock rate from being too far from the real time. 3-12 B (CS-Sem-7) Agreement Protocols Que 3.11.| Explain the interactive convergence algorithm for clock synchronization. Answer The interactive convergence algorithm : 1. It is called interactive convergence algorithm because it causes the convergence of non-faulty clocks. 2 This algorithm assumes that the clocks are initially synchronized and they are resynthesized often enough so that two non-faulty clocks never differ by more than à. 3. According to the algorithm, each process reads the value of all other processes clocks and sets its clock value to the average of these values. 4. Ifa clock value difers from its own clock value by more than 8, it replaces that value by its own clock value when taking the average. 5. The processing of the algorithm is very simple. 6. Now, assume the situation where, there are two processes p and q respectively, use Co, and C as the clock values of a third process r when computing their averages. 7. Ifr is non-faulty, then Cpr =Cor 8 Ifr is faulty, then |Cpr-Car |s38 when p and g compute their averages for the n clock values, they both use identical values for the clocks of n-mnon-faulty processes and the difference in the clock values of mfaulty processes they use is bounded by 38. 9 In this way, the average value computed by p and q differ by at most (3m/n) 8. Since, n> 3m and (3m/n) 8 1) messages sent. 6 However, this method suffers from the domino effect. Que 4.7. Write short note on : i. Livelocks ii. Domino effects iii. Failure resilient processes iv. Consistent checkpoints AKTU2014-15, Marks 10 Answer i. Livelocks: 1. In rollback recovery, livelock is a situation in which a single failure can cause an infinite number of rollbacks, preventing the system from making progress. 2. A livelock situation in a distributed system is shown in Fig. 4.7.1. Here Fig. 4.7.1(a) illustrates the activity of twoprocesses X and Y until the failure of Y. 3. Process Yfails before receiving message n,, sent by X. 4 When Yrolls back to y,, there is no record of sending message m,, hence X must rollback to x,: 5 When process Y recovers, it sends out m, and receives n, [Fig. 4.7.1(6)]. 4-8 B (CS-Sem-7) Failure Recovery in Distributed System X m Time Failure (a) X n m Time 2nd rollback (6) Fig. 4.7.1. 6 Process X, after resuming from x,, sends n, and receives m 7. However, because Xis rolled back, there is no record of sending n, and hence Yhas to rollback for the second time. 8 This forces X to rollback too, as it has received m,, and there is no record of sending m, at Y. 9 This situation can repeat indefinitely, preventing the system from making any progress. ii. Domino effect: 1. Consider the system activity illustrated in the Fig. 4.7.2. 2 In the Fig. 4.7.2, X,Y, and Zare three processes that cooperate by exchanging information (shown by the arrows). 3. Each symbol marks a recovery point towhich a process can be rolled back in the event of afailure. X X X Y Time Fig. 4.7.2. Distributed System 4-9 B (CS-Sem-7) 4 If process Xis to be rolled back, it can be rolled back to the recovery point x, without affecting any other process. 5. Suppose that Yfails after sending message mand is rolled back to 6. In this case, the receipt of mis recorded in , but the sending of m is not recorded in y, 7. Now we have a situation where Xhas received message mfrom Y, but Y has no record of sending it, which corresponds to an inconsistent state. 8 Under such circumstances,m is referred to as an orphan message and process X must rollback. X must rollback because Yinteracted with X after establishing its recovery point y, 10. When Yis rolled back to y,, the event that is responsible for the interaction is undone. 11. Therefore, all the effects at Xcaused by the interaction must also be undone. 12. This can be achieved by rolling back Xto recovery point x 13. In the same way, ifZ is rolled back, all three processes must rollback to their very first recovery points, namely, , Y, and z, 14. This effect, where rolling back one process causes one or more other processes to rollback, is known as domino effect. ii. Failure resilient processes : 1 The fundamental unit of execution is a process. 2. Hence, in order for any system to be fault-tolerant., the processes of that system must be resilient to system failure. 3. A process is said to be resilient, if it marks failures and guarantees progress despite a certain number of system failures. 4. In other words, a minimum disruption is caused to service provided by the process in event of a system failure. 5 Two approaches have been proposed to implement resilient process : a. Backup processes b Replicated execution iv. Consistent checkpoints : Refer Q. 4.5, Page 4-6B, Unit-4. Que 4.8. Explain synchronous checkpointing algorithm. Answer Synchronous checkpointing : 1 so In this approach, processes synchronize their checkpointing activitythe that a globally consistent set of checkpoints is always maintained in system. 4-10 B (CS-Sem-7) Failure Recovery in Distributed System 2. In this method, consistent set of checkpoints are used which avoids livelock problems during recovery. Algorithm: 1 It assumes the following characteristics : a. Processes communicate by exchanging messages through communication channels. b. Channels are FIFO in nature. C. Communication failures do not partition the network. 2. The checkpoint algorithm takes two kinds of checkpoints on stable storage: a. Permanent checkpoint : A permanent checkpoint is a local checkpoint at aprocess and is apart ofa consistent global checkpoint. b. Tentative checkpoints : A tentative checkpoint is a temporary checkpoint that is made a permanent checkpoint on the successful termination of the checkpoint algorithm. 3. Processes rollback only totheir permanent checkpoints. 4 The algorithm has two phases : a. First phase: i An initiating process P, takes tentative checkpoint and requests all the processes to take tentative checkpoints. Each process informs process P; whether it accepts or rejects the request of taking tentative checkpoint. When all the process has successfully accepted the tentative checkpoints then P decides to make this checkpoint a permanent checkpoint. Otherwise tentative checkpoint is discarded. b. Second phase : i. P, informs all the processes of the decision it reached at the end of the first phase. Aprocess, on receiving the message from P,, will act accordingly. i. Therefore, either all or none of the processes accept permanent checkpoints. Que 4.9. Write short note on lost message. Answer Lost messages : a. Suppose that checkpoints x, and y,(Fig. 4.9.1) are chosen as the recovery points for processes Xand Y, respectively. Distributed System 4-11 B(CS-Sem-7) b In this case, the event that sent message m is recorded in x,, while the event ofits receipt at Yis not recorded in y,. C. If Y fails after receiving message m, the system is restored to state Y, in which message mis lost as process Xispast the point where it sendsmessage m. d This condition can also arise if m is lost in the communication channel and processes X and Yare in state x, and y,, respectively. e Both the above conditions are indistinguishable. X m Failure Time Fig. 4.9.1. Que 4.10. Explain the approaches to implement resilient process. Answer Two approaches have been proposed to implement resilient process: 1. Backup processes : a. In the backup processes approach, each resilient process is implemented by a primary process and one or more backup processes. The primary process executes while the backup processes are inactive. b. If the primary process terminates because of a failure, one of the backup processes becomes active and talkes over the functions of the primary process. C. To minimize the computation that has to be redone by the backup process, the state of the primary process is stored (checkpointed) at appropriate intervals. d. The checkpointed state is stored in a suitable place such that the failure of the primary process's machine does not affect the checkpoint's availability. 2. Replicated execution: a. In the replicated execution approach, several processes execute the same program concurrently. b As long as one of the processes survives failures, the computation or the service continues. 4-12B (CS-Sem-7) Failure Recovery in Distributed System C. A significant advantage of replicated execution is that it can be used to increase reliability as well as availability. d. The reliability of a computation can be increased by taking a majority consensus among the results generated by all the process. e This final result can then be used in subsequent computations. Que 4.11. Explain rollback recovery algorithm in distributed database system. Answer 1. This algorithm assumes that a single process invokes the algorithm as opposed to several processes concurrently invoking it to rollback and recover. 2. It also assumes that the checkpoint and the rollback recovery algorithms are not concurrently invoked. 3. This algorithm has two phases: First phase : i. An initiating process P, checks whether all the processes are willing to restart from their previous checkpoints. i. Aprocess may reject the request of P, ifit is already participating in acheckpointing or a recovering process initiated by some other process. iii. If all the processes accept the request of P, to restart from their previous checkpoints, P, decides to restart all the processes. iv. Otherwise all the processes continue with their normal activities. b. Second phase: i P.propagates its decision toall the processes.On receiving P's decision, a process will act accordingly. The recovery algorithm requires that every process do not send messages related to underlying computation while it is waiting for P's decision. Que 4.12. Write the difference between deadlock and livelock. Distributed System 4-13 B (CS-Sem-7) Answer Difference: S. No. Deadlock Livelock 1. A deadlock is a situation Livelock is a situation in which where a process or a set of single failure of aprocess can cause processes is blocked and an infinite number of rollbacks, waiting for an event that will preventing the systems from never occur. making progress. 2. The problem of deadlocks is LivelockS are common in common in computer system distributed system where where resource sharing is synchronization is maintained by frequent. message passing. 3 Deadlock can be handled by Livelocks can be prevented by various deadlock handling coordinating the processes either strategies like prevention, at the time of establishing detection and avoidance. checkpointing or at the beginning of recovery. PART-3 Fault Tolerance :Issues in Fault Tolerance. Questions-Answers Long Answer Type and Medium Answer Type Questions Que 4.13.Define fault and failure. What are different approaches to fault-tolerance ?Explain. AKTU 2014-15, Marks 05 Answer Faults : A fault is an anomalous physical condition. The causes of a fault include design errors, manufacturing problems, damage fatigue or other deterioration, and external disturbances. Failure: Failure of asystem occurs when the system does not perform its services in the manner specified. An erroneous state of the system is a state which could lead to a system failure by a sequence of valid state transaction. 4-14 B (CS-Sem-7) Failure Recovery in Distributed System Different fault tolerance approaches : 1. Replication : a. Replication is the process of creating and maintaining multiple copies of data objects or processes on several nodes. b Therefore, iffailure on one node occurs then data will be accessible to the user from other node. C. Replication provides high data availability and performance. 2. Checkpointing : a. Fault tolerance can be achieved through checkpointing. b. Checkpointing means to periodically save the consistent state of the system in a reliable storage medium. Each such instance when a system is in the consistent state is called a checkpoint. C. Checkpointing is primarily used to avoid losing all the useful processing done before a fault has occurred. In case of a fault, checkpoint enables the execútion of a program to be resumed fromn a previous consistent state rather than resuming the execution from the beginning. Que 4.14. Discuss at least three main issues that are relevant to the understanding of distributed fault tolerance system. Explain how that makes it important. AKTU2015-16, Marks 10 Answer Issues in the fault tolerance are as follows : 1. Process deaths: a. When a process dies, it is important that the resources allocated to that process are recouped, otherwise they may be permanently lost. b Many distributed systems are structured along the client-server model in which a client requests a service by sendinga message to a server. C. Ifthe server process fails, it is necessary that the client machine be informed so that the client process, waiting for a reply can be unblocked to take suitable action. d. Similarly, if a client process dies after sending a request toa server, it is imperative that the server be informed that the client process no longer exists. e This will facilitate the server in reclaiming any resources it has allocated to the client process. Distributed System 4-15 B (CS-Sem-7) 2. Machine failure : In the case of machine failure, all the processes running at the machine will die. b. As far as the behaviour of a client process or a server process 1s concerned, there is not much difference in their behaviour in the event of a machine failure or a process death. C. In case of machine failure, an absence of any kind of message indicates either process death or a failure. 3. Network failure: a. A communication link failure can partition a network into subnets, making it impossible for a machine to communicate with another machine in a different subnet. b. A process cannot give the difference between a machine and a communication link failure, unless the underlying communication network (such asa slotted ring network) can recognize a machine failure. C If the communication network cannot recognize machine failures and thus cannot return a suitable error code (such as ethernet), a fault-tolerant design will have to assume that a machine may be operating and processes on that machine are active. PART-4 Commit Protocol, Voting Protocol, Dynamic Voting Protocol. Questions-Answers Long Answer Type and Medium Answer TypeQuestions Que 4.15.Explain two phase commit protocol. Answer Two phase commit protocol : Two phase commit protocol is designed to allow any participant to abort its part oftransaction. Due to the requirement for atomicity, if one part of a transaction is aborted then the whole transaction must also be aborted. Phase 1 (Voting phase) : 1. The coordinator sends a canCommit request to each of the participants in the transaction. 2 When a participant receives a can Commit request it replies with its vote (yes or no) to the coordinator. Before voting yes, it prepares to commit 4-16 B (CS-Sem-7) Failure Recovery in Distributed System by saving objects in permanent storage. If the vote is no, the participant aborts immediately. Phase 2 (Completion according to outcome of vote) : 1 The coordinator collects the votes (including itsown): a. If there are no failures and all the votes are yes, the coordinator decides to commit the transaction and sends a doCommit request to each of the participants. b Otherwise, the coordinator decides to abort the transaction and sends doAbort requests to all participants that voted yes. 2 Participants that voted yes are waiting for a doCommit or doAbort request from the coordinator. When a participant receives one of thesemessages, it acts accordingly and in the case of commit, makes a haveCommitted calls as confirmation to the coordinator. coordinator participant step status step status can Commit ? 1. prepared to commit 2. prepared to commit (waiting for votes) yes (uncertain) doCommit ? 3.committed 4. committed | haveCommitted ? done Fig. 4.15.1. Communication in two phase commit protocol. Que 4.16. What are commit protocols ? Explain how two phase protocols respond to failure of participating site and failure of coordinator. AKTU 2014-15, Marks 05 Answer Commit protocols : 1. In distributed system commit protocols ensure the atomicity across the sites, i.e., when a transaction executes at multiple sites it must either be committed at all the sites or aborted at all the sites. 2. The goal of commit protocols is to have all the concern participants agree either to commit or to abort a transaction. Handlinga failure of a participating site : Let us assume that the failed site is S, and the Transaction Coordinator is TC. There are two things we need to look into to handle failure of a participating site: Distributed System 4-17 B (CS-Sem-7) 1. The response of the Transaction Coordinator of transaction T : If the failed site have not sent any message (), the TC cannot decide to commit the transaction. Hence, the transactionT should be aborted and other participating sites is to be informed. b. If the failed site have sent a message (), the TC can assume that the failed site also was ready to commit, hence the transaction can be committed by TC and the other sites will be informed to commit. In this case, the site which recovers from failure has to execute the two phase (2PC) protocol to set its local database up-to-date. 2. The response of the failed site when it recovers: When recover from failure, the recovering siteS, must identify the fate of the transactions which was going on during the failure of S. This can be done by examining the log file entries of site S;. b. This is how the two phase (2PC) protocol handles the failure of a participating Site. Handling the failure of a coordinator site : Let us suppose that the coordinator site failed during execution of two phase (2PC) protocol for atransaction T. This situation can be handled in following two way : 1 The other sites which are participating in the transaction T may try to decide the fate of the transaction. That is, they may try to decide on commit or abort of T using the control messages available in every site. 2. The second way is towait until the coordinator site recovers. Que 4.17.Deseribe three phase commit protocol. Howthree phase commit protocol is different than two phase commit protocol ? AKTU 2017-18,Marks 10 Answer Phases in three phase commit proto col : 1 The three-phase commit (3PC) protocol isa distributed algorithm which lets all nodes in a distributed system agrees to commit a transaction. 2. 3PC is non-blocking protocol. 3. 3PC places an upper bound on the amount of time required before a transaction either commits or aborts. 4. This property ensures that if a given transaction holds some resource locks, it will release the locks after the time-out. 5. The three-phase commit (3PC) protocol is more complicated and more expensive phase in 3PC protocol. 4-18 B (CS-Sem-7) Failure Recovery in Distributed System Phase 1:Voting /Prepare phase : 1. Transaction Coordinator (TC) of the transaction writes BEGIN_COMMIT message in its log file and sends PREPARE message to all the participating sites and waits. 2. On receiving PREPARE message, ifa site is ready to commit, then the site's Transaction Manager (TM) writes READY in its log and send VOTE_COMMIT to TC. 3. If any site is not ready to commit, it writes ABORT in its log and responds with VOTE ABORT to the TC. Phase 2: BufferingPre-commit phase : 1. IfTC received VOTE_COMMIT from all the participating sites, then it writes PREPARE_TO0_COMMIT in its log and sends PREPARE_TO_COMMITmessage to all the participating sites. 2 If TC receives any one VOTE ABORT message, it writes ABORT in its log and sends GLOBAL ABORT to all the participating sites and also writes END_OF_TRANSACTION message in its log. 3. On receiving the message PREPARE_TO_COMMIT, the TM of participating sites write PREPARE_TO_COMMIT in their log and respond with READY_ TO_COMMIT message to the TC. 4. If they receive GLOBAL_ABORT message, then TM of the sites write ABORT in their logs and acknowledge the abort. Phase 3: Decision/Commit or abort phase : 1 If all responses are READY_TO_COMMIT, then TC writes COMMIT in its log and send GLOBAL_COMMIT message to all the participating sites' TMs. 2. The TM of all sites then writes COMMIT in their log and sends an acknowledgement to the TC. Then, TC writes END_OF_TRANSACTION in its log. Three phase vs. two phase commit protocol : 1. In two-phase commit protocol, when coordinator fails during execution then participating sites are unable to determine whether the coordinator has made a decision to abort or commit the transaction, which cause participating sites to be in blocked state. 2. To remove this blocking problem in 2PC, three phase commit protocol was proposed. Three-Phase Commit protocol is able to. prevent this blocking problem by taking the decision based on the decision of all sites. Que 4.18. What is voting protocol ? Explain static voting and dynamic voting protocols. AKTU2014-15, Marks 05 Distributed System 4-19 B (CS-Sem-7) OR Describe in detail: a. Dynamic voting protocols b. Method to obtain consistent set of checkpoint AKTU2016-17, Marks 10 Answer Voting protocol : 1 Voting protocol is a common approach to provide fault tolerance in distributed system by replicating data at many sites (or nodes). 2 Ifa site is not available, the data can still be obtained from copies at other sites. 3 With the voting mechanism, each replica is assigned some number of votes, and a majority of votes must be collected from a process before it can access a replica. 4 The voting mechanism is more fault tolerant than a commit protocol in that it allows access to data under network partitions, site failures, and message losses without compromising the integrity of the data. Static voting protocol : 1 In static voting scheme, the replicas of files are stored at different sites. 2 Every file access operation requires that an appropriate lock is obtained. 3. The lock granting rules allow either 'one writer and no readers' or 'multiple readers and no writers' to access a file simultaneously. 4 It is assumed that at every site there is a lock manager that performs the lock related operations, and every file is associated with a version number, which gives the number of times the file has been updated. 5 The version numbers are stored on stable storage, and every successful write operation on a replica updates its version number. Dynamic voting protocol : 1 Suppose that in the system shown in Fig. 4.18.1, site 4 becomes unreachable from the rest of the sites due to its failure or due to a network partition. votes =1 (1) 75 msecs 3 votes = 1 votes = 1 750 msecs 750 msecs Votes = 2 100 msecs Fig. 4.18.1. 4-20 B (CS-Sem-7) Failure Recovery in Distributed System 2. Site 1, 2, and 3 can still collect a guorum (also referred to as while site 4 cannot collect a majority) 3. quorum. If another partition or a failure of a site occurs, making any site unavailable, the system cannot serve any read or write requests as a quorumn cannot be collected in any partition. 4. In other words, thà system is completely unavailable which is a serious problem. 5. Dynamic voting protocols solve this problem by adapting the number of votes or the set of sites that can form a quorum, to the changing state of the system due to site and communication failures: 6 In the dynamicprotocols, following two approaches are used to enhance availability: a. Majority based approach :The set of sitesthat can forma majority to allow access to replicated data changes with the changing state of the system. b. Dynamic vote reassignment :The number of votes assigned to a site changes dynamically. Method to obtain consistent set of checkpoint : Refer Q. 4.6, Page 4-7B, Unit-4. VERY IMPORTANT QUESTIONS |Following questions are very important. These questions may be asked in your SESSIONALS as well as UNIVERSITY EXAMINATION. Q. 1. Define forward recovery and backward recovery. List advantages and disadvantages of forward recovery. Explain two approaches of backward-error recovery. Ans. Refer Q. 4.1. Q.2. What do you mean by recovery in concurrent system ? Explain. Ans. Refer Q. 4.3. Q.3. Write a short note on method to obtain consistent set of checkpoints. Ans. Refer Q. 4.6. Q.4. Write short note on: i. Livelocks ii. Domino effects 5 UNIT Transaetion and Concurrency Control CONTENTS Part-1 Transaction and Concurrency 5-2B to 5-4B Control : Transaction, Nested Transactions Part-2 : Locks, Optimistic 5-4B to 5-10B Concurrency Control, Timestamp Ordering, Comparison of Methods for Concurrency Control Part-3 : Distributed Transaction 5-10B to 5-14B Flat and Nested Transaction Part-4 : Atomic Commit Protocol, 5-14B to 5-17B Concurrency Contro] in Distributed Transaction Part-5 : Distributed Deadlocks,.5-17B to 5-21B Transaction Recovery Part-6 : Replication : System Model. 5-21B to 5-31B and Group Communication, Fault Tolerant Services, Highly Available Services and Transaction With Replicated Data. 5-1 B (CS-Sem-7) 5-2 B (CS-Sem-7) Transaction and Concurrency Control PART-1 Transaction and Concurrency Control : Transaction, Nested Transactions. Questions-Answers Long Answer Type and Medium Answer Type Questions Que-5.1. Explain transaction and its properties. Answer Transaction is a sequence of read, compute and write statements that refer tothe data objects of a database. Properties of transactions: Transactions have four essential properties, known as ACID properties : 1. Atomicity : Atomicity means that the transaction completes its processing as an indivisible and atomic unit either successfully or unsuccessfully. After the transaction, the database changes consistency or not changed at all. 2. Consistency : Consistency means that the database is in consistent state before transaction and after the termination of transaction, the database will also be in the consistent state. 3. Isolation: Isolation property indicates that every action processed by a transaction is kept isolated until the completion of transaction. During the transaction processing, the processing will be hidden from outside the transaction. Durability : Durability means any failure made after the commit 4 operation will not affect the database. The commit action will reflect its results to the database after its termination. Que 5.2. Write short note on nested transaction. AKTU 2016-17, Marks 10 Answer Nested transaction : 1 In a nested transaction, the top-level transaction can open subtransactions, and each subtransaction can open further subtransactions down to any depth of nesting. Distributed System 5-3 B (CS-Sem-7) 2 The Fig. 5.2.1 shows a client's transaction T that opens two subtransactions T, and T,, which access objects at servers Xand Y. 3 The subtransactions T, and T, open further subtransactions T1) T and T Which access objects at servers M, N, O andP. 4 In the nested case, subtransactions at the same level can run concurrently, so Tand T, are concurrent, and as they invoke objects in different servers, they can run in parallel. 5 The four subtransactions T,,,Ti, To, and T, also run concurrently. M T Client T X T T1 T, Y T 22 Fig. 5.2.1. Que 5.3.Explain how the two phase commit protocol for nested transaction ensures that if the top level transactions commit, all the right descendents are committed or aborted ? AKTU2015-16, Marks 10 Answer 1 Consider the top-level transaction T and its subtransactions shown in Fig. 5.3.1. T, abort (at M) T, provisional commit (at X) T,, provisional commit (at N) T provisional commit (at N) T, aborted (at Y) T2 provisional commit (at P) Fig. 5.3.1. Transaction Tdecides whether to commit. 5-4B (CS-Sem-7) Transaction and Concurrency Control 2 Each subtransaction has either provisionally committed or aborted. For example, T, has provisionally committed and T,, has aborted, but the fate of T;, depends on its parent T, and eventually on the top-level transaction, T. b Although T, and T,, have both provisionally committed, T, has aborted and this means that T,, and To, must also abort. C. Suppose that T decides to commit in spite of the fact that T, has aborted, also that T,decides to commit in spite of thefact that Ti1 has aborted. 4 When a top-level transaction completes, its coordinator carries out a two phase commit protocol. The only reason for a participant subtransaction being unable to complete is if it has crashed since it completed its provisional commit. 5. When each subtransaction was created, it joined its parent transaction. 6 Therefore, the coordinator of each parent transaction has a list of its child subtransactions. 7. When a nested transaction provisionally commits, it reports its status and the status of its descendants toitsparent. 8 When a nested transaction aborts, it just reports abort to its parent without giving any information about its descendants. 9 Eventually, the top-level transaction receives a list of all the subtransactions in the tree, together with the status of each. 10. Descendants of aborted subtransactions are actually omitted from this list. PART-2 Locks, Optimistic Concurrency Control, Timesstamp Ordering, Comparison of Methods for Concurrency Control. Questions-Answers Long Answer Type and Medium Answer Type Questions Que 5.4. What is lock ? What are the different modes in which transaction can lock a data object. Distributed System 5-5 B (CS-Sem-7) Answer 1 A lock is avariable associated with shared resources such as data item that determines whether read/write operation can be performed on that data item. 2. In lock based techniques,each data object has a lock associated with it. 3. A transaction can hold, request or release the lock on a data object, as required by the transaction. 4 The transaction is said to have the locked data object, ifit holds a lock. 5. There are two modes of locking in which transaction can lock data object : a. Exclusive: If a transaction has locked the data object in exclusive mode, no other transaction can lock it in any mode. ii. In this locking scheme, the server attempts to lock any object. iii. If a client requests access to an object, the request is suspended and the client must wait until the object is unlocked. b. Shared: i If the transaction has locked the data object in shared mode, other transaction can concurrently lock it but only in shared mode. i. If a client requests access to an object, the request is always successful. ii. All the transactions reading the same object can share their read lock. Que 5.5. Discuss 2PL and strict 2PL in context of distributed system. AKTU 2015-16, 2016-17; Marks 7.5 Answer Two phase locking : 1 This locking scheme is also called as 2 PL. 2 Two phase locking is a dynamic locking scheme in which a transaction requests a lock on a data object when it needs the data object. Twophase locking is called two phase as it has two phases : a. Growingphase:It is aphase during which new locks are acquired. b. Shrinking phase : It is a phase during which locks are released. 4. Two phase locking imposes a constraint on lock acquisition and the lock release actions of a transaction to guarantee consistency. 5. The state ofa transaction in which it releases locks and hold locks on all the needed data objects is referred to as lock point. An execution is shown in Fig. 5.5.1. 5-6 B (CS-Sem-7) Transaction and Concurrency Control Number Lock point of locks Growing Shrinking phase phase Time First Second Release Release lock of first of second lock acquisition acquisition lock lock Fig. 5.5.1. Two phase locking scheme. Strict two phase locking : read or write an 1 Under a strict execution, a transaction that needs to the same object must be delayed until other transactions that wrote object have committed or aborted. 2. To enforce this rule, any locks applied during the progress of transaction or aborts. This is called are held until the transaction either commit strict two phase locking. because transaction 3 Strict two phase locking eliminates cascaded aborts after the transaction can read data objects modified by a transaction only has completed. as a transaction 4. However, strict two phase locking reduces concurrency consistency. holds locks for a longer period than required for implemented in distributed Que 5.6. How two phase locking is database systemn ? Answer distributed database system in Two phase locking can be implemented in a the following way : the locks associated with objects 1 ADataManager (DM) at a site controls stored at that site. appropriate DM to 2. Transaction Manager (TM) communicates with the A lock or unlock a data object. Distributed System 5-7 B (CS-Sem-7) 3. If arequest for lock cannot be granted, the DM puts it on the waiting queue of the object. 4 When a lock on an object is released, one of the waiting requests for the lock on that object is granted. Que 5.7. Explain optimistic concurrency control. AKTU2014-15, 2016-17; Marks 05 Answer Optimistic concurrency control states that the conflicts among the transactions are rare in distributed database system. It is only an assumption so it is also called optimistic. In optimistic concurrency control scheme, each transaction goes through three phase: 1. Working phase : During this phase, each transaction has a tentative version of each of the objects that it updates. The use oftentative versions allows the transaction to abort either during the working phase or other validation phase. The rules for read/write are: a. Read operation is performed if the tentative version for that transaction already exists. b. Write operation record the new values of several concurrent transaction objects as tentative values which are invisible to other transactions. 2. Validation phase: When the close transaction request is received, the transaction is validated to establish whether or not its operations on objects conflicts with operations of other transaction on same objects. 3. Update phase:If the transaction is validated, all the changes recorded in its tentative versions are made permanent -read only transaction can commit immediately after passing validation. Que 5.8. Discuss the optimistic methods for distributed concurrency control. What are the different validation conditions for optimistic concurrency control ? Explain. |AKTU2015-16, Marks 10 Answer Optimistic concurrency control : Refer Q. 5.7, Page 5-7B, Unit-5. Validation condition for optimisticconcurrency control: Let T, and T, be the two transactions. For a transaction T, to be serializable with respect tó an overlapping transaction T, their opérations must confirm to the following rules/conditions: 5-8 B (CS-Sem-7) Transaction and Concurrency Control 1. T, must not read objects written by T;. 2. T, must not read objects written by T;. 3 T, must not write objects written by T, and T. must not write objects written by T;, Que 5.9. What are the different validation conditions for optimistic concurrency control ? How it effects the transaction in distributed system. AKTU2018-19, Marks 10 Answer Validation condition for optimistic concurren cy control : Refer Q. 5.8, Page 5-7B, Unit-5. Effects of validation conditions on transaction in distributed system : 1. If the validation conditions are successful, then the transaction can commit. 2 If the validation conditions fail, then some form of conflict resolution must be used and the current transaction will be aborted. 3 Rule 1and 2 test whether there is a overlapping between the objects of pair of transaction T; and T;. 4 Rule 3 ensures that no two transactions can overlap in update phase. 5 Due to restriction on write operations no dirty read can occurs. Que 5.10. Write short notes on timestamp ordering transaction management. AKTU 2015-16, Marks 05 Answer 1. In distributed transaction, each coordinator issue globally unique timestamps. 2. Aglobally unique transaction timestamp is issued to the client by the first coordinator accessed by a transaction. 3 The transaction timestamp is passed to the coordinator at each server whose objects perform an operation in the transaction. 4 The servers of distributed transactions are jointly responsible for ensuring that they are performed in a serially equivalent manner. 5. Atimestamp consists of apair. 6. The agreed ordering of pairs of timestamps is based on a comparison in which the server-id part is less significant. Distributed System 5-9 B (CS-Sem-7) 7. The same ordering of transactions can be achieved at all the servers even if their local clocks are not synchronized. Que 5.11. Explain strict two phase locking with its rules. Answer Strict two phase locking: Refer Q. 5.5,Page 5-5B, Unit-5. The rules for the use of locks in a strict two phase locking implementation are as follows : 1 When an operation accesses an object within a transaction: a If the object is not already locked, it is locked and the operation proceeds. b If the object has the conflicting lock set by another transaction, the transaction wait until it is unlocked. C. Ifthe object has the non-conflicting lock set by another transaction, the lock is shared and the operation proceeds. d. If the object has already been locked in the same transaction, the lock will be promoted if necessary and the operation proceeds. 2. When a transaction is committed or aborted, the server unlocks all objects it locked for the transaction. These rules ensure strictness because the locks are held until a transaction has either committed or aborted. Que 5.12. Explain multiversion timestamp ordering protocol. Answer 1 In multiversion timestamp ordering, a list of old committed versions as well as tentative versions is kept for each object. 2 This list represents the history of the values of the object. 3 Each version has a read timestamp recording the largest timestamp of any transaction that has read from it in addition to a write timestamp. 4 Whenever a write operation is accepted, it is directed to a tentative version with the write timestamp of the transaction. 5 Whenever a read operation is carried out it is directed to the version with the largest write timestamp less than the transaction timestamp. 6 If the transaction timestamp is larger than the read timestamp of the version being used, the read timestamp of the version is set to the transaction timestamp. 7. When a read arrives late, it can be allowed to read from an old committed version, so there is no need to abort the read operations. 5-10 B (CS-Sem-7) Transaction and Concurrency Control 8. In multiversion timestamp ordering, read operations are always permitted although they may have to wait for earlier transaction to complete (either commit or abort), which ensures that executions are recoverable. Que 5.13, What are the advantages and drawback of multiversion timestamp ordering in comparison with the basic timestamp ordering? AKTU2014-15, Marks 10 Answer Advantages of multiversion timestamp ordering: 1 It allows more concurrency in distributed system. 2. Improved system responsiveness by providing multiple versions. 3 Reduces the probability of conflicts transaction. 4. Read request never fails and is never made to wait. Disadvantages of multiversion timestamp ordering : 1. Reading of adata item also requires the updating of the read timestamp field resulting in two potential disk accesses, rather than one. 2. The conflicts between transactions are resolved through rollbacks, rather than through waits. 3. It require huge amount of storage for storing multiple versions of data objects. 4. It does not ensure recoverability and cascadelessness. PART-3 Distributed Transaction : Flat and Nested Transaction. Questions-Answers Long Answer Type and Medium Answer Type Questions Que 5.14. Explain distributed transaction. Answer 1. Adistributed transaction is a database transaction in which two or more network hosts are involved. 2. These hosts provide transactional resources, while the transaction manager is responsible for creating and managing aglobal transaction that encompasses alloperations against such resources. Distributed System 5-11 B (CS-Sem-7) 3. Distributed transactions, as any other transactions, must have all four ACID (Atomicity, Consistency, Isolation and Durability) properties. 4. For distributed transactions, each computer (or nodes) acts as a local transaction manager. 5. If the transaction works at several computers, the transaction managers communicate with various other transaction managers by means of superior or subordinate relationships, which are accurate only for a specific transaction. Que 5.15. Explain distributed transactions. Discuss the functionality of flat and nested distributed transactions with example. |AKTU2018-19, Marks10 OR Write short note on flat and nested tran saction. AKTU 2016-17, Marks 7.5 Answer Distributed transaction : Refer Q.5.14, Page 5-10B, Unit-5. Functionality of flat with example : 1 In a flat transaction, a client makes requests to more than one server. 2 Aflat client transaction completes each ofits requests bfore going on to the next one. Therefore, each transaction accesses server objects sequentially. 3 For example, in the Fig. 5.15.1, transaction Tis a flat transaction that invokes operations on objects in servers X, Yand Z. X T T Y Client Fig. 5.15.1, Flat transactions. Functionality of nested transaction with example : Refer Q. 5.2, Page 5-2B, Unit-5. Que 5.16. i. What are the goals of distributed transaction ? Distinguish between flat and nested transaction along with its structure. 5-12 B(CS-Sem-7) Transaction and Concurrency Control ii. Explain optimistic concurrency control. AKTU 2014-15, 2016-17; Marks 10 Answer i. Goals of distributed transaction : 1. The goal of distributed transaction is to ensure that all of the objects managed by, a server remain in a consistent state when they are accessed by multiple transactions and in the presence of server crashes. 2. To maintain ACID properties of transaction in distributed system. 3 To complete overall transaction occurring at different nodes. 4. Toensure the consistency of a set of shared data objects accessed by user at the time of failures and concurrent access. Difference between flat and nested transaction: S. No. Flat transaction Nested transaction 1 A flat client transaction In a nested transaction, the top-level completes each of its | transaction can open subtransactions, requests before going on to and each subtransaction can open the next one. Therefore, further subtransactions down to any each transaction accesses depth of nesting. server objects sequentially. 2 In the Fig. 5.16.1, In nested transaction as shown in transaction T is a flat Fig. 5.16.2, subtransactions at the transaction that invokes same level can run concurrently, so operation on objects in T,and T, are concurrent, and as they servers X, Y and Z. invoke objects in different servers, they can run in parallel. M x T T Y Client, T, Client T12 T Fig. 5.16.1. P Fig. 5.16.2. Distributed System 5-13 B (CS-Sem-7) iü. Optimistic concurrency control: Refer Q. 5.7, Page 5-7B, Unit-5. Que 5.17. Draw a schematic diagram of the distributed transaction management model. Explain each component in brief. AKTU2017-18, Marks 05 Answer Transaction management model contain following three components : 1. Transaction manager (TM) : The transaction manager supervises.the execution of a transaction. b. It intercepts and executes all the submitted transactions. C. TM interacts with the DM to carry out the execution of atransaction. TM assigns a timestamp to a transaction or issue requests to lock and unlock data objects on behalf ofauser. e. TM acts as an interface between user and the database system. 2. Scheduler: Scheduler is used for enforcing concurrency control. b. It grants or releases locks on data objects as requested by a transaction. 3 Data manager : The data manager (DM) manages the database. b. It carries out the read-write requests issued by the TM on behalf of a transaction by operating them on the database. C. Thus, DM is an interface between scheduler and database. transactions TM Scheduler DM Database transactionsTM Scheduler DM D Database transactionsTM Scheduler DM D Database Fig. 5.17.1. Distributed transaction management model. Execution of a transaction at the TM results in the execution of its actions at the DM. 5-14 B (CS-Sem-7) Transaction and Concurrency Control e. So, the DM executes a stream of transaction actions, directed towards it by the TM. PART-4 Atomic Commit Protocol, Concurrency Control in Distributed System. Questions-Answers Long Answer Type and Medium Answer Type Questions Que 5.18. Write short note on atomic commit protocol. Answer 1. Atomic commit protocol (ACP) is a protocol used by database manager to ensure that all the sub-transactions are consistently committed or aborted. 2 In this each server applies local concurrency control to its own object, which ensures that transaction are serialized locally as well as serialized globally. 3. When a dístributed transaction comes to an end, either all the servers commit the transaction or abort the transaction. 4 There are two atomic commit protocol used in distributed database : a. Two phase commit protocol. b Three phase commit protocol. Que 5.19. Explain two phase and three phase commit protocol. Answer Two phase commit protocol : Refer Q. 4.15, Page 4-15B, Unit-4. Three phase commit protocol:Refer Q. 4.17, Page 417B, Unit-4. Que 5.20. Write short note on conflict resolution. Answer Aconflict is resolved by.taking one of the following actions: 1. Wait: The requesting transaction is made to wait until the conflicting transaction either completes or aborts. Distributed System 5-15 B (CS-Sem-7) 2. Restart: Either the requesting transaction or the transaction it conflicts with is aborted and started afresh. b. Restarting is achieved by using one of the following primitives : 3. Die: The requesting transaction aborts and starts afresh. 4 Wound : a. The transaction in conflict with the requesting transaction is tagged as wounded and a message "wounded" is sent to all sites that the wounded transaction has visited. b If the message is received before the wounded transaction has committed at a site, the concurrency control algorithm at that site initiates an abort of the wounded transaction, otherwise the message is ignored. C. Ifawounded transaction is aborted, it is started again. d The requesting transaction proceeds after the wounded transaction completes or aborts. Que 5.21. What are the algorithms for conflict resolution in timestamp concurrency control ? Answer Following are the algorithms for conflict resolution in timestamps concurrency control : 1. Wait-die algorithm : a The wait-die algorithm is a nonpreemptive algorithm because a requesting transaction never forces the transaction holding the requested data object to abort. b Suppose requesting transaction T, is in conflict with a transaction T, If T, is older (i.e., has a smaller timestamp), then T, waits, otherwise T, dies. C. The older transaction waits for the younger transaction if the younger has accessed the granule first. d. The younger transaction is aborted and restarted ifit tries to access a granule after an older concurrent transaction. 2 Wound-wait algorithm : The wound-wait algorithm is a preemptive algorithm. b. Supposea requesting transaction T, is in conflict with a transaction T,. If T, is older, it wounds T, otherwise it watts. C. The older transaction preempts the younger by suspending itif the younger transaction tries to access a granule af