chap-01.pdf
Document Details
Tags
Full Transcript
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN...
DISTRIBUTED SYSTEMS Principles and Paradigms Second Edition ANDREW S. TANENBAUM MAARTEN VAN STEEN Instructor: Amril Nazir, Ph.D email: [email protected] twitter: @NurmanNazir Chapter 1 Introduction Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Definition of a Distributed System (1) A distributed system is: A collection of independent computers that appears to its users as a single coherent system. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 History Introduction Computer systems are undergoing a revolution From 1945, when the modern computer era began, until about 1985, computers were large and expensive Even microcomputers cost tens of thousands of dollars each Most organization had only a handful of computers, and for lack of a way to connect them, these operated independently from one another Starting in the mid-1980s, two advances in technology began to change that situation Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 History Blue Gene Supercomputer at Argonne National Lab (250, 000 processors! ) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 History The amount of improvement that has occurred in computer technology in the past half century is truly staggering The first major advancement is that from a supercomputing that cost 100 million dollars and executed 1 instruction per second, we have come to commodity machines that cost 1000 dollars and are able to execute at the same speed if not greater!!!! IBM PC was the first the offered such commodity machines in 1981. The second major development was the invention of high-speed computer networks Local-area networks allow hundreds of machines within a building to be connected in such a way that small amounts of information can be transferred between machines in a few microseconds As the cost of commodity computers continues to drop, it is more feasible to interconnect multiple computing nodes working together closely so that they form a single computer. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 History Commodity hardware are interconnected together to form cluster machines! Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 History Wide-area networks allow millions of machines all over the earth to be connected at speeds varying from 64Kbps to gigabits per second The result of these technologies is that it is now not only feasible, but easy, to put together computing systems composed of large numbers of computers connected by a high-speed network These are usually called computer networks or distributed systems. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Key Characterizations The definition of distributed systems has two aspects: the first deals with hardware that is the machines are autonomous the second deals with software that is the users think they are dealing with a single system Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Key Characterizations (2) Multiple Computers A distributed system contains more than one physical computer, each consisting of CPUs, some local memory, possibly some stable storage like disks, and I/O paths to connect it with the environment Interconnections Some of the I/O paths will interconnect the computers. If they cannot talk to each other, then it is not going to be very interesting distributed system Shared State The computers cooperate to maintain some shared state. That is, if the correct operation of the system is described in terms of some global invariants, then maintaining those invariants requires the correct and coordinated operation of multiple computers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Key Characterizations (3) Building a system out of interconnected computers requires that several issues to be addressed: Heterogeneity The Internet enables users to access services and run applications over a heterogeneous collection of computers and networks Heterogeneity applies to all of the following: Networks computer hardware operating systems programming language implementations by different developers Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Key Characterizations (4) Security Many of the information resources that are made available and maintained in distributed systems have a high intrinsic value to their users. Their security is therefore of considerable importance Security for information resources has three components: Confidentiality (protection against disclosure to unauthorized individuals) Integrity (protection against alteration or corruption) Availability (protection against interference with the means to access the resources) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Key Characterizations (5) Scalability Distributed systems operate effectively and efficiently at many different scales, ranging from a small intranet to the Internet A system is described as scalable if will remain effective when there is a significant increase in the number of resources and the number of users The Internet provides an illustration of a distributed system in which the number of computers and services has increased dramatically The design of scalable distributed systems presents the following challenges: Controlling the cost of physical resources Controlling the performance loss Preventing software resources running out Avoiding performance bottlenecks Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Key Characterizations (6) Failure handling Computer systems sometimes fail in a very big network When faults occur in hardware or software, programs may produce incorrect results or they may stop before they have completed the intended computation Failures in a distributed systems are partial that is, some components fail while others continue to function Therefore the handling of failures is particularly difficult Recovery from failures. Recovery involves the design of software so that the state of permanent data can be recovered or ‘rolled back’ after a server has crashed Redundancy. Services can be made to tolerate failures by the use of redundant components. An example- There should always be at least two different routes between any two routers in the Internet Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Key Characterizations (7) Concurrency Both services and application provide resources that can be shared by clients in a distributed system. There is therefore a possibility that several clients will attempt to access a shared resource at the same time Transparency Transparency is defined as the concealment from the user and the application programmer of the separation of components in a distributed system Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Definition of a Distributed System (2) Figure 1-1. A distributed system organized as middleware. The middleware layer extends over multiple machines, and offers each application the same interface. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Pitfalls when Developing Distributed Systems False assumptions made by first time developer: The network is reliable. The network is secure. The network is homogeneous. The topology does not change. Latency is zero. Bandwidth is infinite. Transport cost is zero. There is one administrator. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Transparency in a Distributed System Figure 1-2. Different forms of transparency in a distributed system (ISO, 1995). Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Scalability Problems Figure 1-3. Examples of scalability limitations. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Scalability Problems Characteristics of decentralized algorithms: No machine has complete information about the system state. Machines make decisions based only on local information. Failure of one machine does not ruin the algorithm. There is no implicit assumption that a global clock exists. Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Performance Issues Performance issues arising from the limited processing and communication capacities of computers and networks are considered under the following headings: Responsiveness. Users of interactive applications require fast and consistent response to interaction. When a remote service is involved, the speed at which the response is generated is determined not just by the load and performance of the server and the network but also by delays in all the software components involved – the client and server operating systems’ communication and middleware services (remote invocation support) Throughput. A traditional measure of performance for computer systems is the throughput – the rate at which computational work is done. We are interested in the ability of a distributed system to perform work for all its users Balancing computational loads. One of the purposes of distributed systems is to enable application and service processes to proceed concurrently without competing for the same resources and to exploit the available computational resources (processor, memory, and network capabilities) Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5 Dependability issues Dependability is a requirement in most application domains: the application dependents on some other distributed components. In such cases, fault tolerance must be preserved. Fault tolerance. Dependable applications should continue to function correctly in the presence of faults in hardware, software and networks. Reliability is achieved through redundancy – the provision of multiple resources so that the system and application software can reconfigure and continue to perform its tasks in the presence of faults net Tanenbaum & Van Steen, Distributed Systems: Principles and Paradigms, 2e, (c) 2007 Prentice-Hall, Inc. All rights reserved. 0-13-239227-5