Distributed Systems - Chapter One PDF
Document Details
Uploaded by ExaltingDivisionism
Debre Berhan University
Tags
Summary
This document provides an introduction to the fundamental concepts of distributed systems. It defines distributed systems, outlines their characteristics, and discusses various aspects including resource management, communication, and scalability. The paper also explores the issues concerning concurrency, security, and heterogeneous environments within this type of computing.
Full Transcript
Chapter One Introduction to Distributed Systems Contents Introduction Definition Goals of a Distributed System Types of Distributed Systems 2 Introduction Definition of a Distributed System System: “A complex whole; a set of connected p...
Chapter One Introduction to Distributed Systems Contents Introduction Definition Goals of a Distributed System Types of Distributed Systems 2 Introduction Definition of a Distributed System System: “A complex whole; a set of connected parts; an organized assembly of resources and procedures (collection of …) united and regulated by interaction or interdependence to accomplish a set of specific functions.” A distributed system is: a collection of independent computers that appears to its users as a single coherent system - computer (Tanenbaum & Van Steen) This definition has two aspects: Hardware: autonomous machines Software: a single system view for the users 3 Contd. Other Definitions A distributed system is a system designed to support the development of applications and services which can exploit a physical architecture consisting of multiple, autonomous processing elements that do not share primary memory but cooperate by sending asynchronous messages over a communication network (Blair & Stefani) A distributed system is a computing platform built with many computers that: Operate concurrently; Are physically distributed; (have their own failure modes) Are linked by a network; Have independent clocks 4 Why Distributed Systems? Resource and Data Sharing printers, databases, multimedia servers,... Availability, Reliability the loss of some instances can be hidden Scalability, Extensibility the system grows with demand (e.g., extra servers) Performance huge power (CPU, memory,...) available Inherent distribution, communication organizational distribution, e-mail, video Examples of Distributed Systems Collection of Web servers: jointly provide distributed database of hypertext and multimedia documents Distributed file system on a LAN Domain Name Service (DNS) Massively multiplayer online games 5 Centralized Vs Distributed System Centralized System Characteristics One component with non-autonomous parts Component shared by users all the time All resources accessible Software runs in a single process Single point of control Single point of failure Distributed System Characteristics Multiple autonomous components Components are not shared by all users Resources may not be accessible Software runs in concurrent processes on different processors Multiple points of control 6 Multiple points of failure Problems of Distribution Concurrency, Security clients must not disturb each other Privacy e.g., when building a preference profile such as using cookies unwanted communication such as spam Partial failure we often do not know where the error is (e.g., RPC) Location, Migration, Replication clients must be able to find their servers Heterogeneity hardware, platforms, languages, management 7 Contd. Concurrent execution of processes: o Non-determinism, race conditions, synchronization, deadlocks, … No global clock o The limits to the accuracy with which the computers in a network can synchronize their clocks o Coordination is done by message passing o No single global notion of the correct time No global state o No process has a knowledge of the current global state of the system. Units may fail independently o Network faults may isolate computers that are still running o System failures may not be immediately known 8 Characteristics of Distributed Systems Differences between the computers and the ways they communicate are hidden from users. Users and applications can interact with a distributed system in a consistent and uniform way regardless of location. Distributed systems should be easy to expand and scale. A distributed system is normally continuously available, even if there may be partial failures. 9 Organization and Goals of a Distributed System To support heterogeneous computers and networks and to provide a single- system view, a distributed system is often organized by means of a layer of software called middleware that extends over multiple machines. a distributed system organized as middleware; note that the middleware layer extends over multiple machines 10 Goals of a Distributed System Goals of a distributed system: a distributed system should Making Resources Accessible Be easily connect users with resources (printers, computers, storage facilities, data, files,Web pages,...) reasons: economics, e-commerce, to collaborate and exchange information Distribution Transparency Be transparent: hide the fact that the resources and processes are distributed across multiple computers Be open Be scalable 11 Transparency in a Distributed System A distributed system that is able to present itself to users and applications as if it were only a single computer system is said to be transparent. Access transparency: enables local and remote resources to be accessed using identical operations. Location transparency: enables resources to be accessed without knowledge of their physical or network location (for example, which building or IP address). Concurrency transparency: enables several processes to operate concurrently using shared resources without interference between them. 12 Contd. Replication transparency: enables multiple instances of resources to be used to increase reliability and performance without knowledge of the replicas by users or application programmers. Failure transparency: enables the concealment of faults, allowing users and application programs to complete their tasks despite the failure of hardware or software components. Mobility transparency: allows the movement of resources and clients within a system without affecting the operation of users or programs. Performance transparency: allows the system to be reconfigured to improve performance as loads vary. Scaling transparency: allows the system and applications to expand in scale without change to the system structure or the application algorithms. 13 Openness in a Distributed System A distributed system should be open (Complete and Neutral) We need well-defined Interfaces Interoperability ✓ components of different origin can communicate Portability ✓ components work on different platforms o Another goal of an open distributed system is that it should be flexible and extensible; easy to configure the system out of different components; easy to add new components, replace existing ones. o An Open Distributed System is a system that offers services according to standard rules that describe the syntax and semantics of those services. o e.g., protocols in networks 14 Contd. Standards - a necessity ✓ should allow competition in non-normative areas ✓ In distributed systems, such services are often specified through interfaces often described using an Interface Definition Language (IDL) ✓ Specify only syntax: the names of the functions, types of parameters, return values, possible exceptions,... 15 Scalability in Distributed Systems Remain effective when there is a significant increase in the number of resources and the number of users A distributed system should be scalable in terms of :- ✓ Size: adding more users and resources to the system ✓ Problem: overloading ✓ Geographically: users and resources may be far apart ✓ Problem: communication ✓ Administratively: should be easy to manage even if it spans many administrative organizations ✓ Problem: administrative mess Scalability problems: Concept Example Centralized services Single server for all users-mostly for security reasons Centralized data A single on-line telephone book Centralized algorithms Doing routing based on complete information 16 Scaling Techniques How to solve scaling problems? ✓ The problem is mainly performance, and arises as a result of limitations in the capacity of servers and networks (for geographical scalability) ✓ Three (3) possible solutions: 1. Hiding communication latencies, 2. Distribution, and 3. Replication. 17 1. Hiding communication latencies Try to avoid waiting for responses to remote service requests. Let the requester do other useful job ✓ i.e., construct requesting applications that use only asynchronous communication instead of synchronous communication; when a reply arrives the application is interrupted Good for batch processing and parallel applications but not for interactive applications For interactive applications, move part of the job to the client to reduce communication; e.g. filling a form and checking the entries 18 Contd. e.g., checking the completeness of mandatory fields. shipping code is now supported in Web applications using Java Applets (a) a server checking the correctness of field entries (b) a client doing the job 19 2. Distribution Another important scaling technique is distribution. ✓ Distribution involves taking a component, splitting it into smaller parts, and subsequently spreading those parts across the system. An excellent example of distribution is the Internet Domain Name System (DNS). The DNS name space is hierarchically organized into a tree of domains, which are divided into non overlapping zones, as shown in below. 20 Contd. The names in each zone are handled by a single name server. Without going into too many details, one can think of each path name, being the name of a host in the Internet, and thus associated with a network address of that host. Basically, resolving a name means returning the network address of the associated host. Consider, for example, the name nl.vu.cs.flits. To resolve this name, it is first passed to the server of zone Z1 which returns the address of the server for zone Z2, to which the rest of name, vu.cs.flits, can be handed. The server for Z2 will return the address of the server for zone Z3, which is capable of handling the last part of the name and will return the address of the associated host. 21 3. Replication Replicate components across a distributed system to increase availability and for load balancing, leading to better performance Decided by the owner of a resource. Caching (a special form of replication) also reduces communication latency Decided by the user. But, caching and replication may lead to consistency problems. 22 Pitfalls when Developing Distributed Systems Because of false assumptions made by first time developers (of distributed systems) which are related to the properties of distributed systems and do not occur in non- distributed applications The network is reliable (making it difficult to achieve failure transparency) The network is secure The network is homogeneous The topology does not change Latency is zero Bandwidth is infinite Transport cost is zero There is one administrator 23 Types of distributed Systems Three (3) types: 1. Distributed Computing Systems 2. Distributed Information Systems and 3. Distributed Pervasive / Embedded Systems 24 1. Distributed Computing Systems 1. Distributed Computing Systems ✓ Used for high-performance computing tasks ✓ Two (2) types: A. Cluster Computing and B. Grid Computing 25 A. Cluster Computing A collection of similar workstations or PCs (homogeneous), closely connected by means of a high-speed LAN Each node runs the same operating system Used for parallel programming in which a single compute intensive program is run in parallel on multiple machines 26 Contd. an example of a cluster computing system ✓ a master node runs a middleware (containing libraries for parallel programs) and controls other compute nodes; ✓ it allocates tasks ✓ provides an interface to users. etc. 27 B. Grid Computing “Resource sharing and coordinated problem solving in dynamic, multi-institutional virtual organizations" Ian Foster) High degree of heterogeneity: no assumptions are made concerning hardware, operating systems, networks, administrative domains, security policies, etc. Globus is a software system for Grid Computing 28 2. Distributed Information Systems Problem: many networked applications with a problem of interoperability At the lowest level: wrap a number of requests into a single larger request and have it executed as a distributed transaction; all or none of the requests would be executed. How to let applications communicate directly with each other, i.e., Enterprise Application Integration (EAI). 29 Transaction Processing Systems Applications Special primitives are required Transaction Processing Systems Consider database to program transactions, supplied either by the underlying distributed system or by the language runtime system. Exact list of primitives depends on the type of application; procedure calls, ordinary statements, etc. can also be included. ▪ e.g., assume the following banking operation ▪ withdraw an amount x from account 1 ▪ deposit the amount x to account 2 ✓ what happens if there is a problem after the first activity is carried out? Group the two operations into one transaction; either both are carried out or neither we need a way to roll back when a transaction is not completed. 30 Properties of Transactions (ACID) 1. Atomic: to the outside world, the transaction happens indivisibly; a transaction either happens completely or not at all; intermediate states are not seen by other processes. 2. Consistent: the transaction does not violate system invariants; e.g., in an internal transfer in a bank, the amount of money in the bank must be the same as it was before the transfer (the law of conservation of money); this may be violated for a brief period of time, but not seen to other processes. 3. Isolated or Serializable: concurrent transactions do not interfere with each other; if two or more transactions are running at the same time, the final result must look as though all transactions run sequentially in some order. 4. Durable: once a transaction commits, the changes are permanent. 31 3. Distributed Pervasive Systems There are also mobile and embedded computing devices which are small, battery- powered, mobile, and with a wireless connection. Three (3) requirements for pervasive applications 1. Embrace Contextual Changes: a device is aware that its environment may change all the time, e.g., changing its network access point. 2. Encourage ad hoc Composition: devices are used in different ways by different users 3. Recognize Sharing as the default: devices join a system to access or provide information examples of pervasive systems This calls for means to easily read, store, manage, and share information. Home Systems that integrate consumer electronics, Electronic Health Care Systems to monitor the well-being of individuals, Sensor Networks,… 32 ~ End of Chapter-1~