Summary

This document presents an overview of process and thread concepts, focusing on their implementation and related issues. It discusses the use of threads for communication and parallelism in distributed and non-distributed systems. The concepts of concurrent programming, thread implementation techniques, and examples in the context of client and server organizations are thoroughly explored.

Full Transcript

Chapter 3 Process oThreads and their Implementation oclient and server design issue oobjects servers and adaptors ocode migration osoftware agent and agent technology oagent communication languages Introduction  communication takes place between processes  a process is a program in...

Chapter 3 Process oThreads and their Implementation oclient and server design issue oobjects servers and adaptors ocode migration osoftware agent and agent technology oagent communication languages Introduction  communication takes place between processes  a process is a program in execution  from OS perspective, management and scheduling of processes is important  other important issues arise in distributed systems  multithreading to enhance performance by overlapping communication and local processing  how are clients and servers organized and server design issues  process or code migration for enhancing performance, reducing communication, to exploit parallelism, and to dynamically configure clients and servers  software agents that perform a task through cooperation and agent technology 2 3.1 Threads and their Implementation  how are processes and threads related?  process tables or PCBs are used to keep track of processes  there are usually many processes executing concurrently  processes should not interfere with each other; sharing resources by processes is transparent  this concurrency transparency has a high price; allocating resources for a new process and context switching take time  a thread also executes independently from other threads; but no need of a high degree of concurrency transparency thereby resulting in better performance 3  threads can be used in both distributed and nondistributed systems  Threads in Nondistributed Systems  a process has an address space (containing program text and data) and a single thread of control, as well as other resources such as open files, child processes, accounting information, etc. Process 1 Process 2 Process 3 three processes each with one thread one process with three threads 4  each thread has its own program counter, registers, stack, and state; but all threads of a process share address space, global variables and other resources such as open files, etc. 5  Threads take turns in running  Threads allow multiple executions to take place in the same process environment, called multithreading  Thread Usage – Why do we need threads?  e.g., a wordprocessor has different parts for  interacting with the user  formatting the page as soon as changes are made  timed savings (for auto recovery)  spelling and grammar checking, etc. 1. Simplifying the programming model: since many activities are going on at once more or less independently 2. They are easier to create and destroy than processes since they do not have any resources attached to them 3. Performance improves by overlapping activities if there is too much I/O; i.e., to avoid blocking when waiting for input or doing calculations, say in a spreadsheet 4. Real parallelism is possible in a multiprocessor system 6  having finer granularity in terms of multiple threads per process rather than processes provides better performance and makes it easier to build distributed applications  in nondistributed systems, threads can be used with shared data instead of processes to avoid context switching overhead in interprocess communication (IPC) context switching as the result of IPC 7  Thread Implementation  threads are usually provided in the form of a thread package  the package contains operations to create and destroy a thread, operations on synchronization variables such as mutexes and condition variables  two approaches of constructing a thread package a. construct a thread library that is executed entirely in user mode (the OS is not aware of threads)  cheap to create and destroy threads; just allocate and free memory  context switching can be done using few instructions; store and reload only CPU register values  disadvantage: invocation of a blocking system call will block the entire process to which the thread belongs and all other threads in that process b. implement them in the OS’s kernel  let the kernel be aware of threads and schedule them  expensive for thread operations such as creation and deletion since each requires a system call 8  solution: use a hybrid form of user-level and kernel-level threads, called lightweight process (LWP)  a LWP runs in the context of a single (heavy-weight) process, and there can be several LWPs per process  the system also offers a user-level thread package for some operations such as creating and destroying threads, for thread synchronization (mutexes and condition variables)  the thread package can be shared by multiple LWPs combining kernel-level lightweight processes and user-level threads 9  Threads in Distributed Systems  threads allow blocking system calls without blocking the entire process; this means multiple logical connections (communications) can be established at the same time  Multithreaded Clients  consider a Web browser; fetching different parts of a page can be implemented as a separate thread, each opening its own TCP connection to the server  each can display the results as it gets its part of the page  parallelism can also be achieved for replicated servers since each thread request can be forwarded to separate replicas  Multithreaded Servers  servers can be constructed in three ways a. single-threaded process  it gets a request, examines it, carries it out to completion before getting the next request  the server is idle while waiting for disk read, i.e., system calls are blocking; other requests cannot be handled 1 b. threads  threads are more important for implementing servers  e.g., a file server  the dispatcher thread reads incoming requests for a file operation from clients and passes it to an idle worker thread  the worker thread performs a blocking disk read; in which case another thread may continue, say the dispatcher or another worker thread a multithreaded server organized in a dispatcher/worker model 1 c. finite-state machine  if threads are not available  it gets a request, examines it, tries to fulfill the request from cache, else sends a request to the file system; but instead of blocking it records the state of the current request and proceeds to the next request  but hard to program  Summary Model Characteristics Single-threaded process No parallelism, blocking system calls Parallelism, blocking system calls Threads (thread only) Finite-state machine Parallelism, nonblocking system calls three ways to construct a server  read about virtualization (the illusion of having more resources than we actually have): pages 79 - 82 1 3.2 Anatomy of Clients  Two issues: user interfaces and client-side software for distribution transparency a. User Interfaces  to create a convenient environment for the interaction of a human user and a remote server; e.g. mobile phones with simple displays and a set of keys  GUIs are most commonly used  The X Window System (or simply X) as an example  it has the X kernel: the part of the OS that controls the terminal (monitor, keyboard, pointing device like a mouse) and is hardware dependent  contains all terminal-specific device drivers through the library called xlib 13 the basic organization of the X Window System  the window manager is a special application and is in charge of the “look and feel” of the screen that is presented to users  for controlling the client remotely, compression may be important to reduce bandwidth and latency; but decompression by the client is required 14 b. Client-Side Software for Distribution Transparency  in addition to the user interface, parts of the processing and data level in a client-server application are executed at the client side  an example is embedded client software for ATMs, cash registers, etc.  moreover, client software can also include components to achieve distribution transparency  e.g., replication transparency  assume a distributed system with replicated servers; the client proxy can send requests to each replica and a client side software can transparently collect all responses and passes a single return value to the client application 15 transparent replication of a server using a client-side solution  location, migration, and relocation transparency can also be handled using naming (see Chapter 5) and client cooperation; e.g., when a server changes location, the client software can be informed without the user knowing  access transparency and failure transparency in communication (keep on trying) can also be achieved using client-side software 16 3.3 Servers and Design Issues 1. General Design Issues  How to organize servers?  Where do clients contact a server?  Whether and how a server can be interrupted  Whether or not the server is stateless a. How to organize servers?  Iterative server  the server itself handles the request and returns the result  Concurrent server  it passes a request to a separate process or thread and waits for the next incoming request; e.g., a multithreaded server; or by forking a new process as is done in Unix 17 b. Where do clients contact a server?  using endpoints or ports at the machine where the server is running where each server listens to a specific endpoint  how do clients know the endpoint of a service?  globally assign endpoints for well-known services; e.g. FTP is on TCP port 21, HTTP is on TCP port 80  for services that do not require preassigned endpoints, it can be dynamically assigned by the local OS  IANA (Internet Assigned Numbers Authority) Ranges  IANA divided the port numbers into three ranges  Well-known ports: assigned and controlled by IANA for standard services, e.g., DNS uses port 53 18  Registered ports: are not assigned and controlled by IANA; can only be registered with IANA to prevent duplication e.g., MySQL uses port 3306  Dynamic ports or ephemeral ports : neither controlled nor registered by IANA  how can the client know endpoints that are not well-known? two approaches i. have a daemon running and listening to a well-known endpoint; it keeps track of all endpoints of services on the collocated server  the client will first contact the daemon which provides it with the endpoint, and then the client contacts the specific server 19 Client-to-server binding using a daemon ii. use a superserver (as in UNIX) that listens to all endpoints and then forks a process to take care of the request; this is instead of having a lot of servers running simultaneously and most of them idle Client-to-Server binding using a superserver 20 c. Whether and how a server can be interrupted  for instance, a user may want to interrupt a file transfer, may be it was the wrong file  let the client exit the client application; this will break the connection to the server; the server will tear down the connection assuming that the client had crashed or  let the client send out-of-bound data, data to be processed by the server before any other data from the client; the server may listen on a separate control endpoint; or send it on the same connection as urgent data as is in TCP d. Whether or not the server is stateless  a stateless server does not keep information on the state of its clients; for instance a Web server  soft state: a server promises to maintain state for a limited time; e.g., to keep a client informed about updates; after the time expires, the client has to poll 21  a stateful server maintains information about its clients; for instance a file server that allows a client to keep a local copy of a file and can make update operations  this may improve performance but requires a recovery procedure in case of a server crash which is not the case for a stateless server 2. Server Clusters  a server cluster is a collection of machines connected through a network (normally a LAN with high bandwidth and low latency) where each machine runs one or more servers  it is logically organized into three tiers 22 the general organization of a three-tiered server cluster 23  Distributed Servers  the problem with a server cluster is when the logical switch (single access point) fails making the cluster unavailable  hence, several access points can be provided where the addresses are publicly available leading to a distributed server  e.g., the DNS can return several addresses for the same host name 24 3.4 Code Migration  so far, communication was concerned on passing data  we may pass programs, even while running and in heterogeneous systems  code migration also involves moving data as well: when a program migrates while running, its status, pending signals, and other environment variables such as the stack and the program counter also have to be moved 25  Reasons for Migrating Code  to improve performance; move processes from heavily- loaded to lightly-loaded machines (load balancing)  to reduce communication: move a client application that performs many database operations to a server if the database resides on the server; then send only results to the client  to exploit parallelism (for nonparallel programs): e.g., copies of a mobile program (called a mobile agent or a crawler as is called in search engines) moving from site to site searching the Web 26  to have flexibility by dynamically configuring distributed systems: instead of having a multitiered client-server application deciding in advance which parts of a program are to be run where the principle of dynamically configuring a client to communicate to a server; the client first fetches the necessary software, and then invokes the server 2  Models for Code Migration  code migration doesn’t only mean moving code; in some cases, it also means moving the execution status of a program, pending signals, and other parts of the execution environment  a process consists of three segments: code segment (set of instructions), resource segment (references to external resources such as files, printers,...), and execution segment (to store the current execution state of a process such as private data, the stack, the program counter)  alternatives for code migration  weak versus strong mobility  is it sender- or receiver-initiated  is it executed at the target process or in a separate process (for weak mobility); migrate or clone process (for strong mobility) 28  Weak Mobility  transfer only the code segment and may be some initialization data; in this case a program always starts from its initial stage, e.g. Java Applets  execution can be by  the target process (in its own address space like in Java Applets) but the target process and local resources must be protected (security) or  by a separate process; still local resources must be protected (security)  in addition to security, the other issue is portability in heterogeneous systems 29  Strong Mobility (or process migration)  transfer code and execution segments; helps to migrate a process in execution; stop execution, move it, and then resume execution from where it is stopped  can also be supported by remote cloning; having an exact copy of the original process and running on a different machine; executed in parallel to the original process; UNIX does this by forking a child process  migration can be  sender-initiated: the machine where the code resides or is currently running; e.g., uploading programs to a server; may need authentication or that the client is a registered one; e.g., crawlers to index Web pages  receiver-initiated: by the target machine; e.g., Java Applets; easier to implement  in a client-server model, receiver-initiated (by the server) is easier to implement since security issues are minimized; if clients are allowed to send code (sender- initiated), the server must know them since they may access resources such as disk on the server; 3  Summary of models of code migration alternatives for code migration 3  Migration and Local Resources  how to migrate the resource segment  not always possible to move a resource; e.g., a reference to TCP port held by a process to communicate with other processes; it should get a new port at the destination  Types of Process-to-Resource Bindings  Binding by identifier (the strongest): a resource is referred by its identifier; the process requires that resource; e.g., a URL to refer to a Web page or an FTP server referred by its Internet (IP) address  Binding by value (weaker): when only the value of a resource is needed; in this case another resource can provide the same value; e.g., standard libraries of programming languages such as C or Java which are normally locally available, but their location in the file system may vary from site to site  Binding by type (weakest): a process needs a resource of a specific type; reference to local devices, such as monitors, printers,... 3  in migrating code, the above bindings cannot change, but the references to resources can  how can a reference be changed? depends on whether the resource can be moved along with the code, i.e., resource-to- machine binding  Types of Resource-to-Machine Bindings  Unattached Resources: can be easily moved with the migrating program (such as data files associated with the program)  Fastened Resources: such as local databases and complete Web sites; moving or copying may be possible, but very costly  Fixed Resources: intimately bound to a specific machine or environment such as local devices and cannot be moved  we have nine combinations to consider 33  Migration in Heterogeneous Systems  distributed systems are constructed on a heterogeneous collection of platforms, each with its own OS and machine architecture  heterogeneity problems are similar to those of portability  easier in some languages  for scripting languages the source code is interpreted  for Java an intermediary code is generated by the compiler for a virtual machine  in weak mobility  since there is no runtime information, compile the source code for each potential platform  in strong mobility  difficult to transfer the execution segment since there may be platform-dependent information such as register values; there are some suggested solutions 34 3.5 Software Agents and Agent Technology  a software agent is an autonomous unit (process) capable of performing a task in collaboration (i.e., it communicates) with other, possibly remote, agents  it is capable of reacting to, and initiating changes (proactive) in its environment, possibly with users and other agents  a collaborative agent is an agent that forms part of a multiagent system, in which agents seek to achieve some common goal through collaboration; e.g., in arranging meetings  a mobile agent is an agent having the capability to move between different machines; e.g., to retrieve information distributed across a large heterogeneous network such as the Internet  an interface agent is an agent that assists an end user in the use of one or more applications and has learning capabilities (it is adaptive); e.g., those that bring buyers and sellers together 35  an information agent is an agent that manages information from different sources such as ordering and filtering; e.g. an e-mail agent filtering unwanted mail from its owner’s mailbox, or automatically distributing incoming mail into appropriate subject-specific mailboxes Common to Property all Description agents? Autonomous Yes Can act on its own Responds timely to changes Reactive Yes in its environment Proactive Yes Initiates actions that affects its environment Can exchange information with Communicativ Yes users and other agents e Continuous No Has a relatively long lifespan Mobile No Can migrate from one site to another some important properties Adaptive No byCapable which different types of agents can be of learning distinguished 3  Agent Technology  we need support to develop agent systems; for instance a middleware consisting of generally-used components of agents in distributed systems  FIPA - Foundation for Intelligent Physical Agents - develops a general model for software agents  agent platform: where agents are registered at, and operate; provides basic services such as creating and deleting agents, locating agents, interagent communication,...; it may include the following  agent management: keep track of the agents for the associated platform; provides facilities for creating and deleting agents, looking for the current endpoint for a specific agent by providing a naming service to globally and uniquely identify an endpoint  local directory service: where agents can look up what other agents on the platform have to offer  agent communication channel (ACC): for agents to communicate by exchanging messages in multiagent systems 3 the general model of an agent platform  please visit http://www.fipa.org/ for more information on the activities of FIPA 38  Agent Communication Languages (ACL)  ACL: an application-level protocol where communication between agents takes place  a message has a purpose and content  purpose  to request a specific service  to respond to a request  to inform about an event  to propose during negotiation 39 Message purpose Description Message Content INFORM Inform that a given proposition is Proposition true Query whether a given QUERY-IF Proposition proposition is true QUERY-REF Query for a given object Expression CFP Ask for a proposal Proposal specifics PROPOSE Provide a proposal Proposal ACCEPT- Tell that a given proposal is Proposal ID PROPOSAL accepted REJECT- Tell that a given proposal is Proposal ID PROPOSAL rejected Action REQUEST Request that an action be specificati performed on examples of different message types in the FIPA ACL, giving the purpose Reference SUBSCRIBE Subscribe of a message, along to description with the an information of the actual message content to source source 40  ACL messages consist of a header and the actual content  the header has different fields:  a field to identify the purpose of the message,  fields to identify the sender and the receiver,  a field to identify the language or encoding scheme for the content (since an ACL does not prescribe the format or language in which the message content is expressed),  a field to identify a standardized mapping of symbols to their meaning called ontology (if no common understanding of interpreting data) 41 Group project 1.write RMI implementation in java code to operate four basic mathematical operation in different machine, input accept from user? point(15)

Use Quizgecko on...
Browser
Browser