Messaging Lecture Notes Fall 2024
Document Details
Uploaded by BuoyantPeony
2024
Balkir Kayaalti
Tags
Summary
These lecture notes cover messaging and interprocess communication (IPC) concepts, including information sharing, computation speedup, modularity, and different communication models like shared memory and message passing. Different buffer types and their characteristics are also discussed.
Full Transcript
Messaging BALKIR KAYAALTI Fall 2024 Interprocess Communication Information sharing. Since several applications may be interested in the same piece of information (for instance, copying and pasting), we must provide an environment to allow concurrent access to such information. Comput...
Messaging BALKIR KAYAALTI Fall 2024 Interprocess Communication Information sharing. Since several applications may be interested in the same piece of information (for instance, copying and pasting), we must provide an environment to allow concurrent access to such information. Computation speedup. If we want a particular task to run faster, we must break it into subtasks, each of which will be executing in parallel with the others. Notice that such a speedup can be achieved only if the computer has multiple processing cores. Modularity. We may want to construct the system in a modular fashion, dividing the system functions into separate processes or threads, Cooperating processes require an interprocess communication (IPC) mechanism that will allow them to exchange data— that is, send data to and receive data from each other. There are two fundamental models of interprocess communication: shared memory and message passing Communication models Two types of buffers can be used. The unbounded buffer places no practical limit on the size of the buffer. The consumer may have to wait for new items, but the producer can always produce new items. The bounded buffer assumes a fixed buffer size. In this case, the consumer must wait if the buffer is empty, and the producer must wait if the buffer is full. #define BUFFER SIZE 10 typedef struct {... } item; item buffer[BUFFER SIZE]; int in = 0; int out = 0; } item next produced; while (true) { while (((in + 1) % BUFFER SIZE) == out) ; buffer[in] = next produced; in = (in + 1) % BUFFER SIZE; } IPC item next consumed; while (true) { while (in == out) ; next consumed = buffer[out]; out = (out + 1) % BUFFER SIZE; } A message-passing facility provides at least two operations: send(message) and receive(message) Direct or indirect communication Synchronous or asynchronous communication Automatic or explicit buffering 3.6.1 Naming Processes that want to communicate must have a way to refer to each other. They can use either direct or indirect communication. Under direct communication, each process that wants to communicate must explicitly name the recipient or sender of the communication. In this scheme, the send() and receive() primitives are defined as: send(P, message)—Send a message to process P. receive(Q, message)—Receive a message from process Q. send(P, message)—Send a message to process P. receive(Q, message)—Receive a message from process Q. A communication link in this scheme has the following properties: A link is established automatically between every pair of processes that want to communicate. The processes need to know only each other’s identity to communicate. A link is associated with exactly two processes. Between each pair of processes, there exists exactly one link. send(P, message)—Send a message to process P. receive(id, message)—Receive a message from any process. The variable id is set to the name of the process with which communication has taken place. Mailbox send(A, message)—Send a message to mailbox A. receive(A, message)—Receive a message from mailbox A. In this scheme, a communication link has the following properties: A link is established between a pair of processes only if both members of the pair have a shared mailbox. A link may be associated with more than two processes. Between each pair of communicating processes, a number of different links may exist, with each link corresponding to one mailbox. Allow at most one process at a time to execute a receive() operation. Allow the system to select arbitrarily which process will receive the message (that is, either P2 or P3, but not both, will receive the message). The system may define an algorithm for selecting which process will receive the message (for example, round robin, where processes take turns receiving messages). The system may identify the receiver to the sender. Blocking send. The sending process is blocked until the message is received by the receiving process or by the mailbox. Nonblocking send. The sending process sends the message and resumes operation. Blocking receive. The receiver blocks until a message is available. Nonblocking receive. The receiver retrieves either a valid message or a null. message next produced; while (true) { send(next produced); } Buffering Whether communication is direct or indirect,messages exchanged by communicating processes reside in a temporary queue. Basically, such queues can be implemented in three ways: Zero capacity. The queue has a maximum length of zero; thus, the link cannot have any messageswaiting in it. In this case, the sendermust block until the recipient receives the message. Bounded capacity. The queue has finite length n; thus, atmost n messages can reside in it. If the queue is not full when a new message is sent, the message is placed in the queue (either the message is copied or a pointer to the message is kept), and the sender can continue execution without message next consumed; while (true) { receive(next consumed); } Figure 3.15 IPC System examples POSIX shared memory fd = shm open(name, O CREAT | O RDWR, 0666); The first parameter specifies the name of the shared-memory object. Processes that wish to access this shared memory must refer to the object by this name. The subsequent parameters specify that the shared-memory object is to be cre- ated if it does not yet exist (O CREAT) and that the object is open for reading and writing (O RDWR). The last parameter establishes the file-access permissions of the shared-memory object. A successful call to shm open() returns an integer file descriptor for the shared-memory object. Once the object is established, the ftruncate() function is used to configure the size of the object in bytes. The call ftruncate(fd, 4096); Finally, the mmap() function establishes a memory-mapped file containing the shared-memory object. It also returns a pointer to the memory-mapped file that is used for accessing the shared-memory object. #include #include #include #include #include #include #include int main() { const int SIZE = 4096; const char *name = "OS"; const char *message 0 = "Hello"; const char *message 1 = "World!"; int fd; char *ptr; fd = shm open(name,O CREAT | O RDWR,0666); ftruncate(fd, SIZE); ptr = (char *) mmap(0, SIZE, PROT READ | PROT WRITE, MAP SHARED, fd, 0); sprintf(ptr,"%s",message 0); ptr += strlen(message 0); sprintf(ptr,"%s",message 1); ptr += strlen(message 1); return 0; } #include #include #include #include #include #include int main() { const int SIZE = 4096; const char *name = "OS"; int fd; char *ptr; fd = shm open(name, O RDONLY, 0666); ptr = (char *) mmap(0, SIZE, PROT READ | PROT WRITE, MAP SHARED, fd, 0); printf("%s",(char *)ptr); shm unlink(name); return 0; } Mach message passing As an example of message passing, we next consider the Mach operating system. Mach was especially designed for distributed systems, but was shown to be suitable for desktop and mobile systems as well, as evidenced by its inclusion in the macOS and iOS operating systems, as discussed in Chapter 2. The Mach kernel supports the creation and destruction of multiple tasks, which are similar to processes but have multiple threads of control and fewer associated resources. Most communication in Mach — including all inter- task communication — is carried out by messages. Messages are sent to, and received from, mailboxes, which are called ports in Mach. Ports are finite in size and unidirectional; for two-way communication, a message is sent to one port, and a response is sent to a separate reply port. Each port may have multiple senders, but only one receiver. Mach uses ports to represent resources such as tasks, threads, memory, and processors, while message passing provides an object-oriented approach for interacting with these system resources and services. Message passing may occur between any two ports on the same host or on separate hosts on a distributed system. Associated with each port is a collection of port rights that identify the capabilities necessary for a task to interact with the port. For example, for a task to receive a message from a port, it must have the capability MACH PORT RIGHT RECEIVE for that port. The task that creates a port is that port’s owner, and the owner is the only task that is allowed to receive messages from that port. A port’s owner may also manipulate the capabilities for a port. This is most commonly done in establishing a reply port. For example, assume that task T1 owns port P1, and it sends a message to port P2, which is owned by task T2. If T1 expects to receive a reply from T2, it must grant T2 the right MACH PORT RIGHT SEND for port P1. Ownership of port rights is at the task level, which means that all threads belonging to the same task share the same port rights. Thus, two threads belonging to the same task can easily communicate by exchanging messages through the per-thread port associated with each thread. When a task is created, two special ports— the Task Self port and the Notify port — are also created. The kernel has receive rights to the Task Self port, which allows a task to send messages to the kernel. The kernel can send notification of event occurrences to a task’s Notify port (to which, of course, the task has receive rights). mach port t port; // the name of the port right mach port allocate( mach task self(), // a task referring to itself MACH PORT RIGHT RECEIVE, // the right for this port &port); // the name of the port right Each task also has access to a bootstrap port, which allows a task to register a port it has created with a system-wide bootstrap server. Once a port has been registered with the bootstrap server, other tasks can look up the port in this registry and obtain rights for sending messages to the port. The queue associated with each port is finite in size and is initially empty. As messages are sent to the port, the messages are copied into the queue. All messages are delivered reliably and have the same priority. Mach guarantees that multiple messages from the same sender are queued in first-in, first- out (FIFO) order but does not guarantee an absolute ordering. For instance, messages from two senders may be queued in any order. Mach messages contain the following two fields: A fixed-size message header containing metadata about the message, including the size of the message as well as source and destination ports. Commonly, the sending thread expects a reply, so the port name of the source is passed on to the receiving task, which can use it as a “return address” in sending a reply. A variable-sized body containing data. Messages may be either simple or complex. A simple message contains ordinary, unstructured user data that are not interpreted by the kernel. A complex message may contain pointers to memory locations containing data (known as “out-of-line” data) or may also be used for transferring port rights to another task. Out-of-line data pointers are especially useful when a message must pass large chunks of data. A simple message would require copying and packaging the data in the message; out-of- line data transmission requires only a pointer that refers to the memory location where the data are stored. The function mach msg() is the standard API for both sending and receiving messages. The value of one of the function’s parameters — either MACH SEND MSG or MACH RCV MSG— indicates if it is a send or receive operation. We now illustrate how it is used when a client task sends a simple message to a server task. Assume there are two ports— client and server— associated with the client and server tasks, respectively. The code in Figure 3.18 shows the client task constructing a header and sending a message to the server, as well as the server task receiving the message sent from the client. The mach msg() function call is invoked by user programs for performing message passing. mach msg() then invokes the function mach msg trap(), which is a system call to the Mach kernel. Within the kernel, mach msg trap() next calls the function mach msg overwrite trap(), which then handles the actual passing of the message. #include struct message { mach msg header t header; int data; }; mach port t client; mach port t server; struct message message; // construct the header message.header.msgh size = sizeof(message); message.header.msgh remote port = server; message.header.msgh local port = client; // send the message mach msg(&message.header, // message header MACH SEND MSG, // sending a message sizeof(message), // size of message sent 0, // maximum size of received message - unnecessary MACH PORT NULL, // name of receive port - unnecessary MACH MSG TIMEOUT NONE, // no time outs MACH PORT NULL // no notify port ); 1. Wait indefinitely until there is room in the queue. 2. Wait at most n milliseconds. 3. Do not wait at all but rather return immediately. 4. Temporarily cache a message. Here, a message is given to the operating system to keep, even though the queue to which that message is being sent is full. When the message can be put in the queue, a notification message is sent back to the sender. Only one message to a full queue can be pending at any time for a given sending thread. Windows Connection request Connection Handle Port Handle Client Communication Port Server Handle Communication Port Shared Section Object (> 256 bytes) When an ALPC channel is created, one of three message-passing techniques is chosen: 1. For small messages (up to 256 bytes), the port’s message queue is used as intermediate storage, and the messages are copied from one process to the other. 2. Larger messages must be passed through a section object, which is a region of shared memory associated with the channel. 3. When the amount of data is too large to fit into a section object, an API is available that allows server processes to read and write directly into the address space of a client. Pipes 1. Pipes A pipe acts as a conduit allowing two processes to communicate. Pipes were one of the first IPC mechanisms in early UNIX systems. They typically pro- vide one of the simpler ways for processes to communicate with one another, although they also have some limitations. In implementing a pipe, four issues must be considered: 1. Does the pipe allow bidirectional communication, or is communication unidirectional? 2. If two-way communication is allowed, is it half duplex (data can travel only one way at a time) or full duplex (data can travel in both directions at the same time)? 3. Must a relationship (such as parent – child) exist between the communi- cating processes? 4. Can the pipes communicate over a network, or must the communicating processes reside on the same machine? Ordinary pipes allow two processes to communicate in standard producer– consumer fashion: the producer writes to one end of the pipe (the write end) and the consumer reads from the other end (the read end). As a result, ordinary pipes are unidirectional, allowing only one-way communication. If two-way communication is required, two pipes must be used, with each pipe sending data in a different direction. We next illustrate constructing ordinary pipes on both UNIX and Windows systems. In both program examples, one process writes the message Greetings to the pipe, while the other process reads this message from the pipe. On UNIX systems, ordinary pipes are constructed using the function pipe(int fd[]) Parent Child fd fd fd pipe fd #include #include #include #include #define BUFFER SIZE 25 #define READ END 0 #define WRITE END 1 int main(void) { char write msg[BUFFER SIZE] = "Greetings"; char read msg[BUFFER SIZE]; int fd; pid t pid; if (pipe(fd) == -1) { fprintf(stderr,"Pipe failed"); return 1; } pid = fork(); if (pid < 0) { fprintf(stderr, "Fork Failed"); return 1; } if (pid > 0) { close(fd[READ END]); write(fd[WRITE END], write msg, strlen(write msg)+1); close(fd[WRITE END]); } else { close(fd[WRITE END]); read(fd[READ END], read msg, BUFFER SIZE); printf("read %s",read msg); close(fd[READ END]); } return 0; } Windows pipe #include #include #include #define BUFFER SIZE 25 int main(VOID) { HANDLE ReadHandle, WriteHandle; STARTUPINFO si; PROCESS INFORMATION pi; char message[BUFFER SIZE] = "Greetings"; DWORD written; SECURITY ATTRIBUTES sa = {sizeof(SECURITY ATTRIBUTES),NULL,TRUE}; ZeroMemory(&pi, sizeof(pi)); if (!CreatePipe(&ReadHandle, &WriteHandle, &sa, 0)) { fprintf(stderr, "Create Pipe Failed"); return 1; } GetStartupInfo(&si); si.hStdOutput = GetStdHandle(STD OUTPUT HANDLE); si.hStdInput = ReadHandle; si.dwFlags = STARTF USESTDHANDLES; SetHandleInformation(WriteHandle, HANDLE FLAG INHERIT, 0); CreateProcess(NULL, "child.exe", NULL, NULL, TRUE, 0, NULL, NULL, &si, &pi); CloseHandle(ReadHandle); if (!WriteFile(WriteHandle, message,BUFFER SIZE,&written,NULL)) fprintf(stderr, "Error writing to pipe."); CloseHandle(WriteHandle); WaitForSingleObject(pi.hProcess, INFINITE); CloseHandle(pi.hProcess); CloseHandle(pi.hThread); return 0; } Windows pipe #include #include #define BUFFER SIZE 25 int main(VOID) { HANDLE Readhandle; CHAR buffer[BUFFER SIZE]; DWORD read; ReadHandle = GetStdHandle(STD INPUT HANDLE); if (ReadFile(ReadHandle, buffer, BUFFER SIZE, &read, NULL)) printf("child read %s",buffer); else fprintf(stderr, "Error reading from pipe"); return 0; } Client server system host X (146.86.5.20) socket (146.86.5.20:1625) web server (161.25.19.8) socket (161.25.19.8:80) A socket is defined as an endpoint for communication. A pair of processes com- municating over a network employs a pair of sockets— one for each process. A socket is identified by an IP address concatenated with a port number. In general, sockets use a client– server architecture. The server waits for incoming client requests by listening to a specified port. Once a request is received, the server accepts a connection from the client socket to complete the connection. Servers implementing specific services (such as SSH, FTP, and HTTP) listen to well-known ports (an SSH server listens to port 22; an FTP server listens to port 21; and a web, or HTTP, server listens to port 80). All ports below 1024 are considered well known and are used to implement standard services. Java data server import java.net.*; import java.io.*; public class DateServer { public static void main(String[] args) { try { ServerSocket sock = new ServerSocket(6013); while (true) { Socket client = sock.accept(); PrintWriter pout = new PrintWriter(client.getOutputStream(), true); pout.println(new java.util.Date().toString()); client.close(); } } catch (IOException ioe) { System.err.println(ioe); } } } Java client import java.net.*; import java.io.*; public class DateClient { public static void main(String[] args) { try { Socket sock = new Socket("127.0.0.1",6013); InputStream in = sock.getInputStream(); BufferedReader bin = new BufferedReader(new InputStreamReader(in)); String line; while ( (line = bin.readLine()) != null) System.out.println(line); sock.close(); } catch (IOException ioe) { System.err.println(ioe); } } } 1. Remote Procedure Calls One of the most common forms of remote service is the RPC paradigm, which was designed as a way to abstract the procedure-call mechanism for use between systems with network connections. It is similar in many respects to the IPC mechanism described in Section 3.4, and it is usually built on top of such a system. Here, however, because we are dealing with an environment in which the processes are executing on separate systems, we must use a message-based communication scheme to provide remote service. In contrast to IPC messages, the messages exchanged in RPC communi- cation are well structured and are thus no longer just packets of data. Each message is addressed to an RPC daemon listening to a port on the remote sys- tem, and each contains an identifier specifying the function to execute and the parameters to pass to that function. The function is then executed as requested, and any output is sent back to the requester in a separate message. A port in this context is simply a number included at the start of a message packet. Whereas a system normally has one network address, it can have many ports within that address to differentiate the many network services it supports. If a remote process needs a service, it addresses a message to the proper port. For instance, if a system wished to allow other systems to be able to list its current users, it would have a daemon supporting such an RPC attached to a port— say, port 3027. Any remote system could obtain the needed information (that is, the list of current users) by sending an RPC message to port 3027 on the server. The data would be received in a reply message. The semantics of RPCs allows a client to invoke a procedure on a remote host as it would invoke a procedure locally. The RPC system hides the details that allow communication to take place by providing a stub on the client side. Typically, a separate stub exists for each separate remote procedure. When the client invokes a remote procedure, the RPC system calls the appropriate stub, passing it the parameters provided to the remote procedure. This stub locates the port on the server and marshals the parameters. The stub then transmits a message to the server using message passing. A similar stub on the server side receives this message and invokes the procedure on the server. If necessary, return values are passed back to the client using the same technique. On Windows systems, stub code is compiled from a specification written in the Microsoft Interface Definitio Language (MIDL), which is used for defining the interfaces between client and server programs. Andoid RPC user calls kernel to send RPC message to procedure X From: client kernel sends To: server matchmaker message to Port: matchmaker receives matchmaker to Re: address message, looks find port number for RPC X up answer kernel places From: server port P in user To: client matchmaker RPC message Port: kernel replies to client Re: RPC X with port P Port: P daemon From: client kernel sends listening to RPC To: server port P receives Port: port P message kernel receives From: RPC daemon reply, passes Port: P processes it to user To: client request and Port: kernel processes send output Two approaches are common. First, the binding information may be prede- termined, in the form of fixed port addresses. At compile time, an RPC call has a fixed port number associated with it. Once a program is compiled, the server cannot change the port number of the requested service. Second, binding can be done dynamically by a rendezvous mechanism. Typically, an operating system provides a rendezvous (also called a matchmaker) daemon ona fixed RPC port. Aclient then sends a message containing the name of the RPC to the rendezvous daemon requesting the port address of the RPC it needs to execute. The port number is returned, and the RPC calls can be sent to that port until the process terminates (or the server crashes). This method requires the extra overhead of the initial request but is more flexible than the first approach. Figure 3.29 shows a sample interaction.