Distributed System Programming Lecture Notes PDF
Document Details
Uploaded by Deleted User
AAIT
2022
Wondimagegn D. (AAIT)
Tags
Summary
These lecture notes provide an overview of distributed system programming. Topics covered include processes, threads, inter-process communication, and virtualization. The lecture was delivered on December 5, 2022.
Full Transcript
Overview 1 Processes 2 Creating/Deleting Unix Processes 3 Inter-Process Communication 4 Posix Thread Programming 5 Java Thread Programming 6 Golang — Goroutine 7 Virtualization Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 1 / 46 ...
Overview 1 Processes 2 Creating/Deleting Unix Processes 3 Inter-Process Communication 4 Posix Thread Programming 5 Java Thread Programming 6 Golang — Goroutine 7 Virtualization Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 1 / 46 Processes Introduction Distributed Programming Distributed programming is about processes which communicate over a network Obvious requirement: good knowledge in how to handle processes locally Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 2 / 46 Processes Processes Basic idea We build virtual processors in software, on top of physical processors: Processor: Provides a set of instructions along with the capability of automatically executing a series of those instructions. Thread: A minimal software processor in whose a series of instructions can be executed. Saving a thread context implies stopping the current execution and saving all the data needed to continue the execution at a later stage. Process: A software processor in whose context one or more threads may be executed. Executing a thread, means executing a series of instructions in the context of that thread. Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 3 / 46 Processes Threads and Distributed Systems Multithreaded Web client Hiding network latencies: Web browser scans an incoming HTML page, and finds that more files need to be fetched Each file is fetched by a separate thread, each doing a (blocking) HTTP request. As files come in, the browser displays them. Multiple request-response calls to other machines (RPC) Hiding network latencies: A client does several calls at the same time, each one by a different thread. It then waits until all results have been returned. Note: if calls are to different servers, we may have a linear speed-up. Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 4 / 46 Processes Processes One process is made of: A process identifier (an integer) One executing program Memory used to execute the program One program counter (indicates where in the program the process currently is) A number of signal handlers (tells the program what to do when receiving signals) One process can determine what is its own identifier: #i n c l u d e #i n c l u d e pid t getpid ( void ) ; It can also determine the identifier of its parent: pid t getppid ( void ) ; Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 5 / 46 Processes Processes Computers can execute programs A process is one instance of a program while it is executing For example: I am currently using acroread to display these slides At the same time, there are many other programs executing: they are other processes The same program can be executed multiple times in parallel Somebody else may log on my computer and start acroread These are several separate processes, executing the same program A process can only be created by another process E.g., when I type a command, my shell process will create an acroread process Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 6 / 46 Processes The fork() Primitive There is exactly one way of creating a process in Unix: #i n c l u d e #i n c l u d e pid t fork ( void ) ; Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 7 / 46 Creating/Deleting Unix Processes The fork() Primitive The child process is an exact copy of the parent It is running the same program Its program counter is at the same position within the program I.e., just after the fork() call Its memory area is an exact copy of the parent’s memory Signal handlers and file descriptors are copied too There is one way of distinguishing between the two processes: Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 8 / 46 Creating/Deleting Unix Processes The fork() Primitive There is one way of distingusing between the two processes: fork() return 0 to the child process fork() returns the childs pid to the parent (pid > 0) fork() return -1 if an error occurred Most often, programs need to check the return value from fork() Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 9 / 46 Creating/Deleting Unix Processes The fork() Primitive After testing for fork() ’s return value, the two processes may diverge: p i t t pid ; pid = fork ( ) ; i f ( p i d Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 22 / 46 Inter-Process Communication Shared Memory To create a shared memory segment: #i n c l u d e #i n c l u d e i n t shmget ( k e y t key , i n t s i z e , i n t s h m f l g ) ; key: rendezvous point key=IPC PRIVATE if it will be used by children processes size: size of the segment in bytes shmflg: options (access control mask) Return value: a shm identifier (or -1 for error) Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 23 / 46 Inter-Process Communication Shared Memory To attach a shared memory segment: #i n c l u d e #i n c l u d e v o i d ∗shmat ( i n t shmid , c o n s t v o i d ∗shmaddr , i n t s h m f l g ) ; shmid: shared memory identifier (returned by shmget) shmaddr: address where to attach the segment, or NULL if you don’t care shmflg: options (access control mask) To detach a shared memory segment: i n t shmdt ( c o n s t v o i d ∗shmaddr ) ; shmaddr: segment address Attention: shmdt() does not destroy the segment! Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 24 / 46 Inter-Process Communication Shared Memory To destroy a segment: #i n c l u d e #i n c l u d e i n t s h m c t l ( i n t shmid , i n t cmd , s t r u c t s h m i d d s ∗ b u f ) ; Shared memory segments stay persistent even after all processes have died! You must destroy them in your programs The ipcs command shows existing segments (and semaphores) You can destroy them by hand with: ipcrmshm < id > Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 25 / 46 Inter-Process Communication Shared Memory Example i n t main ( ) { i n t shmid = shmget ( IPC PRIVATE , s i z e o f ( i n t ) , 0600); i n t ∗ s h a r e d i n t = ( i n t ∗) shmat ( shmid , 0 , 0); ∗shared int = 42; i f ( f o r k ()==0) { p r i n t f ( ”The v a l u e i s : %d\n” , ∗ s h a r e d i n t ); ∗shared int = 12; shmdt ( ( v o i d ∗) s h a r e d i n t ) ; } else { sleep (1); p r i n t f ( ”The v a l u e i s : %d\n” , ∗ s h a r e d i n t ); shmdt ( ( v o i d ∗) s h a r e d i n t ) ; s h m c t l ( shmid , IPC RMID , 0 ) ; } } Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 26 / 46 Posix Thread Programming Threads vs. Processes Multi-process programs are expensive: fork() needs to copy all the process’ memory, etc. Inter-process communication is hard Threads: ”lightweight processes” One process contains several “threads of execution” All threads execute the same program (but can be at different stages within it) All threads share process instructions, global memory, open files and signal handlers But each thread has its own thread ID, stack, program counter and stack pointer, errno and signal mask There are special synchronization primitives between threads of the same process Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 27 / 46 Posix Thread Programming Threads in C and Java Threads in C: Posix threads (pthreads) are standard among Unix systems The operating system must have special support for threads Programs must be linked with -lpthread Threads in Java: Threads are a native feature of Java: every virtual machine has thread support They are portable on any Java platform Java threads can be mapped to operating system threads (native threads) or emulated in user space (green threads) Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 28 / 46 Posix Thread Programming Creating a Pthread To create a pthread: #i n c l u d e i n t p t h r e a d c r e a t e ( pthread t ∗thread , p t h r e a d a t t r t ∗attr , v o i d ∗(∗ s t a r t r o u t i n e ) ( v o i d ∗ ) , v o i d ∗ a r g ) ; thread: thread id attr: attributes (i.e., options) start routine: function that the thread will execute arg: parameter to be passed to the thread To initialize a pthread attribute: int pthread attr init ( pthread attr t ∗attr ); Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 29 / 46 Posix Thread Programming Stopping a Pthread A pthread stops when: Its process stops Its parent thread stops Its start routine function returns It calls pthread exit: #i n c l u d e void pthread exit ( void ∗ r e t v a l ) ; Like processes, stopped threads must be waited for: #i n c l u d e i n t p t h r e a d j o i n ( p t h r e a d t th , v o i d ∗∗ t h r e a d r e t u r n ) ; Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 30 / 46 Posix Thread Programming Pthread Create/Delete Example #i n c l u d e v o i d ∗ f u n c ( v o i d ∗param ) { i n t ∗p = ( i n t ∗) param ; p r i n t f ( ”New t h r e a d : param=%d\n” ,∗ p ) ; r e t u r n NULL ; } i n t main ( ) { pthread t id ; pthread attr t attr ; i n t x = 42; void pthread exit ( void ∗ r e t v a l ) ; p t h r e a d a t t r i n i t (& a t t r ) ; p t h r e a d c r e a t e (& i d , &a t t r , f u n c , ( v o i d ∗) &x ) ; p t h r e a d j o i n ( i d , NULL ) ; } Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 31 / 46 Posix Thread Programming Detached Threads A “detached” thread: Does not need to be pthread join()ed Does not stop when its parent thread stops By default, threads are “joinable” (i.e., “attached”) To create a detached thread, set an attribute before creating the thread: pthread t id ; pthread attr t attr ; p t h r e a d a t t r i n i t (& a t t r ) ; p t h r e a d a t t r s e t d e t a c h s t a t e (& a t t r , PTHREAD CREATE DETACHED ) ; p t h r e a d c r e a t e (& i d , &a t t r , f u n c , NULL ) ; You can also detach a thread later with pthread detach() Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 32 / 46 Posix Thread Programming Pthread Synchronization with Mutex Pthreads have two synchronization concepts: mutex and condition variables Mutex: mutual exclusion #i n c l u d e int pthread mutexattr init ( pthread mutexattr t ∗attr ); i n t p t h r e a d m u t e x i n i t ( p t h r e a d m u t e x t ∗mutex , const pthread mutexattr t ∗mutexattr ) ; i n t p t h r e a d m u t e x l o c k ( p t h r e a d m u t e x t ∗mutex ) ) ; i n t p t h r e a d m u t e x t r y l o c k ( p t h r e a d m u t e x t ∗mutex ) ; i n t p t h r e a d m u t e x u n l o c k ( p t h r e a d m u t e x t ∗mutex ) ; i n t p t h r e a d m u t e x d e s t r o y ( p t h r e a d m u t e x t ∗mutex ) ; Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 33 / 46 Posix Thread Programming Pthread Synchronization with Mutex Pthreads have two synchronization concepts: mutex and condition variables Mutex: mutual exclusion pthread mutex t array mutex ; i n t add elem ( i n t elem ) { int n; p t h r e a d m u t e x l o c k (& a r r a y m u t e x ) ; i f ( n b e l e m s ==32) { p t h r e a d m u t e x u n l o c k (& a r r a y m u t e x ) ; r e t u r n −1; } a r r a y [ n b e l e m s ++] = e l e m ; n = nbelems ; p t h r e a d m u t e x u n l o c k (& a r r a y m u t e x ) ; return (n ); } i n t main ( ) { pthread mutexattr t attr ; p t h r e a d m u t e x a t t r i n i t (& a t t r ) ; p t h r e a d m u t e x i n i t (& a r r a y m u t e x , &a t t r ) ;... p t h r e a d m u t e x d e s t r o y (& a r r a y m u t e x ) ; } Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 34 / 46 Java Thread Programming Creating Java Threads A Java thread is a class which inherits from Thread You must overload its run() method: p u b l i c c l a s s MyThread e x t e n d s Thread { p r i v a t e i n t argument ; MyThread ( i n t a r g ) { a r g ument = a r g ; } p u b l i c void run ( ) { System. o u t. p r i n t l n ( ”New t h r e a d s t a r t e d ! a r g=” + argument ) ; } } To start the thread: MyThread t = new MyThread ( 4 2 ) ; t. start (); Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 35 / 46 Java Thread Programming Stopping Java Threads A Java thread stops when its run() method returns You do not need to join() for a Java thread to finish MyThread t = new MyThread ( 4 2 ) ; t. start ();... t. join (); Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 36 / 46 Java Thread Programming Java Thread Synchronization with Monitors A monitor is similar to a mutex: public c l a s s AnotherClass { s y n c h r o n i z e d p u b l i c v o i d methodOne ( ) {... } s y n c h r o n i z e d p u b l i c v o i d methodTwo ( ) {... } p u b l i c v o i d methodThree ( ) {... } } Each object contains one mutex, which is locked when entering a synchronized method and unlocked when leaving Two synchronized methods from the same object cannot be executing simultaneously Two different objects from the same class can be executing the same synchronized method simultaneously Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 37 / 46 Java Thread Programming Java Thread Synchronization with Monitors So, the previous class is equivalent to: public c l a s s AnotherClass { p r i v a t e Mutex mutex ; p u b l i c v o i d methodOne ( ) { mutex. l o c k ( ) ;... ; mutex. u n l o c k ( ) ; } p u b l i c v o i d methodTwo ( ) { mutex. l o c k ( ) ;... ; mutex. u n l o c k ( ) ; } p u b l i c v o i d methodThree ( ) {... } }... e x c e p t t h a t t h e Mutex c l a s s d o e s n o t e x i s t ! Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 38 / 46 Java Thread Programming Condition Variables in Java There is no real condition variable in Java But you can explicitly block a thread All Java classes inherit from class Object: c l a s s Object { void wait ( ) ; /∗ b l o c k s t h e c a l l i n g t h r e a d ∗/ void notify ( ) ; /∗ u n b l o c k s one t h r e a d b l o c k e d i n t h i s o b j e c t ∗/ v o i d n o t i f y A l l ( ) ; /∗ u n b l o c k s a l l t h r e a d s b l o c k e d i n t h e o b j e c t ∗/... }; wait(): causes current thread to wait until another thread invokes the notify() method or the notifyAll() method for this object. notify(): wakes up a single thread that is waiting on this object’s monitor. notifyAll(): wakes up all threads that are waiting on this object’s monitor. Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 39 / 46 Java Thread Programming Condition Variables in Java This means that each object contains one (and no more) condition variable The wait(), notify() and notifyAll() methods must be called inside a monitor Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 40 / 46 Golang — Goroutine Goroutine Goroutine : is a function or method which executes independently and simultaneously in connection with any other Goroutines present in your program. Every concurrently executing activity in Go language is known as a Goroutines. A goroutine is a lightweight thread managed by the Go runtime. Goroutines run in the same address space, so access to shared memory must be synchronized. Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 41 / 46 Golang — Goroutine Goroutine Vs Thread Goroutine Thread Goroutines are managed by the Operating system threads are go runtime. managed by kernal. Goroutine are not hardware de- Threads are hardware depen- pendent. dent. Threads are hardware depen- Thread does not have easy dent. communication medium. Due to the presence of channel Due to lack of easy communi- one cation goroutine can communicate medium inter-threads commu- with other goroutine with low nicate takes place with high la- latency. tency. Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 42 / 46 Golang — Goroutine Goroutine Vs Thread Goroutine Thread Goroutine does not have ID be- Threads have their own unique cause go does not have Thread ID because they have Thread Local Storage. Local Storage Goroutines are cheaper than The cost of threads are higher threads. than goroutine. They are cooperatively sched- They are preemptively sched- uled. uled They have faster startup time They have slow startup time than threads. than goroutines. Goroutine has growable seg- Threads does not have grow- mented stacks. able segmented stacks. Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 43 / 46 Golang — Goroutine Goroutine Example p a c k a g e main import ( ” fmt ” ” time ” ) func say ( s s t r i n g ) { f o r i := 0 ; i < 5 ; i++ { time. S l e e p (100 ∗ time. M i l l i s e c o n d ) fmt. P r i n t l n ( s ) } } f u n c main ( ) { go s a y ( ” w o r l d ” ) say ( ” h e l l o ” ) } A goroutine is a lightweight thread managed by the Go runtime. Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 44 / 46 Virtualization Virtualization Virtualization is becoming increasingly important: Hardware changes faster than software Ease of portability and code migration Isolation of failing or attacked components Program Interface A Program Implementation of mimicking A on B Interface A Interface B Hardware/software system A Hardware/software system B (a) (b) Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 45 / 46 Virtualization Architecture of VMs Virtualization Virtualization can take place at very different levels, strongly depending on the interfaces as offered by various systems components: Library functions Application Library System calls Privileged Operating system General instructions instructions Hardware Wondimagegn D. (AAIT ) Distributed System Programming December 5, 2022 46 / 46