Chapter 7: User-Kernel Interactions And User Space Components PDF

Summary

These lecture notes cover Chapter 7 on User-Kernel Interactions and User Space Components, in the context of operating systems and system software. The topics include process implementation, user-kernel interactions, and shared libraries.

Full Transcript

333 Chapter 7: User-Kernel Interactions And User Space Components Operating Systems and System So ware ...

333 Chapter 7: User-Kernel Interactions And User Space Components Operating Systems and System So ware 334 Overview In this lecture, we will discuss two main topics: 1. Process implementation We will have a look back at how processes are implemented in an operating system kernel, with everything we learned since the first lectures We will discuss interactions between user space and the kernel 2. User space operating system components Shared libraries Operating Systems and System So ware 335 User-Kernel Interactions Operating Systems and System So ware 336 Process Implementation, Again As we’ve seen earlier, a process is represented in the kernel as its process control block (PCB). From the user perspective, programs manipulate: Integer values for runtime components + system calls Virtual addresses directly for memory Indexes in the file descriptor table for files Operating Systems and System So ware 337 Signals - An End To End Example Let’s assume we have two processes 1 and 2 on our system. 1. Process 1 sends a STOP signal to process 2 with the system call kill(234, SIGSTOP), switching into supervisor mode 2. In the kernel code, the bit corresponding to SIGSTOP in process 2’s pending signals is set to 1 3. The kill() system call returns, switching back to user mode, and process 1 keeps running 4. When process 2 wakes up at some point in the future, the kernel code checks the pending signals and masked signals 5. Signal SIGSTOP is pending and not masked, the kernel wakes up process 2 in its signal handler 6. Process 2 wakes up, executes the signal handler, and returns into the kernel with a syscall to acknowledge the signal 7. The kernel restores process 2 to its initial state (before the signal handler), and returns to user space Operating Systems and System So ware 338 Pipes - An End To End Example Let’s assume we have two processes that communicate through a pipe. 1. Process 1 opens the pipe in write mode (fd = 3), while process 2 opens it in read mode (fd = 4). 2. Process 1 writes "foo" in the pipe with the system call write(3, "foo", 4): a. The process switches to supervisor mode, goes through the file table at index 3, then the open file descriptor which points to the in-kernel pipe buffer. b. The process writes in the buffer at the proper place. The kernel code handles the synchronization with the reader. If the buffer is full, the thread blocks until it can write into the buffer. c. When the data is written into the buffer, the process returns to user mode and the program continues its execution. 3. Process 2 reads from the pipe with the system call read(4, &buf, 10)): a. The process switches to supervisor mode, goes through the file table at index 4, then the open file descriptor which points to the in-kernel pipe buffer. b. The process reads in the buffer at the proper place. The kernel code handles the synchronization with the writer. If the buffer is empty, the thread blocks until data is available. c. When the data is read from the buffer, the process returns to user mode and the program continues its execution. Operating Systems and System So ware 339 Files - An End To End Example Let’s assume one process P manipulating a file. 1. Process P is created to run a given program P is forked from its parent process, copying most of its PCB P calls exec() to change to a new program Three file descriptors are opened by default: stdin, stdout and stderr 2. P opens the file "foo.txt" in write-only mode The open() system call switches to supervisor mode An inode is created through the file system, allocating necessary blocks on the device A file descriptor that points to this inode is created with the right mode and offset 0 The file descriptor is added into the file table of P ’s PCB The process switches back to user mode, returning from the system call 3. P writes "barfizz\n" into the file "foo.txt" The write() system call switches to supervisor mode The kernel code goes to the fd index in the file table and reaches the file descriptor The file system code finds the block on the partition where the data needs to be written, using the inode and the offset in the file descriptor The device driver writes the data into the proper block/sector The system call returns to user mode Operating Systems and System So ware 340 Shared Libraries Operating Systems and System So ware 341 The Need For Shared Libraries Some libraries are used by most/all programs. For example, most languages have a standard library, e.g., the libc for C. If we embed this library code in every binary ELF file, i.e., static linking, we will duplicate the same code multiple times in memory. For example, the code of printf() or any system call wrapper would be loaded in memory by every binary program using it! To avoid wasting memory by duplicating these code sections, you can build shared libraries! These libraries are loaded once in memory, and then shared by all processes using them through memory mappings, i.e., mmap(). Operating Systems and System So ware 342 Shared Libraries And Function Resolution Since shared libraries are mapped into each process’ virtual address space, the same function might have different virtual addresses! Each process will have its own mapping, created when the process starts. This means that: A shared function’s address varies across processes A shared function’s address is not known at compile time, but at run time only For shared libraries to work, we need a way to resolve the address of a function when a process starts. When starting a program, the dynamic linker is tasked to load and link shared libraries and functions into the process’ virtual address space. Operating Systems and System So ware 343 Dynamic Linker In UNIX systems using the ELF executable format, one of the first things an executable does is run the linker code to: 1. Load the binary in memory 2. Setup the memory layout 3. Load shared libraries needed by the program in memory 4. Start the program In order to perform the linking operations, the dynamic linker uses information available in the ELF header, mainly: The Procedure Linkage Table (PLT) that contains stubs to jump to the proper address when calling a function from a shared library The Global Offset Table (GOT) that contains relocations between symbols in the program and addresses in shared objects There are two strategies to resolve dynamic relocations: Eager binding Lazy binding Operating Systems and System So ware 344 Eager Binding Eager binding, or immediate binding, consists in resolving all relocations when the program starts. When the linker loads the program in memory, it also goes through all the symbols used by the program, e.g., functions, variables, located in shared libraries, by querying information from the ELF headers. For each symbol, it modifies the corresponding entry in the GOT. When a shared library function is called, it will jump to the corresponding PLT entry that will jump to the address located in the GOT. Example: 1 #include 2 3 int main(void) { 4 printf("foo\n"); // printf is located in a shared library 5 6 return 0; 7 } compiles to: 1 0000000000001139 : 2 1139: 55 push %rbp 3 113a: 48 89 e5 mov %rsp,%rbp 4 113d: 48 8d 05 c0 0e 00 00 lea 0xec0(%rip),%rax # 2004 5 1144: 48 89 c7 mov %rax,%rdi 6 1147: e8 e4 fe ff ff call 1030 # call to printf 7 114c: b8 00 00 00 00 mov $0x0,%eax 8 1151: 5d pop %rbp 9 1152: c3 ret 10 11 0000000000001030 : 12 1030: ff 25 ca 2f 00 00 jmp *0x2fca(%rip) # 4000 (the address of puts in the GOT) 13 1036: c3 ret Operating Systems and System So ware 345 Lazy Binding Lazy binding, or deferred binding, resolves the relocation on the first call to the shared function. When a function is called, it jumps to the PLT entry, which in turn jumps to the value in the GOT corresponding to the called function. However, the correct value is not known yet, so the default value points to the relocation resolver code from the linker. This code will find the real address of the shared function and patch the GOT. Future calls to the shared functions will directly jump to the correct address through the GOT. Example: With the same C code as earlier: 1 0000000000001139 : 2 1139: 55 push %rbp 3 113a: 48 89 e5 mov %rsp,%rbp 4 113d: 48 8d 05 c0 0e 00 00 lea 0xec0(%rip),%rax # 2004 5 1144: 48 89 c7 mov %rax,%rdi 6 1147: e8 e4 fe ff ff call 1030 # call to printf 7 114c: b8 00 00 00 00 mov $0x0,%eax 8 1151: 5d pop %rbp 9 1152: c3 ret 10 11 0000000000001020 : 12 1020: ff 35 ca 2f 00 00 push 0x2fca(%rip) # 3ff0 13 1026: ff 25 cc 2f 00 00 jmp *0x2fcc(%rip) # 3ff8 14 102c: 0f 1f 40 00 nopl 0x0(%rax) 15 16 0000000000001030 : 17 1030: ff 25 ca 2f 00 00 jmp *0x2fca(%rip) # 4000 18 1036: 68 00 00 00 00 push $0x0 19 103b: e9 e0 ff ff ff jmp 1020 Operating Systems and System So ware

Use Quizgecko on...
Browser
Browser