COMP 3000 Notes PDF - Operating Systems Concepts

Document Details

EarnestAntigorite5965

Uploaded by EarnestAntigorite5965

Carleton University

Tags

operating systems processes file systems system calls

Summary

These notes, likely for COMP 3000 at an undergraduate level, cover fundamental concepts in operating systems. Topics detailed include resource management, processes, files, and system calls. Key concepts such as abstraction, system calls, and multitasking are described.

Full Transcript

1. Resource Management & Abstraction ​ The operating system (OS) manages hardware resources like CPU, RAM, and storage. ​ Abstraction simplifies hardware complexities, making it easier for programs to interact with resources without knowing low-level details. Benefits of Abstraction:...

1. Resource Management & Abstraction ​ The operating system (OS) manages hardware resources like CPU, RAM, and storage. ​ Abstraction simplifies hardware complexities, making it easier for programs to interact with resources without knowing low-level details. Benefits of Abstraction: ​ Portability – Programs can run on different hardware without modification. ​ Security – Abstracting resources prevents direct access to hardware, reducing risks. ​ Efficiency – The OS optimizes resource allocation dynamically. 2. Preemptive Multitasking ​ The OS can interrupt a running process to switch to another, ensuring fair CPU usage. ​ Enables multi-user environments and prevents any single program from monopolizing the system. 3. Monolithic vs. Microkernels ​ Monolithic Kernel: ○​ Everything (device drivers, memory management, file system, etc.) runs in kernel mode (Ring 0). ○​ Pros: Faster execution, fewer context switches. ○​ Cons: Less modular, a single failure can crash the whole system. ​ Microkernel: ○​ Only essential functions (IPC, scheduling, memory) run in kernel mode. ○​ Other services (device drivers, file systems) run in user space. ○​ Pros: More modular, more secure (failures don’t crash the whole system). ○​ Cons: More IPC overhead, slower than monolithic kernels. ​ FUSE/udev/sshfs: ○​ Example of moving services to user space in microkernel-like designs. ○​ FUSE (Filesystem in Userspace) allows users to create custom filesystems without kernel modifications. ○​ udev manages device nodes dynamically in user space. ○​ sshfs allows remote filesystem mounting over SSH without kernel modifications. 4. Command-Line, Text UI, GUI ​ Different ways for users to interact with the OS: ○​ Command-line: Powerful, scriptable, but has a learning curve. ○​ Text UI: Menu-based interfaces (e.g., ncurses programs like htop). ○​ GUI: Graphical desktops, user-friendly but more resource-intensive. 5. Kernel Mode (Ring 0) vs. User Mode ​ Kernel Mode (Ring 0): ○​ Full access to hardware. ○​ Runs critical OS functions (e.g., memory management, device drivers). ​ User Mode: ○​ Restricted access, can only make system calls to interact with hardware. ○​ Protects system stability by preventing direct hardware access. HW-enforced: CPU enforces separation using privilege levels (e.g., x86 rings). 6. Processes & PIDs ​ Process = A running instance of a program. ​ PID (Process ID) = Unique identifier assigned by the OS. ​ Difference from Program Binaries: ○​ A program binary (e.g., /bin/ls) is just a file. ○​ When executed, it becomes a process with its own PID, memory, and state. 7. Process Lifecycle & Symbols ​ Process states: ○​ New → Ready → Running → Waiting (I/O) → Terminated. ​ Symbols: Represent different states in tools like ps (R for running, S for sleeping, etc.). 8. Address Space Layout Each process gets a virtual address space: ​ Code segment – Stores executable instructions. ​ Data segment – Stores global/static variables. ​ Heap – For dynamic memory (e.g., malloc in C). ​ Stack – Stores function calls & local variables. ​ Arg/env – Stores command-line arguments & environment variables. 9. Users & UIDs ​ Every user has a UID (User ID). ​ Root (UID = 0) has full control over the system. ​ UID is just a software label – not hardware-enforced, but controlled by the OS. 10. Files & File Systems ​ File = Basic unit of data storage. ​ File System = Structure that organizes files (e.g., ext4, NTFS). ​ Provides abstraction for different storage devices (HDD, SSD, USB, etc.). 11. Library Calls vs. System Calls ​ Library Calls (lib calls): Functions provided by libraries (printf(), malloc()). ​ System Calls (sys calls): Direct OS requests (open(), read(), write()). Key Difference: ​ Lib calls run in user space. ​ Sys calls require a mode switch to kernel space. 12. Abstraction & Portability ​ Higher-level abstractions make software more portable across different OS/hardware. ​ Example: Writing in C (higher abstraction) makes code work on both Windows & Linux, whereas assembly (low-level) is CPU-specific. ==Tut1== 1. Environment Variables ​ Key-value pairs stored in the shell/process environment. ​ Used for configuration (e.g., $PATH, $HOME, $USER). ​ Global vs. Local: ○​ Global (exported, available to all child processes) – export VAR=value ○​ Local (only for the current shell) – VAR=value 2. Internal vs. External Commands ​ Internal (built-in): Executed directly by the shell (e.g., cd, echo, export). ​ External: Separate binaries stored in system directories (e.g., /bin/ls, /usr/bin/cat). ​ Internal commands are faster (no separate process needed). ==Lec3, Lec4 - Abstraction== 3. Execution Types ​ Direct Execution: Running code directly on the CPU (e.g., kernel code, native processes). ​ Indirect Execution: Requires an interpreter (e.g., Python scripts, Java bytecode). ​ Limited Direct Execution: OS allows user programs to run but enforces control using interrupts (e.g., system calls, preemptive multitasking). 4. Execution Context & Context Switching ​ Execution Context = State of a process (registers, memory, open files, etc.). ​ Context Switch: ○​ When the CPU switches from one process to another. ○​ Saves the state of the current process and loads the next one. ○​ Expensive because it involves saving/restoring registers, TLB flushes, etc. 5. Scheduling & Principles ​ The OS decides which process/thread gets CPU time based on scheduling policies. ​ Common scheduling strategies: ○​ FIFO (First-In-First-Out) – Simple but unfair. ○​ Round Robin – Each process gets a time slice. ○​ Priority Scheduling – Higher priority processes run first. ○​ Multilevel Queue – Different queues for different process types (e.g., system vs. user). 6. fork() & exec() – Why Separate? ​ fork(): Creates a new process by duplicating the parent. ​ exec(): Replaces the process memory with a new program. Why separate? ​ Flexibility: A child process can modify its state before calling exec(). ​ fork() is useful for parallel execution, while exec() allows replacing the current process. 7. Threads vs. Processes ​ Processes have separate memory spaces. ​ Threads within a process share memory (stack, heap). Pros of threads: ​ Faster context switching (less overhead than processes). ​ Easier communication (shared memory, no need for IPC). Cons: ​ Risk of race conditions (threads modifying shared data simultaneously). 8. pthreads vs. clone() ​ pthreads (POSIX threads): ○​ High-level API for threading. ○​ Works across different OS implementations. ​ clone(): ○​ A low-level system call that creates a process or thread. ○​ More control, but harder to use than pthreads. 9. Paging & Swapping ​ Paging: ○​ Memory is divided into fixed-size pages. ○​ Virtual memory is mapped to physical memory using a page table. ​ Swapping: ○​ Moves entire processes from RAM to disk when memory is full. ○​ Slower than paging (more data moved at once). 10. Role of the Kernel in Abstraction ​ Provides a simplified interface for hardware (CPU, memory, I/O). ​ Manages execution (processes, scheduling). ​ Handles memory (paging, swapping). ​ Controls access (user/kernel mode, permissions). ==Tut2== 1. Memory Allocation & Symbols ​ Memory allocation: ○​ Static (compile time) – Fixed size, declared variables. ○​ Stack (runtime) – Function calls, local variables. ○​ Heap (runtime) – Dynamic allocation (malloc/free, new/delete). ​ Symbols: ○​ Represent functions, variables, etc., in compiled programs. ○​ Can be seen using nm on a binary file. ○​ Some symbols exist only at compile-time, others at runtime. 2. libcall → syscall: Differences, Consequences ​ Lib calls (Library Calls) ○​ Calls functions from libraries (e.g., printf(), malloc()). ○​ Runs in user mode (no kernel involvement unless needed). ​ Syscalls (System Calls) ○​ Directly request OS services (e.g., open(), read(), write()). ○​ Triggers context switch into kernel mode (more overhead). ​ Consequences: ○​ Lib calls are faster (stay in user mode). ○​ Syscalls have more overhead but direct hardware access. ○​ Compile-time: Library functions are resolved/linked. ○​ Load-time: Dynamic linking loads shared libraries. ○​ Runtime: Actual execution, syscall transitions to kernel mode. ==Lec5, Lec6 - Facilities== 3. Steps for Talking to a Computer 1.​ Terminal (/sbin/getty) ○​ getty starts when a terminal is opened, waiting for login. 2.​ Login (/usr/bin/login) ○​ Handles user authentication. ○​ Reads credentials from /etc/passwd and /etc/shadow. 3.​ Shell (/bin/bash, custom shell like 3000shell) ○​ Starts after login. ○​ Interprets user commands. 4. User Account Info ​ /etc/passwd: Stores usernames, UIDs, GIDs, home directories, shell paths. ​ /etc/shadow: Stores hashed passwords (more secure). 5. UID, EUID, GID, EGID, setuid ​ UID (User ID): Identifies a user. ​ EUID (Effective UID): Determines actual privileges. ○​ If a program has setuid enabled, it can run with elevated privileges (e.g., sudo). ​ GID (Group ID), EGID (Effective GID): Same as UID but for groups. 6. File Permission Bits & Notation Conversion ​ Octal Notation: ○​ rwxr-xr-- → 754 (Owner: rwx=7, Group: r-x=5, Others: r--=4). ​ Symbolic Notation: ○​ chmod u+x file → Adds execute permission to the user. ​ Special Bits: ○​ setuid (chmod u+s): Runs as the file’s owner. ○​ setgid (chmod g+s): Runs with the group’s permissions. ○​ sticky bit (chmod +t): Only the owner can delete files in a directory. 7. Workflow of a Shell ​ Reads input (command line). ​ Parses and tokenizes commands. ​ Expands wildcards (*), variables ($HOME). ​ Determines if it's built-in (cd, export) or external (ls, grep). ​ If external, forks a child process. ​ Uses exec() to replace process memory with the command. ​ Waits for process completion. 8. init (PID 1): Before Any User Process ​ First process started by the kernel. ​ Manages system initialization, background services. ​ Modern systems use systemd, SysV init, or OpenRC. 9. Zombie State & Reaping ​ Zombie Process: ○​ A process that has finished executing but still has an entry in the process table. ○​ Happens if the parent does not call wait() to clean up. ​ Reaping: ○​ The OS (or parent process) removes the zombie when wait() is called. 10. Signals & Concurrency Issues ​ Signals: Asynchronous notifications sent to processes (e.g., SIGKILL, SIGSTOP). ​ Predefined, no payload: Can’t send extra data, only a signal number. ​ Concurrency issue: If a signal is sent before a process is ready, it might get lost. ​ SA_RESTART: Ensures interrupted system calls are retried instead of failing. 11. Signal Handling (sigaction()), Sending Signals (kill()) ​ sigaction(): Sets a signal handler for a process. ​ kill(pid, SIGTERM): Sends a termination signal to a process. 12. Shell Pipeline (|) & Redirection (, >>) ​ Pipeline (|): ○​ Chains commands together by passing stdout of one to stdin of the next. ○​ Example: cat file.txt | grep "error" | wc -l. ​ Redirection: ○​ < file – Reads input from a file. ○​ > file – Overwrites a file with output. ○​ >> file – Appends output to a file. 13. Ways to Provide Input (Args, Stdin, Files, etc.) ​ Command-line args (./script arg1 arg2). ​ Piped input (echo "Hello" | myprogram). ​ Files (myprogram < input.txt). ​ Standard input (stdin) – Reads input dynamically. 14. Path, Pathname, Filename ​ Path: Location of a file. ​ Filename: The name of a file (without the path). ​ Absolute Pathname: Full path from root (/home/user/file.txt). ​ Relative Pathname: Based on the current directory (./file.txt). ​ ~ → Home directory, / → Root directory. 15. open(), read(), write(), ioctl() ​ open(): Opens a file descriptor. ​ read()/write(): Reads/writes data from/to a file descriptor. ​ ioctl(): Sends control commands to devices (e.g., change terminal settings). 16. Mounting a File System (mount & mountpoint) ​ Mounting: Attaching a filesystem (USB, network drive) to a directory. ​ mount /dev/sdb1 /mnt/usb – Mounts the device at /mnt/usb. ​ mountpoint /mnt/usb – Checks if a directory is a mount point. ==Tut3 - Shell & Standard I/O== 1. The Shell ​ A command-line interpreter that allows users to interact with the OS. ​ Examples: bash, zsh, fish, sh. ​ Executes internal commands (built into the shell) and external commands (separate executables). ​ Provides features like variables, scripting, pipes, redirection, and job control. 2. /dev/stdout, /dev/stdin, /dev/stderr ​ Special device files that represent standard input, output, and error streams. They are just symbolic links:​ bash​ CopyEdit​ ls -l /dev/std* Example output:​ swift​ CopyEdit​ lrwxrwxrwx 1 root root 15 Feb 1 10:20 /dev/stderr -> /proc/self/fd/2 lrwxrwxrwx 1 root root 15 Feb 1 10:20 /dev/stdin -> /proc/self/fd/0 lrwxrwxrwx 1 root root 15 Feb 1 10:20 /dev/stdout -> /proc/self/fd/1 ​ ​ What they mean: ○​ /dev/stdin → Standard input (file descriptor 0) ○​ /dev/stdout → Standard output (file descriptor 1) ○​ /dev/stderr → Standard error (file descriptor 2) Usage Example: bash CopyEdit echo "Hello" > /dev/stdout # Same as echo "Hello" cat /dev/stdin # Reads from keyboard ls non_existing_file 2> /dev/stderr # Sends error to stderr ==Tut4 - Login & Environment Variables== 1. The login Process When you log into a Linux system, the process follows these steps: 1.​ getty (or agetty) starts and displays a login prompt. 2.​ You enter a username → getty hands control to /usr/bin/login. 3.​ login checks credentials in /etc/passwd and /etc/shadow. 4.​ If authentication is successful, it starts a shell (e.g., /bin/bash). 2. /etc/passwd - User Account Information ​ Contains basic user information (but not passwords). Format:​ ruby​ CopyEdit​ username:x:UID:GID:comment:home_directory:shell ​ Example entry:​ ruby​ CopyEdit​ alice:x:1001:1001:Alice User:/home/alice:/bin/bash ​ ○​ alice → Username ○​ x → Password is stored in /etc/shadow (for security). ○​ 1001 → User ID (UID) ○​ 1001 → Group ID (GID) ○​ Alice User → Description ○​ /home/alice → Home directory ○​ /bin/bash → Default shell Check your UID/GID: bash CopyEdit id 3. Understanding the "Environment" ​ The environment consists of variables that affect shell processes. View all environment variables:​ bash​ CopyEdit​ env ​ Examples of common variables:​ bash​ CopyEdit​ echo $HOME # User's home directory echo $PATH # Directories where the shell looks for commands echo $SHELL # Default shell echo $USER # Current username echo $PWD # Current working directory ​ Modify environment variables:​ bash​ CopyEdit​ export MY_VAR="Hello" ​ echo $MY_VAR ==Lec7, Lec8, Lec9, Lec10 - Filesystems== 1. RAM vs. Storage (I/O Operations) ​ RAM is fast, volatile memory (data is lost on power-off). ​ Storage (HDD/SSD) is non-volatile but much slower than RAM. ​ I/O operations refer to reading/writing data between RAM and storage. 2. Block Device Layer & Block Size ​ Storage is managed in blocks, the smallest unit of disk I/O. ​ The block device layer abstracts hardware differences and allows buffering of read/write operations. 3. File Types & Inodes ​ File types: Regular files, directories, symbolic links, device files, sockets, FIFOs (named pipes). ​ Inodes store metadata about files, such as: ○​ File type ○​ Permissions ○​ Owner (UID/GID) ○​ Size ○​ Data block pointers (where the file's actual data is stored) 4. File Descriptors (FDs) ​ Per-process file descriptors: Each process tracks open files using an integer ID. Standard file descriptors:​ lua​ CopyEdit​ stdin (0) - Standard input stdout (1) - Standard output stderr (2) - Standard error ​ 5. File System Structure ​ A file system is composed of: ○​ Superblock: Contains metadata about the file system itself. ○​ Inodes: Store file metadata and pointers to data blocks. ○​ Data blocks: Hold the actual file content. 6. Directories & Dentries ​ Dentry (directory entry): Maps a file name to an inode number. ​ A directory is a file that contains dentries (mappings of file names to inodes). 7. Hard Links vs. Symbolic Links ​ Hard Link: Another name for the same inode (i.e., multiple names pointing to the same file). ​ Symbolic (Soft) Link: A separate file that stores the pathname of the target file. 8. Special Files & Device Files ​ Special files exist only in the Virtual File System (VFS), not on a physical disk. ​ Device files are stored in /dev and represent hardware: ○​ Character devices (e.g., /dev/tty, /dev/random): Handle one character at a time. ○​ Block devices (e.g., /dev/sda, /dev/loop0): Handle data in fixed-size blocks. 9. Physical Size vs. Logical Size ​ Logical size: The amount of data a file contains. ​ Physical size: The actual space it occupies on disk. ○​ If l > p, the file is sparse (contains "holes"). ○​ If p > l, disk fragmentation or compression may be involved. 10. File System Crash Consistency & Recovery ​ Crash consistency: Ensuring data integrity after crashes (e.g., journaling file systems like ext4). ​ Recovery methods: ○​ fsck (file system consistency check) repairs disk inconsistencies. ○​ Data recovery tools attempt to restore deleted files. ○​ Device repair involves replacing faulty hardware. 11. Special File Systems & User-Space File Systems ​ Special file systems (e.g., procfs, sysfs, tmpfs) don’t store regular files but provide OS information. ​ User-space file systems (e.g., FUSE, sshfs) allow implementing custom file systems without modifying the kernel. ==Connections== 1. External vs. Internal Fragmentation ​ External Fragmentation: Free space is split into small, unusable chunks. ​ Internal Fragmentation: Allocated blocks contain unused space. 2. Memory vs. Storage Allocation ​ Memory uses pages (fixed-size units, typically 4KB). ​ Storage uses blocks (also fixed-size, varies by file system). 3. Dynamic vs. Static Linking ​ Static linking: All dependencies are included in the binary (larger size, independent). ​ Dynamic linking: The program loads shared libraries at runtime (smaller, but depends on system libraries). 4. Per-Process Properties Each process has: ​ Environment variables (env, $HOME, $PATH, etc.). ​ File descriptors (open files, pipes, sockets). ​ Current working directory (cwd) determining relative path behavior. 5. Reading Command Output Understanding command outputs like ls, stat, df, du, etc., helps diagnose file system and storage issues.

Use Quizgecko on...
Browser
Browser