Virtualization PDF
Document Details
Uploaded by WorthJasper9548
Radboud University
Tags
Summary
This document provides a comprehensive overview of virtualization concepts, including important aspects like sandboxing and containerization. It explains how these techniques help isolate processes and improve efficiency in software development and deployment.
Full Transcript
VIRTUALIZATION The last two bullet points in the image highlight two categories of attacks or issues that operating systems (OS) must prevent to maintain the system's stability and security. Here's a detailed explanation: Stealing service This refe...
VIRTUALIZATION The last two bullet points in the image highlight two categories of attacks or issues that operating systems (OS) must prevent to maintain the system's stability and security. Here's a detailed explanation: Stealing service This refers to unauthorized use of system resources for unintended or malicious purposes. Examples include: 1. Cryptominers: Malicious programs that use the computing power of a system to mine cryptocurrencies without the owner's knowledge or consent. 2. Abusing free Continuous Integration (CI) tiers: CI services (like GitHub Actions or CircleCI) often offer free usage tiers for running software builds or tests. Attackers exploit these free resources to run unrelated, resource-intensive tasks (e.g., mining cryptocurrency or other heavy computations), effectively stealing the service. Denying service This category focuses on attacks or scenarios where resources are consumed in a way that makes the system or its services unavailable to legitimate users. Examples include: 1. Fork bombs (e.g., Morris worm): A fork bomb is a type of denial-of-service attack where a process continuously replicates itself, rapidly consuming all available system resources (like CPU and memory), causing the system to become unresponsive. 2. Zip bombs: These are malicious archive les (e.g., ZIP les) designed to expand to a massive size when decompressed, overwhelming the system's storage or processing capability. 3. Users killing each other's processes: In systems where users have inappropriate permissions, one user might terminate processes owned by others, disrupting their work ows and making services unavailable. The OS plays a critical role in enforcing policies and mechanisms to prevent these kinds of misuse or attacks, ensuring fair use of resources and system stability. This slide explains two important concepts in modern computing—sandboxing and containerization—that are often confused or used interchangeably. Let's break it down clearly. 1. Sandboxing De nition: Sandboxing refers to securely isolating one or more processes so that they cannot interfere with or harm the rest of the system. Purpose: The primary goal is to create a "safe zone" where untrusted code or applications can run without risking the security, integrity, or functionality of the host system. fi fl fi fi Key Features: ◦ Isolation: Each sandboxed process is kept separate from the rest of the system, with limited access to system resources. ◦ Protection: Even if the process is malicious or misbehaves, it is con ned to the sandbox, preventing it from damaging the OS or other processes. ◦ Example Use Cases: ▪ Running web browsers or downloaded applications in a secure environment. ▪ Testing untrusted software without risking the entire system. Challenges: ◦ Hard to Implement: Achieving strong sandboxing requires precise control over resource access, permissions, and communication between processes. This complexity makes effective sandboxing dif cult. 2. Containerization De nition: Containerization is the process of packaging an application and all its dependencies into a lightweight, portable runtime image. This makes it easy to deploy and run consistently across different environments, such as local machines, data centers, or the cloud. Purpose: To ensure that applications can run reliably in any environment by encapsulating everything they need (libraries, binaries, etc.) in a self-contained "container." Key Features: ◦ Portability: Containers can run on any system with the appropriate container runtime (e.g., Docker). ◦ Reproducibility: Applications behave the same way in development, testing, and production environments. ◦ Ef ciency: Containers share the host OS kernel, making them lighter and faster than virtual machines. ◦ Example Technologies: ▪ Docker ▪ Snap ▪ Flatpak ▪ AppImage Notable Difference from Sandboxing: Containerization primarily focuses on application packaging and portability rather than strict security isolation. Common Confusion Between the Two Sandboxing and containerization rely on similar underlying technologies, such as kernel namespaces and cgroups in Linux, which blur the lines between the two concepts. However, they serve distinct purposes: Sandboxing focuses on security and isolation. Containerization focuses on portability and reproducibility. fi fi fi fi Important Points in the Slide 1. Containerization ≠ Sandboxing: ◦ Tools like Docker, Snap, Flatpak, and AppImage were designed with containerization in mind, not sandboxing. Their goal was to make applications portable and easy to deploy. ◦ While containers provide some level of isolation, they do not inherently offer the strong security guarantees of sandboxing. 2. Sandboxing is Hard: ◦ Achieving true sandboxing is challenging because it requires careful control of permissions and resource access. Even minor errors can lead to security vulnerabilities. 3. Containerization Without Sandboxing is Irresponsible: ◦ If you use containerization without proper sandboxing, you risk exposing the host system to potential security threats. For example: ▪ A compromised container could affect the entire system. ▪ Miscon gured containers might inadvertently leak sensitive data or access resources they shouldn’t. Key Takeaway While both sandboxing and containerization are valuable technologies, they serve different purposes. Effective sandboxing adds a critical layer of security that should not be overlooked when using containerization. Combining the two ensures both portability and safety. Explanation of Namespaces in Simple Terms Namespaces are a feature in modern operating systems (especially Linux) that allow the kernel to create isolated environments for processes. Each namespace provides a customized view of certain parts of the system for the processes running inside it. What Do Namespaces Do? Normally, all processes in an operating system share the same "global" view of system resources, like the lesystem, network interfaces, and process IDs. A namespace changes this behavior by isolating what a process can see and interact with. Processes inside a namespace get their own private view of certain resources, as if they are running in their own mini-system. How Does This Help? Namespaces are especially useful in creating containers (e.g., Docker, Kubernetes), where you want processes to think they are running on their own system, even though they are sharing the same physical machine. How Do Namespaces Provide Isolation? fi fi The kernel enforces the isolation, ensuring that processes in one namespace cannot see or interact with resources outside their namespace. This isolation makes it seem like processes in a namespace are running on their own system, even though they share the underlying hardware with the host and other namespaces. Why Are Namespaces Important? Process-Level Isolation: Each process or container feels like it’s running in its own independent environment. Security: Processes inside a namespace cannot see or interfere with the host system or other namespaces. Scalability: Namespaces allow multiple containers to run on the same system without interfering with each other Explanation of Mount Namespaces (mnt) in Simple Terms A mount namespace allows processes to have a different view of the lesystem compared to other processes. This is useful for containerization or isolation, where you want each container (or process) to interact only with a speci c set of les and directories while hiding others. Purpose of Mount Namespaces Processes in one namespace can see a customized lesystem hierarchy, which might be different from what processes in another namespace see. This isolation is achieved by manipulating the "mount points" (locations in the lesystem where directories or devices are attached). Mount Types In a mount namespace, each mount point can behave differently based on its type. There are three main types: 1. Shared Mount: ◦ Changes made in the mount point (e.g., mounting or unmounting a device) are visible in all other shared namespaces, and vice versa. ◦ Example: ▪ If a USB drive is mounted in one shared namespace, it automatically becomes visible in all other shared namespaces. ▪ Think of it as two-way synchronization. 2. Slave Mount: ◦ Changes in the original namespace (e.g., mounting or unmounting a device) are re ected in the slave namespace. ◦ However, changes made in the slave namespace do not propagate back to the original namespace. ◦ Example: ▪ If the host system mounts a drive, it becomes visible in the slave namespace. fl fi fi fi fi fi ▪ But if the slave namespace mounts a drive, it stays private to the slave and does not appear in the original namespace. 3. Private Mount: ◦ This is the default type of mount. ◦ Changes in the mount point are completely isolated and do not propagate to any other namespace. ◦ Similarly, changes in other namespaces have no effect on this mount point. ◦ Example: ▪ If a private namespace mounts or unmounts a directory, it’s completely hidden from all other namespaces. Restricting the View with pivot_root and chroot To fully isolate a process's view of the lesystem: 1. pivot_root: ◦ Changes the root directory of a process to a new location (inside the namespace). ◦ This ensures the process cannot "see" les outside the new root. ◦ Example: ▪ If /path/to/container is set as the root, the process will only see les starting from there. 2. chroot: ◦ Similar to pivot_root, it sets a new root directory for the process, but it’s simpler and less secure. ◦ Used to create basic isolation during development or testing. 3. Unmounting the Old Root: ◦ After setting the new root, the original root must be unmounted to completely hide it. Challenges Con guring mount namespaces correctly is hard because: ◦ You need to carefully decide which les and directories should be visible. ◦ Setting up dependencies for programs (e.g., shared libraries, binaries) requires preparation. ◦ Mistakes can lead to unintended resource sharing, breaking isolation. Why Is This Important? Mount namespaces are critical for technologies like Docker and Kubernetes, where you need to isolate containers. They ensure that each container has access only to the speci c les it needs, without interfering with the host system or other containers. Key Takeaway Mount namespaces isolate how processes see and interact with the lesystem. By using shared, slave, or private mounts, you can control whether lesystem changes propagate between fi fi fi fi fi fi fi fi fi namespaces. Advanced tools like pivot_root and chroot ensure complete isolation, but they require careful con guration to get right. Explanation of User (UID) Namespaces in Simple Terms A User Namespace is a feature that isolates user and group IDs (UIDs and GIDs) for processes. This allows processes in one namespace to have a completely different view of user permissions compared to processes in other namespaces or on the host system. Purpose of User Namespaces 1. Isolation of Users: ◦ Processes in one user namespace are unaware of the users in other namespaces. ◦ This means they can only see and interact with users de ned inside their own namespace. 2. Mapping UIDs (User IDs): ◦ A user namespace can map UIDs on the host system (real UIDs) to virtual UIDs inside the namespace. ◦ For example: ▪ A process in the namespace might think it’s running as root (UID 0) inside the namespace, but on the host system, it’s actually mapped to a non-root, unprivileged UID. How UID Mapping Works Real UIDs on the host system are mapped to virtual UIDs inside the namespace. This is done by specifying a range of UIDs that the namespace can use. Example of Mapping: Host Namespace UID UID 1001 0 1002 1 1003 2 In the above example: ◦ Host UID 1001 is mapped to UID 0 (root) inside the namespace. ◦ Host UID 1002 is mapped to UID 1 inside the namespace. ◦ This way, a process inside the namespace can appear to be root while remaining a non-root user outside the namespace. Key Features 1. Full Root Privileges Inside the Namespace: ◦ A process running as UID 0 inside the namespace has full root privileges for operations within the namespace (e.g., le creation, process control). ◦ This allows it to behave like a root user for tasks that are scoped to the namespace. 2. Unprivileged Outside the Namespace: fi fi fi ◦ The same process does not gain root privileges on the host system. ◦ Its host UID determines its actual privileges outside the namespace. ◦ For example: ▪ If it’s mapped to host UID 1001, it can only perform actions permitted to UID 1001 on the host. Bene ts of User Namespaces 1. Enhanced Security: ◦ Processes can be given root-like privileges within a controlled namespace without compromising the security of the host system. ◦ Even if an attacker exploits a vulnerability to gain root inside the namespace, they remain unprivileged on the host. 2. Flexible User Management: ◦ Different namespaces can have different user mappings, allowing multiple isolated environments on the same system. 3. Lightweight Isolation: ◦ Unlike full virtualization, namespaces don’t require separate operating systems, making them ef cient. Example Use Case Imagine a containerized environment: A container runs a web server that needs root privileges to bind to port 80 inside the container. By using a user namespace: ◦ The web server can run as root (UID 0) inside the namespace. ◦ On the host system, this process might be mapped to a regular user (e.g., UID 1001), ensuring it cannot perform privileged operations on the host. Key Challenges 1. Complex Con guration: ◦ Correctly mapping UIDs between the host and the namespace can be challenging. ◦ Mistakes might lead to unintended privilege escalation or failure to isolate users properly. 2. Compatibility Issues: ◦ Some programs may not work well if their expected UIDs are mapped differently in the namespace. Summary User namespaces isolate and virtualize user IDs to provide user-level isolation for processes. fi fi fi Processes can be root inside the namespace but remain unprivileged on the host, enhancing security. UIDs are mapped between the host and namespace, allowing exible control over privileges. This feature is key to lightweight isolation in containerization technologies like Docker and Kubernetes. The UTS (Unix Timesharing System) namespace is a feature of the Linux kernel that isolates system identi ers like the hostname and domain name for processes within a namespace. This allows each namespace to act as though it is running on a completely separate system, even though it is hosted on the same underlying operating system. Purpose of UTS Namespace The UTS namespace enables processes running in isolated environments (like containers) to have their own unique hostname and domain name. This isolation is particularly important for: 1. Customizing system identity: ◦ Processes in different namespaces can be con gured with their own hostnames. ◦ Scripts or programs running inside the namespace use the customized hostname/ domain name as if they were running on a separate machine. 2. Avoiding con icts: ◦ Prevents processes in one namespace from affecting or interacting with the global hostname/domain name of the host system or other namespaces. Key Features of UTS Namespace 1. Hostname Isolation: ◦ Each namespace can have a unique hostname, which is useful for identifying systems in scripts or logs. ◦ For example: ▪ Namespace A: hostname = web-server ▪ Namespace B: hostname = db-server ▪ Host machine: hostname = host-system 2. Domain Name Isolation: ◦ Similarly, each namespace can de ne its own domain name (e.g., example.com), allowing processes in that namespace to operate as though they belong to that domain. ◦ This is particularly useful for network-related con gurations. Use Cases 1. Initialization and Con guration Scripts: ◦ Many con guration scripts rely on the system's hostname or domain name to make decisions about setup or behavior. fi fi fl fi fi fi fi fl ◦ For example: ▪ A script might check if the hostname is web-server to con gure web services. ▪ By isolating the hostname, each namespace can run its own con guration scripts independently of others. 2. Running Namespaced Servers: ◦ Servers (e.g., web servers or databases) often use the system's hostname for their operation or identi cation. ◦ By isolating hostnames with UTS namespaces: ▪ You can run multiple servers on the same physical machine with different hostnames. ▪ Each server "believes" it is operating on a unique system. Why Is This Useful? Containerization: ◦ When running containers (or other isolated environments), the UTS namespace ensures each container has its own unique system identity. ◦ This makes it possible to: ▪ Run multiple containers with different hostnames/domains on the same host. ▪ Avoid con icts where multiple applications depend on speci c hostname con gurations. Testing and Development: ◦ Developers can test applications in environments with different hostnames/domains without needing multiple physical or virtual machines. Network Con gurations: ◦ Network Information Service (NIS) con gurations often rely on the domain name. Isolating domain names ensures namespaces can have independent network setups. Example Scenario 1. Host System: ◦ Global hostname: host-system 2. Namespace A: ◦ UTS namespace hostname: web-server ◦ Processes in Namespace A "see" the hostname as web-server and con gure themselves accordingly. 3. Namespace B: ◦UTS namespace hostname: db-server ◦Processes in Namespace B "see" the hostname as db-server. Despite sharing the same physical machine, these namespaces operate as though they are independent systems. fi fl fi fi fi fi fi fi fi How Does It Work? When a UTS namespace is created: ◦ A copy of the current system's hostname and domain name is made. ◦ Processes inside the namespace can modify these identi ers (e.g., using the sethostname or setdomainname commands). ◦ Changes affect only that namespace and are invisible to the host or other namespaces. Certainly! Let’s break down control groups (cgroups) and their role in resource management in an operating system. Here’s a detailed explanation: Control Groups (cgroups) Control Groups, or cgroups, are a Linux kernel feature used to manage and limit the resources consumed by groups of processes. They enable administrators and developers to enforce resource usage policies on applications or services, ensuring better performance, security, and isolation in multi-tenant environments. Key Features of cgroups 1. Resource Management: cgroups allow the operating system to control how resources like CPU, memory, I/O, and network bandwidth are allocated and used by different processes or groups of processes. 2. Limits and Isolation: You can set upper bounds (limits) for resource usage by speci c groups of processes. This prevents a single group from monopolizing resources and ensures fair sharing among other processes. 3. Proportional Sharing: Resources can be shared proportionally based on weights or con gurations. For example, one application might get more CPU time than another based on its assigned weight. Integration with Namespaces Namespaces isolate the system's global resources (e.g., le systems, network interfaces) for processes. cgroups complement this by managing the resources available to those isolated processes. Together, they provide: Isolation: Namespaces ensure processes can't interfere with each other. Management: cgroups ensure processes use only their allocated resources. Common cgroup Subsystems (Controllers) 1. CPU Controller (cpu) fi fi fi fi Purpose: Ensures fair and weighted access to the CPU. Example: You can assign a weight to each process group so that higher-weight groups get more CPU time. 2. CPU Set Controller (cpuset) Purpose: Restricts which CPU cores a group of processes can run on. Example: Assigning a group to only speci c cores (e.g., cores 2 and 3) can improve cache locality and performance for certain workloads. 3. Memory Controller (memory) Purpose: Sets a maximum memory limit for a group. If a group exceeds its limit, the kernel can reclaim memory or even terminate processes in that group. Example: Ensuring a process doesn’t consume all system memory, which could destabilize the OS. 4. Block I/O Controller (blkio) Purpose: Controls access to storage devices. Groups can be assigned a weighted share of I/ O bandwidth. Example: Ensuring critical processes have priority access to disk operations over less important ones. 5. Network Controller (net_cls and net_prio) Purpose: Manages network bandwidth and traf c priority for groups. Example: Allocating more bandwidth to an application requiring low latency. Use Cases 1. Virtualization and Containers: ◦ cgroups are a core feature of container technologies like Docker and Kubernetes. They ensure containers don’t overuse shared system resources. 2. Multi-Tenant Systems: ◦ In cloud or shared hosting environments, cgroups enforce resource limits to isolate tenants and prevent resource contention. 3. Performance Tuning: ◦ By restricting processes to speci c cores or memory nodes, you can optimize application performance. 4. System Stability: ◦ Prevent runaway processes from consuming all system resources and impacting other workload The diagram represents the architecture of a Docker container and how it interacts with the underlying operating system. Let’s break it down step by step: 1. The Container Concept fi fi fi A container is a lightweight, standalone unit that encapsulates an application or service along with everything it needs to run (like libraries, dependencies, and runtime les). Containers are portable and ensure that applications run consistently across different environments, from development to production. 2. Layers in the Diagram The diagram highlights three main layers: a. App/Services This is the topmost layer and represents the application or service that you want to run inside the container. It could be a web server (like Nginx), a database (like PostgreSQL), or any custom application you’ve built. Containers isolate the app so that it doesn’t interfere with other applications running on the same machine. b. Supporting Files/Runtime This layer includes the libraries, dependencies, and runtime environment needed by the application to function. For example: ◦ If your app is written in Python, this layer would include the Python runtime. ◦ If it’s a Node.js app, it would include Node.js and the required modules. This ensures that the app can run independently, regardless of what’s installed on the host operating system. c. Host Operating System This is the base layer and represents the operating system of the machine running the container. Containers do not include a separate operating system. Instead, they share the kernel of the host operating system. This is what makes containers lightweight compared to virtual machines (VMs). VMs include an entire guest OS, whereas containers only package the app and its dependencies. What is Docker? Docker is an open-source platform designed to automate the deployment of applications inside containers. It adds a layer of abstraction and automation to OS-level virtualization, making it easier to create, deploy, and run applications in isolated environments. Here’s a detailed explanation of the slide content: fi Key Points on Docker 1. Automating Deployment with Containers Docker provides an environment where you can bundle your application with all of its dependencies into a container. This container can then run reliably on any machine, regardless of its underlying con guration. It simpli es the deployment process by ensuring that the application and its environment are consistent across development, testing, and production. 2. LXC Comparison LXC (Linux Containers) is an earlier technology that introduced the idea of containerization using Linux kernel features like namespaces and cgroups. Docker initially used LXC as its runtime but later moved to its own container runtime for more exibility. Depending on your use case, Docker can be seen as an improvement or a drawback compared to LXC: ◦ Better: Docker provides tooling, APIs, and work ows that simplify container management. ◦ Worse: Some developers nd Docker heavier than bare LXC for lightweight use cases. 3. Docker’s Use of Namespaces and cgroups Docker relies on Linux kernel features for containerization: ◦ Namespaces: Provide isolation for resources like process IDs, le systems, and networks. Each container has its own isolated view of the system. ◦ cgroups (Control Groups): Manage resource allocation and limits (CPU, memory, I/O, etc.) for containers to ensure they don’t exceed their quotas. 4. Docker's Runtime: opencontainer/runc runc is a low-level container runtime used by Docker to interface directly with Linux kernel features (namespaces, cgroups, etc.). It’s part of the Open Container Initiative (OCI), which standardizes container runtime speci cations. Docker also supports other runtime environments like libvirt and systemd-nspawn for interacting with virtualized and containerized systems. This makes it exible, but you can choose your runtime ("pick your poison"). What is OS-Level Virtualization? OS-level virtualization is a method of virtualization where the operating system kernel allows the creation of multiple isolated user-space instances. These instances, often called containers, share the same OS kernel but operate as if they were independent systems. Advantages of OS-Level Virtualization fi fl fi fi fi fl fi fl 1. Fairly Easy to Set Up Setting up containers is straightforward compared to traditional virtualization (e.g., setting up virtual machines). Tools like Docker simplify the creation, con guration, and management of containers. 2. Convenient for Testing New Con gurations Containers provide an isolated environment where new con gurations or applications can be tested without affecting the host system. Changes are limited to the container, so there's no risk of breaking the host or other containers. 3. Essentially No Overhead or Performance Loss Why? ◦ Containers use the host system's native OS kernel and do not need to emulate hardware or run a separate guest operating system (unlike VMs). Result: ◦ Containers are lightweight, fast, and close to native performance. ◦ They avoid the signi cant resource overhead of full virtualization or emulation. Comparison to Virtual Machines (VMs): ◦ VMs require a full OS for each instance, which consumes more CPU, memory, and storage. 4. System-Level Backups Containers can be easily backed up or cloned using snapshots: ◦ A snapshot captures the entire state of the container ( les, processes, con gurations). ◦ Snapshots allow quick restoration to a previous state, making disaster recovery and testing rollback scenarios straightforward. 5. Run Multiple Environments in Parallel Containers allow you to run multiple isolated environments on the same host: ◦ For example, you could run Python, Node.js, and Java applications simultaneously in separate containers without con icts. Each container operates as if it were a separate machine with its own environment, isolated from the host and other containers. 6. Maximum Use of Hardware Resources Containers share the host's resources (CPU, memory, storage) more ef ciently than VMs. They avoid duplicating operating system overhead, allowing more containers to run on the same hardware. 7. Examples The slide provides real-world examples to highlight the exibility and ef ciency of OS-level virtualization: fi fl fi fi fi fl fi fi fi fi 1. Multiple Virtual Server Applications on a Single Box: ◦ You can host multiple services (e.g., a web server, database, cache) in isolated containers on the same hardware. 2. Instant Updates with Patches or Modi cations: ◦ Changes made to the host system (e.g., applying a security patch) are immediately re ected across all containers because they share the host kernel. 3. Massive Scalability in Cloud Computing: ◦ Containers are ideal for cloud environments, where workloads can be scaled up or down quickly based on demand. ◦ For example: ▪ You can spin up hundreds of containers during peak usage. ▪ Scale down by removing containers during low demand. Key Takeaways Performance and Ef ciency: Containers offer near-native performance with minimal overhead because they rely on the host OS kernel. Flexibility and Scalability: They enable ef cient multi-environment setups and dynamic resource scaling, making them ideal for cloud-native applications. Ease of Use: Snapshots, isolation, and simple con guration make containers a powerful tool for development, testing, and production deployment. 1. OS-Level Virtualization Security The slide begins by emphasizing that OS-level virtualization can provide suf cient security when implemented properly. However, it is not foolproof. Unlike full virtualization (e.g., virtual machines), containers share the host's operating system kernel, which can expose security vulnerabilities if not managed correctly. 2. Role of chroot chroot is a Unix/Linux command that changes the apparent root directory for a process, isolating it within a speci c lesystem subtree. Limitation of chroot: ◦ It provides some separation, but not enough for robust container security. ◦ A process running as root inside a chroot environment may escape the con nement and gain access to the host system. 3. Privileged Containers and LXC Privileged Containers: ◦ In privileged containers, the root user inside the container maps directly to the root user on the host. ◦ This means if an attacker gains root access inside the container, they effectively have unrestricted root access to the host system. LXC (Linux Containers): fl fi fi fi fi fi fi fi fi ◦ Similar to Docker, LXC is a containerization tool. ◦ Privileged LXC Containers: ▪ These also map the container's root user to the host's root user, exposing the system to container escape risks. 4. Unprivileged Containers What Are Unprivileged Containers? ◦ Introduced in LXC 1.0, unprivileged containers map the root user inside the container to a non-root user on the host. ◦ This means even if an attacker compromises the container, their privileges are restricted to those of a non-root user on the host. Bene ts: ◦ Reduces the impact of a security breach. ◦ Prevents an attacker from gaining full root access to the host system. 5. Challenges with Unprivileged User Namespaces Unprivileged User Namespaces: ◦ These are a kernel feature that enables containers to run as unprivileged processes, mapping container users (including root) to less-privileged users on the host. ◦ Security Concerns: ▪ Kernel code that was traditionally only accessed by the real root user can now be called by a virtual root user (mapped to an unprivileged user). This increases the attack surface, as vulnerabilities in the kernel might be exploited by malicious processes inside the container. 6. Kernel and Drivers as Part of the Trusted Computing Base (TCB) The kernel and drivers remain part of the Trusted Computing Base (TCB): ◦ The TCB is the set of components that must be trusted to enforce security policies. ◦ If the kernel or a driver is compromised, it could lead to a total breach of the host system. Containers inherently depend on the host kernel, so vulnerabilities in the kernel directly affect container security. \Slide 1: Meltdown – The Attack What is Meltdown? Meltdown is a hardware vulnerability in CPUs that exploits out-of-order execution, allowing an attacker to read memory that should be inaccessible, such as kernel memory. Attack Explanation: 1. Attack Mechanics: ◦ The goal is to read a protected memory location (e.g., kernel memory at a speci c address). ◦ The steps in the attack: ▪ x = kernel_mem[address]; fi fi ▪ This line attempts to read a value from a protected memory address. ▪ Normally, this would trigger a fault since user processes shouldn't access kernel memory. ▪ y = x & 0x100; ▪ This isolates a speci c bit (in this case, bit 0x100) from the protected value. ▪ z = user_mem[base + y]; ▪This line translates the isolated bit (y) into an offset and accesses a user memory location. 2. Out-of-Order Execution: ◦Modern CPUs execute instructions out of order to improve speed. ◦During out-of-order execution: ▪ Instructions that access protected memory might be executed speculatively, even if they will ultimately fail due to access violations. ▪ These failed instructions leave traces in the CPU cache, which can be measured and exploited. 3. Cache Side Channel: ◦ The key to the attack is using timing analysis to infer cached values: ▪ Test memory access times for different offsets (base + 0, base + 1, etc.). ▪ The faster access indicates which value was cached and thus reveals the bit value (y). Slide 2: Meltdown – The Aftermath What Happens After the Attack? The vulnerability in Meltdown compromises the isolation between: ◦ Kernel Space: Reserved for the operating system and inaccessible to user processes. ◦ User Space: Memory available to user-level applications. Kernel Page-Table Isolation (KPTI): To mitigate Meltdown, the operating system enforces Kernel Page-Table Isolation (KPTI): ◦ Before the x: ▪ Both kernel space and user space were mapped into the same virtual address space. ▪ This mapping allowed speculative execution to leak kernel memory through timing attacks. ◦ After the x: ▪ The kernel space is isolated from the user space. ▪ User processes no longer have access to the kernel's page table, even during speculative execution. Slide 3: Kernel Page Table Isolation (KPTI) What is KPTI? Kernel Page-Table Isolation introduces two separate page tables: fi fi fi ◦ One for user mode. ◦ One for kernel mode. Depending on the mode, the appropriate page table is used, isolating kernel memory from user space. How It Works: 1. Separation of Address Space: ◦ Most of the kernel address space is removed from user-level virtual address space. ◦ This prevents user processes from accessing or speculating about kernel memory. 2. Challenges Introduced by KPTI: ◦ System Call Overhead: ▪ Every time a user process transitions to kernel mode (e.g., making a system call), the CPU must switch page tables. ▪ This switch involves a Translation Lookaside Buffer (TLB) ush, which incurs a performance penalty. ◦ Noticeable Impact on Performance: ▪ Some workloads, such as font rendering in Windows 7, experienced a signi cant slowdown because they heavily interacted with the kernel. 3. CPU Design Changes: ◦ Future CPU designs aim to address these vulnerabilities at the hardware level, reducing the need for software mitigations like KPTI. What is Emulation? Emulation is the process of mimicking the behavior of one system (the guest) on another system (the host) by recreating its hardware or software environment. It allows software designed for one platform to run on a completely different platform. 2. Disadvantage: Serious Performance Penalty Performance Penalty: ◦ Emulation comes at a cost: it is signi cantly slower than running software natively because: 1. The host system must translate the instructions of the emulated system into instructions it understands. 2. This translation process often requires substantial computational resources. ◦ For example: 1. Emulating a Raspberry Pi's ARM processor on an x86-based system requires additional layers of processing to translate ARM-speci c instructions. fi fi fi fl 1. Software-Based Virtualization Software-based virtualization relies on software (called a hypervisor) to emulate the hardware environment and manage virtual machines. There are two main types: a. Full Virtualization De nition: ◦ In full virtualization, the hypervisor fully emulates the underlying hardware. ◦ The guest operating system (OS) is unaware that it is running in a virtualized environment. ◦ This allows unmodi ed guest operating systems to run in the VM. How It Works: ◦ The hypervisor intercepts and translates privileged operations (e.g., accessing hardware) performed by the guest OS. ◦ This adds an additional software layer between the guest OS and the physical hardware. This slide provides an overview of hardware virtualization techniques, which are used to create virtual machines (VMs) and allow multiple operating systems to run on a single physical machine. Let’s break it down in detail: What is Hardware Virtualization? Hardware virtualization is a technique where the physical hardware of a computer is abstracted to allow multiple virtual machines (VMs) to share the same physical resources (CPU, memory, storage, etc.). Each VM operates independently as if it had its own dedicated hardware. 1. Software-Based Virtualization Software-based virtualization relies on software (called a hypervisor) to emulate the hardware environment and manage virtual machines. There are two main types: a. Full Virtualization De nition: ◦ In full virtualization, the hypervisor fully emulates the underlying hardware. ◦ The guest operating system (OS) is unaware that it is running in a virtualized environment. ◦ This allows unmodi ed guest operating systems to run in the VM. How It Works: ◦ The hypervisor intercepts and translates privileged operations (e.g., accessing hardware) performed by the guest OS. ◦ This adds an additional software layer between the guest OS and the physical hardware. Example Hypervisors: ◦ VMware Workstation fi fi fi fi ◦ VirtualBox ◦ QEMU (in emulation mode) Pros: ◦ Flexibility: Can run any OS without modi cation. ◦ Isolation: Strong isolation between VMs. Cons: ◦ Performance: Translation and emulation introduce overhead, leading to slower performance. b. Paravirtualization De nition: ◦ Paravirtualization requires modi cations to the guest operating system to make it aware that it is running in a virtualized environment. ◦ The guest OS communicates directly with the hypervisor, reducing the need for emulation. How It Works: ◦ Instead of fully emulating hardware, the hypervisor provides an API that the guest OS can use to perform privileged operations ef ciently. Example Hypervisors: ◦ Xen (in paravirtualization mode) ◦ KVM (Kernel-based Virtual Machine) with paravirtualized drivers Pros: ◦ Better Performance: Eliminates the need for hardware emulation. Cons: ◦ Guest OS Modi cation: Requires a modi ed OS, which limits exibility. 2. Hardware-Assisted Virtualization De nition: ◦ In hardware-assisted virtualization, the physical CPU provides built-in support to improve the ef ciency of virtualization. ◦ Special instructions and features are added to the processor to help the hypervisor manage virtual machines more effectively. Pros: ◦ High Performance: Hardware assistance minimizes the performance overhead of virtualization. ◦ Compatibility: Can run unmodi ed guest operating systems. Cons: ◦ Requires modern CPUs with virtualization support. fi fi fi fi fi fi fi fi fi fl