Intro to Cloud Computing

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

How does a distributed computing system typically address large-scale problems over the Internet?

  • By isolating computations to individual machines to minimize network usage.
  • By focusing only on problems that can be solved with minimal data transfer.
  • By utilizing multiple computers to solve different parts of the problem. (correct)
  • By using a single, powerful computer to process all data.

Which factor has primarily driven the evolutionary changes in distributed and cloud computing over the last 30 years?

  • Advancements in cooling technologies for data centers.
  • Applications with variable workloads and large datasets. (correct)
  • Introduction of new programming languages.
  • Decreased costs of network hardware.

What is the primary focus shift in data-intensive computing?

  • From network speed to CPU speed.
  • From data storage to data compression algorithms.
  • From computation to the data itself. (correct)
  • From data security to data accessibility.

What does Gilder's law predict regarding network bandwidth?

<p>Network bandwidth doubles each year. (B)</p> Signup and view all the answers

What vision did Fernando Corbató of MIT's Multics operating system have in 1965?

<p>A computer facility operating like a power or water company. (A)</p> Signup and view all the answers

Which of the following is a key characteristic of today's clouds?

<p>On-demand access with pay-as-you-go pricing. (A)</p> Signup and view all the answers

Which of the following best describes the concept of utility computing?

<p>Using computing resources from a paid service provider. (C)</p> Signup and view all the answers

How does a single-site cloud, also known as a 'Datacenter', organize its components?

<p>Compute nodes grouped into racks, connected by switches, with a hierarchical network topology. (A)</p> Signup and view all the answers

What is a key difference between cloud computing and distributed computing?

<p>Cloud computing offers utility or service computing, while distributed computing focuses on solving problems using multiple autonomous computers. (A)</p> Signup and view all the answers

Which of the following is a primary characteristic of 'Massive Scale' in the context of cloud computing?

<p>Data centers containing tens to hundreds of thousands of servers. (D)</p> Signup and view all the answers

Which of the following is an example of Infrastructure as a Service (IaaS)?

<p>Amazon EC2 and S3 (B)</p> Signup and view all the answers

What is a key characteristic of Platform as a Service (PaaS)?

<p>It offers flexible computing and storage infrastructure coupled with a software platform. (D)</p> Signup and view all the answers

How does cloud computing address the nature of large datasets?

<p>By focusing on storing data at datacenters and using compute nodes nearby to run computation services. (C)</p> Signup and view all the answers

What programming paradigms are commonly associated with cloud computing?

<p>MapReduce/Hadoop, NoSQL/Cassandra/MongoDB (B)</p> Signup and view all the answers

What was the primary origin of virtualization technology?

<p>IBM in the 1960s. (A)</p> Signup and view all the answers

What is the role of a Virtual Machine Monitor (VMM) in virtualization?

<p>To manage the physical hardware and allow multiple operating systems to run concurrently. (C)</p> Signup and view all the answers

What represents a Virtual Machine (VM)?

<p>Operating System + Applications + Virtual resources. (D)</p> Signup and view all the answers

What environment does a Virtual Machine Monitor (VMM) provide?

<p>An environment essentially identical with the original machine. (D)</p> Signup and view all the answers

In virtualization, a key benefit is running multiple VMs on a single machine, what is the primary advantage of this?

<p>Reduced cost and improved manageability (A)</p> Signup and view all the answers

Why is the security aspect important in virtualization?

<p>The OS and application are encapsulated, and more easily contained if bugs or malicious behavior happen to a virtual machine. (B)</p> Signup and view all the answers

In bare-metal virtualization, what is the role of the hypervisor?

<p>To manage all hardware resources and support the execution of virtual machines. (D)</p> Signup and view all the answers

Which hypervisor(s) uses the bare-metal virtualization model?

<p>Citrix XenServer &amp; VMware ESX (D)</p> Signup and view all the answers

In a hosted virtualization model, what primarily manages the hardware resources?

<p>A full-fledged host OS (D)</p> Signup and view all the answers

What role does the Hardware play with Kernel-based Virtual Machine (KVM)?

<p>Hardware-assisted virtualization through the use of a QEMU hardware emulator (A)</p> Signup and view all the answers

What is the key function of binary translation in virtualization?

<p>Rewriting the virtual machine binary to avoid problematic instructions. (C)</p> Signup and view all the answers

How does para virtualization improve system performance?

<p>By modifying the guest OS to make explicit calls to the hypervisor (D)</p> Signup and view all the answers

Why is the Memory Management Unit (MMU) virtualized?

<p>To allow multiple virtual machines to run on a single system. (C)</p> Signup and view all the answers

What best describes Memory Virtualization using Shadow Page Tables?

<p>It maps guest physical memory to the actual machine memory, used this to accelerate mappings (D)</p> Signup and view all the answers

What is a challenge that device virtualization needs to address?

<p>A lack of standard specification of device interfaces. (A)</p> Signup and view all the answers

What is a primary characteristic of the Passthrough Model in device virtualization?

<p>Device virtualization requires that a VM has exclusive use of the device (C)</p> Signup and view all the answers

What is a primary characteristic of the Hypervisor-Direct Model regarding device accesses?

<p>The VMM intercepts all the accesses (C)</p> Signup and view all the answers

What is a key feature of the Split-Device Driver Model in device virtualization?

<p>All of the above (D)</p> Signup and view all the answers

What is the goal with hotspot mitigation teqniques?

<p>Automating the task of monitoring and deteting hotspots (C)</p> Signup and view all the answers

What would be something common about both the White and Gray box approaches?

<p>Resource information (B)</p> Signup and view all the answers

What are enterprise data centers composed of?

<p>Large clusters of servers with network attached storage devices (B)</p> Signup and view all the answers

Why is multi application support an indicator for data centers?

<p>Multi-tier, may span multiple servers (B)</p> Signup and view all the answers

The nature of the demand of different tasks require a dynamic resource allocation. The resource allocation has three stages:

<p>Virtual machine Migration (C)</p> Signup and view all the answers

Web applications and enterprise systems must see constant dynamic workloads, this will lead to workload fluctuations which is caused by multiple factors. What is NOT a factor?

<p>time-of-year (D)</p> Signup and view all the answers

Dynamic provisioning relies on three approaches:

<p>Over-provisioning and Dynamic provisioning (B)</p> Signup and view all the answers

Which question(s) do Research Challenges respond to?

<p>When to migrate? Where to move to?, How much of each resource to allocate?How much information needed to make decisions? (C)</p> Signup and view all the answers

When performing Application migration, the VM migration is performed for dynamic provisioning. What action is required?

<p>Migration is transparent to applications (D)</p> Signup and view all the answers

What action does the Sandpiper Nucleus do?

<p>Monitors resources (A)</p> Signup and view all the answers

The Gray-box has access to which metrics?

<p>All the below (B)</p> Signup and view all the answers

There are multiple stages to VM migration. Which step is to generate ARP to redirect traffic to new host?

<p>Stage 3 (B)</p> Signup and view all the answers

Flashcards

What is Cloud Computing?

Computing over a network, often the internet, dynamically scalable and often virtualized resources are provided as a service over the Internet

What drives cloud evolution?

Evolutionary changes driven by applications with variable workloads and large datasets.

What is Distributed computing?

A system using multiple computers to solve large-scale problems over the Internet, becoming data-intensive and network-centric.

Computing Clouds emergence

Systems built with distributed computing technologies demanding high throughput

Signup and view all the flashcards

High-throughput computing (HTC)

A type of computing appearing as computer clusters used for computational grids.

Signup and view all the flashcards

What are private clouds?

Accessible only to company employees.

Signup and view all the flashcards

What are public clouds?

Provides service to any paying customer.

Signup and view all the flashcards

What is Utility computing?

A computing model where clients plug into computing resources like utilities.

Signup and view all the flashcards

Massive Scale

Very large data centers with thousands of servers.

Signup and view all the flashcards

On-demand Access

Pay-as-you-go access to computing resources without upfront commitment.

Signup and view all the flashcards

Data-intensive Nature

Computing where the focus shifts from computation to data.

Signup and view all the flashcards

What is Amazon EC2?

AWS: Elastic Compute Cloud; pay per CPU hour.

Signup and view all the flashcards

What is Amazon S3?

AWS: Simple Storage Service; pay per GB-month.

Signup and view all the flashcards

What is Hardware as a Service (HaaS)?

Rent barebones hardware machines.

Signup and view all the flashcards

What is Infrastructure as a Service (IaaS)?

Access flexible computing and storage infrastructure using virtualization.

Signup and view all the flashcards

Datacenter

Single-site cloud, compute nodes, switches, network topology

Signup and view all the flashcards

Geographically Distributed Cloud

Multiple such sites are included perhaps with a different structure and other services

Signup and view all the flashcards

Racks

Compute nodes grouped into racks

Signup and view all the flashcards

What is Virtualization?

A virtual machine is an efficient, isolated duplicate of the real machine.

Signup and view all the flashcards

advances of Virtualization

Allows the concurrent execution of multiple Operating Systems on a single physical machine.

Signup and view all the flashcards

Consolidation

Run multiple virtual machines, with their operating systems on a single platform.

Signup and view all the flashcards

Migration

Migrate OS and applications from one physical machine to another.

Signup and view all the flashcards

Security in Virtualization

The OS and applications are encapsulated in a virtual machine easing security.

Signup and view all the flashcards

What is the Hypervisor?

A hypervisor that manages hardware resources and supports execution of VMs.

Signup and view all the flashcards

Hardware Resources

A model of virtualization: hardware resources are supports the execution of the VMs

Signup and view all the flashcards

Virtualization used by Xen

This model is adapted by Xen, Citrix Xen Server.

Signup and view all the flashcards

full fledged host OS

Full-fledged host OS that manages hardware resources.

Signup and view all the flashcards

What is data protection?

Commodity hardware has more than two protection levels to secure data.

Signup and view all the flashcards

Ring 3

The least level of ring privilege, where applications reside, to protect your data

Signup and view all the flashcards

Ring 0

The highest level of ring privilege can data.

Signup and view all the flashcards

Legal operation

Emulate behaviour that guest OS are expecting from you.

Signup and view all the flashcards

Virtual Machine binary

Virtual Machine binary to never issue those seventeen instructions

Signup and view all the flashcards

What is Para virtualization

Technique that is used to improve performance. It improves the guest operating system in an unstructured way.

Signup and view all the flashcards

MMU

Used to support the guest operating system

Signup and view all the flashcards

VMM

mapping guest physical memory to the actual machine memory to accelerate the mappings

Signup and view all the flashcards

TLB

Translate lookaside buffer) hardware to map the virtual memory directly to the machine memory.

Signup and view all the flashcards

Hotspot Mitigation

The system, the new mapping of physical to virtual resources

Signup and view all the flashcards

Black-box approach

OS- and application-agnostic

Signup and view all the flashcards

Gray-box approach

Exploits OS- and application-level statistics

Signup and view all the flashcards

Data centers

Large clusters of servers

Signup and view all the flashcards

Study Notes

  • Introduction to Cloud Computing

Content of the Lecture

  • The lecture covers a brief introduction to Cloud Computing, focusing on:
    • Why Clouds are needed
    • Defining what a Cloud is
    • What's new in today's Clouds
    • Distinguishing Cloud Computing from previous distributed systems

Scalable Computing Over the Internet

  • Distributed and cloud computing have seen evolutionary changes over 30 years
  • These changes are driven by applications with variable workloads and large data sets
  • Machine architecture, operating system platform, network connectivity, and application workload changes have evolved
  • Distributed computing systems use multiple computers to solve large-scale problems over the Internet, becoming data-intensive and network-centric
  • Computing clouds demand high-throughput computing (HTC) systems built with distributed computing technologies
  • High-throughput computing (HTC) appears as computer clusters, service-oriented architecture, computational grids, peer-to-peer networks, Internet clouds, and the future Internet of Things

The Hype of Cloud: Forecasting

  • Gartner in 2009 forecasted that cloud computing revenue would exceed $150 billion by 2013, representing 19% of IT spending by 2015.
  • IDC in 2009 projected that spending on IT cloud services would triple in 5 years, reaching $42 billion.
  • Forrester in 2010 estimated cloud computing would increase from $40.7 billion in 2010 to $241 billion in 2020.
  • Companies, and federal/state governments now use cloud computing (fbo.gov).

Cloud Providers

  • AWS (Amazon Web Services) includes:
    • EC2 (Elastic Compute Cloud)
    • S3 (Simple Storage Service)
    • EBS (Elastic Block Storage)
  • Microsoft Azure is a cloud service provider
  • Google Compute Engine/App Engine is another cloud provider

Categories of Clouds

  • Clouds can be either public or private
  • Private clouds are only accessible by company employees
  • Public clouds provide services to paying customers
  • Amazon S3 stores datasets and charges per GB-month
  • Amazon EC2 allows uploading/running OS images and charges per CPU hour
  • Google App Engine/Compute Engine facilitates application development within their framework and data uploading for execution

Customer Savings with Cloud Computing

  • AWS is reported to enable a new server to be up and running in three minutes
  • This is in contrast to seven and a half weeks deploying a server internally
  • A 64-node Linux cluster can be online in five minutes using AWS, compared to three months internally
  • Online Services reduce IT operational costs by roughly 30% of spending
  • Private clouds of virtual servers in datacenters are reported to save companies crores of rupees annually through shared computing and storage
  • 100s of startups can leverage large computing resources without buying their own machines

What is a Cloud?

  • Advances in virtualization enable the development of Internet clouds as a new computing paradigm
  • There are dramatic differences between developing software for millions to use as a service versus software to run on individual PCs
  • In 1984, John Gage of Sun Microsystems coined the slogan, "The network is the computer."
  • In 2008, David Patterson of UC Berkeley stated, "The data center is the computer."
  • Recently, Rajkumar Buyya of Melbourne University said, "The cloud is the computer."
  • Clouds are viewed as grids or clusters with virtualization, processing large data sets from the Internet, social networks, and IoT

Definition of a Cloud

  • A single-site cloud, also known as a "Datacenter", comprises:
    • Compute nodes grouped into racks
    • Switches connecting the racks
    • A network topology, like hierarchical
    • Storage nodes connected to the network
    • A front-end for job submission and client request reception
    • Often called "three-tier architecture"
    • Software Services
  • A geographically distributed cloud consists of multiple such sites, each with different structures and services

Computing Paradigm Distinctions

  • Cloud computing overlaps with distributed computing
  • Distributed computing: A distributed system consists of multiple autonomous computers, having its own memory, communicating through message passing
  • Cloud Computing: Clouds are built with physical/virtualized resources over large data centers in distributed systems; a form of utility or service computing

"A Cloudy History of Time"

  • Historical progression of computing:
    • 1940s: Vacuum Tubes, Mechanical Relays like ENIAC, ORDVAC and ILLIAC (First Large Datacenters) -1950s: Computer Development
    • 1960s: Honeywell, Xerox, Timesharing and Data Processing Industry developing UNIVAC(UNIVersal Automatic Computer) technology
    • 1970s: Supercomputers
    • 1980s: OS Grid developments
    • 1990s: P2P Systems development
    • 200s : Cloud developments
  • Doubling periods: storage (12 months), bandwidth (9 months), and CPU compute capacity (18 months)
  • Moore's law indicates processor speed doubles every 18 months
  • Gilder's law indicates network bandwidth has doubled each year
  • Bandwidth capacity was mostly 56Kbps links nationwide in 1985, to Tbps links widespread in 2015
  • Today's PCs have TBs of disk capacity, surpassing a 1990 supercomputer
  • Focuses on autonomic operations for dynamic discovery
  • Major computing paradigms are composable with QoS and SLAs
  • In 1965, Fernando Corbató of MIT envisioned computing as "like a power company or water company"
  • Utility computing offers a business model where customers receive computing resources from a paid service provider
  • All grid/cloud platforms are regarded as utility service providers

Features of Cloud Computing

  • Massive Scale: Large data centers hold 10-100s of thousands of servers to run computations across as many servers as the application needs
  • On-Demand Access: a pay-as-you-go model eliminates the need for upfront commitment and allows anyone access
  • Data-Intensive Nature: MBs have increased to TBs, PBs, and XBs for things like daily logs, forensics, and web data
  • New Cloud Programming Paradigms: MapReduce/Hadoop, NoSQL/Cassandra/MongoDB, etc are used
  • Novel & unsolved distributed computing problems in cloud computing

Massive Scale in Cloud Computing

  • Facebook had 30K servers in 2009, which increased to 60K in 2010 and 180K in 2012
  • Microsoft had 150K machines with a growth rate of 10K per month
  • 80K running Bing and Microsoft Cosmos had 110K machines across 4 sites in 2013
  • Yahoo! had 100K machines split into clusters of 4000
  • AWS EC2 had 40K machines with 8 cores/machine
  • eBay had 50K machines, and HP had 380K in 180 DCS

On-Demand Access (*AAS Classification)

  • On-demand is like renting rather than buying
    • AWS Elastic Compute Cloud (EC2) charges a few cents to a few $ per CPU hour
    • AWS Simple Storage Service (S3) charges a few cents per GB-month
  • HaaS (Hardware as a Service): access barebones hardware machines, useful for your own cluster, but has security risks
  • IaaS (Infrastructure as a Service): access flexible computing and storage infrastructure
    • One way of achieving this is through virtualization and it subsumes HaaS
    • Ex: Amazon Web Services (AWS: EC2 and S3), OpenStack, Eucalyptus, Rightscale, Microsoft Azure, Google Cloud
  • PaaS (Platform as a Service): Get access to flexible computing and storage infrastructure, coupled with a software platform (often tightly coupled)
    • Ex: Google's AppEngine (Python, Java, Go)
  • SaaS: Software as a Service: access software services when required, subsuming SOA (Service Oriented Architectures)
    • Ex: Google docs, MS Office on demand

Data-Intensive Computing

  • Computation-Intensive Computing: MPI-based, high-performance computing, Grids; Typically run on supercomputers like NCSA Blue Waters
  • Data-Intensive: Typically store data at datacenters and uses compute nodes nearby, which run computation services
  • In data-intensive computing, focus shifts from computation to data
  • CPU utilization is secondary, I/O (disk and/or network) is the important resource metric

New Cloud Programming Paradigms

  • Easy to write and run highly parallel programs in new cloud programming paradigms
  • Google uses MapReduce and Sawzall
  • Amazon provides Elastic MapReduce service (pay-as-you-go)
  • Google uses MapReduce with an Indexing chain of 24 MapReduce jobs which processes ~200K jobs processing 50PB/month (in 2006)
  • Yahoo! uses Hadoop + Pig and has WebMap as a chain of several MapReduce jobs that has 300 TB of data and 10K cores for many tens of hours (~2008)
  • Facebook utilizes Hadoop + Hive which has ~300TB total, adding 2TB/day (in 2008), and contains processes that run to 3K jobs processing 55TB/day
  • MySQL is an industry standard but NoSQL's Cassandra is 2400 times faster

Cloud Categories

  • Clouds can be either public or private
  • Private clouds are accessible to company employees only
  • Examples of private cloud vendors are VMware, Microsoft Azure, and Eucalyptus
  • Public clouds provide services to any paying customer
  • Examples of public cloud services are Amazon EC2, Google AppEngine, Gmail, Office365, and Dropbox

Cloud Site to Outsource

  • Startups tend to use clouds a lot
  • Cloud providers benefit most monetarily from storage
  • Medium-sized organizations use cloud service for M months
  • The service requires 128 servers (1024 cores) and 524 TB
  • Example costs:
    • S3: $0.12 per GB month
    • EC2: $0.10 per CPU hour
  • Owning can be more preferable if the timeframe stretches over a long period

Conclusion

  • Clouds build on many previous generations of distributed systems
  • Characteristics of cloud computing include scale, on-demand access, data-intensive nature, and new programming

Virtualization

What is Virtualization?

  • Virtualization originated in the 1960s at IBM
  • It allows concurrent execution of multiple Operating Systems (OSs) and their applications on the same physical machine
  • Virtual resources are used as if each OS thinks that it "owns" hardware resources
  • A Virtual Machine (VM) consists of OS, applications, and virtual resources (guest domain)
  • The virtualization layer manages physical hardware through a virtual machine monitor, or hypervisor

Defining Virtualization

  • A virtual machine is an efficient, isolated duplicate of the real machine, it is supported by a virtual machine monitor (VMM) and:
    • Provides an environment essentially identical to the original machine
    • Runs programs at worst with only a minor decrease in speed
    • Maintains complete control of system resources
  • VMM Goals: Fidelity, Performance, and Safety & Isolation

Benefits of Virtualization

  • Consolidation is when multiple virtual machines run on a single physical machine, running many operating systems and applications
    • This decreases costs and improves manageability with fewer electrical bills and admins
  • Migration migrates an OS and applications from one physical machine to another
    • This increases/ensures greater availability and reliability
  • Security protects the applications and OS by encapsulating them; and it enables containment of bugs or malicious behavior to VM, minimizing damage
  • Other benefits are debugging and affordable support for legacy OSs

Virtualization Models

  • There are two popular models for virtualization
    • Bare-metal hypervisor or Native Hypervisor (Type 1)
      • The Native Hypervisor (bare metal hypervisor) enables machines to connect via a shared hardware to the hypervisor
    • Hosted Hypervisor (Type 2)
      • The hosted hypervisor enables Guest OSs to connect via a Host OS to the Hypervisor and connect to machines through shared hardware

Bare-Metal Virtualization Model

  • Bare-metal hypervisor (Type 1) components:
    • VMM (hypervisor) manages hardware resources and supports the execution of VMs
    • Privileged service VM deals with devices and other configuration/management tasks

Bare-Metal Virtualization Model details

  • This model is adapted by the Xen virtualization, and VMware's ESX hypervisor
  • Xen (Open source):
    • VMs are referred to as domains which include a privileged domain (dom 0), and guest VMs (domUs)
    • Xen is the hypervisor, and drivers run in the privileged domain
  • ESX (VMware): VMware still owns the largest percentage of virtualized server cores
    • These cores run the ESX hypervisor and provide drivers for the different devices with third party community of developers using the exports

Hosted Virtualization Model

  • Hosted Hypervisor (Type 2) architecture:
    • In this models the lowest level is a full-fledged host OS that manages all hardware resources
    • The Host OS integrates a VMM module, that's responsible for providing the virtual machines with their virtual platform interface and for managing all of the context switching scheduling

Hosted Virtualization Model Example

  • An example is KVM (Kernel-based VM):
    • Based on Linux
    • The KVM kernel module has a hardware emulator called QEMU that works for hardware virtualization
    • Leverages a huge existing open-source community

Hardware Protection Levels

  • Commodity hardware has multiple protection levels
  • The x86 architecture uses four protection levels known as rings
  • Ring 3 is used for applications because it is the least privileged
  • Ring 0 has the highest privilege and can access all resources, executing any hardware-supported instruction

x86 Hardware w/o Virtualization

  • The x86 architecture has four levels of privilege between 0-3 rings
  • The rings manage the computer hardware
  • User level applications run in ring 3, the operating system needs to have direct access to hardware in ring 0.

Processor Virtualization (Trap-and-Emulate)

  • Guest instructions can be executed directly by the hardware
    • Non-privileged operations: hardware based speed
    • Privileged operations: trap to hypervisor
  • The hypervisor determines what needs to occur, whether to:
    • Terminate VM due to an illegal operation
    • Emulate the behavior the guest OS was expecting from the hardware due to a legal operation

Trap-and-Emulate Problems

  • Problems in the x86 CPUs pre 2005
  • Included 4 rings, but no root/non-root modes for the hypervisor
  • The hypervisor resides in ring 0, while guest OS in ring 1
  • 17 privileged instructions did not trap and failed silently like interrupt enable/disable bit in privileged register, POPF/PUSHF
  • The hypervisor and operating systems doesn't know the interrupt failed so can't change any settings

Binary Translation

  • Main idea is to rewrite the VM binary, not issuing those 17 instructions from previously in the x86 CPUs pre 2005
  • The technique was pioneered by Mendel Rosenblum's group at Stanford, and commercialized as VMware
  • Rosenblum won a ACM fellow for virtualization

Binary Translation Details

  • Goal: full virtualization with a guest OS that is not modified
  • Approach: dynamic binary translation
  1. Inspect code blocks to be executed
  2. Translate to the alternate instruction request
  3. Use to run all hardware and translate for a fraction of a cost.

Paravirtualization

  • Goal: to provide performance and give up on the notion of unmodified guests
  • Approach: modify the guest OS to communicate with the hypervisor
  • The explicit calls are known as hypercalls
  • This requires the operating system is modified in-order to show it is not running as a bare metal, but as a guest to the hypervisor

Paravirtualization Review

  • The amount that a Guest OS code may need modification with para Virtualization is less than 2%

Memory Virtualization

  • To run multiple virtual machines on a single system, one has to virtualize the MMU (memory management unit) to support the guest OS
  • The VMM maps guest physical memory to the machine memory, using shadow page tables to make the mappings faster
  • The VMM uses TLB (translation lookaside buffer) hardware to directly map virtual memory to machine memory
    • This avoids levels of translation on every access for speed

Shadow Page Table

  • Virtual address (VA) and Physical Address (PA) are used in address conversion
  • VA converts to a TLB hit
  • A miss converts to convert it via PT
  • The hardware PT is really then S PT, where S stands for shadow and PT stands for page table

Memory Virtualization using Full Virtualization

  • Full virtualization aspects:
    • All guests expect contiguous physical memory, starting at 0
    • Handles different virtual vs physical vs machine addresses and page frame numbers
    • Still uses hardware MMU, TLB
  • Option 1:
    • Guest page table (VM => PA)
    • Hypervisor (PA=> MA)
      • But this is too expensive
  • Option 2:
    • Hypervisor Shadow Page Table (VA=> MA) manages the
    • Hypervisor maintains consistence with switch and write mappings that are new

Memory Virtualization using Para-Virtualization

  • Para virtualization
    • Guest OS knows it's been virtualized so doesnt require contiguous physical memory
    • Registers page tables with the hypervisor - Can "batch" page table updates to reduce VM - Optimizations
  • Overheads reduced on later platforms

Device Virtualization

  • For CPUs and memory
    • there is some diversity, ISA level of standardization of interface
  • For devices - there is higher diversity - lack of standardized specifications
  • There are three Key models for Device Virtualization - Passthrough Model - Hypervisor Direct Model - Split Device Driver Model

Passthrough Model device virtualization

  • VMM-level driver configures device access permissions
    • VM provided with exclusive access to the device
      • Called "VMM-bypass" model Device sharing becomes hard A exact match for type of device is needed so the guest vm expects a exact match

Hypervisor-Direct Model

  • In the Hypervisor direct model: - Hypervisor intercepts all device accesses so it can emulate operation - the device is de coupled from physical

  • Key benefits: - device decoupled from physical device - deals with migrated with device specifications

Downside includes, having high Ecosystem complexities in a hypervisor

Split Device Driver Model

  • Front end driver is a vm driver and backend on a service

  • Eliminated for better management of Shared devices for better emulation

Conclusion

  • In the lecture both defined Virtulazatiina discussed Approaches
  • Processor virtualization, memory virtualization and Device Virtualize used in virtualization, such as Xen, is.kvm and VMware

Hotspot Mitigation for Virtual Machine Migration

Content of the Lecture

  • This lecture discusses techniques for automated hotspot detection and mitigation, involving:
    • Monitoring
    • Determining new physical-to-virtual resource mappings
    • Initiating necessary migrations
  • Two main approaches:
    • Black-box: fully OS- and application-agnostic
    • Gray-box: exploits OS- and application-level statistics

Enterprise Data Centers

  • Data Centers are composed of:
    • Large clusters of servers
    • Network attached storage devices
  • Multiple applications per server where:
    • Shared hosting environment
    • Multi-tier and may span multiple servers
  • They allocate resources to meet Service Level Agreements (SLAs)
  • Virtualization increasingly common

Benefits of Virtualization

  • Run multiple applications on one server
  • Each application runs in its own virtual machine
  • Provide for Maintain isolation and security
  • Allows for Rapidly adjust resource allocations via CPU priority and memory allocation
  • VM migration provides:
    • "Transparency" to application
    • No downtime, but incurs overhead

Data Center Workloads

  • Web applications, enterprise systems, e-commerce sites see dynamic workloads
  • Caused by incremental growth, time-of-day effects, and flash crowds
  • Provision resources to meet demand while meeting SLAs tasks are complex

Provisioning Methods

  • Hotspots form if resource demand exceeds capacity
  • Over-provisioning:
    • Allocate for peak load, wastes resources
    • Unsuitable for dynamic workloads or if difficult to predict requirements
  • Dynamic provisioning: adjust based on workload
    • Often done manually but becoming easier with virtualization

Problem Statement

  • How can we automatically monitor resource usage for hotspot detection, mitigation, and initiating the necessary migrations in virtualized data centers?

Hotspot Mitigation Problem

  • The migration manager utilizes an algorithm to detect and allocate resources where: -Determine which virtual servers to migrate, to avoid threshold violations
    • Determining a VM mapping is NP-hard can be reduced to an NP-hard bin packing problem
    • Even the problem of determining if a valid packing exists is NP-hard.

Research Challenges with VM Migration

  • Mitigation of hotspots, which automatically detect the hotspots and mitigate VM migration
  • When to migrate?
  • Where to move to?
  • How much resource to allocate?
  • How much information is needed?
  • Dynamic replication:
    • Focuses on dynamic provisioning approaches which servers that are allocated to an application is varies
  • Dynamic slicing:
    • Vary the allocated servers within the fraction and its applied to app
  • Application migration:
    • The virtual process where VM migration is done transparent to apps wihin

Sandpiper Architecture

  • Nucleus:
    • Monitors resources; Reports to control plane; There is one per server - Run runs in a special vir server inside a Zen domain 0 - Centralized server to control - Detection determines when to move on occurs - Decide on how or on what where you allocate - Where are you allocating

Black-Box Vs Gray-Box

  • Black-box - only receives data from the VM. and only uses application data
    • Stats from Os and other internal structures
    • Customer must have the feasible data to detect - Box Sufficient Is black data Sufficient?
  • box Data

Black-box Monitoring

  • Special VM network and disk drivers that run in Zeus Scheduler information for you and what happens
  • Detection swapping so the the disk has poor information

Black-box Monitoring in depth

  • Details on information with what occurs with
    • CPU
      • scheduling events are provided
      • events tracked are determined
    • Network interfa ce driver abstractions to determine the
  • Xen
  • The Memory which is difficult to allocated as amount specified and utilization is unknown.

Details on implementation with blackboxing

  • Back box moniitoring provides data and to collect
    • Elastic computers which allows the to load in there os
  • The other infrastructure provide a level of state level

Gray-box Monitoring details

  • Gray-Box: A Light weight deamon is install inside the virtual serveR - proc access the stats for - use and more

What indicates Hotspot

  • Resource Deficit to service collectives - What more is needed to the fulfill
    • To located sources to resources

Hotspot Detections

  • When VM is needed to make or be implicitly from gray box approar

    hot spot flagged for what occurs - - If memory is over used or request rate

HotSpot implement and Detection approach

  • Thresholds or what ever you use like exceeding from to - The ex is to trigger migration - How agresstionely you hot spots

Hotspot more about

  • What and were needed
  • In other for migration to move into account

Resource Provisioning approach to black box

  • Component needs To a memory and of resources where ever in presence the work

Estimation and network with Black Box

  • Tail and profile for bandwith and needed the future for need

Resource Provisioning (i) Black-box Provisioning

  • The estimating and memory need to occur for the - Allocation rep bound can. ‐ not at to bound

Resource Provisioning (ii)

  • Approach - Since can is the resource The resource is used for that need

Conclusion

  • In this lecture and The memory allocation

  • What black box

  • In Black box used and repliC

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

More Like This

Use Quizgecko on...
Browser
Browser