AHV Storage I/O Path

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

AHV leverages a traditional storage stack, similar to ESXi or Hyper-V, for managing disk I/O.

False (B)

The iSCSI redirector in AHV uses NOP commands to periodically check the health of Stargates within the cluster.

True (A)

QEMU is configured to directly use the Stargate service as the iSCSI target portal.

False (B)

IDE devices are the preferred controller type for virtual disks in AHV for optimal performance.

<p>False (B)</p> Signup and view all the answers

In the traditional I/O path, the QEMU main loop processes I/O requests concurrently using multiple threads.

<p>False (B)</p> Signup and view all the answers

Frodo, also known as AHV Turbo Mode, is disabled by default on VMs powered on after AOS 5.5.X.

<p>False (B)</p> Signup and view all the answers

Frodo accelerates I/O by replacing the QEMU main loop with vhost-user-scsi.

<p>True (A)</p> Signup and view all the answers

With Frodo enabled, a VM with one vCPU will create two Frodo threads per disk device to maximize throughput.

<p>False (B)</p> Signup and view all the answers

Acropolis IP Address Management (IPAM) relies on traditional DHCP servers in an 'unmanaged' network configuration.

<p>True (A)</p> Signup and view all the answers

Acropolis IPAM uses VXLAN and OpenFlow to intercept and respond to DHCP requests, providing IP address management.

<p>True (A)</p> Signup and view all the answers

VM High Availability in AHV is triggered immediately after the Acropolis Leader detects a disruption in the libvirt connection to a host.

<p>False (B)</p> Signup and view all the answers

In an AHV cluster, VM HA guarantees VM restarts within 5 minutes of a host failure.

<p>False (B)</p> Signup and view all the answers

In Guarantee mode, AHV HA restarts all VMs if sufficient resources are available, but does not reserve resources to guarantee restart capability.

<p>False (B)</p> Signup and view all the answers

When using Guarantee mode with all containers at RF2, AHV reserves resources equivalent to two host's worth of memory.

<p>False (B)</p> Signup and view all the answers

Segment-based reservations distribute the HA resource reservation evenly across all hosts in the cluster.

<p>True (A)</p> Signup and view all the answers

In AHV, only host based reservations are automatically implemented when the Guarantee HA mode is selected.

<p>False (B)</p> Signup and view all the answers

When a local CVM's Stargate recovers, remote Stargates immediately transfer all iSCSI sessions back to the local Stargate.

<p>False (B)</p> Signup and view all the answers

In AHV, a virtual machine's OS sends SCSI commands directly to the physical storage devices.

<p>False (B)</p> Signup and view all the answers

If the Acropolis Leader is running remotely, VXLAN tunnel will be leveraged to handle the request over the network for DHCP requests using Nutanix IPAM solution.

<p>True (A)</p> Signup and view all the answers

The algorithm used to determine the total number of reserved segments and per host reservation is called MTKM.

<p>False (B)</p> Signup and view all the answers

Flashcards

AHV Storage I/O Path

AHV does not use a traditional storage stack, passing disks to VMs as raw SCSI block devices for a lightweight I/O path.

iSCSI Redirector

An iSCSI redirector on each AHV host checks Stargate health using NOP commands.

QEMU and iSCSI Redirector

QEMU is configured with the iSCSI redirector as the target, which redirects login requests to a healthy Stargate.

Preferred Controller Type

In normal operation, the preferred controller type is virtio-scsi for SCSI devices (default).

Signup and view all the flashcards

Frodo I/O Path

Frodo is an optimized I/O path for AHV providing higher throughput, lower latency, and reduced CPU overhead.

Signup and view all the flashcards

Frodo Key Features

Frodo replaces the QEMU main loop with vhost-user-scsi and uses multiple virtual queues (VQs), leveraging multiple threads for multi-vCPU VMs.

Signup and view all the flashcards

Acropolis IPAM

Acropolis IPAM provides DHCP scope and address assignment to VMs using VXLAN and OpenFlow for request interception and response.

Signup and view all the flashcards

AHV VM High Availability

AHV VM HA restarts VMs on healthy nodes after a host failure, with the Acropolis Leader managing the restarts.

Signup and view all the flashcards

Acropolis Leader Role in HA

The Acropolis Leader monitors host health via libvirt connections, initiating HA restart countdown upon connection loss.

Signup and view all the flashcards

HA Modes in AHV

Default HA restarts VMs on available resources; Guarantee HA reserves resources to ensure all VMs restart after a failure.

Signup and view all the flashcards

Resource Reservations for HA

Guarantee mode reserves one host's worth of resources for RF2 containers, and two hosts' worth for RF3 containers.

Signup and view all the flashcards

Segment Based HA

With 5.0 and later, AHV supports a segment-based reservation, distributing resources across all hosts for HA failover capacity.

Signup and view all the flashcards

Study Notes

  • AHV does not use a traditional storage stack and passes disks to VMs as raw SCSI block devices

Storage I/O Path

  • AOS handles backend configuration of kvm, virsh, qemu, libvirt, and iSCSI, abstracting these from the user
  • Each AHV host runs an iSCSI redirector which monitors the health of Stargates via NOP commands
  • The iscsi_redirector log shows the health of each Stargate
  • The local Stargate is shown via its internal address (192.168.5.254)

iSCSI Configuration

  • The iSCSI redirector listens on 127.0.0.1:3261
  • QEMU is configured with the iSCSI redirector as the iSCSI target portal
  • Upon a login request, the redirector performs an iSCSI login redirect to a healthy Stargate (preferably the local one)

iSCSI Multi-Pathing

  • Virtio-scsi is the preferred controller type (default for SCSI devices)
  • IDE devices are possible but not recommended
  • Virtio drivers, Nutanix mobility drivers, or Nutanix guest tools must be installed for Windows to use virtio
  • Modern Linux distros ship with virtio pre-installed
  • If the active Stargate goes down, the iSCSI redirector marks it as unhealthy and redirects the login to another healthy Stargate

Local CVM Handling

  • Once the local CVM’s Stargate comes back up, the remote Stargate quiesces and kills connections to remote iSCSI sessions
  • QEMU then attempts an iSCSI login again and is redirected to the local Stargate

Traditional I/O Path

  • VMs perform SCSI commands to virtual devices
  • Virtio-scsi places requests in the guest’s memory
  • The QEMU main loop handles these requests
  • Libiscsi inspects each request and forwards it
  • The network layer forwards requests to the local CVM (or externally if the local CVM is unavailable)
  • Stargate handles the requests

Network Sessions

  • qemu-kvm establishes sessions with a healthy Stargate using the local bridge and IPs
  • For external communication, the external host and Stargate IPs are used
  • There is one session per disk device

Inefficiencies

  • The main QEMU loop is single-threaded
  • libiscsi inspects every SCSI command

Frodo I/O Path (AHV Turbo Mode)

  • Frodo is an optimized I/O path for AHV that allows for higher throughput, lower latency, and less CPU overhead
  • It is enabled by default on VMs powered on after AOS 5.5.X

Frodo I/O Process

  • VMs perform SCSI commands to virtual devices
  • Virtio-scsi places requests in the guest’s memory
  • Frodo handles the requests
  • A custom libiscsi appends the iSCSI header and forwards the requests
  • The network layer forwards requests to the local CVM (or externally if the local one is unavailable)
  • Stargate handles the requests

Key Differences

  • Replaces the QEMU main loop (qemu-kvm) with Frodo (vhost-user-scsi)
  • Exposes multiple virtual queues (VQs) to the guest (one per vCPU) and leverages multiple threads for multi-vCPU VMs
  • The standard libiscsi is replaced by a lightweight version

Performance

  • VMs will have multiple queues for disk devices
  • Achieves up to 3x performance increases compared to Qemu
  • Results in a CPU overhead reduction of 25% for I/O
  • Frodo processes are visible for each running VM (qemu-kvm process)

vCPU Considerations

  • To take advantage of multiple threads / connections, a VM must have >= 2 vCPUs when powered on
    • 1 vCPU UVM: 1 Frodo thread / session per disk device
    • = 2 vCPU UVM: 2 Frodo threads / sessions per disk device

  • Frodo establishes sessions with a healthy Stargate using the local bridge and IPs

IP Address Management

  • Acropolis IP address management (IPAM) solution allows for the establishment of a DHCP scope and the assigning of addresses to VMs
  • It uses VXLAN and OpenFlow rules to intercept DHCP requests and respond with a DHCP response
  • Traditional DHCP / IPAM solutions can be used in an ‘unmanaged’ network scenario

VM High Availability (HA)

  • AHV VM HA ensures VM availability in the event of a host or block outage
  • VMs previously running on a failed host will be restarted on other healthy nodes in the cluster
  • The Acropolis Leader is responsible for restarting the VMs on the healthy host(s)
  • The Acropolis Leader tracks host health by monitoring its connections to libvirt on all cluster hosts

HA Process

  • Once the libvirt connection goes down, a countdown to the HA restart begins
  • If the libvirt connection fails to be re-established within the timeout, Acropolis restarts VMs that were running on the disconnected host, generally within 120 seconds
  • If the Acropolis Leader fails or becomes partitioned, a new Acropolis Leader is elected
  • If a cluster becomes partitioned and quorum is achieved VMs restart

HA Modes

  • Default: Requires no configuration, VMs restart on available hosts depending on resource availability
  • Guarantee: Reserves space on AHV hosts to guarantee that all failed VMs can restart

Resource Reservations

  • Guarantee mode reserves host resources for VMs
Resource Amounts:
  • If all containers are RF2 (FT1): One “host” worth of resources
  • If any containers are RF3 (FT2): Two “hosts” worth of resources
  • The system uses the largest host’s memory capacity when determining how much to reserve per host when hosts have uneven memory capacities

Segment Based Reservations

  • Resource reservations are now segment based and are implemented when Guarantee HA mode is selected (5.0+)
  • Reserve segments distributes the resource reservation across all hosts
  • Each host shares a portion of the reservation for HA, ensuring the cluster has failover capacity
  • VMs are restarted throughout the cluster on the remaining healthy hosts

Reservations Calculation

  • The system automatically calculates the total number of reserved segments and per-host reservation
  • The algorithm used for memory reservations is called MTHM

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Use Quizgecko on...
Browser
Browser