vSAN Storage Notes (PDF)
Document Details
Uploaded by GreekMichigander
CMU
Tags
Summary
These notes provide an overview of vSAN, a policy-driven storage system within VMware Cloud Foundation (VCF). They detail the differences between traditional and policy-driven storage, and explain storage objects, redundancy, policies, and different failure tolerance methods (FTTs).
Full Transcript
vSAN **Policy driven storage system** With SPBM(Storage based policy management) -- VCF assets use storage that guarantees 1. Specified level of capacity 2\. Performance 3\. Availability 4\. Redundancy Storage policies help you to meet the following goals - Categorize storage based on desir...
vSAN **Policy driven storage system** With SPBM(Storage based policy management) -- VCF assets use storage that guarantees 1. Specified level of capacity 2\. Performance 3\. Availability 4\. Redundancy Storage policies help you to meet the following goals - Categorize storage based on desired levels of service - Provision VM disks for optimal configuration - Protect Data through object-based fault tolerance SPBMs are defined as a set of rules **Traditional Storage** Typical VM Writing/Reading/Failure scenario **Policy Driven OBJECT Based Storage** A diagram of a data storage Description automatically generated **Objects are made from components** Example: Flat.VMDK in VSAN file is an object and made up of different components and needs redundancy Needs to be placed on 3 esxi hosts Raid-1 policy on VMs (mirrored copies of this flat.vmdk) Types of objects can be treated differently within a virtual machine by policy Huge flexibility Rules define and are applied to an object on vms to create a policy Capacity, performance ![A screenshot of a computer Description automatically generated](media/image2.png) All of the rules have to be based on something vSAN can deliver to us (capacity, etc) Failures to tolerate (established through rules on vms)A diagram of a cloud foundation Description automatically generated Default policy -- 1 failure to tolerate Can even do 3 host failures to tolerate (FTT) ![A screenshot of a computer Description automatically generated](media/image4.png) Failures to Tolerate (FTT) Failures to tolerate methods (FTMs) -- raid and capacity/erasure coding space savings Need extra esxi hosts for FTT and methods Minimum of 3 esxi hosts! FTT of 0 or 1 = 3 esxi hosts minimum for Raid-1 FTT of 1 = 4 esxi hosts minimum for Raid-5 or Raid-6 1.33xobject size = 33% erasure coding savings versus mirroring FTT of 2 = 5 esxi hosts minimum for Raid-1 FTT of 2 = 6 esxi hosts minimum for Raid-5 or Raid-6 Storage Policy Space Consumption Comparison ![A screenshot of a computer Description automatically generated](media/image6.png) Raid-0/ Failures to Tolerate -0, 1 component = 1 host, 100 GB storage Raid-1/Failures to Tolerate -- 1, 3 components = 3 hosts 200 GB for Raid-1, 5 Hosts 300 GB, Raid-1 Failures to Tolerate 3 Raid-5/Failures to Tolerate -1, 4 components -- 4 hosts, 133 GB of storage, Raid-6, Failures to Tolerate 2 Raid-5 is minimum of 4 hosts Raid-6 is minimum of 6 hosts A screenshot of a computer Description automatically generated Example - Host in maintenance mode and a failure at that time Example: Raid-5 with 4 hosts, consider more than the minimum number of hosts **Monitoring compliance (you can find the compliance status for all objects in a policy on the VM Compliance tab for that policy)** Policies and Profiles tab VM Storage Policies tab VM Compliance tab Compliance Status of host ![A screenshot of a computer Description automatically generated](media/image8.png) A screenshot of a computer Description automatically generated **Deduplication and Compression** The deduplication and compression option reduces the amount of data that is stored on capacity drives by ensuring that only a single instance of redundant data is stored on each disk group. Enabling the deduplication and compression option **requires an all-flash architecture**. Deduplication and compression are fully compatible with vSAN datastore encryption, but they are ineffective with VMs that use VM encryption. Deduplication removes redundant data blocks, whereas compression removes additional redundant data in each data block. These techniques work together to reduce the amount of space for storing data. vSAN first applies deduplication, and then it performs data compression as it moves data from the cache tier to the capacity tier. Deduplication occurs when data is destaged nearline, that is, from the cache tier to the capacity tier. The deduplication algorithm uses a 4K fixed block size and is performed in each disk group. Redundant copies of a block within the same disk group are reduced to one copy. However, redundant blocks across multiple disk groups are not deduplicated. Deduplication at the disk group level using a 4K block size provides a good balance of efficiency and performance. The compression algorithm is applied after deduplication occurs, before the data is written to the capacity tier. Considering the additional compute resource and allocation map overhead of compression, vSAN stores compressed data only if a unique 4K block can be reduced to 2K or less. Otherwise, the block is written uncompressed. Enable deduplication and compression during bring-up. VSAN (OSA -- original storage architecture) -- Enabled together **Existing Cluster Deduplication and Compression Guideline** - Evacuate the disk group. - Format the evacuated disk group. - Move data back to the newly formatted disk group. - Deduplicate and compress data on the disk groups. - Repeat the steps on each disk group in the cluster. ![A screenshot of a computer Description automatically generated](media/image10.png) Example Screen Storage Policy A screenshot of a computer Description automatically generated Physical Disk Placement of components on VM ![A screenshot of a computer Description automatically generated](media/image12.png) A screenshot of a computer Description automatically generated ![A screenshot of a white box Description automatically generated](media/image14.png) **vSAN (ESA -- Express Storage Architecture in VMware Cloud Foundation (VCF)** Uses current and future hardware **vSAN ESA and NVMe (direct on PC bus)** **vSAN ESA (express storage architecture) Characteristics** - vSAN ESA delivers up to four-times better performance that scales with the latest NVMe and server technologies to meet the needs of the most demanding applications: - vSAN ESA has new features to handle the data and services more efficiently. - vSAN ESA builds on existing vSAN technology to bring new features, functionality, and enhancements instead of rewriting and building an entirely new solution. - vSAN ESA supports only vSAN ReadyNode validated server configurations. - No in place upgrades - No going from OSA to ESA on same hardware in same vSAN datastore - Only high performance NVMe TLC SSD devices are supported. - Only greenfield deployments are supported. - Administrators can use vCenter to manage their vSAN ESA environments the same way as SAN Original Storage Architecture (OSA). **Comparing Architectures** **Feature** **vSAN OSA** **vSAN ESA** -------------------------------- ----------------------------------------------------------------------------------------- ------------------------------------------------- **Disk Support** **Supports hybrid and all-flash configurations** **Only supports all-flash configurations** **Storage Configuration** **Two-tier disk construct known as disk groups** **Single-tier structure called a storage pool** **Disk Maximums** **40 disks per vSAN host (5x cache, 35x capacity)** **No defined upper limit** **Compression** **Off by default** **Off by default** **Compression Scope** **Cluster-wide setting** **Individual object configuration** **Encryption** **One Key Encryption Key (KEK) per cluster and one Disk Encryption Key (DEK) per disk** **One KEK per cluster and one DEK per cluster** **Supported vSphere Versions** **5.5, 6.x, 7.0, and 8.0** **8.0** \*Deduplication is NOT available on ESA \*Compression on ESA is MUCH MORE Efficient Disk structure is the biggest difference between vSAN ESA and vSAN OSA. However, there are also differences in how the I/O flows through the vSAN, how space efficiency works, and how data is encrypted when it lands on disk. For vSAN 8, vSAN ESA can only be deployed in a greenfield deployment. **vSAN ESA (express storage architecture requirements)** vSAN ESA has the following requirements: - vSAN ReadyNodes (meets the listed requirements below) - Minimum of 32 CPUs - Minimum of 512 GB of RAM - One 25-Gbps NIC or faster for vSAN traffic - One 10-Gbps NIC or faster for VM and management traffic - Minimum of four NVMe-based SSDs per host, Class D or better for endurance or Class F for performance - No support for SAS and SATA devices - Disks at least 1.6 TB in size vSAN Disk Configuration Comparison This image shows the disk layout differences between vSAN OSA and vSAN ESA. A screenshot of a computer Description automatically generated **Single point of failure -- if a Disk Cache in the disk group goes out, it brings down the ENTIRE disk group in vSAN OSA. Not so in vSAN Storage Pools. A single failure does NOT bring down the pool** vSAN ESA provides the following advantages: - More efficient use of disks: Disks no longer must be allocated only to cache. - Better resiliency: A failure of a cache disk does not impact other disks. - Improved I/O flow: Data does not need to land on cache before being destaged to capacity. **vSAN ESA (express storage architecture storage pools)** A vSAN ESA storage pool has the following features: - Single-tier architecture - Only supports NVMe based flash devices - Pool of independent storage devices - Reduced I/O flow (no two-tier architecture) - Maximum number of disks defined by the number of disk slots (NVMe Devices you can put in a host) With ESA, the new disk architecture for vSAN 8 uses storage pools, which makes each disk its own independent disk. Because only NVMe disks are supported, the disks serve both reads and writes. Storage pools simplify the I/O process of only having to write to one disk. In addition, storage pools allow other features to be more streamlined, such as encryption and compression. Encryption and compression are discussed in more detail in a later module. **vSAN ESA New I/O Engine** ![A screenshot of a computer Description automatically generated](media/image23.png) - Compresses data once at ingest to reduce the network traffic and reduces CPU resources - Encrypts data once at ingest to reduce CPU resources - Checksums data at ingest to reuse already calculated CRCs - Performs full-stripe writes in parallel, asynchronously, and eliminating read-modify-write activities **vSAN Log-Structured File System** - Reduces I/O amplifications - Low overhead - Compatible with future device types - Allows for high-performance snapshots vSAN ESA introduces the Log-Structured Filesystem (LFS) which works as follows: - Ingests smaller incoming I/O from the guest VM to an in-memory stripe buffer - Packages the smaller writes into larger I/O blocks - Performs encryption, compression, and checksum - Prepares the data to be written to the performance leg - When a full-stripe write accumulates in the performance leg, it is written to the capacity leg of the corresponding object. - Reduces I/O amplification by performing encryption, compression, and checksum in one location instead of each individual host A screenshot of a computer Description automatically generated **Storage Policies for VMware Cloud Foundation vSAN ESA Clusters** Defaults vSAN ESA Default Policy Raid 5- provides FTT=1 vSAN ESA Default Policy Raid 6-provides FTT=2 \*Recommend use ESA Default Policy \*Notes -- vSAN ESA Auto Policy Management Must turn this ON in VCF and then they will take effect Configure-General-Default Storage Policy-Edit- [Example for Raid 5] New default vSAN ESA storage policies have the following characteristics: - **Compression is enabled by default** unless it is disabled because of space efficiency. - The number of disk stripes per object, flash-read cache reservation, and storage tier are not relevant for vSAN ESA. - Granular storage policies (per VMDK) are not supported for vSAN ESA. vSAN ESA supports per-object storage policies. Storage policies (per VMDK) are NOT supported. **vSAN ESA supports per OBJECT** ![A screenshot of a computer Description automatically generated](media/image25.png) When a new VM is created on the vSAN ESA datastore, the datastore default policy is used which corresponds to the cluster-specific default storage policy created by Auto-Policy Management. VM Components (vSAN ESA) Concatenates Raid-0 INTO the Raid-5 Components vSAN ESA Capacity Reporting VSAN Cluster-Monitoring-Capacity A screenshot of a computer Description automatically generated Which two characteristics apply to default vSAN ESA storage policies? (Select two.) - Enabling Auto-Policy Management on the cluster changes the storage policy to vSAN Default Storage Policy. - **Compression is enabled by default unless it is disabled because of space efficiency.** - **Granular storage policies (per VMDK) are not supported for vSAN ESA.** - vSAN ESA Auto-Policy Management configures optimized storage policies based on cluster size. - **The number of disk stripes per object, flash-read cache reservation, and storage tier are not relevant for vSAN ESA.** Incorrect. vSAN ESA supports per-object storage policies. vSAN ESA Auto-Policy Management configures optimized storage policies based on the cluster type and the number of hosts in the cluster inventory. Enabling Auto-Policy Management on the cluster changes the default storage policy from vSAN Default Storage Policy to the new cluster-specific default storage policy. vSAN Requirements for Management and Workload Domains (VCF 5.2) Supported: - Management and VI workload domain creation with vSAN ESA - vSAN ESA host commission - **Additional clusters with vSAN ESA (requires vSphere Lifecycle Manager to be enabled [during VI workload domain creation]**) - Cross Cluster Capacity Sharing Unsupported: - Cross vCenter Cross Cluster Capacity Sharing (supported from vCenter not VCF 5.2) - Conversion from vSAN OSA to vSAN ESA - Conversion from vSAN ESA to vSAN OSA **vSAN ESA clusters require vSphere Lifecycle Manager to be enabled**. - You can perform bring-up using the UI or API with vSAN ESA enabled. - You create additional vSAN ESA clusters on the management domain. - You can create additional vSAN ESA clusters through the UI or the API. - You use the Hardware Compatibility List (HCL) disk claim feature to auto-claim disks. - \*Must turn this on! - The vSAN datastore uses the new vSAN ESA storage type. - You use the default storage policy selection to create virtual machines. Parameters for vSAN ESA **You create a vSAN ESA cluster-based deployment by modifying the bring-up parameter sheet.** ![A screenshot of a computer Description automatically generated](media/image32.png) The user has the option of providing the path to the HCL JSON file manually. This is useful for customers in air-gapped environments without Internet connectivity. The vSAN Health Check Plugin performs automatic verification of your underlying hardware (hosts, disks, storage controller, and drivers) by automatically checking it against VMware\'s vSAN HCL. The vSAN HCL database can either be downloaded automatically from VMware.com or manually uploaded if you do not have direct or proxy Internet access. Proxy Configuration for HCL Management A proxy server can be used for HCL life cycle management for disconnected environments A screenshot of a computer Description automatically generated The proxy server configuration is not a required field. However, it is useful for customers who want to use a proxy server to connect to the Internet to download the latest HCL files. ![A screenshot of a computer Description automatically generated](media/image34.png) **Configure vSAN ESA on VMware Cloud Foundation** Initial Build-up [Cloud Builder performs validation] on the following: - Proxy configuration - HCL (hardware compatibility list) JSON File - A newer HCL file is downloaded if out of date (if internet ready) - vSAN ESA disk eligibility (NVMe Disk) - ESXi host vSAN HCL compatibility - vSAN ESA can be enabled on a new VI workload domain or a new cluster in an existing VI workload domain. - The HCL disk claim feature can be used to automatically claim disks. - A new VI workload domain or cluster is created using the VMware Cloud Foundation UI or API. - A default datastore policy is created based on the number of hosts. Host Commissioning with vSAN ESA - Select the Enabled check box next to vSAN ESA during host commissioning. - The host version must support vSAN ESA. - Hardware is compatible with vSAN ESA (HCL checks). - A host commissioned with vSAN OSA cannot be used for vSAN ESA, and vice versa. VI Workload Domain Creation with vSAN ESA Host commissioning: - Commission a free host. - Select vSAN ESA as the storage type. - HCL and disk compatibility is verified during the process. - Validate and commission the host. VI workload domain creation: - Enable the vSAN ESA option in the GUI during VI workload domain creation. - If you use the API, enable the vSAN ESA option in the datastore specification. - Trigger the VI workload domain deployment. - The disks are automatically claimed during cluster creation. - Using Auto-Policy Management, the [default storage policy is created based on the cluster size]. - Cluster creation completes. Adding a Cluster to a Workload Domain - **Use commissioned vSAN ESA hosts to create clusters.** - Use the SDDC Manager to Commission New Hosts to vSAN ESA - vSAN ESA supports HCL-Aware-Disk-Claim: - All eligible disks are consumed during the cluster creation process. - This process also claims disks for newly added hosts. - If the disk claim fails to consume disks, a vSAN health check is shown with possible causes. - Prechecks are performed to check whether the underlying hardware supports vSAN ESA. Automatic claiming is only performed if the hardware (disks and controllers) is validated by the vSAN HCL list. - **[vSAN Direct is not supported with vSAN ESA]**. - Configuration options for enabling deduplication are not available for vSAN ESA clusters. **Deduplication and compression are policy-based in vSAN ESA and not cluster wide.** Removing Hosts The following points must be considered before removing hosts from a vSAN ESA cluster: - The vSAN ESA default storage policy is created based on the number of hosts. - Removing a host can make the policy noncompliant. - The vSAN Health UI generates health alerts in response to policy noncompliance. - The host decommission workflow in SDDC Manager is the same as non-ESA hosts. **Introducing vSAN Max Support** vSAN Max is supported as principal storage in VMware Cloud Foundation 5.2 and vSAN 8 U3. A screenshot of a computer Description automatically generated In VMware Cloud Foundation 5.2, customers can use vSAN Max as their primary VMware Cloud Foundation storage. Customers running VMware Cloud Foundation 5.2 can easily choose whether they want to deploy their clusters as aggregated vSAN HCI clusters or disaggregated vSAN Max clusters, offering tremendous flexibility. Overview: **\*Creates storage ONLY ESXi hosts and make available to vSAN Clusters** **No Virtual Machines in this cluster -- storage ONLY clusters** - Provide shared storage for VMware Cloud Foundation using vSAN Max. - Integrated in deployment and management workflows. Choose: - Aggregated storage using vSAN HCI - Disaggregated storage using vSAN Max - From 20 to 360 TB per vSAN Max host - VMware Cloud Foundation licensing includes vSAN capacity entitlements. Limitations: - When using VMware Cloud Foundation, a disaggregated vSAN Max deployment is not supported in a stretched cluster topology. VMware Cloud Foundation only supports a stretched cluster when using an aggregated vSAN HCI deployment option. - As with environments not using VMware Cloud Foundation, the selection of vSAN HCI or vSAN Max must be performed at cluster creation time. You cannot retroactively change this setting. - Other limitations of vSAN Max apply to VMware Cloud Foundation environments just as they do in non VMware Cloud Foundation environments. Such limitations include but are not limited to cluster host count sizes, connection limits, and so on. See the vSAN Max documentation for more information. - Presentation of vSAN Max storage is limited to within the same vCenter system. Common use cases of vSAN Max: - Storage refreshes: Replace aging storage arrays while keeping storage independent from compute. - Application and hardware cost optimization: Minimize the costs of application licensing by keeping compute clusters small. - Extend the life of existing hardware assets: Introducing vSAN Max into a workload domain allows you to maintain existing investments. - Normalize storage management: vSAN Max is managed the same way as vSAN HCI, providing a consistent management experience across environments. - Cloud native environments: Keep compute and storage independent for optimal scaling of cloud native applications. - Private AI: Store large data sets for large language models (LLMs). - vSAN Max landing page: ![A screenshot of a computer Description automatically generated](media/image43.png) A screenshot of a computer Description automatically generated ![A screenshot of a computer Description automatically generated](media/image47.png) A screenshot of a computer Description automatically generated ![A screenshot of a computer Description automatically generated](media/image51.png) **vSAN VCF Design Considerations** ESA Choose vSAN Ready Node from OEM Partner Build your own is NOT supported at this time for ESA OSA Build a server with Hardware Compatibility Guide Using a jointly engineered hyperconverged infrastructure with VMware Cloud Foundation Integration on Dell EMC VxRail CPU and Memory CPU HW Compatibility List CPU -- Sockets per host and cores per socket vCPU-to-core ratio Add 10% CPU Overhead for vSAN Memory Memory for VMs Full vSAN requires 32GB per host - Deduplication and compression: 30 MB per disk group - Compression only: 39.5 MB per capacity disk Boot Device UEFI Firmware (Unified Extensible Firmware Interface) TPM 2.0 (Trusted Platform Module) High-Endurance Device such as SSD, SATADOM 128-GB Boot Device to maximize the space available for ESX-OS Data Disk Controllers Multiple Storage Controllers Higher Performance Redundancy - Single Controller failure affects a smaller porion of the overall storage on a host Raid-0 and Pass Through Queue-Depth vSAN HCL vSAN Performance on Hardware Different Storage device types provide different cost & performance advantages Better Performance NVMe cache/SAS All SAS NVMe Cache/SATA SAS Cache with SATA All SATA All-flash is more predictable and responsive than hybrid. ![A blue triangle with white text Description automatically generated](media/image53.png) **vSAN Sizing: Levels of Overhead** Consider different levels of overhead when sizing vSAN A screen shot of a graph Description automatically generated 50% available capacity left after all other overhead options - The vSAN metadata overhead is calculated with **5 percent**, a bit more than required for on-disk format version 3, but it includes space for checksum. - Slack space is calculated with **25 percen**t. You might consider **30 percent** in your calculations when sizing for a **production** environment. - **The effective capacity is calculated for RAID 5 (FTT=1).** - The vSAN design and sizing guide states that formatting overhead is **1 percent plus deduplication overhead**. Consider this calculation if deduplication is used. Sizing Considerations Impact of Storage Policy Based Management (SPBM) FT -- Raid 1 or 5/6 FTT Object Space Reservations VM Home Namespace Thin provisioned by default Max of 255 GB FTT or FTM inherited by VM Swap Space -- thin provisioning vSphere HA Maintenance Mode Using FTT=1 might require a full data migration for prod workloads Entering Maintenance Mode can take several HOURS vSAN Performance Service This feature has its own policy Redundancy built in Size of the Virtual Machine Disks (VMDKs) vSAN File Services Object contains sys shares, file shares vSAN ISCSI Services Applications that require this iscsi target ![A screenshot of a computer Description automatically generated](media/image55.png) vSAN VCF Design Principles Verify Hardware, vSAN ReadyNode recommendations, firmware, drivers, setup ESA -- requirement OSA -- HIGHLY recommended Verify Storage Controllers provide the required level of performance Use multiple storage i/o controllers per host to help reduce failure domain Choose storage I/O controller that has the HIGHEST queue depth Prefer **pass-through** over Raid-0 Eliminates overhead if choose pass-through Disable controller cache and advanced features Recognize workload performance and availability requirements Consider vSAN CPU and Memory overhead (CPU 10%) Consider enabling vSAN Reserved Capacity for vSAN cluster maintenance Operations Consideration vSphere Admin are now storage admins **Maintenance Mode counts as a failure** **-- consider FTT=2 for Production** Discuss resync traffic & duration, and failure scenarios A white background with black text Description automatically generated ![A screenshot of a computer error Description automatically generated](media/image57.png) **Stretched vSAN Clusters in VCF** Availability Zones (subset of an instance of VCF) VCF uses specific terms for availability design options and site locations - Independent power, cooling, network, security - Physically separated so they are not affected by the same disaster - Connected using high-bandwidth (10Gbps), low-latency (\< 5 ms) networks - One mgmt. domain and one or more workload domain in each region - Less than 150-ms latency between regions - Provides resilience across availability zones. A stretched cluster consists of two active availability zones and one witness site: a. Each availability zone represents its own fault domain allowing the failure of an entire availability zone in the stretched cluster b. Each availability zone must contain the same number of hosts c. A witness site contains a single host that maintains witness components for objects that need them - Each availability zone in the stretched cluster is configured as a fault domain. When used in a stretched cluster, fault domains spread redundancy components across availability zones and can tolerate the failure of an entire availability zone. - For a given availability zone, [you must have a minimum of four ESXi hosts in the default management domain]. [The minimum number of hosts in a VI workload domain is three]. - For additional information about the number of ESXi hosts required per availability zone, see the section on vSphere cluster design for VMware Cloud Foundation in *VMware Cloud Foundation Design Guide* at . A screenshot of a computer Description automatically generated ![A diagram of a cloud computing system Description automatically generated](media/image70.png) A blue line drawn on a white board Description automatically generated **Stretched Cluster Use Cases** Planned Maintenance - For planned maintenance of one availability zone without any service downtime - For migrating applications back after maintenance is complete - To prevent production outages before an impending service outage, such as power outages - To avoid downtime, not to recover from it - For auto initiation of VM restart or recovery - When you want a low recovery time objective (RTO) for most unplanned failures - When you want users to focus on app health after recovery, not on how to recover VMs - vSAN Enterprise License - Mgmt Cluster MUST be stretched before any VI workload clusters are stretched - The management cluster in the management domain must be stretched before you stretch a VI workload cluster. vCenter instances for all VI workload domains are hosted in the management domain. You must protect the management domain to ensure that you can access and manage the workload domains in case a disaster occurs in one availability zone. - In a campus area network, the vSphere vMotion, vSAN, and host overlay networks might be stretched across the availability zones. In most other cases, these networks typically have different VLANs in each availability zone. The vSphere vMotion, vSAN, and host overlay networks must route between availability zones if they are not stretched. - If vSphere vMotion networks are not in the same layer 2 domain, you must configure gateways for the vSphere vMotion VMkernel ports. vSphere vMotion VMkernel ports use the vMotion TCP/IP stack instance by default, and they can therefore use a different gateway than the default TCP/IP stack. VMware Cloud Foundation defines a gateway for vSphere vMotion in the network pool configuration. You must ensure that this gateway is reachable and can route traffic between the availability zones - Networking in both availability zones must meet the following requirements: - Round-trip time (RTT) between availability zones must be \< or = 5 ms - Enough IP addresses must be available on the IP pool configured for the Host Overlay Transport in each availability zone - The vSphere vMotion, vSAN, host overlay, and mgmt. networks must be stretched (L2) or routed (L3) between availability zones - NSX Edge Uplink and NSX Edge Overlay Transport vLANs must be stretched (L2) across availability zones - The vSAN and Management networks must have routing to the witness site - The RTT between the availability zones and the witness host must be less than or equal to 200 ms - With 11 or more hosts her availability zone, this requirement hanges to less tha or equal to 100 ms ![A screenshot of a computer Description automatically generated](media/image73.png) **Commissioning Hosts in Availability Zone 2** - You must commission hosts in the 2^nd^ AZ BEFORE you can stretch the cluster - To commission hosts in the 2^nd^ AZ - Create the AZ2 network pool - Ensure that vSphere vMotion and vSAN network types are configured - Commission the hosts in AZ2 and associate them with the AZ2 network pool - Dedicated ESXi host whose purpose is to host the witness components of VM objects - The vSAN Witness host has the following characteristics - Managed by the same vCenter instance managing the vSAN cluster - Hosted in a 3^rd^, external site or in the cloud - Can be a physical ESXi host or the vSAN witness appliance (nested ESXi) - Must have connectivity to both availability zones on the mgmt. network A white background with black text Description automatically generated **Deploying vSAN Stretched Clusters using APIs** 1. **Stretching a Cluster Using APIs: Obtaining the Cluster ID** a. Open the APIs for managing clusters (get the cluster ID) i. Expand GET /v1/clusters ii. Read the PageOFCluster API response to obtain the cluster ID for the relevant cluster 1. Obtain a bearer token for a VCF user 2. Run the following command: ![](media/image75.png) 2. **Stretching a Cluster Using APIs: Obtain the Host ID** a. Expand **GET /v1/hosts** in the APIs for managing hosts section of the API Explorer. b. Run the following command: a. curl -X GET \--insecure https://localhost/v1/hosts -H \'Content-Type: application/json\' -H \"Authorization: Bearer \$VCFADMIN\_TOKEN\" \| json\_pp 3. **Creating JSON Input** a. Download a sample clusterUpdateSpec JSON file in the API Explorer. b. Edit the JSON input to include only the clusterStretchSpec section. c. Add the host IDs and license keys for each host and witness host configuration to the clusterStretchSpec file. 4. **Validating the JSON Input** 5. Executing the Stretch Cluster Workflow ![A screenshot of a computer Description automatically generated](media/image78.png) 6. Stretched Cluster Workflow: Complete A screenshot of a computer Description automatically generated **Configuring NSX Edge for Failover to AZ2** - You configure IP prefix lists for outbound route advertisements - You configure inbound and outbound route maps with local preferences and AS-path prepend values for AZ2 - You add BGP neighbors for AZ2, using the route maps as route filters - You **configure IP prefix lists for outbound route advertisements**. - You **configure inbound and outbound route maps with local preferences and AS-path prepend values for AZ2**. - You **add BGP neighbors for AZ2, using the route maps as route filters**. - [Site disaster tolerance governs fault tolerance between sites] - [Failures to tolerate governs fault tolerance within each site] ![A screenshot of a computer Description automatically generated](media/image87.png) Example: Dual Site Mirroring with Raid 5 in local sites Raid 1 BETWEEN sites and Raid 5 in the siteA computer screen shot of a computer Description automatically generated Expanding a stretched workload domain cluster ![A computer screen shot of a computer Description automatically generated](media/image89.png) Replacing Failed Hosts in a Stretched Workload Domain Cluster (quiz question) 1. Remove the failed host from the cluster using the cluster compaction API 2. Decommission the failed host 3. Commission the new host with the correct network pool 4. Add the newly commissioned host to the cluster using the cluster expansion API 5. Update vSphere DRS and HA ![A screenshot of a computer Description automatically generated](media/image91.png) A screenshot of a computer Description automatically generated ![A screenshot of a computer Description automatically generated](media/image93.png)