VCFDCM52 PDF: VMware Cloud Foundation Architecture and Components
Document Details

Uploaded by ManageableWormhole
2025
Tags
Related
- VMware Cloud Foundation Administrator 2024 Exam Notes PDF
- VCF Management Domain Backup and Restore PDF
- VMware Cloud Foundation 5.2 Design Guide PDF
- VMware Cloud Foundation 5.2 Administrator PDF
- Deploy and Configure a VCF Management Domain Using VMware Cloud Builder PDF
- Deploy and Configure an NSX Edge Cluster PDF
Summary
This document provides an overview of VMware Cloud Foundation (VCF), covering its architecture, components, and functionalities. It details the role of SDDC Manager in managing and configuring the VCF environment, including NSX Edge clusters and network traffic separation. Additionally, it explains the processes for importing existing vSphere infrastructure into VCF and managing NSX Edge networking components.
Full Transcript
VCFDCM52 Thursday, January 2, 2025 12:01 PM 1. VCF Architecture and Components You must use SDDC Manager to perform configuration changes, update software, add or remove hosts from workload domain clusters, and perform other key operations to ensure that the SDDC Manager main...
VCFDCM52 Thursday, January 2, 2025 12:01 PM 1. VCF Architecture and Components You must use SDDC Manager to perform configuration changes, update software, add or remove hosts from workload domain clusters, and perform other key operations to ensure that the SDDC Manager maintains accurate inventory information. SDDC Manager provides the following key services: UI: Provides an HTML5-based interface that offers a consistent look and feel between VMware UIs, such as the vSphere Client and SDDC Manager Lifecycle Manager: Monitors and performs updates to software components This service ensures that products run software versions that are tested and proven to work together. Lifecycle Manager also automates and orchestrates the upgrade of components that require updating. It ensures that the inventory is updated when upgrades are performed. Domain Manager: Orchestrates the creation, deletion, and scaling of workload domains in VMware Cloud Foundation SoS Utility: Performs health checks and log collection from the command line or API Network Pools: Provide a pool of IP addresses that can be assigned to deployed resources, such as ESXi hosts Inventory: Maintains an inventory of managed entities in an internal database User Management Roles: Admin, operator, view only Examples of actions performed in SDDC Manager: Changing host and other component passwords Deploying and configuring NSX Edge clusters Adding hosts to an existing cluster Creating a new vSphere cluster Examples of actions performed outside SDDC Manager: Changing roles and permissions for Active Directory users and groups Adding new resource pools Creating port groups Applying updated license keys to vSphere that are added to SDDC Manager Actions that vSphere administrators are accustomed to performing have serious ramifications when performed on VMware Cloud Foundation. Only vSphere administrators with a clear understanding of such possible ramifications should have privileges to VMware Cloud Foundation assets in the vSphere Client. For example, if a vSphere administrator renames a vSphere cluster that was deployed and managed by VMware Cloud Foundation, the mismatch makes that cluster inaccessible to SDDC Manager. Broadcom technical support should be contacted before attempting to fix the problem. Password Management: SDDC Manager uses the admin account for internal API calls. The password for this account should be changed periodically. Changing passwords is different for the local accounts in SDDC Manager. To change the password for root and vcf: Access SDDC Manager with SSH as the vcf account. Run the su command to change to the root account. Run the passwd command for the desired account. passwd root passwd vcf The vcf account is used to access the SDDC Manager with SSH. The root account cannot log in with SSH by default. To change the API admin password: use SSH to access SDDC Manager as vcf, su to root, and run the following command. /opt/vmware/vcf/commonsvcs/scripts/auth/set-basicauth-password.sh admin To change the API admin password: Access SDDC Manager with SSH as the vcf account. Run the su command to change to the root account. Run the /opt/vmware/vcf/commonsvcs/scripts/auth/set-basicauth-password.sh admin command. Ensure that special characters are properly escaped. SDDC Manager uses the admin account for internal API calls. The password for this account should be Mariel pdp página 1 SDDC Manager uses the admin account for internal API calls. The password for this account should be changed periodically. Changing passwords is different for the local accounts in SDDC Manager. To change the password for root and vcf: Access SDDC Manager with SSH as the vcf account. Run the su command to change to the root account. Run the passwd command for the desired account. passwd root passwd vcf The vcf account is used to access the SDDC Manager with SSH. The root account cannot log in with SSH by default. To change the API admin password: use SSH to access SDDC Manager as vcf, su to root, and run the following command. /opt/vmware/vcf/commonsvcs/scripts/auth/set-basicauth-password.sh admin To change the API admin password: Access SDDC Manager with SSH as the vcf account. Run the su command to change to the root account. Run the /opt/vmware/vcf/commonsvcs/scripts/auth/set-basicauth-password.sh admin command. Ensure that special characters are properly escaped. NSX Overview A segment, also known as a logical switch, reproduces switching functionality in an NSX virtual environment. Segments are similar to VLANs. Segments separate networks and provide network connections to which you can attach VMs. If the VMs are connected to the same segment, the VMs can then communicate with each other over tunnels between hypervisors. Each segment has a virtual network identifier (VNI), similar to a VLAN ID. However, unlike VLANs, VNIs scale well beyond the limits of VLAN IDs. A segment contains multiple segment ports. Entities such as routers, VMs, or containers are connected to a segment through the segment ports. Segment profiles include layer 2 networking configuration details for logical switches and logical ports. NSX Manager supports several types of switching profiles and maintains one or more system-defined default switching profiles for each profile type. Segment profiles contain different configurations of the logical port. These profiles can be applied at a port level or at a segment level. Profiles applied on a segment are applicable on all ports of the segment unless they are explicitly overwritten at the port level. Multiple segment profiles are supported, including QoS, port mirroring, IP Discovery, SpoofGuard, segment security, MAC management, and Network I/O Control. NSX Edge nodes are appliances with pools of capacity that can host distributed routing and non- distributed services. NSX Edge nodes provide high availability, using active-active and active-standby models for resiliency. NSX Edge is commonly deployed in DMZs and multitenant cloud environments, where it creates virtual boundaries for each tenant. VMware Cloud Foundation automates NSX Edge deployment. When you create a VI workload domain, NSX deployments work in the following ways: An NSX Manager three-node cluster is deployed in the management domain: The cluster is deployed when you create the first VI workload domain that is supported by NSX. The cluster can be shared by other VI workload domains or dedicated to a single VI workload domain. NSX Manager adds the VI workload domain vCenter instance as a compute manager. Mariel pdp página 2 NSX Manager adds the VI workload domain vCenter instance as a compute manager. NSX Manager creates VLAN and overlay transport zones to which all host transport nodes are added. NSX Manager creates an uplink profile and applies it to all host transport nodes. An uplink profile is a profile for the Edges and for the transport nodes that declares how are they going to connect to a VLAN/Overlay (to which and with which nic) Workload domain traffic separation VMkernel traffic types that can be separated include: Management Storage vSphere vMotion Overlay Use cases for multi-NIC hosts include the following situations: Minimum of 2 NICs per host in a multi-NIC setup Adherence to legacy practices: Some customers still segregate different traffic types in either separate vSphere standard switch or VDS configurations, or in a single VDS using port groups and policy to dictate which NICs carry specific traffic types. VMware Cloud Foundation replaces this practice by using consolidated VDS (NSX), with each traffic type separated by VLANS and GENEVE or tunnel endpoints (TEPs). Traffic segregation for security: For environments that require physical segregation, VMware Cloud Foundation can support the mapping of management, storage, or virtual machine traffic to a logical switch that is connected to dedicated physical NICs. Management VMkernel adapters can be on one pair of physical NICs, with all other traffic (vSphere vMotion, vSAN, VM, and so on) on another pair. Mariel pdp página 3 vMotion, vSAN, VM, and so on) on another pair. You can isolate VM traffic so that one pair of physical NICs manages all management VMkernel adapters, vSphere vMotion, and vSAN, and a second pair handles the overlay and VM traffic. Traffic separation due to bandwidth limitations: Even with aggregation, some customers might still need to separate traffic types to guarantee bandwidth, especially when they cannot use 25 GbE adapters. For example, you can separate VM traffic onto a port group supported by separate physical adapters. You import an existing vSphere infrastructure into VMware Cloud Foundation in the following scenarios: Scenario 1: You do not already have the SDDC Manager deployed: In this scenario, you begin by identifying a vSphere cluster to the host SDDC Manager. You download and deploy the SDDC Manager appliance (OVA) into the cluster and copy the import scripts (Tar archive) to SDDC Manager. You then use SSH to connect to the SDDC Manager instance to extract the import tarball, and you run the import script with the convert option to convert the vSphere cluster into a management domain. Scenario 2: The SDDC Manager has already been deployed: In this scenario, you use SSH to connect to the SDDC Manager instance. You then extract the import tarball (if it does not already exist) and run the import script with the import option to import vSphere clusters as VI workload domains. NSX Egde Networking VMware Cloud Foundation automates the following tasks during the deployment and configuration of NSX Edge clusters: Deploys the NSX Edge VMs Creates and configures NSX Edge uplink profiles Creates and configures uplink segments When border gateway protocol (BGP) is used: Enables, configures, and verifies Tier-0 BGP Creates and configures the initial Tier-0 and Tier-1 gateways Expands or shrinks an existing NSX Edge cluster NSX Edge clusters are associated with workload domains. The north-south routing and networking services provided by the NSX Edge cluster are shared by the entire workload domain and any other workload domains that use the same NSX Manager cluster. SDDC Manager does not support deploying NSX Edge clusters on L2 non-uniform and L3 vSphere clusters. NSX Edge Cluster Profile Options You can choose between Default and Custom NSX Edge cluster profiles: You select Default unless you require a Bidirectional Forwarding Detection (BFD) configuration for the BGP peering: Using BFD with BGP peering enhances the reliability and responsiveness of your network by allowing faster detection and response to link failures. Selecting the Routing Type You can choose between the following routing types when you deploy an NSX Edge cluster: BGP: Dynamic route advertisement requires a physical switch configuration. NSX Edge Tier-0 gateways are configured to distribute routes automatically with physical peers using Mariel pdp página 4 NSX Edge Tier-0 gateways are configured to distribute routes automatically with physical peers using BGP. Static: You must configure all routes to external networks manually. External BGP (eBGP) is usually the preferred option because route redistribution takes place automatically. For environments with high security requirements, using static routes might be preferable to ensure that secure networks are not unintentionally advertised. Open Shortest Path First (OSPF) is supported by VMware Cloud Foundation, but it is not the recommended solution: OSPF is not integrated in SDDC Manager. OSPF is not supported on Application Virtual Network (AVN) uplinks. VMware Cloud Foundation does not support the use of OSPF with stretched clusters. VMware Cloud Foundation does not support the use of OSPF with federated NSX. VMware Cloud Foundation does not support the combined use of BGP and OSPF in a single Tier-0. To use OSPF: Deploy using static routing. Deploy the NSX Edge nodes. Configure OSPF. Configuring NSX Edge Appliances You must configure the network interfaces for each NSX Edge appliance: Management IP address on the management VLAN One edge tunnel endpoint (TEP) IP address for each edge node on the edge TEP VLAN One uplink IP address for each edge node on separate uplink VLANs You can remove NSX Edge nodes only if the following prerequisites are met. The NSX Edge cluster must be available in the SDDC Manager inventory and must be active. The NSX Edge node must be available in the SDDC Manager inventory. The NSX Edge cluster must be hosted on one or more vSphere clusters from the same VI workload domain. The NSX Edge cluster must contain more than two NSX Edge nodes. The NSX Edge cluster must not be federated or stretched. For active-active configurations, the NSX Edge cluster must contain two or more NSX Edge nodes with two or more Tier-0 routers after the NSX Edge nodes are removed. For active-standby configurations, you cannot remove NSX Edge nodes that are the active or standby node for the Tier-0 router. Equal-cost multipath routing (ECMP) is used to load balance traffic across layer 3 connections. BGP is used to exchange routes and peer with NSX Edge nodes. BGP is an interdomain routing protocol that provides loop-free routing between separate routing domains that contain independent routing policies (autonomous systems). The implementation of BGP version 4 includes multiprotocol extensions so that BGP can carry routing information for IP multicast routes and multiple layer 3 protocol address families, including IP version 4 (IPv4), IP version 6 (IPv6), and virtual private networks version 4 (VPNv4). BGP is primarily used to connect a local network to an external network to gain access to the Internet or to connect to other organizations. When connecting to an external organization, eBGP peering sessions are created. Although BGP is referred to as an exterior gateway protocol (EGP), many networks within an organization are becoming so complex that BGP can be used to simplify the organization's internal network. New dvPortgroups in vSphere and new logical switches provisioned in NSX are connected to the north- south ECMP on-ramp.??????? L2 fabric design (where ToR are L2 devices) has the following characteristics: NSX Edge nodes peer with L3 devices to which ToR connects. In this mode, it is vendor neutral. Both BGP and OSPF protocols can be used. L3 ToR design has the following characteristics: The simple design is based on non-LACP uplink teaming. With MLAG (Arista and others), LACP is terminated at the ToR. Either BGP or OSPF can be used. Connectivity to Physical Layer 3 Devices: Routing Feature Set The routing feature is set on the Tier-0 logical router that does the peering with the layer 3 device: Static routes to the physical network. Multihop external BGP (eBGP) to the physical network. Equal-cost multipath (ECMP) is supported using static routes and eBGP (8-way ECMP). BFD is available to enhance network resilience through rapid and reliable detection of link and path Mariel pdp página 5 BFD is available to enhance network resilience through rapid and reliable detection of link and path failures between network devices, enabling fast failover. A BGP session refers to the established adjacency between two BGP routers. BGP sessions are always point-to-point and are categorized into the following types: Internal BGP (iBGP):?Sessions established with an iBGP router that are in the same AS or participate in the same BGP confederation iBGP sessions are considered more secure, and some of BGP?s security measures are lowered in comparison to eBGP sessions. iBGP prefixes are assigned an administrative distance (AD) of 200 when being installed into the router?s routing information base (RIB). External BPG (eBGP):?Sessions established with a BGP router that are in a different AS eBGP prefixes are assigned an AD of 20 on installation into the router?s RIB. ECMP (Equal-Cost Multi-Path) is not a routing protocol itself. Instead, it's a routing strategy or technique used to distribute traffic across multiple paths that have the same cost or metric. Key Points: Routing Protocols: These are protocols like OSPF and BGP that determine the best path for data to travel through a network. ECMP: This technique works with these routing protocols to balance traffic across multiple equal- cost paths, enhancing performance and reliability. Appplication Virtual Networks AVNs are software-defined networks that serve a specialized purpose in the SDDC. These networks can span a defined zone of clusters and traverse NSX Edge service gateways for their north-south ingress and egress (like an on-ramp to the SDDC). They implement software-defined networking based on NSX in the management domain. In the management domain, application virtual networks (AVNs) are created by NSX after the management domain bring-up process and NSX Edge cluster deployment. AVNs consist of the following components: One Tier-0 gateway: → Peers with physical network → Configured with equal-cost multipath (ECMP) routing One Tier-1 gateway: → Services two NSX segments for VMware Aria Suite solution deployments One NSX Edge cluster These AVNs are pre-provisioned for all SDDC management VMs that reside on the overlay network. VMware Cloud Foundation requires these networks for the automated deployment of VMware Aria Suite Lifecycle. If the AVNs are not present, you must manually deploy VMware Aria Suite Lifecycle. The AVNs have specific uses in VMware Cloud Foundation: Mariel pdp página 6 The AVNs have specific uses in VMware Cloud Foundation: Mgmt-RegionA01-VXLAN is used for VMware Aria Suite workloads that do not require portability between regions: → You use this network for VMware Aria Operations for Logs and VMware Aria Automation Proxy Servers guided deployment. Mgmt-xRegion01-VXLAN is used for VMware Aria Suite workloads that require portability across regions or failover capability: → VMware Cloud Foundation deploys VMware Aria Suite Lifecycle on this network. → You use this network for VMware Aria Automation and VMware Aria Operations guided deployment. AVNs can be one of the following types: Overlay-backed NSX segments: → Layer 2 traffic is carried by a tunnel between VMs on different hosts. → This type requires a direct connection to the Tier-1 gateway. VLAN-backed NSX segments: → Layer 2 traffic is carried across the VLAN network between VMs on different hosts WLD SSO Workload Domain preparation Host Comissioning Mariel pdp página 7 Hosts must be pre-imaged with the correct ESXi version. Host names must be resolvable in DNS. You cannot change the storage type after the host is commissioned. Network pools must exist before you commission hosts.(they're created on SDDC manager) Hosts can be commissioned individually or in bulk. NTP and Syslog are enabled. You can use a JSON template file to bulk-commission a maximum of 32 hosts at a time. Network pools A network pool is a range of IP addresses for specific services that VMware Cloud Foundation assigns to hosts. VMware Cloud Foundation assigns IP addresses for the following host network types: vMotion (required) vSAN (optional) NFS (optional) iSCSI (optional) Network pools automatically assign static IP addresses to vMotion, vSAN, NFS, and iSCSI VMkernel ports on a vSphere standard switch when hosts are commissioned. The VMkernel ports are migrated to a vSphere Distributed Switch (VDS) when hosts are added to a VI workload domain. You must configure a subnet for vSphere vMotion as part of every network pool that you create. You optionally configure subnets for vSAN, NFS, and iSCSI network types. You can only commission hosts with a particular storage type if subnets are configured for the network pool. When creating a network pool, you enable network types that you plan to implement on hosts that consume IP addresses from the network pool. You cannot add a network type to a network pool after it is created. Mariel pdp página 8 Network Pool Implementation When a host is initially commissioned, the management network is added to a vSphere standard switch. When hosts are added to a VI workload domain, the networking uses a vSphere Distributed Switch (VDS): The management networking is migrated from the old vSphere standard switch to a newly created VDS port group. A VDS port group is created for each of the networks defined in the network pool. Hosts are connected to each port group and assigned addresses from the ranges in the network pool. Network Pool Settings You must provide the following information when creating a network pool: VLAN ID for each network type Subnet for each network type Range of IP addresses (inclusion range) MTU settings Gateway settings Each network requires a reserved subnet and VLAN for the hosts. The VLAN must be configured on the physical network before you add hosts that use the VLAN defined in the network pool. You must also ensure that your network switches are configured to support the MTU that you define in the network pool. You cannot change the subnets assigned to a network pool after the network pool is created. Sizing and Managing Network pools Correctly sizing your subnets helps to prevent problems related to insufficient IP addresses. You must specify a subnet with adequate available addresses because the subnet cannot be changed after the network pool is created. Before sizing your network pool, you must consider the number of hosts that you plan to have in each cluster. You must also consider how many clusters will use the network pool. All hosts in a cluster must use the same network pool, and each host consumes one IP address from each configured subnet. Subnets must have sufficient IP addresses for other services such as DHCP, DNS, gateways, and firewalls. Mariel pdp página 9 You can only adjust an existing network pool in the following ways: Change the name of the network pool Add additional included IP address ranges You should plan your network pools to reserve what you need for a pool rather than larger blocks of addresses: After an IP address range is included in a network pool, those addresses are locked to that pool. A network pool cannot be deleted if any IP addresses in the pool are in use. Workload Domain Configuration A transport zone defines the span of a logical network over the physical infrastructure. Transport nodes can participate in the following transport zones: Overlay: → Used as the internal tunnel between ESXi hosts and NSX Edge transport nodes → Carries Geneve-encapsulated traffic VLAN: → Used at NSX Edge uplinks to establish northbound connectivity to top-of-rack (ToR) switches → Can be used to connect workload VLANs to the physical network Sub-transport node: in 5.1 Per Rack transport node nsx configuration You must follow these guidelines: A transport node can have multiple NSX virtual switches if the transport node has more than two pNICs. A transport zone can only be attached to a single NSX virtual switch on a given transport node. An NSX virtual switch can attach to only a single overlay transport zone. An NSX virtual switch can attach to multiple VLAN transport zones. A segment can belong to only one transport zone. Transport zones determine which hosts can participate in a network and have the following characteristics: A single transport zone can have all types of transport nodes (ESXi and NSX Edge). A transport zone does not represent a security boundary. A hypervisor transport node can belong to multiple transport zones. An NSX Edge node can belong to one overlay transport zone and multiple VLAN transport zones. VMware Cloud Foundation supports a single overlay transport zone per NSX Instance, which is particularly suitable for multitier network architectures in which L2 VLANs extend across racks. In a Layer 2 transport switch fabric design (also called a multitier architecture): ToR switches and upstream Layer 3 devices, such as core switches or routers, form a switched fabric. Mariel pdp página 10 ToR switches and upstream Layer 3 devices, such as core switches or routers, form a switched fabric. L2 VLANs and NSX overlay networks are extended across racks. The upstream Layer 3 device terminates each VLAN and provides default gateway functionality. Uplinks from the ToR switch to the upstream Layer 3 devices are 802.1Q trunks carrying all required VLANs. The multitier architecture might require a specialized data center switching fabric product from a single vendor. Sub-transport node profiles which can be used to define per-rack NSX configurations that are applied to sub-clusters. Sub-transport node profiles are particularly suitable for spine-leaf network architectures. In NSX, a sub-cluster is a grouping of hosts that share common network configuration attributes, typically including elements such as IP address subnets and other networking parameters. From VMware Cloud Foundation version 5.1: The Default profile uses two or more pNICs and a single VDS prepared for NSX. The Custom profile option is also available using the SDDC Manager UI. After hosts are commissioned to the inventory as Unassigned, they can be assigned during VI workload domain creation. You must make the following preparations before creating a workload domain: You can use a static IP pool for the NSX host overlay network. If you plan to use DHCP for the NSX host overlay network, verify that a DHCP server is available to provide IP addresses to the NSX tunneling endpoints (TEPs). Verify that DNS records exist for vCenter instances and NSX Manager. Mariel pdp página 11 Verify that DNS records exist for vCenter instances and NSX Manager. Verify that enough unassigned hosts with the correct storage type are available. Verify that all hosts are associated with the same network pool. Appropriate license keys with sufficient capacity must be added in SDDC Manager. The VLAN in which the TEPs are assigned must have an available DHCP server to assign IP addresses for the overlay network VMkernel ports. You cannot assign static TEP address pools using VMware Cloud Foundation automated deployment of NSX. VMware Cloud Foundation creates a DHCP IP address pool in NSX Manager by API during the VI workload domain creation. VI workload domain creation fails if a DHCP server is unavailable on the overlay network VLAN. Clarificación: TEPs and Overlay Transport Zones TEPs (Tunnel Endpoints): These are used for encapsulating and decapsulating overlay traffic in NSX. They are essential for the overlay network to function. Overlay Transport Zones: These zones allow for the creation of logical networks that span multiple physical networks and data centers using encapsulation protocols like VXLAN or Geneve. VLANs for TEPs VLAN Assignment: Even though TEPs are part of the overlay transport zone, they still need to communicate over the physical network. This is where VLANs come in. VLANs provide the Layer 2 network segmentation required for TEPs to communicate with each other across the physical network DHCP Requirement: During the automated deployment of NSX in VMware Cloud Foundation, DHCP is used to dynamically assign IP addresses to the TEPs. This ensures that each TEP gets a unique IP address without manual intervention Bridging the Concepts Overlay and VLAN Integration: The overlay network (using TEPs) operates on top of the physical network, which uses VLANs for segmentation. The VLAN assigned to the TEPs ensures that the encapsulated overlay traffic can traverse the physical network efficiently Communication: TEPs need to communicate with each other, and this communication happens over the physical network. By assigning a VLAN to the TEPs, you ensure that this communication is properly segmented and managed In summary, while TEPs are indeed part of the overlay transport zone, they still rely on VLANs for their underlying physical network communication. This setup allows the overlay network to function seamlessly on top of the physical infrastructure To configure a DHCP server to be available on a VLAN, you need to ensure that the DHCP server can communicate with clients on that VLAN Yes, when you configure an overlay transport zone in NSX, you also need to configure VLANs for the underlying physical network that the TEPs (Tunnel Endpoints) will use. Here’s how you can do it: Steps to Configure VLANs for Overlay Transport Zones 1. Create the Overlay Transport Zone: Log in to the NSX Manager. Navigate to System > Fabric > Transport Zones. Click Add Zone. Enter a name and description for the transport zone. Select Overlay as the traffic type. Click Save 1. 2. Configure VLANs for TEPs: Physical Network Configuration: Ensure that the physical network is configured with the VLANs that will be used for TEP communication. This involves setting up the VLANs on your physical switches and ensuring they are properly trunked to the ESXi hosts. DHCP Server: Ensure that a DHCP server is available on the VLAN that will be used for TEPs. This is necessary for dynamically assigning IP addresses to the TEPs during the automated deployment process Mariel pdp página 12 process 2. 3. Assign VLANs to TEPs: Host Preparation: When you prepare the ESXi hosts for NSX, you will assign the VLANs to the VMkernel adapters used for TEPs. NSX Manager Configuration: In NSX Manager, navigate to System > Fabric > Nodes > Host Transport Nodes. Select the host and click Configure NSX. In the Transport Node Profile, specify the VLAN ID for the TEPs under the N-VDS/VDS Configuration section 2. 4. Verify Configuration: After configuring the VLANs and TEPs, verify that the TEPs have obtained IP addresses from the DHCP server. Ensure that the TEPs can communicate with each other across the physical network. Summary Overlay Transport Zones: Used for encapsulated traffic. VLANs: Required for the physical network communication of TEPs. DHCP: Necessary for dynamically assigning IP addresses to TEPs. By following these steps, you can ensure that your overlay transport zone is properly configured with the necessary VLANs for TEP communication. If you have any more questions or need further details, feel free to ask! Workload Domain Design and Sizing vSphere Networking: Distributed Port Group Design Example When you design vSphere networking, consider the configuration of the VDS instances, distributed port groups, and VMkernel adapters in the VMware Cloud Foundation environment. Some networks, such as vSphere vMotion and vSAN, are created by VMware Cloud Foundation, whereas others, such as NFS, are optional. The NFS network is used when NFS is the principal storage in the workload domain. You must create a network pool, specifying an NFS VLAN ID and subnet IP addresses. Mariel pdp página 13 network pool, specifying an NFS VLAN ID and subnet IP addresses. Separating different types of traffic is required for access security and to reduce contention and latency. High latency on any network can negatively affect performance. Some components are more sensitive to high latency than others. For example, reducing latency is important on the IP storage and the vSphere FT logging network because latency on these networks can negatively affect the performance of multiple VMs. According to the application or service, high latency on specific VM networks can also negatively affect performance Software-Defined Networking Design: Shared NSX Manager Instances VI Workload domains can share existing NSX Manager instances. Mariel pdp página 14 For different technical requirements or business reasons, you might need dedicated NSX Manager instances that are deployed for an individual workload domain. Examples include: Test band development workloads that do not use the production network Business units that require separate management and billing for their consumption Applications in the VI workload domain that have an NSX version dependency Multitenant workload requirements Overview of vSAN in VCF VMware Cloud Foundation automates the following tasks related to vSAN during both bring-up and VI workload domain creation: Creates vSphere Distributed Switch (VDS) instances and configures Network I/O Control Creates a vSAN VMkernel port on the VDS Enables vSAN and applies licenses on the cluster Creates disk groups on each host Configures the vSAN Default Storage Policy, if necessary Configures vSphere High Availability (HA) You must use a network pool with the vSAN storage type with sufficient IP addresses included for all hosts. vSAN uses the fastest device type in all-flash disk groups for caching. NVMe is determined to be the fastest device type, SAS is the second fastest, and SATA is the slowest. If you want to scale an existing vSAN cluster, you must consider the following details: Hosts must be added to the cluster using SDDC Manager. You must commission hosts using the same network pool as other hosts in the cluster. You should use homogeneous hardware across the cluster. If you add disks or disk groups to one host, you should do the same for all hosts. You add disks and disk groups using the vSphere Client. The best practice is to use homogeneous hardware in vSAN clusters. Using homogeneous hardware results in more stable and predictable performance. If you keep hosts equally sized from both a compute and a storage perspective, host failure responses are easier to plan. If you add hosts to a cluster built with servers that are no longer available from your server vendor, you should strive to obtain server hardware with a configuration as close as possible to that of the rest of the cluster. This guidance is especially true for the storage components. Storage Policy-based Management (SPBM) vSAN policies protect components such as NSX, SDDC Manager, workload VMs, and vSAN data by Mariel pdp página 15 vSAN policies protect components such as NSX, SDDC Manager, workload VMs, and vSAN data by placing data objects strategically across the datastore. VM storage policies apply to the following objects: VM home namespaces VMDK objects Thin-provisioned VM swap object One or more snapshot delta objects One or more VM memory objects vSAN performance data objects vSAN File System Service RAID 5 in Four-Node Clusters Mariel pdp página 16 RAID 5 in Four-Node Clusters Before you use RAID 5 in the four-node management domain cluster, consider the following points: The same object protected with RAID 1 has more rebuild flexibility. A RAID 5 object creates four components spread across all four hosts. When a host is in maintenance mode, vSAN cannot recreate the component from that host. Changing from RAID 1 (Mirroring) to RAID 5 (Erasure Coding) requires additional temporary disk space. You can add a fifth host where possible to ensure redundancy levels if a host fails. Consider the following scenario: The VMware Cloud Foundation management domain four-node cluster includes NSX Manager on a RAID 5 FTT=1 policy and SDDC Manager on a RAID 1 FTT=1 policy. One host uses the Ensure accessibility maintenance mode, which affects a component of each VM. Both NSX Manager and SDDC Manager continue to function if no other failures occur. When the host maintenance extends beyond one hour, the following events occur: vSAN rebuilds the missing RAID 1 component for SDDC Manager on the third host, using the component that still exists. vSAN cannot rebuild the NSX RAID 5 component because no host is available. If another failure occurs, affecting the remaining components of the two objects, the following events occur: SDDC Manager keeps running because two of its three components are available. NSX Manager becomes unavailable and might become stale and inoperable, depending on which components are restored later. Deduplication removes redundant data blocks, whereas compression removes additional redundant data in each data block. vSAN ESA in VCF NVMe is a high-performance, non-uniform memory access (NUMA) optimized, and highly scalable storage protocol that connects the host to the memory subsystem. The protocol is feature-rich and newly designed for NVM media (NAND and persistent memory) that is directly connected to the CPU through the PCIe interface. NVMe provides businesses with more options in terms of data (especially fast data), for real-time analytics and emerging technologies. The protocol is built on high-speed PCIe lanes. A PCIe generation 3.0 link can offer transfer speeds more than two times faster than a SATA interface. The NVMe protocol capitalizes on parallel, low-latency data paths to the underlying media, similar to high-performance processor architectures. This protocol offers significantly higher performance and lower latencies compared to legacy SAS and SATA protocols. The protocol accelerates existing applications that require high performance, and it supports new applications and capabilities for real- time workload processing in the data center. Mariel pdp página 17 In vSAN ESA, all disks are NVMe SSDs. Having a separate cache tier is no longer needed. vSAN ESA provides the following advantages: More efficient use of disks: Disks no longer must be allocated only to cache. Better resiliency: A failure of a cache disk does not impact other disks. Improved I/O flow: Data does not need to land on cache before being destaged to capacity. A vSAN ESA storage pool has the following features: Single-tier architecture Only supports NVMe based flash devices Pool of independent storage devices Reduced I/O flow (no two-tier architecture) Maximum number of disks defined by the number of disk slots With ESA, the new disk architecture for vSAN 8 uses storage pools, which makes each disk its own independent disk. Because only NVMe disks are supported, the disks serve both reads and writes. Storage pools simplify the I/O process of only having to write to one disk. In addition, storage pools allow other features to be more streamlined, such as encryption and compression. vSAN ESA New I/O Engine vSAN ESA uses a highly parallel and efficient I/O engine, which produces the following results: Compresses data once at ingest to reduce the network traffic and reduces CPU resources Encrypts data once at ingest to reduce CPU resources Checksums data at ingest to reuse already calculated CRCs Performs full-stripe writes in parallel, asynchronously, and eliminating read-modify-write activities vSAN Log-Structured File System vSAN ESA introduces a new log-structured file system, known as the vSAN LFS, which provides the following benefits: Reduces I/O amplifications Low overhead Compatible with future device types Allows for high-performance snapshots vSAN ESA introduces the Log-Structured Filesystem (LFS) which works as follows: Ingests smaller incoming I/O from the guest VM to an in-memory stripe buffer Packages the smaller writes into larger I/O blocks Performs encryption, compression, and checksum Prepares the data to be written to the performance leg When a full-stripe write accumulates in the performance leg, it is written to the capacity leg of the corresponding object. Reduces I/O amplification by performing encryption, compression, and checksum in one location instead of each individual host New default vSAN ESA storage policies have the following characteristics: Compression is enabled by default unless it is disabled because of space efficiency. The number of disk stripes per object, flash-read cache reservation, and storage tier are not relevant for vSAN ESA. Granular storage policies (per VMDK) are not supported for vSAN ESA. vSAN ESA supports per-object storage policies. Mariel pdp página 18 vSAN ESA Auto-Policy Management configures optimized storage policies based on the cluster type and the number of hosts in the cluster inventory. Enabling Auto-Policy Management on the cluster changes the default storage policy from vSAN Default Storage Policy to the new cluster-specific default storage policy. vSAN ESA Auto-Policy Management configures optimized storage policies based on the cluster type and the number of hosts in the cluster inventory. Changes to the number of hosts in the cluster or Host Rebuild Reserve prompts you to make a suggested adjustment to the optimized storage policy. Turn on vSAN ESA Auto-Policy Management for vSAN ESA clusters in VMware Cloud Foundation. VM Components (vSAN ESA) For the components of a single hard disk with the default policy created by Auto-Policy Management, the RAID 1 section refers to the performance leg in vSAN ESA, whereas the RAID 5 components constitute the capacity leg. vSAN Requirements for management and Workload domains Supported: Management and VI workload domain creation with vSAN ESA vSAN ESA host commission Additional clusters with vSAN ESA (requires vSphere Lifecycle Manager to be enabled during VI workload domain creation) Cross Cluster Capacity Sharing Unsupported: Cross vCenter Cross Cluster Capacity Sharing (supported from vCenter) Conversion from vSAN OSA to vSAN ESA Conversion from vSAN ESA to vSAN OSA vSAN ESA clusters require vSphere Lifecycle Manager to be enabled. Cross vCenter HCI Mesh is not supported from the VMware Cloud Foundation UI or API. However, it can continue to be used if the user has configured this from vCenter. Parameters for vSAN ESA You create a vSAN ESA cluster-based deployment by modifying the bring-up parameter sheet. Mariel pdp página 19 The user has the option of providing the path to the HCL JSON file manually. This is useful for customers in air-gapped environments without Internet connectivity. The vSAN Health Check Plugin performs automatic verification of your underlying hardware (hosts, disks, storage controller, and drivers) by automatically checking it against VMware's vSAN HCL. The vSAN HCL database can either be downloaded automatically from VMware.com or manually uploaded if you do not have direct or proxy Internet access. VMware Cloud Builder Validation VMware Cloud Builder performs the following validations for vSAN ESA deployment: Proxy configuration validation HCL JSON file validation A newer HCL file is downloaded if the file is out of date. vSAN ESA disk eligibility validation ESXi host vSAN HCL compatibility validation A host commissioned with vSAN OSA cannot be used for vSAN ESA, and vice versa. During VI workload domain creation, you select vSAN ESA as storage type and a vSphere Lifecycle Manager image for the cluster. VI Workload Domain Creation Workflow Host commissioning and VI workload domain creation with vSAN ESA have the following critical steps in their workflow: Host commissioning: Commission a free host. Select vSAN ESA as the storage type. HCL and disk compatibility is verified during the process. Validate and commission the host. VI workload domain creation: Enable the vSAN ESA option in the GUI during VI workload domain creation. If you use the API, enable the vSAN ESA option in the datastore specification. Trigger the VI workload domain deployment. The disks are automatically claimed during cluster creation. Using Auto-Policy Management, the default storage policy is created based on the cluster size. Cluster creation completes. Adding a Cluster to a Workload Domain The cluster addition workflow is similar to creating the vSAN ESA cluster for a new VI workload domain: Use commissioned vSAN ESA hosts to create clusters. vSAN ESA supports HCL-Aware-Disk-Claim: → All eligible disks are consumed during the cluster creation process. → This process also claims disks for newly added hosts. → If the disk claim fails to consume disks, a vSAN health check is shown with possible causes. Prechecks are performed to check whether the underlying hardware supports vSAN ESA. Automatic claiming is only performed if the hardware (disks and controllers) is validated by the vSAN HCL list. vSAN Direct is not supported with vSAN ESA. Configuration options for enabling deduplication are not available for vSAN ESA clusters. Deduplication and compression are policy-based in vSAN ESA and not cluster wide. vSAN MAX Support: vSAN Max is supported as principal storage in VMware Cloud Foundation 5.2 and vSAN 8 U3 When using VMware Cloud Foundation, a disaggregated vSAN Max deployment is not supported in a stretched cluster topology. VMware Cloud Foundation only supports a stretched cluster when using an aggregated vSAN HCI deployment option. As with environments not using VMware Cloud Foundation, the selection of vSAN HCI or vSAN Max must be performed at cluster creation time. You cannot retroactively change this setting. Mariel pdp página 20 be performed at cluster creation time. You cannot retroactively change this setting. VCF Storage design vSAN Design principles: hardware Verify that storage controllers provide the required level of performance: → To help reduce the failure domain and improve performance, use multiple storage I/O controllers per host. → Use the recommended minimum of two disk groups. One to five disk groups per host are supported. → Choose storage I/O controllers that have as large a queue depth as possible. → Prefer pass-through mode over RAID-0. → Disable controller cache and advanced features for acceleration. vSAN Design Principles: Performance and Availability Recognize workload performance and availability requirements. Check network infrastructure onsite, including throughput, VLAN separation, jumbo frames. Use Network I/O Control. VMware Cloud Foundation includes default configurations, but they might require modification. Consider vSAN CPU and memory overhead. Consider host versus custom failure domains. The domains are dependent on the available FTM and FTT options. Consider enabling vSAN Reserved Capacity for vSAN cluster maintenance The Reactive Rebalance threshold is set at 80 percent by default. Operations vSphere administrators are now storage administrators. Create strategies for vSphere host maintenance mode: Ensure accessibility and full data migration. Maintenance mode counts as a failure: Consider FTT = 2 for production workloads. Discuss resync traffic and duration, and failure scenarios. Discuss deduplication and compression change error handling on capacity devices. Management Domain availability and backup Availability of Key Infrastructure Components VMware Cloud Foundation requires the availability of the following infrastructure services to function: Directory Services DNS Network Time Protocol (NTP) Dynamic Host Configuration Protocol (DHCP) Certificate Authority Border Gateway Protocol (BGP) peers Interesante: VMware Cloud Foundation supports both Active Directory (AD) and OpenLDAP as identity sources. You should configure at least two domain controllers in your domain to ensure directory services availability. Each domain controller should be located in a different physical environment. You can use AD for authentication and authorization with VMware Cloud Foundation, but it is not a requirement. DNS servers must be highly available. If name resolution cannot be performed, VMware Cloud Foundation software components cannot communicate. If DNS resolution fails, you cannot create workload domains or run other workflows. NTP is critical for authentication and troubleshooting purposes. vCenter Single Sign-On is especially sensitive to time drift because tokens for authentication are issued with a time restriction. If the time drift between components is too great, authentication fails. Multiple NTP servers should be available to all components in the VMware Cloud Foundation instance. Using at least three NTP sources ensures that a quorum can be provided in case one NTP server has a drift or an outage. VMware Cloud Foundation uses static IP pools or DHCP to obtain IP addresses for NSX tunnel endpoints on ESXi hosts during workload domain creation. If you are using DCHP, you must ensure that DHCP servers are available on the VLAN assigned to the tunnel endpoints. If an ESXi host is rebooted, DHCP must be available to reassign IP addresses to the VMkernel ports used as the tunnel endpoint. A certificate authority must be available when you want to replace the VMCA-signed certificates that are created during the bring-up process. BGP peers must be in place and available by the time you deploy an NSX Edge cluster for the management domain. Mariel pdp página 21 management domain. Configuring an external SFTP server as a backup location is a best practice for the following reasons: An external SFTP server is a prerequisite for restoring SDDC Manager file-based backups. An external SFTP server provides better protection against failures because it decouples NSX backups from SDDC Manager backups. An external SFTP server provides better protection against failures because it means that the failed SDDC Manager does not take down the SDDC Manager backups with it. Schedule your backups on an hourly, daily, or weekly basis, depending on how frequently your environment changes. If you make significant changes to your VMware Cloud Foundation environment, you can make an unscheduled backup to ensure that a failure does not cause significant data loss. Restoring SDDC Manager Backups Restoring the SDDC Manager involves the following steps: Use the vSphere Client to deploy a new SDDC Manager OVA Management cluster. Take a snapshot of the newly deployed SDDC Manager instance as a safety measure. Power on SDDC Manager and use the CLI to restore the SDDC Manager backup to the newly deployed VM. NSX Manager backs up each node of the NSX Management cluster every hour by default. Using vSphere HA in the Management Domain VMware Cloud Foundation deploys the management domain with vSphere High Availability (HA) settings that are configured according to VMware best practices. The default vSphere HA settings are as follows: Host Failure Response: Restart VMs Response for Host Isolation: Power off and restart VMs Datastore with Permanent Device Loss: Disabled Datastore with All Paths Down: Disabled VM Monitoring: VM Monitoring Only You should contact Broadcom Support before changing any vSphere HA configurations. The option to power off and restart VMs on isolated hosts is required in vSAN clusters because of the nature of data placement in the vSAN datastore. If a VM is running on an isolated host, it cannot access its data components, and applications running in the VM fail. vSphere HA restarts the VM automatically on hosts that can access the data components. The VM restart inevitably incurs a small amount of downtime, but the application is promptly made available again. vSphere HA restarts VMs and contributes to service availability for most failures. If a host fails, vSphere HA restarts the VMs elsewhere in the cluster. In the management cluster, vSphere HA also monitors VMware Tools in each VM to ensure that the OS is healthy and running. If VMware Tools heart beating fails for a continuous period of 50 seconds, the VM is reset to ensure that services are restored as quickly as possible. All components deployed as part of the VMware Cloud Foundation software stack run VMware Tools by default. Restoring NSX Manager appliances uses the SFTP Server directly: Power off any failed NSX Manager appliances from the old NSX Management cluster. Deploy one new NSX Manager appliance using the same IP and FQDN as the previous node. Log in to the NSX Manager UI at https:/// Make the new NSX Manager Active. Navigate to the Backup & Restore page. Configure the SFTP server details used by the previous NSX Manager node. In the list of NSX Manager backups, select the desired backup and click RESTORE. Deploy additional NSX Manager nodes for the newly restored cluster. Mariel pdp página 22 Stretched vSAN clusters in VCF Availability zones are collections of infrastructure components in different physical locations managed by a single VMware Cloud Foundation instance. Availability zones have the following characteristics: Independent power, cooling, network, and security Physically separate so that they are not affected by the same disaster Connected using high-bandwidth (10 Gbps), low-latency (less than 5 ms) networks Stretched Cluster Architecture Stretched clusters provide resilience across availability zones. A stretched cluster consists of two active availability zones and one witness site: Each availability zone represents its own fault domain, allowing the failure of an entire availability zone in the stretched cluster. Each availability zone must contain the same number of hosts. A witness site contains a single host that maintains witness components for objects that need them. Each availability zone in the stretched cluster is configured as a fault domain. Mariel pdp página 23 Stretched Cluster Use Cases Planned maintenance: For planned maintenance of one availability zone without any service downtime For migrating applications back after maintenance is complete Service outages: To prevent production outages before an impending service outage, such as power outages To avoid downtime, not to recover from it Automated recovery: For automated initiation of VM restart or recovery When you want a low recovery time objective (RTO) for most unplanned failures When you want users to focus on application health after recovery, not on how to recover VMs VMware Cloud Foundation 5.2 supports vSAN Express Storage Architecture (ESA) in a stretched cluster topology. Stretched Cluster Requirements vSAN stretched clusters in VMware Cloud Foundation have the following requirements: You must have a vSAN Enterprise license. The management domain cluster must be stretched before any VI workload clusters are stretched. Networking in both availability zones must meet the following requirements: The round-trip time (RTT) between availability zones must be less than or equal to 5 milliseconds. Enough IP addresses must be available on the IP pool configured for the Host Overlay Transport in each availability zone. The vSphere vMotion, vSAN, host overlay, and management networks must be stretched (L2) or routed (L3) between availability zones. The NSX Edge Uplink and NSX Edge Overlay Transport VLANs must be stretched (L2) across both availability zones. In a campus area network, the vSphere vMotion, vSAN, and host overlay networks might be stretched across the availability zones. In most other cases, these networks typically have different VLANs in each availability zone. The vSphere vMotion, vSAN, and host overlay networks must route between availability zones if they are not stretched. If vSphere vMotion networks are not in the same layer 2 domain, you must configure gateways for the vSphere vMotion VMkernel ports. vSphere vMotion VMkernel ports use the vMotion TCP/IP stack instance by default, and they can therefore use a different gateway than the default TCP/IP stack. VMware Cloud Foundation defines a gateway for vSphere vMotion in the network pool configuration. You must ensure that this gateway is reachable and can route traffic between the availability zones Stretched Cluster Requirements: Witness Sites You must have a third witness site for the witness host or appliance. The witness site must meet the following requirements: Mariel pdp página 24 The vSAN and Management networks must have routing to the witness site. The RTT between the availability zones and the witness host must be less than or equal to 200 milliseconds. With 11 or more hosts per availability zone, this requirement changes to less than or equal to 100 milliseconds. When configuring a stretched vSAN cluster in VMware Cloud Foundation, you must create separate network pools for each availability zone. From VMware Cloud Foundation 5.1, you can use NSX sub-transport node profiles with static IP pools or DHCP to assign different tunnel endpoint (TEP) IP pools to sub-clusters in the same availability zone. In the example in the slide, Host 5 and Host 6 can form a sub-cluster and be assigned a particular TEP IP pool. Host 7 and Host 8 can form a second sub-cluster in the same availability zone and can be configured with a different TEP IP pool using sub-transport node profiles. When configuring a vSAN stretched cluster, you begin by configuring the hosts in the first availability zone (AZ1). You then stretch the cluster by adding the hosts in the second availability zone (AZ2). You must use APIs to stretch a VMware Cloud Foundation vSAN cluster. Validating the JSON Input To prevent undesired consequences, you must always validate the JSON input before executing the APIs in your production environment: You validate the JSON input using the POST /v1/clusters/{id}/validations API. You must resolve any issues before proceeding with the API execution. Failure to validate the JSON input results in a failed deployment and might require manual cleanup. You must see a status of "SUCCEEDED" in the Response section of the validation before you proceed with the deployment. To stretch the cluster, you paste the JSON spec that you created into the PATCH /v1/clusters/{id} API. The hosts that are commissioned with the original network pool are placed in the AZ1 fault domain. The hosts commissioned with the AZ2 network pool are placed in the AZ2 fault domain. You must configure the NSX Edge Tier-0 gateways to ensure that they can continue to route traffic after failing over to AZ2: You configure IP prefix lists for outbound route advertisements. You configure inbound and outbound route maps with local preferences and AS-path prepend values for AZ2. You add BGP neighbors for AZ2, using the route maps as route filters. What We're Trying to Do We want to make sure that if our primary router location fails, the backup location (AZ2) can take over and keep everything running smoothly. Simple Steps to Do This Mariel pdp página 25 Simple Steps to Do This 1. Create Lists of Rules (IP Prefix Lists): ○ Imagine we have a list of rules that decides which network routes we tell others about. We have two main lists: ▪ A Default Route list that tells everyone we can send any internet traffic. ▪ An Any list that confirms we can send traffic to any network. 2. Make Route Maps: ○ These are like maps that help decide the best path for our internet traffic. ○ With these maps, we can say "I prefer this route over that one" or "This route is less important." 3. Set Up Neighbors (BGP Neighbors): ○ Think of these neighbors as other routers we communicate with. ○ We tell our backup router about these neighbors and use our maps to control how routes are chosen. 4. Control Traffic Preferences: ○ We normally prefer the main route, but if it fails, our traffic can smoothly switch to the backup route. You're absolutely right, IP prefix lists are indeed ranges of IP addresses, not routes themselves. When I referred to them as "rules," I meant they act like rules to determine which IP ranges are advertised to other routers. Let me clarify further: Simplified Explanation 1. IP Prefix Lists: ○ These are lists that specify ranges of IP addresses (like home addresses) that we want to share with other routers. 2. Route Maps: ○ These are sets of rules that use the IP prefix lists to decide which routes (paths) are preferred or less preferred. 3. BGP Neighbors: ○ Other routers we talk to and exchange routing information with. What We're Doing 1. Create Lists of IP Ranges (IP Prefix Lists): ○ We're making lists of IP addresses we want to advertise. 2. Make Route Maps: ○ We use these lists to help decide the best routes. 3. Set Up Neighbors: ○ We tell other routers (neighbors) about our preferred routes. 4. Control Traffic Preferences: ○ We usually prefer routes in our main location, but if it fails, we switch to the backup routes. The Goal To ensure smooth operation even if our main location fails. You configure route maps on the second availability zone's BGP neighbor paths to make them less preferred for inbound and outbound traffic. Mariel pdp página 26 Deploying vSAN stretched cluster Download a sample clusterUpdateSpec JSON file in the API Explorer. Expand Clusters > POST /v1/clusters/{id}/validations in the API Explorer. Desde Mariel pdp página 27 DIGITAL CERTIFICATE OVERVIEW Key Components of PKI Public key infrastructure (PKI) consists of the following key components: Certificate authority (CA) Digital certificate Certificate signing request (CSR) Public key Private key Root certificate Mariel pdp página 28 More details about key components of PKI: CA: The CA is a trusted body that is authorized to issue digital certificates to an organization. The CA also performs the background verification of the organization requesting certificates. Digital certificate: A trusted CA issues a digital certificate to an organization to secure an FQDN or host name. The digital certificate proves the identity of the server. CSR: The first step to obtaining a digital certificate is to generate a CSR and send it to your CA. Upon receiving a CSR, the CA performs a background check and issues a certificate. Public key: The public key is distributed publicly so that the client devices can encrypt the data using the server's public key. Private key: The server uses the private key to decrypt the data. Root certificate: The root certificate identifies the root CA in the chain of trust. A trusted root CA is usually preinstalled in operating systems and browsers. SDDC Manager provides the following options for replacing self-signed certificates that offer different levels of automation: Certificate management fully automated by VMware Cloud Foundation: - Integrated external Microsoft CA - Integrated internal OpenSSL CA Certificate management partially automated by VMware Cloud Foundation: - Third-party CA SDDC Manager can generate the CSR, send the CSR, and download certificates automatically from a Microsoft CA. OpenSSL is a built-in CA in SDDC Manager. OpenSSL is fully automated and does not require additional Mariel pdp página 29 OpenSSL is a built-in CA in SDDC Manager. OpenSSL is fully automated and does not require additional configuration. Certificates issued by OpenSSL are self-signed certificates and do not have a chain of trust to a root CA. SDDC Manager also provides an option to install certificates issued by a third-party CA, but you must manually send the CSR to the CA and download the generated certificate. After the certificate is downloaded and saved in the correct location on SDDC Manager, SDDC Manager can orchestrate the certificate installation on the relevant software component of VMware Cloud Foundation. Configuring Microsoft CA: Requirements SDDC Manager uses the Certification Authority Web Enrollment role in AD to obtain signed certificates: Configure and issue a VMware Certificate template for Machine SSL and Solution User certificates on this CA server. Configure the web server (IIS) security setting to use basic authentication. Ensure that the SDDC Manager service account has the least privileges. From a high level, the process of preparing the certificate service template is as follows: Create and configure a Microsoft Active Directory CA with the Certificate Authority Web Enrollment role. Configure a VMware Certificate template for Machine SSL and Solution user certificates. Configure the certificate service template and all sites, including the default website, for basic authentication. For the steps to create the certificate service template, see "Creating a Microsoft Certificate Authority Template for SSL certificate creation in vSphere 6.x/7.x" at https://knowledge.broadcom.com/external/article?legacyId=2112009. Workflow: Installing Certificates Using an Integrated CA To install certificates in SDDC Manager using a supported integrated CA: Select the resource type whose certificates you want to replace. Click GENERATE SIGNED CERTIFICATES. After the certificates generate, click INSTALL CERTIFICATES. Workflow: Installing Certificates Using an External CA/third-party The workflow to install certificates is slightly longer if you use an external CA: Click GENERATE CSRS. Click DOWNLOAD CSR. - Sign the CSR. - Upload the CSR to the desired CA. Download the signed certificates from the CA. Click UPLOAD AND INSTALL CERTIFICATES to supply the signed certificate files to SDDC Manager. Mariel pdp página 30 vSphere Lifecycle Manager helps to automate the following tasks: Managing VMware Tools and VM hardware upgrades Upgrading and patching ESXi hosts Installing and updating third-party software on ESXi hosts Installing and updating ESXi drivers and firmware Standardizing ESXi images across hosts in a cluster Overview of LCM A vSphere Lifecycle Manager image consists of several elements: ESXi base image: Update that provides software fixes and enhancements Components: Logical grouping of one or more VIBs that encapsulates functionality in ESXi Vendor add-ons: Sets of components that OEMs create and distribute Firmware and driver add-ons: Firmware and driver bundles that you can define for your cluster image These add-ons require the Hardware Support Manager plug-in for the desired server family. The ESXi base image is a complete ESXi installation package that is enough to start an ESXi host. Only VMware by Broadcom creates and releases ESXi base images. The ESXi base image is a grouping of components. You must select at least the base image or vSphere version when creating an image. The component is the smallest unit that is used by vSphere Lifecycle Manager to install VMware and third-party software on ESXi hosts. Components are the basic packaging for vSphere Installation Bundles (VIBs) and metadata. The metadata provides the name and version of the component. On installation, a component provides you with a visible feature. For example, a third-party vendor's network driver is provided as a component. Components are optional elements to add to a cluster image. (parches) Vendor add-ons are custom OEM images. Each add-on is a collection of components customized for a family of servers. OEMs can add, update, or remove components from a base image to create an add- on. Selecting an add-on is optional. The firmware and driver add-ons are provided by the vendor. They contain the components that encapsulate firmware and driver update packages for a specific server type. To add a firmware or driver add-on to your image, you must first install the Hardware Support Manager plug-in for the respective family of servers. VMware Cloud Foundation LCM automatically invokes the NSX upgrade coordinator to upgrade the NSX Edge nodes and local NSX Manager instances in your VMware Cloud Foundation domains. The NSX upgrade coordinator ensures that the NSX components are upgraded in the correct order Upgrade the NSX Edge clusters. Upgrade the NSX components for the ESXi hosts. Upgrade NSX Manager. The current version of SDDC Manager is not capable of life cycle operations for NSX Global Manager. To Mariel pdp página 31 The current version of SDDC Manager is not capable of life cycle operations for NSX Global Manager. To upgrade NSX Global Manager in an NSX Federation environment, you must manually use the NSX upgrade coordinator from NSX Global Manager. You can deploy VMware Aria Suite Lifecycle from SDDC Manager, but after deployment, VMware Aria Suite Lifecycle performs its own life cycle management independent from SDDC Manager. You use VMware Aria Suite Lifecycle to deploy and automatically update the following products: Workspace ONE Access VMware Aria Operations VMware Aria Operations for Logs VMware Aria Automation With LCM you can update esxi hosts and NSX Installation and Upgrade Bundles Users can download bundles directly from the SDDC Manager UI when SDDC Manager has Internet access. VMware Cloud Foundation LCM requires your Broadcom support credentials to connect to the online depot to download software bundles. SDDC Manager polls depot.broadcom.com on port 443 every 5 minutes by default. If new bundles are available, VMware Cloud Foundation LCM generates a notification in SDDC Manager. You can select one of the following options to download: Schedule Download Download Now Online bundles Using a proxy server offline depots Configuring Offline Depots To configure an offline depot: Configure a Windows or Linux VM as a web server. Download and install the latest version of the Offline Bundle Transfer Utility to the web server. Configure the offline depot directory structure and download the required bundles. Connect SDDC Manager to the offline depot. Before you can configure an offline depot, you must deploy a dedicated Windows or Linux VM to host the VMware Cloud Foundation offline bundles. For more information about how to set up an offline depot, see "VCF Offline Depot deployment" at https://knowledge.broadcom.com/external/article?legacyId= 95552. Mariel pdp página 32 95552. vSphere LCM Workflow for Using vSphere Lifecycle Manager Images in VMware Cloud Foundation To create a vSphere Lifecycle Manager image and apply it a VMware Cloud Foundation cluster: Use the vSphere Client to create a vSphere Lifecycle Manager image. Make the image available in VMware Cloud Foundation with one of the following methods: - Export the vSphere Lifecycle Manager image from vSphere and import it into VMware Cloud Foundation. - Extract a vSphere Lifecycle Manager image from an existing workload domain cluster in VMware Cloud Foundation. After the image is available in VMware Cloud Foundation, you can reuse it for clusters across workload domains. Apply the image to the default cluster when creating a new workload domain or when adding a cluster to an existing workload domain. Upgrade all hosts in a cluster using the vSphere Lifecycle Manager image. Mariel pdp página 33 PASO 1: Creating vSphere Lifecycle Manager Images You must create cluster images using vSphere 7.0 or later. You can create an image on the vCenter instance for the management domain or a VI workload domain, or for an external vCenter instance. To create images using the vSphere Client: Create an empty cluster, you do not need any hosts. Select the Manage all hosts in the cluster with a single image check box and then select Compose a new image. Select the ESXi build in the VMware Cloud Foundation BOM. Select any additional vendor add-ons. PASO 2: Exporting vSphere Lifecycle Manager Images in the vSphere Client To export a vSphere Lifecycle Manager image: In the Hosts and Clusters inventory list, select the cluster where you created the image. Click the Updates tab. Select Hosts > Image from the navigation pane on the Updates tab. Click the ellipsis icon in the upper-right corner of the Image page. Select Export. JSON, ISO, ZIP!! PASO 2: Exporting Cluster Settings from the vSphere Client You export the cluster settings JSON file from the vSphere Client by navigating to the Configure tab on the vSphere cluster and selecting Desired State > Configuration from the navigation pane. PASO2: Importing vSphere Lifecycle Manager Images into VMware Cloud Foundation To import a vSphere Lifecycle Manager image into VMware Cloud Foundation: In the navigation pane on the left, select Lifecycle Management > Image Management. In the right pane, click the Import Image tab. Under Import a Cluster Image, upload all four of the required files. PASO 2 B.:Extracting vSphere Lifecycle Manager Images from VMware Cloud Foundation You can extract an image that is already applied to a cluster, and you can use it to create a new workload domain or a new cluster in an existing workload domain. Mariel pdp página 34 an existing workload domain. To extract an image: In the navigation pane on the left, select Lifecycle Management > Image Management. In the right pane, click the Import Image tab. Configure the options under Extract a Cluster Image. About Custom ESXi ISO Images Certain vendors might provide custom ESXi images for their hardware. These custom images include VIBs that might not exist with a standard version of ESXi: The functionality of the server can be severely affected if these VIBs are not included. Supportability by the vendor might be questioned if their ISO images are not used. To create a custom ESXi ISO image with vSphere Lifecycle Manager: Create a new temporary cluster and select the Manage all hosts in the cluster with a single image check box. Select the ESXi version and any vendor add-ons needed. Export the vSphere Lifecycle Manager image as an ISO file. Delete the temporary cluster. vSphere Lifecycle Manager images simplify the firmware update operation. To apply firmware updates to ESXi hosts in a cluster, you must deploy and configure a hardware support manager: A hardware support manager is a vendor-provided software module. You register the hardware support manager as a vCenter extension. Upgrading Workload domains You upgrade VMware Cloud Foundation management domain components in the following order: SDDC Manager and VMware Cloud Foundation services VMware Aria Suite Lifecycle, VMware Aria Suite products, and Workspace ONE Access NSX vCenter ESXi hosts You must first upgrade the management workload domain before upgrading VI workload domains. From VMware Cloud Foundation 5.2, you can upgrade SDDC Manager without updating the other infrastructure components in the management domain. Mariel pdp página 35 Before an administrator can independently update SDDC Manager to a newer version, all components in the management domain and all VI workload domains in the VMware Cloud Foundation instance must be upgraded to at least VMware Cloud Foundation 5.0. VMware Cloud Foundation 5.x includes mixed-mode support for VI workload domains. With mixed-mode support, administrators can update the management domain and one or more VI workload domains to the latest VMware Cloud Foundation version without needing to upgrade all VI workload domains. Before upgrading your VMware Cloud Foundation environment, you must meet the following prerequisites: Verify VMware Aria Suite compatibility. Verify that no passwords are expired or expiring. Verify that no certificates are expired or expiring. Back up SDDC Manager, all vCenter instances, and NSX Manager appliances. Ensure that SDDC Manager has no running or failed workflows. None of the VMware Cloud Foundation resources are in Activating or Error states. Upgrade bundles for all VMware Cloud Foundation components are downloaded. From VMware Cloud Foundation 5.2, you can apply patches directly from the SDDC Manager UI without using the Async Patch Tool: The SDDC Manager graphical workflow allows you to patch multiple components at the same time. Being able to choose specific versions and patches during upgrades offers the following benefits: Eliminates the need to apply patches after the upgrade Allows each domain to have a different combination of component versions based on workload requirements VMware Aria Suite requires AVNs. An NSX Edge cluster is required to configure AVNs in the management domain. Mariel pdp página 36 AVNs allow the cloud administrator to optimally configure VMware Aria Suite management applications for SDN through NSX. AVNs configure local-region and cross-region SDN segments, providing flexibility, mobility, and security to VMware Aria Suite management applications. VMware Aria Suite components can be moved between regions to maintain operations during planned migration, maintenance, or in the case of a disaster recovery (DR) event. Before you can deploy VMware Aria Suite components, you must also deploy AVNs in the management domain. SDDC Manager is used to automate the deployment of AVNs, which are configured through NSX. After they are deployed, AVN configuration is visible in SDDC Manager, but it cannot be changed (read only). Before configuring AVNs, an NSX Edge cluster is required. The workflow of NSX Edge cluster and AVN deployment is as follows: The NSX Edge Cluster is deployed from SDDC Manager. Configure the NSX Edge cluster for AVNs. Configure two-tier routing for VMware Aria Suite local-region and cross-region segments. Configure NSX load balancers for VMware Aria components. AVNs are deployed from SDDC Manager to the management domain. AVNs are used to configure local-region and cross-region SDN segments for VMware Aria Suite management applications. After they are deployed, AVN configuration is visible in SDDC Manager. When deploying AVN to implement VMware Aria Suite components, you can choose to use either overlay-backed or VLAN-backed network segments. Mariel pdp página 37 A cross-region segment is a network segment deployed across a multi-region architecture to support VMs fail over. Overlay-backed network segments are the preferred choice, allowing you to make full use of automation and the powerful features of NSX SDN. For the NSX Edge cluster to be deployed, an upstream router must be configured as a border gateway protocol (BGP) peer to support dynamic routing of the overlay networks. VLAN-backed network segments are also supported for customers that do not want to deploy VMware Aria Suite products on an overlay network with BGP dynamic routing. This configuration supports static routing and VLAN-backed networks. An NSX Edge cluster is still required to support load-balancing services for the VMware Aria Suite products and Workspace ONE Access. Overlay-backed versus VLAN-backed segments: Overlay-backed NSX segments: - Overlay-backed segments provide flexibility for workload placement by removing the dependence on traditional data center networks. Using overlay-backed segments improves the security and mobility of management applications and reduces the integration effort with existing networks. - Overlay-backed segments are created in an overlay transport zone. - Traffic between two VMs on different hosts (but attached to the same overlay segment) have their layer-2 traffic carried by a tunnel between the hosts. NSX instantiates and maintains this IP tunnel without the need for any segment-specific configuration in the physical infrastructure. As a result, the virtual network infrastructure is decoupled from the physical network infrastructure. That is, you can create segments dynamically without any configuration of the physical network infrastructure. VLAN-backed NSX segments: - VLAN-backed segments use the physical data center networks to isolate management applications while still taking advantage of NSX to manage these networks. - VLAN-backed network segments ensure the security of management applications without requiring support for overlay networking. - VLAN-backed segments are created in a VLAN transport zone. An NSX Edge cluster with AVN configured for VMware Aria Suite SDN can use either a region-specific or cross-region network: Region-specific network: - VMware Aria Operations for Logs - VMware Workspace ONE Access - VMware Aria Operations Cloud Proxies Cross-region network: - VMware Aria Operations - VMware Aria Automation - VMware Workspace ONE Access - VMware Aria Suite Lifecycle Mariel pdp página 38 VMware Aria Suite Lifecycleand WSO Access Deployment on VCF The end-to-end workflow for deploying VMware Aria Suite on VMware Cloud Foundation is as follows: Planning and preparation are best performed by using the Planning and Preparation Workbook and by consulting the VMware Cloud Foundation documentation. Deploy an NSX Edge cluster, which is required on the default management vSphere cluster to deploy VMware Aria Suite products. SDDC Manager contains built-in automation to deploy an NSX Edge cluster. One or more NSX Edge nodes are deployed with two-tier routing and network services: a T1 router to manage east-west traffic between segments and a T0 router for north-south traffic. After the NSX Edge cluster is deployed, the cloud administrator can then configure the AVN. AVNs are SDNs that allow VMware Aria Suite components to be deployed locally or cross-region. After the NSX Edge cluster and AVN are configured, SDDC Manager is used to download the VMware Aria Suite Lifecycle installation bundle from the VMware depot. After it is downloaded, the VMware Aria Suite Lifecycle appliance can be deployed from SDDC Manager. The operations team can then run through a series of documented steps following the VMware Validated Solutions and VMware Cloud Foundation documentation to configure certificates, SSO access, global permissions, and Workspace One Access. The VMware Aria Suite products are then ready to be deployed. All VMware Aria Suite components are deployed and run in the management domain default cluster. VMware Aria Operations, VMware Aria Operations for Logs, and VMware Aria Automation can then be configured by customers to best suit their application requirements. Individual VI workload domains can then be connected to them. You can create overlay-backed NSX segments or VLAN-backed NSX segments. Both options create two NSX segments (Region-A and X-Region) on the NSX Edge cluster deployed in the default management vSphere cluster. Those NSX segments are used when you deploy the VMware Aria Suite products. Region-A segments are local instance NSX segments, and X-Region segments are cross-instance NSX Mariel pdp página 39 Region-A segments are local instance NSX segments, and X-Region segments are cross-instance NSX segments. To establish a trusted connection to VMware Aria Suite Lifecycle, you use the SDDC Manager UI to replace the SSL certificate on the appliance. In the navigation pane, click Inventory > Workload Domains On the Workload Domain page, click the management domain name in the workload domains table. On the domain summary page, click the Certificates tab. Select the check box for the VMware Aria Suite Lifecycle resource type and then click GENERATE CSRS. On the Details page, configure the required settings and click NEXT. On the Subject Alternative Name page, leave the default SAN and click NEXT. On the Summary page, click GENERATE CSRS. After the successful return of the operation, select the VMware Aria Suite Lifecycle resource type again and click GENERATE SIGNED CERTIFICATES. In the Generate Certificates dialog box, select Microsoft or OpenSSL from the Select Certificate Authority drop-down menu. (Depends upon which certificate authority is configured for your environment.) Click GENERATE CERTIFICATES. After the successful return of the operation, select the VMware Aria Suite Lifecycle resource type again and click INSTALL CERTIFICATES. Wait until the certificates are installed successfully. Before you can create a global environment for product deployments, you must add a cross-instance data center and the associated management domain vCenter system to VMware Aria Suite Lifecycle. To configure the global data center in VMware Aria Suite Lifecycle: In a web browser, go to https:// to access the VMware Aria Suite Lifecycle UI. Log in with the VMware Cloud Foundation vcfadmin@local user. On the My Services page, click Lifecycle Operations. In the navigation pane on the left, select Datacenters. Click ADD DATACENTER, configure the values for the global data center, and click SAVE. Configuring the Management Domain vCenter System You add the management domain vCenter system to the global data center that you configured in VMware Aria Suite Lifecycle. Mariel pdp página 40 To add the management domain vCenter system to the global data center: On the Datacenters page, expand the global data center and click ADD VCENTER. Enter the management domain vCenter information and click VALIDATE. After the successful vCenter validation, click SAVE. In the navigation pane on the left, select Requests and verify that the state of the vCenter data collection request is Completed. To add Aria Products you need to create an environment for them 'CREATE ENVIRONMENT' from Aria Suite Lifecycle. Deploying WSO ACCESS Workspace ONE Access provides secure authentication to VMware Aria Suite products and allows for role-based access control (RBAC) to VMware Aria Suite products. VMware Aria Suite Lifecycle deploys a single-node or three-node Workspace ONE Access cluster. It then automatically creates the load balancer in NSX. Options for upgrading Aria suite Lifecycle Download the VMware Aria Suite Lifecycle PSPAK and the VMware Aria Suite Lifecycle build. Applying the PSPAK updates the interoperability data for the new VMware Aria Suite versions and is needed to allow the upgrade. After the VMware Aria Suite Lifecycle PSPAK is applied, VMware Aria Suite Lifecycle can then be upgraded. After the latest version of VMware Aria Suite Lifecycle is running, VMware Aria Suite Lifecycle checks all installed VMware Aria Suite components. It then provides the operator with a selection of compatible upgrade options that have been validated for the VMware Cloud Foundation version being used. The VMware Cloud Foundation operator can then download and install the latest VMware Aria products in the VMware Cloud Foundation environment. VMware Validated Solutions for VMware Aria Suite Products The available VMware Validated Solutions are as follows: Identity and Access Management Developer Ready Infrastructure Heath Reporting and Monitoring Intelligent Logging and Analytics Intelligent Operations Management Intelligent Network Visibility Private Cloud Automation Site Protection and Disaster Recovery Advanced Load Balancing Cloud-Based Ransomware Recovery Cross-Cloud Mobility Private AI Ready Infrastructure Mariel pdp página 41 Deploying Aria Solutions with validated solutions The recommended VMware Aria Operations deployment on VMware Cloud Foundation is as follows. Following the VMware Validated Solutions guidance, you deploy VMware Aria Operations in the cross- region network segment and VMware Aria Operations Cloud Proxies in the region-specific network segment. The X-Region VMware Aria Operations instance is deployed from VMware Aria Suite Lifecycle. VMware Aria Suite Lifecycle deploys a three-node cluster of VMware Aria Operations. It can deploy two collectors in the region-specific site. The VMware Aria Operations cluster is secured with RBAC and signed SSL certificates. During installation, VMware Aria Suite Lifecycle also configures monitoring for vSphere, vSAN, and NSX. +aria ops for logs y aria auto Mariel pdp página 42 VMware Private AI Foundation with NVIDIA Generative AI and Large Language Models Generative AI refers to deep-learning models that can generate high-quality text, images, and other content based on the data they were trained on: Generative AI, in the form of large language models (LLMs), offers human-like creativity, reasoning, and language understanding. LLMs have revolutionized natural language processing tasks, enabling machines to understand, generate, and interact with human language in a human-like manner. LLMs such as GPT-4, MPT, Vicuna, and Falcon have gained popularity because of their ability to process vast amounts of text data and produce coherent and contextually relevant responses. LLMs rely on several operational components and processes to achieve their capabilities: Deep-learning (transformers) neural nets: LLMs are built upon complex neural networks based on the transformer architecture. These models consist of multiple layers of self-attention mechanisms and feed-forward neural networks with billions or neurons and parameters that must be trained over terabytes of data. Hardware accelerators: LLMs are very demanding computationally and require specialized hardware to achieve optimal performance. LLM training and inference processes often rely on high-performance GPUs, RDMA networking, and high-speed storage to handle immense computational loads. Machine learning software stack: Multiple open-source software choices provide tools to work with LLMs and Generative AI, for example: - Hugging Face transformers: Hugging Face (HF) is a popular platform where the machine learning community collaborates on models, datasets, and applications. HF creators are the authors of one of the most adopted open-source PyTorch implementations of the NLP Transformers architecture. - Ray Serve: This parallel computing platform lets you serve machine learning models (in real time or batch) using a simple Python API. - Kubeflow is an end-to-end machine learning platform for Kubernetes. It provides components for Mariel pdp página 43 - Kubeflow is an end-to-end machine learning platform for Kubernetes. It provides components for each stage in the machine learning life cycle, from exploration to training to deployment. The Kubeflow on vSphere project provides codes and documents to enable Kubeflow to run better on vSphere and VMware Cloud. Pre-training tasks: - The first stage of LLM development involves pre-training on massive amounts of text data from the internet. During this phase, the model learns to predict the next word in a sentence given the context of the preceding words. This process helps the model build a foundation of language understanding and grammar. - The HF models repository provides access to over 285,000 language and computer vision machine learning models that can be used for many types of tasks. - Pre-training LLMs is a difficult and expensive task. Because of the high cost and complexity of pre- training tasks, it is more convenient to use an open-source pre-trained LLM suitable for the use cases that you have in mind. Fine-tuning tasks: - After pre-training, the model can be fine-tuned on specialized datasets for specific tasks. This process adapts the general language model to perform more specialized tasks such as text generation, translation, sentiment analysis, or question-answering. - Fine-tuning is crucial to tailoring the model's capabilities to the desired application. Inference (prompt completion) tasks:?After the LLM is pre-trained and fine-tuned, it enters the inference stage, where it processes users? prompts and generates completions in real time. The model uses the previously learned information to make predictions and generate coherent and contextually relevant text. Graphics processing units (GPUs) are preferred over CPUs to accelerate computational workloads in modern high-performance computing (HPC) and machine learning or deep learning landscapes Latency versus throughput: CPUs are optimized to reduce latency for processing tasks in a serialized way. GPUs focus on high throughput volumes. A GPU has significantly more cores than a CPU. These additional cores can be used for processing tasks in parallel. The GPU architecture is tolerant of memory latency because it is designed for higher throughput. A GPU works with fewer, relatively small memory cache layers because it has more components dedicated to computation. Machine learning models, especially deep neural networks, involve a large number of matrix multiplications that can be done in parallel, and GPUs can compute these operations much faster than CPUs. A host VIB is installed from the NVIDIA AI Enterprise Suite, allowing GPUs to be shared or partitioned into several GPU instances for running multiple VM and Tanzu Kubernetes Grid workloads. VM or Tanzu Kubernetes Grid workloads can then be configured to consume GPU resources from a selection of preconfigured profiles. NVIDIA Guest OS drivers are then installed, which allow the workload to integrate with the many AI/ML functions contained in the NVIDIA AI Enterprise Suite. Considerations for using NVIDIA GPUs on vSphere: NVIDIA GPUs for machine learning and deep learning have near-native performance when running on vSphere. Host servers must be compatible with GPU devices as published by the server OEM and GPU vendor. The use of NVIDIA drivers to support vGPU is a licensed component of the NVIDIA AI Enterprise Suite. Mariel pdp página 44 By using the NVIDIA vGPU technology with vSphere, you can either dedicate one full GPU device to one VM, or you can share a GPU device across multiple VMs by using vGPU profiles. The NVIDIA vGPU software includes two separate components: NVIDIA management server (the NVIDIA vGPU manager), which is loaded as a VMware Installation Bundle (VIB) into the vSphere hypervisor Separate guest OS NVIDIA driver, which is installed in the guest OS of your VM (the guest VM driver) GPU Configuration Modes Dynamic DirectPath I/O (pass-through mode): - The entire GPU device is allocated to a specific VM-based workload. NVIDIA vGPU (shared GPU): - NVIDIA vGPU enables multiple running VM workloads on a host to have direct access to parts of the physical GPU at the same time. Shared GPU Modes NVIDIA GPUs can be shared with traditional VMs or Tanzu worker node VMs in Time-Slicing mode or Multi-Instance GPU (