OKE Node Manager.pdf
Document Details
Uploaded by PremierEmpowerment2626
Tags
Full Transcript
OKE Node Manager Objectives Requirements Vocabulary Background Features Node Monitoring & Operations Tunables Network Tests VNIC Attac...
OKE Node Manager Objectives Requirements Vocabulary Background Features Node Monitoring & Operations Tunables Network Tests VNIC Attachments Interfaces Design Controller manager Alternatives Node Agent Option: VCN-Native DaemonSet (target) Option: new DaemonSet (prototype) Objectives Enhance OCI Compute and VCN integration for OKE worker nodes with Kubernetes-native management and monitoring: Lightweight, active monitoring for out-of-bounds system configuration with options to report/mitigate (auto-repair) Capture data required to troubleshoot customer issues On-demand network test execution for faster identification of problems Active management of VNIC attachments and other supported network configurations Not a replacement for kube-state-metrics or node-problem-detector. This will be complementary to them. Requirements scale to 10,000 nodes minimize the amount of infrastructure we deploy to the customer’s data plane minimize storage space + reads + writes to etcd (and therefore kube-apiserver as well) we do not need real time data. It’s sufficient if the data is 5 minutes old or less we do not need high availability. It is okay to have some down time as long as it’s not for prolonged periods of time Vocabulary Cloud Controller Kubernetes master component responsible for cloud provider-specific work Manager (CCM) Container Kubernetes interface for setting up pod networking Networking Interface (CNI) Custom Resource An instantiation of a Custom Resource Definition (CR) Custom Resource A user-defined Kubernetes object definition to be reconciled by an operator. Custom Resources follow all the same Definition (CRD) conventions as first-class citizens of the Kubernetes CP API (like pods). This means they follow RBAC rules, you can make GET, PATCH, DELETE, and other calls to them. Kubernetes Control Consists of Kube apiserver, Kube controller manager, Kube scheduler, Cloud Controller Manager, and Proxymux. These Plane are components used to manage the user's cluster. The control plane does not host user workloads. Kubernetes Servers in the OKE Service tenancy that host the Kubernetes Control Plane Manager Instance (KMI) TODO Informer /Watch/Controller Instance Metadata An endpoint that provides metadata for the instance. For example, what is the IP address of the instance, what VNIC's are (IMDS) attached, etc. Content on these Oracle Cloud Infrastructure pages is classified Confidential-Oracle Internal and is Page: 1 intended to support Oracle internal customers & partners only using Oracle Cloud Infrastructure. OKE Management OKE Service that manages KMIs Plane Native Pod OKE-offered CNI. Pods talk to each other via native VCN constructs (VNIC's). Networking (NPN) (aka VCN-Native CNI) Pod The smallest unit of application within a Kubernetes cluster. A pod is typically a single application container, however, it's possible for a pod to contain multiple containers. Pods typically get their own networking setup inside of a node, unless the pod is designated as a HostNetwork pod. Proxymux A component in the KMI that is responsible for proxying traffic for Kubernetes endpoints that require a long-lived connections (e.g. kubectl logs, exec, port-forward). This component also hosts many of OKE's custom solutions. In this case, we can add a webhook to Proxymux to intercept all pod creations to check/perform specific actions, like adding a pod annotation or checking labels. Virtual Cloud An overlay cloud network. A user will typically create their own VCN and run OKE inside of it Network (VCN) Background See also: Managed Observability and Operation Cruise TODO summary, example failure modes, monitored signals, remediation, other ops Features Node Monitoring & Operations Collect health and performance indicators/perform Node operations, incl. OCA plugin state and GPU information Tunables Key health and performance indicators are collected with optional repair automation Network Tests Run automated health checks on worker nodes VNIC Attachments Configure secondary VNIC attachments on worker nodes Interfaces Configure network interfaces on worker nodes Design Installed through OKE add-ons and/or open source manifests/charts, a controller running in the customer data plane yields helpful aspects of control and visibility to the customer - without limiting OKE's ability to provide a managed experience by default. This lends some natural ownership over many of these components that are well understood by a set of customers. Kubernetes-standard logging and telemetry from the application are more easily available to users, depending less on the service itself to provide (or not provide) them. Content on these Oracle Cloud Infrastructure pages is classified Confidential-Oracle Internal and is Page: 2 intended to support Oracle internal customers & partners only using Oracle Cloud Infrastructure. Controller manager A set of controllers deployed as a single container will watch for create/update/delete events of OKE-managed Kubernetes resources. In practice, a Kubernetes Deployment with 2 replicas holding a Lease, and PodDisruptionBudget help guarantee maximum availability for control operations enacted by the user. When notified of an event, the associated controller will inspect and compare actual state of the system to desired user state specified by the CR. Any deviations can be corrected and/or reported appropriately, with user control through add-on configuration or k8s ConfigMap over enabled features. New controllers will be implemented in the existing VCN-Native DaemonSet: Current NPN CNI plugin installation made optional, and disabled by default on Flannel Add configuration to add-on & controllers to existing controller manager Alternatives Controller manager Background: Cloud Controller Manager (CCM) A Kubernetes component managed by OKE within the service tenancy for each cluster. Supports, among other things, reconciliation of a custom resource for secondary VNIC attachments with VCN-Native Pod Networking. The implementation for this function may fit best in one of three OKE components: Option B: Cloud Controller Manager (CCM) Content on these Oracle Cloud Infrastructure pages is classified Confidential-Oracle Internal and is Page: 3 intended to support Oracle internal customers & partners only using Oracle Cloud Infrastructure. CCM already performs this function in the case of VCN-Native Pod Networking. VNIC management is fully-obfuscated and controlled by the service; possible API optimization e.g. batching. Dependent on another mechanism (e.g. DaemonSet) for individual host configuration - more moving parts for function. Internal ownership spans across teams, release coordination with other cluster components. Option C: Node Pool Workflows (NPWF) Additional VNICs are not supported; must first implement e.g. through added Instance Configuration. Additional effort beyond attachment on new pools to support additions on existing, mitigate unexpected detachment/failure. Dependent on another mechanism (e.g. DaemonSet) for individual host configuration - more moving parts for function. Node Agent Lightweight DaemonSet for Node network configuration. When called by the controller manager component, the agent will return status and health information for the node. Management rules may be pushed down into the agent to influence automated observation and enact repairs per user configuration. References Oracle Cloud Agent HPC plugin state files Option: VCN-Native DaemonSet (target) Content on these Oracle Cloud Infrastructure pages is classified Confidential-Oracle Internal and is Page: 4 intended to support Oracle internal customers & partners only using Oracle Cloud Infrastructure. Background The vcn-native DaemonSet is deployed by OKE on clusters with the Native Pod Networking CNI selected (vs. Flannel), starting a container on each worker node to perform the following tasks: Install CNI plugin binaries to the host fileystem (/opt/cni/bin) Install CNI network configuration to the host filesystem (/etc/cni/net.d) Sleep forever; re-reconcile above on restart (e.g. new container image w/ updated CNI plugins) Included in this code-base are the CNI plugins themselves, which utilize networking libraries to perform setup for NPN that is very similar to what we require here for VNIC attachments more generally: conditional VLAN creation for BM vs. VM, address/route assignment, and other link configuration. Proposal The term "VCN-Native" can more broadly encompass other plugins/modes of operation that integrate physical and virtual OCI network resources with Kubernetes, providing management for extended network configurations and integrating seamlessly with existing VCN-Native Pod Networking state. The vcn-native DaemonSet will now be scheduled on all clusters incl. those with Flannel base CNI, with conditional installation of the NPN CNI binaries and IPAM state. Node agent logic will be implemented in the existing application, already deployed as a cluster add-on. Pros: efficient use of existing application with shared function, smaller footprints for service maintenance and user deployment Cons: more impact to existing CNI codebase and application behavior Option: new DaemonSet (prototype) Background A prototype exists to demonstrate aspects of node monitoring and control between a controller manager and node agent implementation. This is new code in its own repository built with the standard operator SDK. If opting for a separate DaemonSet, this existing codebase could be considered for implementation/reference. Proposal The node agent logic will be implemented in a new DaemonSet, deployed as a cluster add-on. Pros: fewer changes to existing components, lower design and implementation effort Cons: higher overhead on worker nodes with multiple add-ons enabled, higher maintenance of added repos/release cycles Content on these Oracle Cloud Infrastructure pages is classified Confidential-Oracle Internal and is Page: 5 intended to support Oracle internal customers & partners only using Oracle Cloud Infrastructure.