Document Details

FeatureRichButtercup

Uploaded by FeatureRichButtercup

Technische Universität München

Tags

container orchestration kubernetes

Full Transcript

PaaS - Kubernetes Michael Gerndt Technische Universität München What is Container Orchestration ? Single VM A Docker Container Container Deployment 100s of VMs...

PaaS - Kubernetes Michael Gerndt Technische Universität München What is Container Orchestration ? Single VM A Docker Container Container Deployment 100s of VMs ? How to deploy and manage them ? Many Docker Containers This management and deployment of containers is called Container Orchestra on ti Container Orchestration Platform Services Container placement Selects a speci c host for a speci c container or a set of containers using di erent rules. Resource usage monitoring Resource usage like CPU and RAM is required at di erent levels – at the container level, at the logical group level and at the cluster level. Health Checks Used to check container’s liveness or readiness status. Container Scaling Scaling of containers up or down based upon the requirements. Access to services IP management and load balancing Networking of containers E cient implementation of microservice communication Persistent storage management.... ffi fi fi ff ff Container Orchestration Platforms Amazon Elastic Azure Container Container Service Service (Amazon ECS) Kindle book for 9,20 € Updated version 2024 continuously updated git repository with codes Many gures and examples are from this book. fi What is Kubernetes ? Development Started by Google Donated as Open Source to the Cloud Native Computing Foundation in 2014 Current version 1.30 It is one of the most feature-rich and widely used container orchestration frameworks. Its key features include: Automated deployment and replication of containers Online scale-in or scale-out of container clusters Load balancing over groups of containers Rolling upgrades of application containers Resilience, with automated rescheduling of failed containers Controlled exposure of network ports to systems outside of the cluster Name Kubernetes Comes from the Greek word (κυβερνήτης) for Helmsman Shortened to K8s (pronounced "Kates") Operating system for the cloud De facto platform for deploying cloud-native applications It abstracts the cloud's resources and schedules microservices like a traditional OS does for standard systems and processes. Kubernetes from 40K feet Kubernetes is two things A cluster for running applications An orchestrator of cloud-native microservices apps Cluster Consists of a bunch of nodes and a control plane Control plane provides API, scheduler to assign work to nodes, and persistent storage for the cluster state. Orchestrator Services to run and coordinate microservice applications To run an application you follow this work ow 1. Write microservices 2. Package each service in a container 3. Wrap each container in its own Pod. 4. Deploy pods to the cluster via workload resources that manage a set of Pods: Deployments, DaemonSets, StatefulSets, Jobs, CronJobs,... fl Supports Multiple Container Runtimes Di erent nodes of a cluster can have a di erent container runtime Container runtime provides means to pull and start containers, manage stdio, … e.g., Docker, containerd, kata, … using lower-level runtime like runc to start the container process Container Runtime Interface (CRI) Abstraction layer for 3rd-party container runtimes Runtime Classes Con guration of the container runtime, e.g. with respect to performance or security. Con guration options depend on the CRI implementation A pod can then specify a runtime class to select certain option settings. ff fi fi ff Declarative Management Applications are managed declaratively Describe how you want the application to run in YAML les. POST the descriptions to Kubernetes Follow how Kubernetes manages the application to match the descriptions. Advantages The implementation is entirely on Kubernetes. It oversees the application during runtime to always match the requirements. fi Pods Term Pod comes from a pod (group) of whales A Pod is a group of containers Examples for multi-container Pods Service meshes, e.g., ISTIO Web containers supported by a helper container that pulls the lastest content Containers with a tightly coupled log scraper Use multiple containers in a Pod only in case of tight coupling through memory or volumes. Pod provides the environment for containers IP address, shared memory, volumes, network stack... Containers inside a Pod can use ports on the Pod's localhost interface Kubernetes Overall Architecture Node 1 Pods kubectl Container Master Node Kube-proxy etcd Kubelet Load Users Kube Controller Balancer apiserver Manager Node 2 Pods Scheduler Container Kube-proxy For high availabilitiy Kubelet multi-master con guration. fi Master Node Components Kube-apiserver Provides REST interface for the kubernetes control plane and datastore. Receives the con guration les (manifests) describing the desired state. Acts as the gatekeeper to the cluster by handling authentication and authorization All clients and other applications interact with kubernetes through the API Server. Cluster store etcd is currently used as Kubernetes’ backing store. All cluster data is stored here. Key-value database with multiple replicas for high-availability. Prefers consistency over availability (CAP Theorem) If etcd becomes unavailable, application can continue but not be updated. Controller manager Observe current Launches and controls independent state control loops for, e.g., nodes, endpoints, Obtain replicasets desired state determine differences reconcile differences fi fi Master Node Components kube-scheduler Selects a node for newly created Pods to run on. Two steps 1. Determine nodes that are capable of running the Pod Check whether it is tainted, excluded by anti-a nity rules, unavailability of the required port, not su cient resources,... 2. Rank the capable nodes depending on Amount of free resources How many pods are running... API Server's RESTful endpoint on port 443 ffi ffi Node Components Node Components Kubelet The node-level manager in Kubernetes Responsible for managing the lifecycle of every Pod on the node Receives new Pod assignments from the apiserver. Container Runtime Performs container-related tasks like pulling images, starting, stopping containers Moved from native Docker support to a plugin model called the Container Runtime Interface (CRI) Popular is cri-containerd (CNCF) Kube-proxy Manages local cluster networking. Ensures that each node gets its own IP Handles routing and load-balancing on the Pod network for services. Pods Pod provides the environment for containers Pods create their own network namespace Single IP address, a single range of TCP and UDP ports, a routing table It will be shared among all containers of a Pod. External access to a container in the Pod Pod IP address combined with the port of the container Container-to-container communication in a Pod Localhost adapter and port number Pod Network Pod network for Pod-to-Pod communication Kubernetes sets up bridge networks and routing tables such that Pods can reach other Pods via their IP address without Network Address Translation. Setting all components up is done by Pod Networks like AWS VPC CNI, Calico,... Network Bridge Pod Pod https://medium.com/google-cloud/understanding-kubernetes-networking-pods-7117dd28727 Pod Resource Limits cgroups Used to specify the resource limits for CPU, RAM, IOPS,... Limits of a Pod are the aggregated limits of the containers plus possibly some Pod overhead. Pods Pods are the unit of scaling Scaling of application components is through adding or removing Pods. Pods are an atomic unit Starting and stopping of Pods will start and stop all the containers. Pods always run on a single node. Pods are mortal A replacement of a Pod will have a new ID and IP address and might run on any other node. Pods are managed by higher-level controllers Deployment: enabling scaling, zero-downtime updates, versioned rollbacks DaemonSet: ensure a Pod is running on each node StatefulSet: adds guarantees and ordering and uniqueness of Pods Creation of a Pod Pod Deployments Deployments apiVersion: apps/v1 Provide self-healing, scalability and rolling kind: Deployment metadata: updates name: hello-deploy Control a single Pod and its replicas through spec: replicas: 10 a ReplicaSets object. selector: De ned declaratively and pushed to the API matchLabels: Server app: hello-world minReadySeconds: 10 Self-healing strategy: Failed Pods will be replaced. type: RollingUpdate rollingUpdate: Scaling maxUnavailable: 1 maxSurge: 1 Number of replicas in the ReplicaSets template: can be adapted as required. metadata: Current state is automatically labels: app: hello-world adapted to match the spec: desired state (Declarative model) containers: through a background control loop. - name: hello-pod image: k8sbook:latest ports: - containerPort: 8080 fi Pod Deployments Rolling update POST a new version of the Deployment YAML le with a new version of a container image. Kubernetes creates a new ReplicaSet When a new Pod is created in the ReplicaSet, an old Pod in the previous ReplicaSet is deleted. Providing a rolling update with zero downtime. Additional support to wait a certain time period after a new Pod was started,... Versioned rolling update Old ReplicaSet still exists with the old con guration. Rolling back simply winds up the old ReplicaSet Check kubectl rollout command. fi fi Services Services provide a reliable networking for a set of Pods. Stable DNS name, IP address and port Service discovery is through Kubernetes DNS service. Load balancing across a dynamic set of Pods. Connecting Pods to Services Loosely coupled via labels and label selector Services Each service is associated with an own Endpoints object Dynamic list of Pods that match the label selector. Balances requests over the Pods in the Endpoint. youtu.be/nKKTdo2Yo6Y Types of Services ClusterIP Service (default) IP address and port only accessible inside of the cluster. NodePort Service Service has an additional port called the NodePort. It can be reached from outside by sending a request to the IP address of any cluster node on the NodePort. kube-proxy listens to that port and replaces target by the cluster IP of the service and its port. Then it is a cluster local request. LoadBalancer Service Integrate with load-balancers from cloud providers Extension of NodePort Service Allow clients to reach the Pods via a cloud load-balancer ExternalName Service Allow to route tra c to systems outside of your Kubernetes cluster. The service is implemented outside of the cluster and accessible through a domain name which is speci ed in the service YAML. ffi fi Service Types NodePort Host 1 Pod Labels: 30000 Kube-proxy app=nginx User Host 2 Pod Pod Labels: Labels: 30000 Kube-proxy app=nginx app=nginx2 Host 1 LoadBalancer Pod Labels: 30000 Kube-proxy app=nginx Host 2 Cloud User Load Balancer Pod Pod Labels: Labels: 30000 Kube-proxy app=nginx app=nginx2 Service Discovery Cluster DNS service Service is called kube-dns. Pods managed by a Deployment called coredns. Services register automatically with the Cluster DNS. Cluster DNS IP address is passed to containers through /etc/resolv.conf. Service discovery A service request is handled on the node by rewriting the services IP address obtained from Cluster DNS into the IP address of a Pod in the Endpoints object of the service. This rewriting is setup by the kube-proxy process. Service Discovery Service DNS registration Service access Accessing Services Di erent implementations of proxier: - userspace - IPTABLES - IPVS Video explaining the role and implementation of kube-proxy: https://youtu.be/nKKTdo2Yo6Y ff Kubernetes Storage Persistent Volume Any storage exposed on your Kubernetes cluster. e.g. Azure File resource and AWS Elastic Block Store block devices. Container Storage Interface (CSI) Storage providers write plugins against the CSI. It hides the internal Kubernetes storage details. Persistent volume subsystem Persistent Volume (PV) object allows to map external storage onto the cluster. Persistent Volume Claims (PVC) are consumed by Pods to get access to PVs. Storage Class (SC) automates creation of PVs. Persistent Volume Represents a storage device in Kubernetes Properties Capacity Storage class Access Mode ReadWriteOnce (RWO): Can be bound only by a single node (PVC) as read/ write but might be accessed by multiple pods on the same node. ReadWriteMany (RWM): Can be bound by multiple nodes. ReadOnlyMany (ROM): Can be bound by multiple nodes as read only. ReadWriteOncePod (RWOP): Can be bound to a single pod only.... Pods only act through the PVC object that is bound to a PV. Speci cation of a PVC object must match that of the bound PV Containers can mount a PV through a PVC fi PV and PVC apiVersion: v1 kind: Pod metadata: name: volpod spec: volumes: - name: data persistentVolumeClaim: claimName: pvc1 containers: - name: ubuntu-ctr image: ubuntu:latest command: - /bin/bash - "-c" - "sleep 60m" volumeMounts: - mountPath: /data name: data StorageClass Automates creation of PVs Speci es the provisioner (CSI plugin) that provides the volume. e.g. classes for disk and SSD based volumes. fi kind: StorageClass apiVersion: storage.k8s.io/v1 metadata: name: slow apiVersion: v1 annotations: kind: Pod storageclass.kubernetes.io/is- metadata: default-class: "true" name: class-pod provisioner: kubernetes.io/gce-pd spec: parameters: volumes: type: pd-standard - name: data reclaimPolicy: Retain persistentVolumeClaim: claimName: pv-ticket containers: - name: ubuntu-ctr apiVersion: v1 image: ubuntu:latest kind: PersistentVolumeClaim command: metadata: - /bin/bash name: pv-ticket - "-c" spec: - "sleep 60m" accessModes: volumeMounts: - ReadWriteOnce - mountPath: /data storageClassName: slow name: data resources: requests: storage: 25Gi Con gMap Con gMap (CM) objects store con guration data outside of a Pod which can be dynamically injected into a Pod at runtime. Key value pairs Can be created declaratively Injected by Environment variables Arguments to container startup commands Files in a volume fi fi fi Con gMap apiVersion: v1 kind: Pod metadata: name: cmvol spec: volumes: - name: volmap configMap: name: multimap containers: - name: ctr image: nginx volumeMounts: - name: volmap mountPath: /etc/name fi Workload Resources Workload is an application -> Pod template. Workload resources control an application Deloyment DeamonSet: A Pod is running on each node of the cluster. CronJobs: StatefulSet: adds additional guarantees for the state of a Pod to the Deployment Predictable and persistent Pod names Predictable and persistent DNS hostnames Predictable and persistent volume bindings StatefulSet Pod names - Ordered creation and termination Pod are created sequentially, waiting for the previous Pod to be running and ready. Same for scaling out and scaling in. Volumes Created and named when the Pod is created Reattached to replaced failed Pods or during scaling Headless service StatefulSets can be connected to a headless service (no clusterIP) to get predictable DNS hostnames for every Pod replica. It creates DNS records for all Pods matching the label selector of the service. Other Pods can nd members of the stateful set and connect to a speci c one. fi fi Volume Claim Template Remember StorageClass used in Persistent Volume Claims to create volumes. A StatefulSet requires individual volumes for each Pod The creation is automated through the Volume Claim Template StatefulSet apiVersion: apps/v1 kind: StatefulSet metadata: name: tkb-sts (Continued) spec: replicas: 3 volumeClaimTemplates: selector: - metadata: matchLabels: name: webroot app: web spec: serviceName: "dullahan" accessModes: [ "ReadWriteOnce" ] template: storageClassName: "flash" metadata: resources: labels: requests: app: web storage: 1Gi spec: terminationGracePeriodSeconds: 10 containers: - name: ctr-web image: nginx:latest ports: - containerPort: 80 name: web volumeMounts: - name: webroot mountPath: /usr/share/nginx/html Kubernetes Dashboard Autoscaling Two autoscalers Horizontal Pod autoscaler Vertical Pod autoscaler Horizontal Pod autoscaler Controller for ReplicaSet It modi es the desired number of replicas within declared bounds Supports metrics of Kubernetes objects, Pods, resources, and external metrics. Resource metrics are typically collected through the metrics-server and accessible from the APIserver through the Metrics API. Calculates the desired replica count such that the threshold is enforced. fi Vertical Pod Autoscaler Calculates the resource requests for Pods based on usage. Speci cation of the Vertical Pod Autoscaler includes The Deployment or StatefulSet Update Policy: how the changes in resources are applied to the pods O : Resources are not modi ed according to computed recommendation Initial: Assign the resources only at the start of the Pod. No dynamic change. Recreate: Deletion and recreation of a Pod Auto: Can use any method. ResourcePolicy: computes the recommended resources for pods Computes resource requirements for a container in the Pod. Di erent policies can be used for di erent containers. Limited by min and max allowed resources Recommendations are given as interval of recommended resources. Can be useful for manual scaling as well. Recommendation based on historic and current resource usage. ff fi ff fi ff Cluster Autoscaler It adapts the number of nodes of the Kubernetes cluster and runs on the Master Node. Scaling is implemented through the cloud provider interface. Scaling policy If Pods could not be scheduled due to a shortage of resources, the number of nodes is increased. It estimates the required number of nodes to ful ll the resource requests of waiting Pods. Cluster Autoscaler considers the unneeded nodes for the scale down, i.e. the nodes whose pods can be scheduled somewhere else. fi Kubernetes Monitoring Kubernetes started with an own monitoring called Heapster. Now o ers MetricServer for basic resource metrics and integration with Prometheus Nice installation tutorial https://sysdig.com/blog/kubernetes-monitoring-prometheus/ ff Prometheus Architecture 47 Monitoring Prometheus operator to easily set up on Kubernetes cluster Start the operator and create one or more Prometheus servers. The operator takes care of version upgrades, persistent volume claims, and connecting Prometheus to Alertmanager instances etc. The Prometheus operator also con gures Prometheus with the services exporting metrics. It is informed via ServiceMonitor resources. Service monitor Allows to nd a service that exposes metrics. It points to the service through a selector and Prometheus will automatically start scraping it. fi fi What to monitor? Application (container) monitoring: custom metrics: provisioned through instrumentation process metrics: CPU usage... Infrastructure monitoring (info about the nodes) Node Exporter: Pulls the linux telemetry from the node Kubernetes Monitoring kube-state-metrics exporter: Metrics about Kubernetes objects such as deployments, pods (CPU, memory, IOPS),... apiserver: provides workload information, e.g. crash loops etc. kubelet: provides metrics on container and on the node Inform Prometheus to scrape the metric sources via ServiceMonitors Other Kubernetes Monitoring Solutions Many Kubernetes monitoring solutions Kubernetes own monitoring based on metricServer and Kubernetes Dashboard Dynatrace Epsagon Datadog Weave Scope... Cloud Provider Offerings Cloud providers of managed Kubernetes clusters in addition to earlier container services AWS Elastic Kubernetes Service Azure Kubernetes Service Google Kubernetes Engine Comparison at https://logz.io/blog/kubernetes-as-a-service-gke-aks- eks/ Cloud aspects Google rst, AWS and Azure in 2018 Support for cluster autoscaling and node pools Integration with provider's monitoring solution High availability by spreading out master nodes to di erent availability zones AWS provides also bare-metal hardware GKE and AKS do not bill for an empty cluster (just a master node) while AWS EKS costs $0.10 per hour for a deployed cluster in addition to resources for workers. fi ff Cloud Native Applications Cloud native software architectures are a completely new paradigm that uses new methods to solve business problems that can typically only be achieved at the scale of cloud computing. Laszewski, Tom; Arora, Kamal; Farr, Erik; Zonooz, Piyum. Cloud Native Architectures: Design high- availability and cost-e ective applications for the cloud. Packt Publishing. Kindle-Version. 52 ff Cloud Native Maturity Model 1. Cloud Native Services Basic cloud services: e.g., storage, compute,... Managed services: e.g., databases, analytics,... Advanced cloud-managed services, e.g., Container service, FaaS,... 2. Application Design General guidelines for scalable, robust cloud applications (12factor.net) Microservices: Considered to be the most mature cloud application model Advanced design considerations: e.g., instrumentation, security, concurrency, resiliency 3. Automation Management, con guration, deployment: e.g., infrastructure as code, CI/CD Monitoring, compliance, optimization: e.g., monitoring for performance analysis and compliance auditing, autoscaling Predictive analytics, AI & ML: Anomaly detection, event prediction and its impact 53 fi The Twelve Factors of Cloud Applications I. Codebase One codebase tracked in revision control, multiple deployments II. Dependencies Explicitly declare and isolate dependencies III. Con g Store con g in the environment IV. Backing services Treat backing services as attached resources V. Build, release, run Strictly separate build and run stages VI. Processes Execute the app as one or more stateless processes fi fi The Twelve Factors of Cloud Applications VII. Port binding Export services via port binding VIII. Concurrency Scale out via the process model IX. Disposability Maximize robustness with fast startup and graceful shutdown X. Dev/prod parity Keep development, staging, and production as similar as possible XI. Logs Treat logs as event streams XII. Admin processes Run admin/management tasks as one-o processes in the production scenario ff Net ix: Cloud Native Company Net ix announced in 2010 to run their stream services in the AWS cloud. They realized that they are an entertainment creation and distribution company, and not a data center operations company. 2009: Migrating video master content system logs into AWS S3 2010: DRM, CDN Routing, Web Signup, Search, Movie Choosing, Metadata, Device Management 2011: Customer Service, International Lookup, Call Logs, and Customer Service analytics 2012: Search Pages, eCommerce, and Your Account 2013: Big Data and Analytics 2016: Billing and Payment After seven years they completely shut down their own data centers. 56 fl fl Cloud Native Maturity Cloud Native Services Net ix uses numerous AWS services including infrastructure, security, application, analytics, dev tools, and arti cial intelligence services. They also use other services like an own content delivery network as well as open source tools like Cassandra for their NoSQL database or Kafka for their event streams. Application Design Decoupled components in form of microservices Redesigned Oracle database into scalable NoSQL data structure for subscription processing and a regionally distributed MySQL database for user-transactional processing. Decisions taken were made with long-term impact in mind to ensure a future- proofed architecture that could scale as they grew internationally. Automation Simian Army: Tool suite to check resiliency. Atlas: Monitoring tool suite Spinnaker: CI/CD platform 57 fl fi Summary Kubernetes is the major platform for Cloud Native Applications Based on host-level virtualization Distributing Pods with one or more containers across a cluster of servers Enables resilience and elasticity All cloud providers o er managed Kubernetes as a service besides own platforms for containerized applications. ff Test of Screen Recording Infrastucture We will provide a previous exam We will take you through screen recording We will discuss the answers to the questions in the exam. Participate, if you want to learn about the exam!!! When: Next Wednesday

Use Quizgecko on...
Browser
Browser