Topic 3: Containers - Part 1 PDF
Document Details
Uploaded by SuitableSerpentine
Dr. Daniel Yellin
Tags
Summary
This document presents an overview of Docker containers, focusing on the benefits of Docker including dependency management, portability, and scalability. The material includes various diagrams and figures.
Full Transcript
Topic 3: Containers part 1 Dr. Daniel Yellin THESE SLIDES ARE THE PROPERTY OF DANIEL YELLIN THEY ARE ONLY FOR USE BY STUDENTS OF THE CLASS THERE IS NO PERMISSION TO DISTRIBUTE OR POST THESE SLIDES TO OTHERS Docker overview Images, Containers, Layers What...
Topic 3: Containers part 1 Dr. Daniel Yellin THESE SLIDES ARE THE PROPERTY OF DANIEL YELLIN THEY ARE ONLY FOR USE BY STUDENTS OF THE CLASS THERE IS NO PERMISSION TO DISTRIBUTE OR POST THESE SLIDES TO OTHERS Docker overview Images, Containers, Layers What problems does Docker solve? 1. Packaging all the software dependencies together with the application to avoid conflicts You want to run multiple applications or multiple modules of a single application on a computer. Each application uses different versions of the same package. How do you avoid conflicts when building and running the application? Dependency management: package your application with all the required packages and dependencies into one container. Multiple programs running on top of OS Diagram from: Docker and Kubernetes: The Complete Guide Udemy Course by Stephen Grider Multiple programs running on top of OS by default use the same versions of programs Containers help to avoid conflicts like this, between Python v2 and v3. Diagram from: Docker and Kubernetes: The Complete Guide Udemy Course by Stephen Grider A container partitions system resources so different containers are isolated from one another Diagram from: Docker and Kubernetes: The Complete Guide Udemy Course by Stephen Grider What problems does Docker solve? 2. Portability You often want to run your application on your desktop, on your company’s test environment, and on your company’s production environment in the Cloud. They run on different hardware and different operating systems. How can you port code from one environment to another without having to make a separate version for each one? How can multiple apps built on different hardware with different operating systems run on the same server? Heterogeneous applications can run on a single server with different hardware, different operating system (with same kernel), and different packages installed. Built on RedHat Built on Ubunto Run on Linux Run on Linux App 2: numpy v 3.8 Django v 5.0 App 1: Numpy v3.6 Django v 5.0 What problems does Docker solve? 3. Independent lifecylce management of application modules Whenever you fix a bug or add a feature to one module, you do not want to rebuild/test/deploy the entire application. You want to make the change to a small unit, then build, test and deploy just that unit. You want an independent lifecycle (build/test/deploy) for each “unit” of your application that still integrates with the entire application. This follows the single responsibility principle, which we will discuss more when we discuss microservices. Independent lifecyle management Server running application Shopping Inventory Cart module module Shopping cart team Inventory team This Photo by Unknown Author is licensed under CC BY-SA-NC What problems does Docker solve? 4. Sharing and reusability Reusing code from others can be complex: lengthy install process and error-ridden build process. GIT helps but does not solve the problem. It focusses on source code. Containers allows others to take your code and update it, without having to tamper with your source code. 5. Scalability You want to deploy your SaaS application. You do not have a lot of users yet, but your user-base will grow rapidly. You do not want to spend a ton of money on Cloud hardware or even VMs. Containers allow you to easily increase the number of instances handling requests as traffic to your application grows. Docker bridges development and deployment Docker fits perfectly with modern development and Cloud: A container is a standardized software unit that bundles the code and all its dependencies so that a program can be executed quickly, reliably and with high portability on a container engine using operating system kernel services. Apps are built from lots of code built by many different teams. The same apps/services gets run in many different environments Apps needs to be able to scale to support both a small and a large number of users. App components need to be integrated into automated software pipelines making it much easier to configure and deploy updates. THE ULTIMATE DOCKER CHEAT SHEET provides a good summary of many of the concepts we will cover in the rest of this section. Images Three fundamental concepts in Docker are images, layers and containers. An image is a collection of files, metadata, and a command. It contains executable application source code as well as all the tools, libraries, and dependencies that the application code needs to run as a container. Images and containers A container is created from an image. It is an isolated running process (or processes) started using the command, with access only to the files in the image. The metadata defines properties of the process such as network ports accessible to the process. You can create, start, stop, move, or delete a container using the Docker API or CLI. You can connect a container to one or more networks, attach storage to it, or even create a new image based on its current Read-only file Live, executable content state. Dockerfiles A Dockerfile automates the process of Docker image creation. run It’s a list of command- line interface (CLI) instructions that Docker Engine will run in order to assemble the image. Often, an image is based build upon another image. To build your own image, you create a Dockerfile with a simple syntax for defining the steps needed to create your desired image “on top of” the other image. Docker by example Build, share, run You can download docker from here: https://www.docker.com 4 docker commands 1. docker build --tag 2. docker run --name 3. docker images 4. docker ps More Docker commands in the appendix Source A Dockerfile to containerize our toy.py app FROM python:alpine3.17 You start with a base image. In WORKDIR./app this case, a “slim” python image (alpine). COPY toys.py. Every command creates a new RUN pip install Flask layer on top of the previous layer. ENV FLASK_APP=toys.py ENV FLASK_RUN_PORT=8001 When you finish building the ENV NINJA_API_KEY... image using this Dockerfile, you have a new image. EXPOSE 8001 CMD ["flask", "run", "--host=0.0.0.0"] Dockerfile: instructions on making an image FROM python:alpine3.17 FROM python:alpine3.17 Image to use as base. In this case, we chose the official Python image that has all the tools and packages we need to run Python (e.g., pip). WORKDIR./app WORKDIR./app Creates and sets the working directory inside the COPY toys.py. container to be./app. All further commands are executed in this directory. RUN pip install flask COPY toys.py. Copies python file from the host into the working directory of the container (in this case,./app) ENV FLASK_APP=toys.py ENV FLASK_RUN_PORT=8001 RUN pip install flask. RUN is executed at build time. ENV NINJA_API_KEY... Runs these cmds to install these packages needed by our application into the container. EXPOSE 8001 CMD [”flask", ”run”, “-- host=0.0.0.0”] Dockerfile: instructions on making an image FROM python:alpine3.17 ENV FLASK_APP=toys.py ENV FLASK_RUN_PORT=8001 WORKDIR./app ENV NINJA_API_KEY … These commands set environment variables inside the COPY toys.py. Docker container. RUN pip install flask EXPOSE 8001 Port 8001 is the port the container will listen on. ENV FLASK_APP=toys.py This is documentation as the actual port for listening ENV FLASK_RUN_PORT=8001 is expressed by the “FLASK_RUN_PORT” variable ENV NINJA_API_KEY... and/or is expressed when starting the container. CMD [“flask”, “run”, “--host=0.0.0.0”] EXPOSE 8001 Executes the command “flask run --host=0.0.0.0” when the container startups. The parameters to the CMD [”flask", ”run”, “-- command are given, with each one parenthesis and in host=0.0.0.0”] separated by a comma. Remember to always add the “build context”. “.” means Build image current directory. That’s where the resources for building an image (referenced by the Dockerfile) are located, unless specified otherwise explicitly in the Dockerfile. docker build --tag toysimage:v1. This command tells docker to build a new image and give it the name:tag “toysimage:v1”. You have to tell docker where to find the Dockerfile. The “.” at the end of the cmd tells docker to use the Dockerfile in the current directory. docker images This command tells docker to list all the images it is storing on the host. You can see listed the newly created docker image toysimage:v1. Explaining the docker build cmd messages Docker uploads files to the Docker Builder. The FROM cmd loads the python:alpine3.17 to the image file system. Each layer in the image is uploaded. The rest of the Dockerfile commands are run. The commands are numbered. Each command creates a new layer in the image. If there are cached images (e.g., from a pull cmd, or a previous build), then they will be loaded and not rebuilt from scratch. The final image is created with a unique ID, and then given the name:tag specified in the cmd. Docker images Docker images residing in my local repository. You can view your local images either using the cmd: docker images or by opening Docker Desktop→ Go to the Dashboard. Dockerfiles and layers Each instruction that modifies the filesystem in a Dockerfile creates a layer in the image. Each layer encodes the changes to the filesystem by executing that instruction. Other instructions just modify the image’s metadata. This view is available on Docker Desktop Layers in the toys container Layers from python:alpine Layers from my Dockerfile commands Note the size of the layers Looking inside an image In the Docker Desktop you can docker sbom get a breakdown on what toysimage:v1 packages are included per layer Gets the software bill of materials for this of your image. toysimage:v1 image. These are all the files making up this image. (This is currently an The Docker Desktop also shows experimental cmd). vulnerabilities in each layer, giving the Common Vulnerabilities & Exposures (CVE) score for each vulnerability. Docker Desktop, CVE vulnerabilities You can download docker from here: https://www.docker.com 4 docker commands 1. docker build --tag 2. docker images 3. docker run --name 4. docker ps More Docker commands in the appendix Source Create and run container The second port is the container port docker run The docker run command = docker create + docker start cmds. It tells docker to create a container from the given image and start the container execution (using its init command). docker run --publish 80:8001 --name toys-cont-v1 toysimage:v1 This cmd creates and starts a docker container based upon the image “restwordsvr:v1”. It gives the container the name “contain-v1”. It maps the container port 8001 to the host port 80. The first port is the host port Ports and port forwarding What are ports? A port is a number assigned to uniquely identify a connection endpoint and to direct data to a specific service. At the software level, within an operating system, a port is a logical construct that identifies a specific process or a type of network service. Port forwarding or port mapping is an application of network address translation (NAT) that redirects a communication request from one address and port number combination to another. Wikipedia on port Source Wikipedia on port forwarding Docker containers have their own IP addresses and ports Each container has its own internal IP address. Normally this is not visible to you and it may change if the container is restarted. Each container may listen on specific ports. For instance, our example toys app in the container will listen on port 8001 of the container for HTTP requests (API calls). What IP address and port does a client use if the client wishes to talk to the container ? The client sends messages to the IP address of the host running the container Docker provides a mechanism to specify which port on the host to use in order to reach the appropriate container port. This is called port publishing. Port publishing Port publishing To make a port available to services outside of Docker, or to Docker containers running on a different network, use the --publish or - p flag. (Docker doc) Example: docker run --publish 8080:80 nginx The client would send to the host IP at port 8080. Docker uses port forwarding to send the message to the container running nginx listening on container port 80. Source Port mapping in Docker Host IP 192.166.1.7 docker run --p8000:8000 --name container1 \ Container1 P8000 P8000 Container IP 172.17.0.4 docker run -p80:8000 --name Container2 container2 \ P80 P8000 Container IP 172.17.0.5 Build and run container (cont) docker ps This cmd lists all running containers. We can see that the “toys-contain-v1” is up an running. The host is listening on port 80 and forwards request to the container port 8001. The docker container is based upon the image “toysimage:v1”. docker ps -a This cmd lists all containers, even those Note that the host (0.0.0.0) port of 80 will forward requests to the container port 8001 as we specified in the Dockerfile. not running currently. Checking that the server is running correctly Requests to the server should act just like it did previously, when running outside the container. Check it out by: Writing a program to issue REST requests to the server, Use postman to issue requests, or Use the curl program, which allows you to issue HTTP requests from your terminal. Curl stands for “Client URL”. We issue 3 curl cmds on the next slide: The first two POST words to the toys server The last one GETs the collection of toys posted so far If you are not familiar with curl, see Curl requests to toys server What does “docker run” really do? It finds the image. If it does not Attaches stdout, stderr to the exist in the local registry, it will terminal search the public registry (or It creates a network interface other registries you specify) that allows the container to DockerHub and pull it down talk to the host. into your local registry. Sets up an IP address for the It creates a new container from container. the image. Invokes the start cmd that you It allocates a file system to the specified for the image. container. It mounts a read- write layer. Source Common container lifecyle commands docker remove Removes the container. All container resources (including container file system) are released. docker stop - Stops the running container(s). The main process inside the container will receive SIGTERM, and after a grace period, SIGKILL. The container’s memory (not file system) will be released. docker start Starts a stopped container(s). docker kill Kills the container(s). The main process inside the container is sent SIGKILL signal. docker pause Suspends all processes in the specified container(s) by sending the SIGSTOP to the main process. The container’s resources are not released, except for the CPU. docker unpause Un-suspends all processes in the specified container(s). Source For complete list of docker container commands and explanation, see Docker command line documentation. Union File System How containers share and modify images Union file system Union file systems: The Union File System (in the Linux kernel) allows contents from one file system to be merged with the contents of another, while keeping the "physical" content separate. Different “branches” can thereby share files. When you have multiple layers in an image, the Union File System Source combines these into a single image. UnionFS, AUFS, OverlayFS are implementations of a Linux Union Filesystem. Union file system The layer reflects the merged image layer and container layer. It contains those files visible to the container during execution. Think of this as the filesystem of The writeable container layer contains any the image after you build it from new files added,modified ,or deleted when you the Dockerfile. It is read only. run your image in a container. Copy-on-write Containers share their filesystems. But when a container modifies a file, then it gets its own modified copy file. This is written in the writeable container layer. Hence that container will see the changes but other containers pointing to the same image will see the original file. This enables a lot of sharing between containers, and fast container start-up time. Source Union file system helps avoid dependency conflicts Hence each container has App 1 source App 2 source its own filesystem, and this avoids dependency App 1 App 2 dependencies dependencies conflicts between containers. Python 2 Python 3 In this case, both use the same Debian base image Debian base but have different versions of python. What we have learnt about Docker so far Docker images consist of a file system, some metadata, and a startup cmd. Metadata includes things like exposed ports, environment variables, etc. When you run an image the Docker engine creates a container. The container is isolated, having its own filesystem and system settings. Because images are built using a union filesystem, containers share files, and only get their own copy when they modify a file. This keeps containers lightweight as they share files with other containers. It also makes it easy to make your own image by modifying an existing image – you just add or modify files on top of the existing image. What we have learnt about Docker so far (cont) A Dockerfile is a recipe for making a new image. You specify an image to start with, and then issue commands to change that image. You also specify the startup command. We saw how easy it is to take our rest-word-svr-v1.py code and use it to create a Docker container. There are many commands that operate on images and containers. We saw just a few. See the official Docker documentation or other Internet documentation. You will most likely need a few more cmds for completing the assignments. The reference material I included at the end (in the section “Additional reference material”) provides some useful commands and links. Question How would you run two separate instances of our toysimage:v1 on your host? You must be able to communicate with each one separately! Hence the host must listen for each one on a separate port. Running two separate rest-word-svr instances 1. Issue a command to run the image toysimage:v1 in a container named toys-cont-v1 (just like we did before) docker run --publish 80:8001 --name toys-cont-v1 toysimage:v1 2. Issue another command to run the image toysimage:v1 in a container named toys-cont-v1-n2 docker run --publish 85:8001 --name toys-cont-v1-n2 toysimage:v1 While they are both are listening on the container port 8001, requests to the first container must be addressed to port 80, and the second one to port 85. Running two separate rest-word-svr instances (cont) 3. docker ps You see both containers running and mapped to different host ports. 4. Run some requests on each container and see that we get the right results (Next slide) Issues requests (curl)to the two containers Docker Architecture The main components comprising Docker Docker Architecture Your host machine Docker Hub Docker client You talk to the Docker client, a CLI that enables you to talk to the HTTP Docker daemon. A daemon is a process that runs in the background. It is part of the Docker Docker daemon HTTP Engine. RESTful API over the The Docker Engine daemon is a Internet server supporting a RESTful API. It receives requests, processes them, and sends responses. HTTP over a private network The daemon may communicate with Private Docker other services. It manages the host’s images and containers. registry Docker Architecture Data Volumes Network configuration Source Docker Engine provides a REST API https://docs.docker.com/referenc e/api/engine/version/v1.41/ Docker Architecture (cont) A Docker registry stores Docker images. Docker Hub is a public registry that anyone can use, and Docker is configured to look for images on Docker Hub by default. When you use the docker pull or docker run commands, the required images are pulled from your local daemon or from a configured registry. When you use the docker push command, your image is pushed to your configured registry or to a specified registry. Registries and Repositories A Docker registry is a storage and distribution system for named Docker images. The registry allows Docker users to pull images locally, as well as push new images to the registry (given adequate access permissions when applicable). A Docker registry is organized into Docker repositories , where a repository holds all the versions of a specific image. E.g., images that have the same name, but different tags. An image may have 0, 1, or Image Source many tags. Docker Hub and private registries Docker Hub hosts a large library Many organizations, however, of pre-built images (operating deploy private registries for systems, development stacks, enhanced security control. etc.). It is a great way to These private registries contain accelerate development as one public and proprietory can almost instantly get started images,all of which have been using an image by downloading scanned for vulnerabilities and it from Docker Hub and running checked for conformance to it. corporate standards. The private registry faciliates incorporation of these images into development pipelines. By default, the Docker engine interacts with DockerHub , Docker’s public registry instance. However, it is possible to run on-premise the open-source Docker registry/distribution, as well as a commercially supported version called Docker Trusted Registry. There are other public registries available online. hub.docker.com Sign up for a free account – you will need it later in the course Docker best practices 1. Run a single process per 3. Build your image with minimum container. Helps to build files required. Use loosely coupled applications..dockerignore or other Use container linking or techqnies to keep the build container networking to context minimal. communicate between 4. Use prebuilt images from containers. Dockerhub whenever you can. 2. Treat containers as ephermal. 5. Minimize the layers in your They should be immutable Docker image, to minimize the entities, capable of being number of layers in your image. stopped and restarted. Store runtime configuration and data outside the image; e.g., use Docker volumes. Source: Docker Cookbook by Sébastien Goasguen How to share using Docker Create an image & put it in a in a Create a Dockerfile and share with public repository others Quick to download and startup. Need to build image and may be Users will always get the exact slow for large projects. replica of the image. There are Users create new image. May pros and cons to this.* not be exact replica of original. Does not require availability of There are pros and cons to this.* any source files. May require source files not Does not provide a repeatable available to others. way to reproduce the image. Provides a repeatable way to Hard to know exactly what is in reproduce the image. an image and what Easy to know what is in the image dependencies it uses and what dependencies it has *Do you want the latest version of a dependency or the original version? Do you specify in the Dockerfile nginx:3.4 or nginx:latest? What about security vulnerabilty patches? Docker Summary The main value of containers Docker helps developers 1. Simplicity of reusing & 2. Portable and platform independent by customizing pre-built images encapsulating dependencies 3. Lightweight environment for executing 4. Facilitates automation small loosely coupled microservices of CI/CD processes But Docker does not solve all problems Dependencies and isolation: When importing multiple packages into a single container, there can still be dependency conflicts This is mitigated if each container has a single responsibility and uses just a few integrated packages When a Dockerfile updates packages or is based upon an image that is tagged “latest”, it may break previous working image And one often needs to use the latest version; e.g., to make sure latest vulnerabilities are patched Still need to test! Docker compose common commands Compose Up Compose Down Compose stats Docker Volumes prune commands on docker images, containers, … See https://arxiv.org/pdf/2403.17940. page 55 References References Docker 1. https://docs.docker.com/get-started/overview/ 2. Docker in Practice, by Ian Miell and Aidan Hobson Sayers, Manning, 2nd edition, 2019 3. Docker Cookbook, by Sébastien Goasguen, O’Reilly, 2016 4. High level overview and value description: https://www.ibm.com/in- en/cloud/learn/containers 5. A good overview of Docker. https://www.freecodecamp.org/news/what-is-docker- used-for-a-docker-container-tutorial-for-beginners/ 6. Examples of Dockerfiles. https://github.com/topics/dockerfile-examples 7. Explanation of 4 key Linux technologies that Docker makes use of: https://opensource.com/article/21/8/container-linux-technology 8. Explanation of Linux namespaces and cgroups: https://www.nginx.com/blog/what- are-namespaces-cgroups-how-do-they-work/ https://www.codementor.io/blog/docker-technology-5x1kilcbow References Docker 9. https://pythonspeed.com/articles/base-image-python-docker-images 10. Overlay filesystem: https://jvns.ca/blog/2019/11/18/how-containers-work-- overlayfs https://gdevillele.github.io/engine/userguide/storagedriver/overlayfs-driver/ https://martinheinz.dev/blog/44 https://docs.docker.com/engine/storage/drivers/#images-and-layers 9. Dockerfiles depend on containers. https://iximiuz.com/en/posts/you-need- containers-to-build-an-image/ 10. Docker and DevOps. https://learn.microsoft.com/en- us/dotnet/architecture/containerized-lifecycle/docker-application- lifecycle/containers-foundation-for-devops-collaboration References Docker 11. Publishing ports: https://iximiuz.com/en/posts/docker-publish-container-ports/ 12. Differences between a VM and Docker: https://stackoverflow.com/questions/16047306/how-is-docker- different-from-a-virtual-machine 13. “Navigating the Docker Ecosystem: A comprehensive survey”, https://arxiv.org/pdf/2403.17940.pdf 14. Research on how to automate the construction/optimization of a Dockerfile. “Automatic Service Containerization with Docker”, João Carlos Maduro, https://repositorio- aberto.up.pt/bitstream/10216/135486/2/487218.pdf Docker with VMs, Applications integrated into VMs “Experimental Assessment of Containers Running on Top of Virtual Machines”, https://arxiv.org/pdf/2401.07539.pdf “Live Objects All The Way Down”, https://arxiv.org/pdf/2312.16973.pdf Management of Docker containers across multi-cloud environments “Containerization in Multi-Cloud Environment: Roles, Strategies, Challenges, and Solutions for Effective Implementation”, https://arxiv.org/pdf/2403.12980.pdf (surveys many papers on using containers in complex multicloud environments) Additional reference material Comes in handy when building docker containers https://devopscycle.com/blog/the-ultimate-docker-cheat- sheet/ Some useful commands on images When you create a new image, it gets a unique ID. docker run An image can be referred to either by docker run --name its ID (unique prefix is good enough), or docker run --publish hport:cport python- by its name. docker Hence any command requiring an image, the Build a container based upon and start image can be specified by its ID or its name. the container. docker images The --name (-n) flag gives a user friendly name to the container. Lists all existing docker images in the local registry The --publish (-p) flag maps the host’s port hport docker rmi to the container’s port cport. Removes an existing image from the local registry docker history docker tag Shows the history of an image – the layers Tags the image with a new tag. comprising the image. The may be the image’s unique ID or an docker sbom already specified name:tag for the image. In the latter case, the original tag remains – this command Displays the Software Bill of Materials of a Docker creates an additional tag for the same image. image. New experimental cmd Common Dockerfile commands Command Purpose FROM To specify the parent image. WORKDIR To set the working directory for any commands that follow in the Dockerfile. RUN To install any applications and packages required for your container. COPY To copy over files or directories from a specific location. ADD As COPY, but also able to handle remote URLs and unpack compressed files. Command that will always be executed when the container starts. If not specified, the default ENTRYPOINT is /bin/sh -c Arguments passed to the entrypoint. If ENTRYPOINT is not set (defaults to /bin/sh -c), the CMD CMD will be the commands the container executes. EXPOSE To define which port through which to access your container application. LABEL To add metadata to the image. Documentation to get started using Docker with Python Docker overview These links basically cover the steps https://docs.docker.com/get-docker/ we just went over. Install Docker You can also use these links as a https://docs.docker.com/get-docker/ starting point to explore a lot more rich documentation on Docker. Build https://docs.docker.com/language/python/ build-images/ Run https://docs.docker.com/language/python/ run-containers/