nginx-cookbook.pdf

ird n Th itio Ed NGINX Cookbook Advanced Recipes for High-Performance Load Balancing Compliments of Derek DeJonghe NGINX Cookbook NGINX is one of the most widely used web servers available today, in part because of its capabilities as a load balancer and reverse proxy server for HTTP and other network protocols. This revised cookbook provides easy-to-follow examples of real-world problems in application delivery. Practical recipes help you set up and use either the open source or commercial offering to solve problems in various use cases. For professionals who understand modern web architectures such as n-tier or microservice designs and common web protocols such as TCP and HTTP, these recipes include proven solutions for security and software load balancing and for monitoring and maintaining NGINX’s application delivery platform. You’ll also explore advanced features of both NGINX and NGINX Plus, the free and licensed versions of this server. Derek DeJonghe, an Amazon Web Services Certified Professional, specializes in Linux/Unix-based systems and web applications. His background in web development, system administration, and networking makes him a valuable cloud resource. Derek focuses on infrastructure management, configuration management, and continuous integration. He also develops DevOps tools and maintains the systems, networks, and deployments of multiple multi-tenant SaaS offerings. You’ll find recipes for: High-performance load balancing with HTTP, TCP, and UDP Securing access through encrypted traffic, secure links, HTTP authentication subrequests, and more Deploying NGINX to Google, AWS, and Azure Cloud Services NGINX Plus as a service provider in a SAML environment HTTP/3 (QUIC), OpenTelemetry, and the njs module Twitter: @oreillymedia linkedin.com/company/oreilly-media youtube.com/oreillymedia SYSTEM ADMINISTR ATION US $79.99 CAN $99.99 ISBN: ISBN: 978-1-098-15844-6 978-1-098-14654-2 57999 9 9 78 7 81 1 00 99 88 11 54 86 454462 Try F5 NGINX Plus and F5 NGINX App Protect free Get High-performance Application Delivery and Security for Microservices NGINX Plus is a software load balancer, API gateway, and microservices proxy. NGINX App Protect is a lightweight, fast web application firewall (WAF) built on proven F5 technology and designed for modern apps and DevOps environments. Cost Savings Reduced Complexity Significant cost savings compared to hardware application delivery controllers and WAFs, with all the performance and features you expect. The only all-in-one load balancer, API gateway, microservices proxy, and web application firewall helps reduce infrastructure sprawl. Enterprise Ready Advanced Security NGINX Plus and NGINX App Protect deliver enterprise requirements for security, scalability, and resiliency while integrating with DevOps and CI/CD environments. Keep your apps and APIs safe from the ever-expanding range of cyberattacks and security vulnerabilities. Download at nginx.com/freetrial ©2024 F5, Inc. All rights reserved. F5, the F5 logo, NGINX, the NGINX logo, NGINX App Protect, and NGINX Plus are trademarks of F5, Inc. in the U.S. and in certain other countries. Other F5 trademarks are identified at f5.com. Any other products, services, or company names referenced herein may be trademarks of their respective owners with no endorsement or affiliation, expressed or implied, claimed by F5, Inc. THIRD EDITION NGINX Cookbook Advanced Recipes for High-Performance Load Balancing Derek DeJonghe NGINX Cookbook by Derek DeJonghe Copyright © 2024 O’Reilly Media, Inc. All rights reserved. Printed in the United States of America. Published by O’Reilly Media, Inc., 1005 Gravenstein Highway North, Sebastopol, CA 95472. O’Reilly books may be purchased for educational, business, or sales promotional use. Online editions are also available for most titles (http://oreilly.com). For more information, contact our corporate/institutional sales department: 800-998-9938 or [email protected]. Acquisitions Editor: John Devins Development Editor: Gary O’Brien Production Editor: Clare Laylock Copyeditor: Piper Editorial Consulting, LLC Proofreader: Kim Cofer November 2020: May 2022: February 2024: Indexer: Potomac Indexing, LLC Interior Designer: David Futato Cover Designer: Karen Montgomery Illustrator: Kate Dullea First Edition Second Edition Third Edition Revision History for the Third Edition 2024-01-29: First Release See http://oreilly.com/catalog/errata.csp?isbn=9781098158439 for release details. The O’Reilly logo is a registered trademark of O’Reilly Media, Inc. NGINX Cookbook, the cover image, and related trade dress are trademarks of O’Reilly Media, Inc. The views expressed in this work are those of the author, and do not represent the publisher’s views. While the publisher and the author have used good faith efforts to ensure that the information and instructions contained in this work are accurate, the publisher and the author disclaim all responsibility for errors or omissions, including without limitation responsibility for damages resulting from the use of or reliance on this work. Use of the information and instructions contained in this work is at your own risk. If any code samples or other technology this work contains or describes is subject to open source licenses or the intellectual property rights of others, it is your responsibility to ensure that your use thereof complies with such licenses and/or rights. This work is part of a collaboration between O’Reilly and NGINX. See our statement of editorial independence. 978-1-098-15844-6 [LSI] Table of Contents Foreword..................................................................... xi Preface...................................................................... xiii 1. Basics...................................................................... 1 1.0 Introduction 1.1 Installing NGINX on Debian/Ubuntu 1.2 Installing NGINX Through the YUM Package Manager 1.3 Installing NGINX Plus 1.4 Verifying Your Installation 1.5 Key Files, Directories, and Commands 1.6 Using Includes for Clean Configs 1.7 Serving Static Content 1 1 2 3 3 4 6 7 2. High-Performance Load Balancing............................................. 9 2.0 Introduction 2.1 HTTP Load Balancing 2.2 TCP Load Balancing 2.3 UDP Load Balancing 2.4 Load-Balancing Methods 2.5 Sticky Cookie with NGINX Plus 2.6 Sticky Learn with NGINX Plus 2.7 Sticky Routing with NGINX Plus 2.8 Connection Draining with NGINX Plus 2.9 Passive Health Checks 9 10 11 13 14 17 18 19 20 21 v 2.10 Active Health Checks with NGINX Plus 2.11 Slow Start with NGINX Plus 22 24 3. Traffic Management........................................................ 25 3.0 Introduction 3.1 A/B Testing 3.2 Using the GeoIP Module and Database 3.3 Restricting Access Based on Country 3.4 Finding the Original Client 3.5 Limiting Connections 3.6 Limiting Rate 3.7 Limiting Bandwidth 25 25 27 30 31 32 34 35 4. Massively Scalable Content Caching........................................... 37 4.0 Introduction 4.1 Caching Zones 4.2 Caching Hash Keys 4.3 Cache Locking 4.4 Use Stale Cache 4.5 Cache Bypass 4.6 Cache Purging with NGINX Plus 4.7 Cache Slicing 37 37 39 40 40 41 42 43 5. Programmability and Automation............................................ 45 5.0 Introduction 5.1 NGINX Plus API 5.2 Using the Key-Value Store with NGINX Plus 5.3 Using the njs Module to Expose JavaScript Functionality Within NGINX 5.4 Extending NGINX with a Common Programming Language 5.5 Installing with Ansible 5.6 Installing with Chef 5.7 Automating Configurations with Consul Templating 45 45 49 51 54 56 58 59 6. Authentication............................................................. 63 6.0 Introduction 6.1 HTTP Basic Authentication 6.2 Authentication Subrequests 6.3 Validating JWTs with NGINX Plus 6.4 Creating JSON Web Keys 6.5 Authenticate Users via Existing OpenID Connect SSO with NGINX Plus 6.6 Validate JSON Web Tokens (JWT) with NGINX Plus vi | Table of Contents 63 63 65 66 68 69 70 6.7 Automatically Obtaining and Caching JSON Web Key Sets with NGINX Plus 6.8 Configuring NGINX Plus as a Service Provider for SAML Authentication 71 72 7. Security Controls........................................................... 77 7.0 Introduction 7.1 Access Based on IP Address 7.2 Allowing Cross-Origin Resource Sharing 7.3 Client-Side Encryption 7.4 Advanced Client-Side Encryption 7.5 Upstream Encryption 7.6 Securing a Location 7.7 Generating a Secure Link with a Secret 7.8 Securing a Location with an Expire Date 7.9 Generating an Expiring Link 7.10 HTTPS Redirects 7.11 Redirecting to HTTPS Where SSL/TLS Is Terminated Before NGINX 7.12 HTTP Strict Transport Security 7.13 Restricting Access Based on Country 7.14 Satisfying Any Number of Security Methods 7.15 NGINX Plus Dynamic Application Layer DDoS Mitigation 7.16 Installing and Configuring NGINX Plus with the NGINX App Protect WAF Module 77 77 78 79 81 83 84 84 85 86 88 89 89 90 92 92 94 8. HTTP/2 and HTTP/3 (QUIC)................................................... 99 8.0 Introduction 8.1 Enabling HTTP/2 8.2 Enabling HTTP/3 8.3 gRPC 99 99 100 102 9. Sophisticated Media Streaming............................................. 105 9.0 Introduction 9.1 Serving MP4 and FLV 9.2 Streaming with HLS with NGINX Plus 9.3 Streaming with HDS with NGINX Plus 9.4 Bandwidth Limits with NGINX Plus 105 105 106 107 108 10. Cloud Deployments........................................................ 109 10.0 Introduction 10.1 Auto-Provisioning 10.2 Deploying an NGINX VM in the Cloud 109 109 111 Table of Contents | vii 10.3 Creating an NGINX Machine Image 10.4 Routing to NGINX Nodes Without a Cloud Native Load Balancer 10.5 The Load Balancer Sandwich 10.6 Load Balancing over Dynamically Scaling NGINX Servers 10.7 Creating a Google App Engine Proxy 112 113 115 117 118 11. Containers/Microservices................................................... 121 11.0 Introduction 11.1 Using NGINX as an API Gateway 11.2 Using DNS SRV Records with NGINX Plus 11.3 Using the Official NGINX Container Image 11.4 Creating an NGINX Dockerfile 11.5 Building an NGINX Plus Container Image 11.6 Using Environment Variables in NGINX 11.7 NGINX Ingress Controller from NGINX 121 122 126 127 128 132 133 134 12. High-Availability Deployment Modes......................................... 137 12.0 Introduction 12.1 NGINX Plus HA Mode 12.2 Load Balancing Load Balancers with DNS 12.3 Load Balancing on EC2 12.4 NGINX Plus Configuration Synchronization 12.5 State Sharing with NGINX Plus and Zone Sync 137 137 140 141 142 144 13. Advanced Activity Monitoring............................................... 147 13.0 Introduction 13.1 Enable NGINX Stub Status 13.2 Enabling the NGINX Plus Monitoring Dashboard 13.3 Collecting Metrics Using the NGINX Plus API 13.4 OpenTelemetry for NGINX 13.5 Prometheus Exporter Module 147 147 148 150 153 157 14. Debugging and Troubleshooting with Access Logs, Error Logs, and Request Tracing....................................................... 159 14.0 Introduction 14.1 Configuring Access Logs 14.2 Configuring Error Logs 14.3 Forwarding to Syslog 14.4 Debugging Configs 14.5 Request Tracing viii | Table of Contents 159 159 161 162 163 164 15. Performance Tuning....................................................... 167 15.0 Introduction 15.1 Automating Tests with Load Drivers 15.2 Controlling Cache at the Browser 15.3 Keeping Connections Open to Clients 15.4 Keeping Connections Open Upstream 15.5 Buffering Responses 15.6 Buffering Access Logs 15.7 OS Tuning 167 167 168 169 169 170 171 172 Index....................................................................... 175 Table of Contents | ix Foreword Welcome to the 2024 edition of the NGINX Cookbook. O’Reilly has been publishing the NGINX Cookbook for nine years, and we continue to update the content to reflect the many improvements that regularly go into NGINX. Today, NGINX is the world’s most popular web server. We first released NGINX in 2004, and the product contin‐ ues to evolve to meet the needs of those charged with scaling, securing, and delivering modern applications. We architected NGINX for flexibility and scale, which has made it possible to extend its capabilities beyond web serving to load balancing, reverse proxy, and API gateway. And we take it as a testament to the value of NGINX that many of the loadbalancing services offered by major public clouds and CDNs are actually based on NGINX code. NGINX also continues to expand into new realms and add critical capabilities. NGINX Ingress Controller for Kubernetes is natively integrated with NGINX and provides key capabilities for managing both east-west and north-south traffic, a criti‐ cal requirement in the world of Kubernetes. NGINX also has consistently expanded its authentication and security capabilities. None of this would matter if NGINX did not continue to deliver speed, resilience, and agility—all fundamental requirements of modern distributed applications. The NGINX Cookbook remains the go-to guide to NGINX from the people who know the code best. Whether you are running NGINX Open Source for a small project or NGINX Plus for an enterprise deployment across multiple regions, and whether you are running locally or in the cloud, NGINX Cookbook helps you get the most out of NGINX. This book features more than one hundred easy-to-follow recipes covering how to install NGINX, configure it for almost any use case, secure it, scale it, and troubleshoot common issues. The 2024 edition includes updates to many sections to reflect new functionality in NGINX and adds entirely new sections focused on expanded security capabilities and effectively leveraging HTTP/2, HTTP/3 (QUIC), and gRPC. The world of technology and applications is changing fast, and NGINX is changing with it to continue contri‐ buting to your success. The original vision of NGINX was highly scalable, reliable, fast, and secure web serving. Everything we do today is built on that original vision and designed to help our community secure and deploy the apps they need, in any environment, at any scale, with confidence and trust. xi Please enjoy this latest edition and feel free to let us know what you think. We are listening and want to hear from you. —Peter Beardmore Director of Product Marketing, F5 NGINX xii | Foreword Preface The NGINX Cookbook aims to provide easy-to-follow examples of real-world prob‐ lems in application delivery. Throughout this book, you will explore the many features of NGINX and how to use them. This guide is fairly comprehensive, and touches on most of the main capabilities of NGINX. The book will begin by explaining the installation process of NGINX and NGINX Plus, as well as some basic getting-started steps for readers new to NGINX. From there, the sections will progress to load balancing in all forms, accompanied by chap‐ ters about traffic management, caching, and automation. Chapter 6, “Authentication”, covers a lot of ground, but it is important because NGINX is often the first point of entry for web traffic to your application, and the first line of application-layer defense against web attacks and vulnerabilities. There are a number of chapters that cover cutting-edge topics such as HTTP/3 (QUIC), media streaming, cloud, SAML Auth, and container environments—wrapping up with more traditional operational topics such as monitoring, debugging, performance, and operational tips. I personally use NGINX as a multitool, and I believe this book will enable you to do the same. It’s software that I believe in and enjoy working with. I’m happy to share this knowledge with you, and I hope that as you read through this book you relate the recipes to your real-world scenarios and will employ these solutions. Conventions Used in This Book The following typographical conventions are used in this book: Italic Indicates new terms, URLs, email addresses, filenames, and file extensions. Constant width Used for program listings, as well as within paragraphs to refer to program elements such as variable or function names, databases, data types, environment variables, statements, and keywords. xiii This element signifies a general note. This element indicates a warning or caution. O’Reilly Online Learning For more than 40 years, O’Reilly Media has provided technol‐ ogy and business training, knowledge, and insight to help companies succeed. Our unique network of experts and innovators share their knowledge and expertise through books, articles, and our online learning platform. O’Reilly’s online learning platform gives you on-demand access to live training courses, in-depth learning paths, interactive coding environments, and a vast collection of text and video from O’Reilly and 200+ other publishers. For more information, visit http://oreilly.com. How to Contact Us Please address comments and questions concerning this book to the publisher: O’Reilly Media, Inc. 1005 Gravenstein Highway North Sebastopol, CA 95472 800-889-8969 (in the United States or Canada) 707-827-7019 (international or local) 707-829-0104 (fax) [email protected] https://www.oreilly.com/about/contact.html We have a web page for this book, where we list errata, examples, and any additional information. You can access this page at https://oreil.ly/nginx-cookbook-3e. For news and information about our books and courses, visit https://oreilly.com. Find us on LinkedIn: https://linkedin.com/company/oreilly-media. Follow us on Twitter: https://twitter.com/oreillymedia. Watch us on YouTube: https://youtube.com/oreillymedia. xiv | Preface CHAPTER 1 Basics 1.0 Introduction To get started with NGINX Open Source or NGINX Plus, you first need to install it on a system and learn some basics. In this chapter, you will learn how to install NGINX, where the main configuration files are located, and what the commands are for administration. You will also learn how to verify your installation and make requests to the default server. Some of the recipes in this book will use NGINX Plus. You can get a free trial of NGINX Plus at https://nginx.com. 1.1 Installing NGINX on Debian/Ubuntu Problem You need to install NGINX Open Source on a Debian or Ubuntu machine. Solution Update package information for configured sources and install some packages that will assist in configuring the official NGINX package repository: $ apt update $ apt install -y curl gnupg2 ca-certificates lsb-release \ debian-archive-keyring Download and save the NGINX signing key: $ curl https://nginx.org/keys/nginx_signing.key | gpg --dearmor \ | tee /usr/share/keyrings/nginx-archive-keyring.gpg >/dev/null 1 Use lsb_release to set variables defining the OS and release names, then create an apt source file: $ OS=$(lsb_release -is | tr '[:upper:]' '[:lower:]') $ RELEASE=$(lsb_release -cs) $ echo "deb [signed-by=/usr/share/keyrings/nginx-archive-keyring.gpg] \ http://nginx.org/packages/${OS} ${RELEASE} nginx" \ | tee /etc/apt/sources.list.d/nginx.list Update package information once more, then install NGINX: $ $ $ $ apt update apt install -y nginx systemctl enable nginx nginx Discussion The commands provided in this section instruct the advanced package tool (APT) package management system to utilize the official NGINX package repository. The NGINX GPG package signing key was downloaded and saved to a location on the filesystem for use by APT. Providing APT the signing key enables the APT system to validate packages from the repository. The lsb_release command was used to automatically determine the OS and release name so that these instructions can be used across all release versions of Debian or Ubuntu. The apt update command instructs the APT system to refresh its package listings from its known repositories. After the package list is refreshed, you can install NGINX Open Source from the official NGINX repository. After you install it, the final command starts NGINX. 1.2 Installing NGINX Through the YUM Package Manager Problem You need to install NGINX Open Source on Red Hat Enterprise Linux (RHEL), Oracle Linux, AlmaLinux, Rocky Linux, or CentOS. Solution Create a file named /etc/yum.repos.d/nginx.repo that contains the following contents: [nginx] name=nginx repo baseurl=http://nginx.org/packages/centos/$releasever/$basearch/ gpgcheck=0 enabled=1 2 | Chapter 1: Basics Alter the file, replacing OS in the middle of the URL with rhel or centos, depending on your distribution. Then, run the following commands: $ $ $ $ $ yum -y install nginx systemctl enable nginx systemctl start nginx firewall-cmd --permanent --zone=public --add-port=80/tcp firewall-cmd --reload Discussion The file you just created for this solution instructs the YUM package management system to utilize the official NGINX Open Source package repository. The commands that follow install NGINX Open Source from the official repository, instruct systemd to enable NGINX at boot time, and tell it to start NGINX now. If necessary, the firewall commands open port 80 for the transmission control protocol (TCP), which is the default port for HTTP. The last command reloads the firewall to commit the changes. 1.3 Installing NGINX Plus Problem You need to install NGINX Plus. Solution Visit the NGINX docs. Select the OS you’re installing to and then follow the instruc‐ tions. The instructions are similar to those of the installation of the open source solutions; however, you need to obtain a certificate and key in order to authenticate to the NGINX Plus repository. Discussion NGINX keeps this repository installation guide up-to-date with instructions on installing NGINX Plus. Depending on your OS and version, these instructions vary slightly, but there is one commonality. You must obtain a certificate and key from the NGINX portal, and provide them to your system, in order to authenticate to the NGINX Plus repository. 1.4 Verifying Your Installation Problem You want to validate the NGINX installation and check the version. 1.3 Installing NGINX Plus | 3 Solution You can verify that NGINX is installed and check its version by using the following command: $ nginx -v nginx version: nginx/1.25.3 As this example shows, the response displays the version. You can confirm that NGINX is running by using the following command: $ ps -ef | grep nginx root 1738 1 0 19:54 ? 00:00:00 nginx: master process nginx 1739 1738 0 19:54 ? 00:00:00 nginx: worker process The ps command lists running processes. By piping it to grep, you can search for specific words in the output. This example uses grep to search for nginx. The result shows two running processes: a master and worker. If NGINX is running, you will always see a master and one or more worker processes. Note the master process is running as root, as, by default, NGINX needs elevated privileges in order to function properly. For instructions on starting NGINX, refer to the next recipe. To see how to start NGINX as a daemon, use the init.d or systemd methodologies. To verify that NGINX is returning requests correctly, use your browser to make a request to your machine or use curl. When making the request, use the machine’s IP address or hostname. If installed locally, you can use localhost as follows: $ curl localhost You will see the NGINX Welcome default HTML site. Discussion The nginx command allows you to interact with the NGINX binary to check the version, list installed modules, test configurations, and send signals to the master process. NGINX must be running in order for it to serve requests. The ps command is a surefire way to determine whether NGINX is running either as a daemon or in the foreground. The configuration provided by default with NGINX runs a static-site HTTP server on port 80. You can test this default site by making an HTTP request to the machine at localhost. You should use the host’s IP and hostname. 1.5 Key Files, Directories, and Commands Problem You need to understand the important NGINX directories and commands. 4 | Chapter 1: Basics Solution The following configuration directories and file locations can be changed during the compilation of NGINX and therefore may vary based on your installation. NGINX files and directories /etc/nginx/ The /etc/nginx/ directory is the default configuration root for the NGINX server. Within this directory you will find configuration files that instruct NGINX on how to behave. /etc/nginx/nginx.conf The /etc/nginx/nginx.conf file is the default configuration entry point used by the NGINX daemon. This configuration file sets up global settings for things like worker processes, tuning, logging, loading dynamic modules, and references to other NGINX configuration files. In a default configuration, the /etc/nginx/ nginx.conf file includes the top-level http block, or context, which includes all configuration files in the directory described next. /etc/nginx/conf.d/ The /etc/nginx/conf.d/ directory contains the default HTTP server configuration file. Files in this directory ending in.conf are included in the top-level http block from within the /etc/nginx/nginx.conf file. It’s best practice to utilize include statements and organize your configuration in this way to keep your configura‐ tion files concise. In some package repositories, this folder is named sites-enabled, and configuration files are linked from a folder named site-available; this conven‐ tion is deprecated. /var/log/nginx/ The /var/log/nginx/ directory is the default log location for NGINX. Within this directory you will find an access.log file and an error.log file. By default the access log contains an entry for each request NGINX serves. The error logfile contains error events and debug information if the debug module is enabled. NGINX commands nginx -h Shows the NGINX help menu. nginx -v Shows the NGINX version. nginx -V Shows the NGINX version, build information, and configuration arguments, which show the modules built into the NGINX binary. 1.5 Key Files, Directories, and Commands | 5 nginx -t Tests the NGINX configuration. nginx -T Tests the NGINX configuration and prints the validated configuration to the screen. This command is useful when seeking support. nginx -s signal The -s flag sends a signal to the NGINX master process. You can send signals such as stop, quit, reload, and reopen. The stop signal discontinues the NGINX process immediately. The quit signal stops the NGINX process after it finishes processing in-flight requests. The reload signal reloads the configura‐ tion. The reopen signal instructs NGINX to reopen logfiles. Discussion With an understanding of these key files, directories, and commands, you’re in a good position to start working with NGINX. Using this knowledge, you can alter the default configuration files and test your changes with the nginx -t command. If your test is successful, you also know how to instruct NGINX to reload its configuration using the nginx -s reload command. 1.6 Using Includes for Clean Configs Problem You need to clean up bulky configuration files to keep your configurations logically grouped into modular configuration sets. Solution Use the include directive to reference configuration files, directories, or masks: http { include conf.d/compression.conf; include ssl_config/*.conf } The include directive takes a single parameter of either a path to a file or a mask that matches many files. This directive is valid in any context. Discussion By using include statements you can keep your NGINX configuration clean and concise. You’ll be able to logically group your configurations to avoid configuration files that go on for hundreds of lines. You can create modular configuration files 6 | Chapter 1: Basics that can be included in multiple places throughout your configuration to avoid duplication of configurations. Take the example fastcgi_param configuration file provided in most package manage‐ ment installs of NGINX. If you manage multiple FastCGI virtual servers on a single NGINX box, you can include this configuration file for any location or context where you require these parameters for FastCGI without having to duplicate this configura‐ tion. Another example is Secure Sockets Layer (SSL) configurations. If you’re running multiple servers that require similar SSL configurations, you can simply write this configuration once and include it wherever needed. By logically grouping your configurations together, you can rest assured that your configurations are neat and organized. Changing a set of configuration files can be done by editing a single file rather than changing multiple sets of configuration blocks in multiple locations within a massive configuration file. Grouping your con‐ figurations into files and using include statements is good practice for your sanity and the sanity of your colleagues. 1.7 Serving Static Content Problem You need to serve static content with NGINX. Solution Overwrite the default HTTP server configuration located in /etc/nginx/conf.d/ default.conf with the following NGINX configuration example: server { listen 80 default_server; server_name www.example.com; location / { root /usr/share/nginx/html; # alias /usr/share/nginx/html; index index.html index.htm; } } Discussion This configuration serves static files over HTTP on port 80 from the directory /usr/ share/nginx/html/. The first line in this configuration defines a new server block. This defines a new context that specifies what NGINX listens for. Line two instructs NGINX to listen on port 80, and the default_server parameter instructs NGINX to 1.7 Serving Static Content | 7 use this server as the default context for port 80. The listen directive can also take a range of ports. The server_name directive defines the hostname or the names of requests that should be directed to this server. If the configuration had not defined this context as the default_server, NGINX would direct requests to this server only if the HTTP host header matched the value provided to the server_name directive. With the default_server context set, you can omit the server_name directive if you do not yet have a domain name to use. The location block defines a configuration based on the path in the URL. The path, or portion of the URL after the domain, is referred to as the uniform resource identifier (URI). NGINX will best match the URI requested to a location block. The example uses / to match all requests. The root directive shows NGINX where to look for static files when serving content for the given context. The URI of the request is appended to the root directive’s value when looking for the requested file. If we had provided a URI prefix to the location directive, this would be included in the appended path, unless we used the alias directive rather than root. The location directive is able to match a wide range of expressions. Visit the first link in the “See Also” section for more information. Finally, the index directive provides NGINX with a default file, or list of files to check, in the event that no further path is provided in the URI. See Also NGINX HTTP location Directive Documentation NGINX Request Processing 8 | Chapter 1: Basics CHAPTER 2 High-Performance Load Balancing 2.0 Introduction Today’s internet user experience demands performance and uptime. To achieve this, multiple copies of the same system are run, and the load is distributed over them. As the load increases, another copy of the system can be brought online. This architecture technique is called horizontal scaling. Software-based infrastructure is increasing in popularity because of its flexibility, opening up a vast world of possibili‐ ties. Whether the use case is as small as a set of two system copies for high availability, or as large as thousands around the globe, there’s a need for a load-balancing solution that is as dynamic as the infrastructure. NGINX fills this need in a number of ways, such as HTTP, transmission control protocol (TCP), and user datagram protocol (UDP) load balancing, which we cover in this chapter. When balancing load, it’s important that the impact to the client’s experience is entirely positive. Many modern web architectures employ stateless application tiers, storing state in shared memory or databases. However, this is not the reality for all. Session state is immensely valuable and vastly used in interactive applications. This state might be stored locally to the application server for a number of reasons; for example, in applications for which the data being worked is so large that network overhead is too expensive in performance. When state is stored locally to an appli‐ cation server, it is extremely important to the user experience that the subsequent requests continue to be delivered to the same server. Another facet of the situation is that servers should not be released until the session has finished. Working with stateful applications at scale requires an intelligent load balancer. NGINX offers multiple ways to solve this problem by tracking cookies or routing. This chapter covers session persistence as it pertains to load balancing with NGINX. 9 It’s important to ensure that the application that NGINX is serving is healthy. Upstream requests may begin to fail for a number of reasons. It could be because of network connectivity, server failure, or application failure, to name a few. Proxies and load balancers must be smart enough to detect failure of upstream servers (servers behind the load balancer or proxy) and stop passing traffic to them; otherwise, the client will be waiting, only to be delivered a timeout. A way to mitigate service degradation when a server fails is to have the proxy check the health of the upstream servers. NGINX offers two different types of health checks: passive, available in NGINX Open Source; and active, available only in NGINX Plus. Active health checks at regular intervals will make a connection or request to the upstream server, and can verify that the response is correct. Passive health checks monitor the connection or responses of the upstream server as clients make the request or connection. You might want to use passive health checks to reduce the load of your upstream servers, and you might want to use active health checks to determine failure of an upstream server before a client is served a failure. The tail end of this chapter examines monitoring the health of the upstream application servers for which you’re load balancing. 2.1 HTTP Load Balancing Problem You need to distribute load between two or more HTTP servers. Solution Use NGINX’s HTTP module to load balance over HTTP servers using the upstream block: upstream backend { server 10.10.12.45:80 weight=1; server app.example.com:80 weight=2; server spare.example.com:80 backup; } server { location / { proxy_pass http://backend; } } This configuration balances load across two HTTP servers on port 80, and defines one as a backup, which is used when the two primary servers are unavailable. The optional weight parameter instructs NGINX to pass twice as many requests to the second server. When not used, the weight parameter defaults to 1. 10 | Chapter 2: High-Performance Load Balancing Discussion The HTTP upstream module controls the load balancing for HTTP requests. This module defines a pool of destinations—any combination of Unix sockets, IP addresses, and server hostnames, or a mix. The upstream module also defines how any individual request is assigned to any of the upstream servers. Each upstream destination is defined in the upstream pool by the server directive. Along with the address of the upstream server, the server directive also takes optional parameters. The optional parameters give more control over the routing of requests. These parameters include the weight of the server in the balancing algorithm; whether the server is in standby mode, available, or unavailable; and how to determine if the server is unavailable. NGINX Plus provides a number of other convenient parameters, like connection limits to the server, advanced DNS resolution control, and the ability to slowly ramp up connections to a server after it starts. 2.2 TCP Load Balancing Problem You need to distribute load between two or more TCP servers. Solution Use NGINX’s stream module to load balance over TCP servers using the upstream block: stream { upstream server server server } mysql_read { read1.example.com:3306 weight=5; read2.example.com:3306; 10.10.12.34:3306 backup; server { listen 3306; proxy_pass mysql_read; } } The server block in this example instructs NGINX to listen on TCP port 3306 and balance load between two MySQL database read replicas. The configuration lists another server as a backup that will be passed traffic if the primaries are down. 2.2 TCP Load Balancing | 11 This configuration is not to be added to the conf.d folder, as, by default, that folder is included within an http block; instead, you should create another folder named stream.conf.d, open the stream block in the nginx.conf file, and include the new folder for stream configurations. An example follows. In the /etc/nginx/nginx.conf configuration file: user nginx; worker_processes auto; pid /run/nginx.pid; stream { include /etc/nginx/stream.conf.d/*.conf; } A file named /etc/nginx/stream.conf.d/mysql_reads.conf may include the following configuration: upstream server server server } mysql_read { read1.example.com:3306 weight=5; read2.example.com:3306; 10.10.12.34:3306 backup; server { listen 3306; proxy_pass mysql_read; } Discussion The main difference between the http and stream contexts is that they operate at different layers of the OSI model. The http context operates at the application layer, 7, and stream operates at the transport layer, 4. This does not mean that the stream context cannot become application-aware with some clever scripting; however, the http context is specifically designed to fully understand the HTTP protocol, and the stream context, by default, simply routes and load balances packets. TCP load balancing is defined by the NGINX stream module. The stream module, like the HTTP module, allows you to define upstream pools of servers and configure a listening server. When configuring a server to listen on a given port, you must define the port it will listen on, or optionally, an address and a port. From there, a destination must be configured, whether it be a direct reverse proxy to another address or an upstream pool of resources. A number of options that alter the properties of the reverse proxy of the TCP connection are available for configuration. Some of these include SSL/TLS validation limitations, timeouts, and keepalives. Some of the values of these proxy options can 12 | Chapter 2: High-Performance Load Balancing be (or contain) variables, such as the download rate and the name used to verify an SSL/TLS certificate. The upstream for TCP load balancing is much like the upstream for HTTP, in that it defines upstream resources as servers, configured with Unix socket, IP, or FQDN, as well as server weight, maximum number of connections, DNS resolvers, connection ramp-up periods, and if the server is active, down, or in backup mode. NGINX Plus offers even more features for TCP load balancing. These advanced features can be found throughout this book. Health checks for all load balancing will be covered later in this chapter. 2.3 UDP Load Balancing Problem You need to distribute load between two or more UDP servers. Solution Use NGINX’s stream module to load balance over UDP servers using the upstream block defined as udp: stream { upstream ntp { server ntp1.example.com:123 weight=2; server ntp2.example.com:123; } server { listen 123 udp; proxy_pass ntp; } } This section of configuration balances load between two upstream network time protocol (NTP) servers using the UDP protocol. Specifying UDP load balancing is as simple as using the udp parameter on the listen directive. If the service over which you’re load balancing requires multiple packets to be sent back and forth between client and server, you can specify the reuseport parameter. Examples of these types of services are OpenVPN, Voice over Internet Protocol (VoIP), virtual desktop solutions, and Datagram Transport Layer Security (DTLS). The following is an example of using NGINX to handle OpenVPN connections and proxy them to the OpenVPN service running locally: 2.3 UDP Load Balancing | 13 stream { server { listen 1195 udp reuseport; proxy_pass 127.0.0.1:1194; } } Discussion You might ask, “Why do I need a load balancer when I can have multiple hosts in a DNS A or service record (SRV record)?” The answer is that not only are there alternative balancing algorithms with which we can balance, but we can also load balance the DNS servers themselves. UDP services make up a lot of the services that we depend on in networked systems, such as DNS, NTP, QUIC, HTTP/3, and VoIP. UDP load balancing might be less common to some, but it’s just as useful in the world of scale. You can find UDP load balancing in the stream module, just like TCP, and configure it mostly in the same way. The main difference is that the listen directive specifies that the open socket is for working with datagrams. When working with datagrams, there are some other directives that might apply where they would not in TCP, such as the proxy_responses directive, which specifies to NGINX how many expected responses can be sent from the upstream server. By default, this is unlimited until the proxy_timeout limit is reached. The proxy_timeout directive sets the time between two successive read-or-write operations on client or proxied server connections before the connection is closed. The reuseport parameter instructs NGINX to create an individual listening socket for each worker process. This allows the kernel to distribute incoming connections between worker processes to handle multiple packets being sent between client and server. The reuseport feature works only on Linux kernels 3.9 and higher, DragonFly BSD, and FreeBSD 12 and higher. 2.4 Load-Balancing Methods Problem Round-robin load balancing doesn’t fit your use case because you have heterogeneous workloads or server pools. Solution Use one of NGINX’s load-balancing methods, such as least connections, least time, generic hash, random, or IP hash. This example sets the load-balancing algorithm for the backend upstream pool to choose the server with the least amount of connections: 14 | Chapter 2: High-Performance Load Balancing upstream backend { least_conn; server backend.example.com; server backend1.example.com; } All load-balancing algorithms, with the exception of generic hash, random, and least time, are standalone directives, such as the preceding example. The parameters to these directives are explained in the following discussion. The next example uses the generic hash algorithm with the $remote_addr variable. This example renders the same routing algorithm as IP hash; however, generic hash works in the stream context, whereas IP hash is only available in the http context. You can replace the variable used, or add more to alter the way the generic hash algo‐ rithm distributes load. The following is an example of an upstream block configured to use the client’s IP address with the generic hash algorithm: upstream backend { hash $remote_addr; server backend.example.com; server backend1.example.com; } Discussion Not all requests or packets carry equal weight. Given this, round robin, or even the weighted round robin used in previous examples, will not fit the need of all applications or traffic flow. NGINX provides a number of load-balancing algorithms that you can use to fit particular use cases. In addition to being able to choose these load-balancing algorithms or methods, you can also configure them. The following load-balancing methods, with the exception of IP hash, are available for upstream HTTP, TCP, and UDP pools: Round robin This is the default load-balancing method, which distributes requests in the order of the list of servers in the upstream pool. You can also take weight into consideration for a weighted round robin, which you can use if the capacity of the upstream servers varies. The higher the integer value for the weight, the more favored the server will be in the round robin. The algorithm behind weight is simply the statistical probability of a weighted average. Least connections This method balances load by proxying the current request to the upstream server with the least number of open connections. Least connections, like round robin, also takes weights into account when deciding which server to send the connection to. The directive name is least_conn. 2.4 Load-Balancing Methods | 15 Least time Available only in NGINX Plus, least time is akin to least connections in that it proxies to the upstream server with the least number of current connections, but favors the servers with the lowest average response times. This method is one of the most sophisticated load-balancing algorithms and fits the needs of highly performant web applications. This algorithm is a value-add over least con‐ nections because a small number of connections does not necessarily mean the quickest response. When using this algorithm, it is important to take into con‐ sideration the statistical variance of services’ request times. Some requests may naturally take more processing and thus have a longer request time, increasing the range of the statistic. Long request times do not always mean a less perform‐ ant or overworked server. However, requests that require more processing may be candidates for asynchronous workflows. A parameter of header or last_byte must be specified for this directive. When header is specified, the time to receive the response header is used. When last_byte is specified, the time to receive the full response is used. If the inflight parameter is specified, incomplete requests are also taken into account. The directive name is least_time. Generic hash The administrator defines a hash with the given text, variables of the request or runtime, or both. NGINX distributes the load among the servers by producing a hash for the current request and placing it against the upstream servers. This method is very useful when you need more control over where requests are sent or for determining which upstream server most likely will have the data cached. Note that when a server is added or removed from the pool, the hashed requests will be redistributed. This algorithm has an optional parameter, consistent, to minimize the effect of redistribution. The directive name is hash. Random This method is used to instruct NGINX to select a random server from the group, taking server weights into consideration. The optional two [method] parameter directs NGINX to randomly select two servers and then use the provided load-balancing method to balance between those two. By default the least_conn method is used if two is passed without a method. The directive name for random load balancing is random. IP hash This method works only for HTTP. IP hash uses the client IP address as the hash. Slightly different from using the remote variable in a generic hash, this algorithm uses the first three octets of an IPv4 address or the entire IPv6 address. This method ensures that clients are proxied to the same upstream server as long as that server is available, which is extremely helpful when the session state is of concern and not handled by shared memory of the application. This method also 16 | Chapter 2: High-Performance Load Balancing takes the weight parameter into consideration when distributing the hash. The directive name is ip_hash. 2.5 Sticky Cookie with NGINX Plus Problem You need to bind a downstream client to an upstream server using NGINX Plus. Solution Use the sticky cookie directive to instruct NGINX Plus to create and track a cookie: upstream server server sticky backend { backend1.example.com; backend2.example.com; cookie affinity expires=1h domain=.example.com httponly secure path=/; } This configuration creates and tracks a cookie that ties a downstream client to an upstream server. In this example, the cookie is named affinity, is set for example.com, expires in an hour, cannot be consumed client-side, can be sent only over HTTPS, and is valid for all paths. Discussion Using the cookie parameter on the sticky directive creates a cookie on the first request that contains information about the upstream server. NGINX Plus tracks this cookie, enabling it to continue directing subsequent requests to the same server. The first positional parameter to the cookie parameter is the name of the cookie to be created and tracked. Other parameters offer additional control, informing the browser of the appropriate usage—like the expiry time, domain, path, and whether the cookie can be consumed client-side or whether it can be passed over unsecure protocols. Cookies are a part of the HTTP protocol, and therefore sticky cookie only works in the http context. 2.5 Sticky Cookie with NGINX Plus | 17 2.6 Sticky Learn with NGINX Plus Problem You need to bind a downstream client to an upstream server by using an existing cookie with NGINX Plus. Solution Use the sticky learn directive to discover and track cookies that are created by the upstream application: upstream backend { server backend1.example.com:8080; server backend2.example.com:8081; sticky learn create=$upstream_cookie_cookiename lookup=$cookie_cookiename zone=client_sessions:1m; } This example instructs NGINX to look for and track sessions by looking for a cookie named COOKIENAME in response headers, and looking up existing sessions by looking for the same cookie on request headers. This session affinity is stored in a shared memory zone of one megabyte, which can track approximately 4,000 sessions. The name of the cookie will always be application-specific. Commonly used cookie names, such as jsessionid or phpsessionid, are typically defaults set within the application or the application server configuration. Discussion When applications create their own session-state cookies, NGINX Plus can discover them in request responses and track them. This type of cookie tracking is performed when the sticky directive is provided: the learn parameter. Shared memory for tracking cookies is specified with the zone parameter, with a name and size. NGINX Plus is directed to look for cookies in the response from the upstream server via specification of the create parameter, and it searches for prior registered server affinity using the lookup parameter. The values of these parameters are variables exposed by the HTTP module. 18 | Chapter 2: High-Performance Load Balancing 2.7 Sticky Routing with NGINX Plus Problem You need granular control over how your persistent sessions are routed to the upstream server with NGINX Plus. Solution Use the sticky directive with the route parameter to use variables describing the request to route: map $cookie_jsessionid $route_cookie { ~.+\.(?P\w+)$ $route; } map $request_uri $route_uri { ~jsessionid=.+\.(?P\w+)$ $route; } upstream backend { server backend1.example.com route=a; server backend2.example.com route=b; sticky route $route_cookie $route_uri; } This example attempts to extract a Java session ID, first from a cookie by mapping the value of the Java session ID cookie to a variable with the first map block, and second by looking into the request URI for a parameter called jsessionid, mapping the value to a variable using the second map block. The sticky directive with the route parameter is passed any number of variables. The first nonzero or nonempty value is used for the route. If a jsessionid cookie is used, the request is routed to backend1; if a URI parameter is used, the request is routed to backend2. Although this example is based on the Java common session ID, the same applies for other session technology like phpsessionid, or any guaranteed unique identifier your application generates for the session ID. Discussion Sometimes, utilizing a bit more granular control, you might want to direct traffic to a particular server. The route parameter to the sticky directive is built to achieve this goal. sticky route gives you better control, actual tracking, and stickiness, as opposed to the generic hash load-balancing algorithm. The client is first routed to an upstream server based on the route specified, and then subsequent requests will carry the routing information in a cookie or the URI. sticky route takes a number 2.7 Sticky Routing with NGINX Plus | 19 of positional parameters that are evaluated. The first nonempty variable is used to route to a server. map blocks can be used to selectively parse variables and save them as other variables to be used in the routing. Essentially, the sticky route directive creates a session within the NGINX Plus shared memory zone for tracking any client session identifier you specify to the upstream server, consistently delivering requests with this session identifier to the same upstream server as its original request. 2.8 Connection Draining with NGINX Plus Problem You need to gracefully remove servers for maintenance or other reasons, while still serving sessions with NGINX Plus. Solution Use the drain parameter through the NGINX Plus API, described in more detail in Chapter 5, to instruct NGINX to stop sending new connections that are not already tracked: $ curl -X POST -d '{"drain":true}' \ 'http://nginx.local/api/9/http/upstreams/backend/servers/0' { "id":0, "server":"172.17.0.3:80", "weight":1, "max_conns":0, "max_fails":1, "fail_timeout":"10s", "slow_start":"0s", "route":"", "backup":false, "down":false, "drain":true } Discussion When session state is stored locally to a server, connections and persistent sessions must be drained before the server is removed from the pool. Draining connections is the process of letting sessions to a server expire natively before removing the server from the upstream pool. You can configure draining for a particular server by adding the drain parameter to the server directive. When the drain parameter is set, NGINX Plus stops sending new sessions to this server but allows current sessions to continue being served for the length of their session. You can also toggle this 20 | Chapter 2: High-Performance Load Balancing configuration by adding the drain parameter to an upstream server directive, then reloading the NGINX Plus configuration. 2.9 Passive Health Checks Problem You need to passively check the health of upstream servers to ensure that they’re successfully serving proxied traffic. Solution Use NGINX health checks with load balancing to ensure that only healthy upstream servers are utilized: upstream backend { server backend1.example.com:1234 max_fails=3 fail_timeout=3s; server backend2.example.com:1234 max_fails=3 fail_timeout=3s; } This configuration passively monitors upstream health by monitoring the response of client requests directed to the upstream server. The example sets the max_fails directive to three and fail_timeout to three seconds. These directive parameters work the same way in both stream and HTTP servers. Discussion Passive health checking is available in NGINX Open Source, and it is configured by using the same server parameters for HTTP, TCP, and UDP load balancing. Passive monitoring watches for failed or timed-out connections as they pass through NGINX as requested by a client. Passive health checks are enabled by default; the parameters mentioned here allow you to tweak their behavior. The default max_fails value is 1, and the default fail_timeout value is 10s. Monitoring for health is important on all types of load balancing, not only from a user-experience standpoint but also for business continuity. NGINX passively monitors upstream HTTP, TCP, and UDP servers to ensure that they’re healthy and performing. See Also HTTP Health Checks Admin Guide TCP Health Checks Admin Guide UDP Health Checks Admin Guide 2.9 Passive Health Checks | 21 2.10 Active Health Checks with NGINX Plus Problem You need to actively check your upstream servers for health with NGINX Plus to ensure that they’re ready to serve proxied traffic. Solution For HTTP, use the health_check directive in a location block: http { server { #... location / { proxy_pass http://backend; health_check interval=2s fails=2 passes=5 uri=/ match=welcome; } } # status is 200, content type is "text/html", # and body contains "Welcome to nginx!" match welcome { status 200; header Content-Type = text/html; body ~ "Welcome to nginx!"; } } This health-check configuration for HTTP servers checks the health of the upstream servers by making an HTTP GET request to the URI “/” every two seconds. The HTTP method can’t be defined for health checks; only GET requests are performed, as other methods may change the state of backend systems. The upstream servers must pass five consecutive health checks to be considered healthy. An upstream server is considered unhealthy after failing two consecutive checks and is taken out of the pool. The response from the upstream server must match the defined match block, which defines the status code as 200, the header Content-Type value as 'text/ html', and the string "Welcome to nginx!" in the response body. The HTTP match block has three directives: status, header, and body. All three of these directives have comparison flags as well. Stream health checks for TCP/UDP services are very similar: stream { #... server { 22 | Chapter 2: High-Performance Load Balancing listen 1234; proxy_pass stream_backend; health_check interval=10s passes=2 fails=3; health_check_timeout 5s; } #... } In this example, a TCP server is configured to listen on port 1234, and to proxy to an upstream set of servers, for which it actively checks for health. The stream health_check directive takes all the same parameters as in HTTP, with the exception of uri, and the stream version has a parameter to switch the check protocol to udp. In this example, the interval is set to 10 seconds, requires two passes to be considered healthy, and requires three fails to be considered unhealthy. The active-stream health check is also able to verify the response from the upstream server. The match block for stream servers, however, has just two directives: send and expect. The send direc‐ tive is raw data to be sent, and expect is an exact response or a regular expression to match. Discussion In NGINX Plus, passive or active health checks can be used to monitor the source servers. These health checks can measure more than just the response code. In NGINX Plus, active HTTP health checks monitor based on a number of acceptance criteria of the response from the upstream server. You can configure active healthcheck monitoring for how often upstream servers are checked, how many times a server must pass this check to be considered healthy, how many times it can fail before being deemed unhealthy, and what the expected result should be. For more complex logic, a require directive for the match block enables the use of variables whose value must not be empty or zero. The match parameter points to a match block that defines the acceptance criteria for the response. The match block also defines the data to send to the upstream server when used in the stream context for TCP/UDP. These features enable NGINX to ensure that upstream servers are healthy at all times. See Also HTTP Health Checks Admin Guide TCP Health Checks Admin Guide UDP Health Checks Admin Guide 2.10 Active Health Checks with NGINX Plus | 23 2.11 Slow Start with NGINX Plus Problem Your application needs to ramp up before taking on full production load. Solution Use the slow_start parameter on the server directive to gradually increase the number of connections over a specified time as a server is reintroduced to the upstream load-balancing pool: upstream { zone backend 64k; server server1.example.com slow_start=20s; server server2.example.com slow_start=15s; } The server directive configurations will slowly ramp up traffic to the upstream servers after they’re reintroduced to the pool. server1 will slowly ramp up its number of connections over 20 seconds, and server2 will ramp up over 15 seconds. Discussion Slow start is the concept of slowly ramping up the number of requests proxied to a server over a period of time. Slow start allows the application to warm up by populating caches, initiating database connections without being overwhelmed by connections as soon as it starts. This feature takes effect when a server that has failed health checks begins to pass again and re-enters the load-balancing pool, and it is only available in NGINX Plus. Slow start can’t be used with hash, IP hash, or random load-balancing methods. 24 | Chapter 2: High-Performance Load Balancing CHAPTER 3 Traffic Management 3.0 Introduction NGINX is also classified as a web-traffic controller. You can use NGINX to intelli‐ gently route traffic and control flow based on many attributes. This chapter covers NGINX’s ability to split client requests based on percentages; utilize the geographical location of the clients; and control the flow of traffic in the form of rate, connection, and bandwidth limiting. As you read through this chapter, keep in mind that you can mix and match these features to enable countless possibilities. 3.1 A/B Testing Problem You need to split clients between two or more versions of a file or application to test acceptance or engagement. Solution Use the split_clients module to direct a percentage of your clients to a different upstream pool: split_clients "${remote_addr}AAA" $variant { 20.0% "backendv2"; * "backendv1"; } The split_clients directive hashes the string provided by you as the first parameter and divides that hash by the percentages provided to map the value of a variable provided as the second parameter. The addition of AAA to the first parameter is to 25 demonstrate that this is a concatenated string that can include many variables, as mentioned in the generic hash load-balancing algorithm. The third parameter is an object containing key-value pairs where the key is the percentage weight and the value is the value to be assigned. The key can be either a percentage or an asterisk. The asterisk denotes the rest of the whole after all percentages are taken. The value of the $variant variable will be backendv2 for 20% of client IP addresses and backendv1 for the remaining 80%. In this example, backendv1 and backendv2 represent upstream server pools and can be used with the proxy_pass directive as such: location / { proxy_pass http://$variant } Using the variable $variant, our traffic will be split between two different application server pools. To demonstrate the wide variety of uses split_clients can have, the following is an example of splitting between two versions of a static site: http { split_clients "${remote_addr}" $site_root_folder { 33.3% "/var/www/sitev2/"; * "/var/www/sitev1/"; } server { listen 80 _; root $site_root_folder; location / { index index.html; } } } Discussion This type of A/B testing is useful when testing different types of marketing and front‐ end features for conversion rates on ecommerce sites. It’s common for applications to use a type of deployment called canary release. In this type of deployment, traffic is slowly switched over to the new version by gradually increasing the percentage of users being routed to the new version. Splitting your clients between different versions of your application can be useful when rolling out new versions of code, to limit the blast radius in the event of an error. Even more common is the blue-green deployment style, where users are cut over to a new version and the old version is still available while the deployment is validated. Whatever the reason for splitting clients between two different application sets, NGINX makes this simple because of the split_clients module. 26 | Chapter 3: Traffic Management See Also split_clients Module Documentation 3.2 Using the GeoIP Module and Database Problem You need to install the GeoIP database and enable its embedded variables within NGINX to utilize the physical location of your clients in the NGINX log, proxied requests, or request routing. Solution The official NGINX Open Source package repository, configured in Chapter 2 when installing NGINX, provides a package named nginx-module-geoip. When using the NGINX Plus package repository, this package is named nginx-plus-modulegeoip. NGINX Plus, however, offers a dynamic module for GeoIP2, which is an updated module that works with NGINX stream as well as HTTP. The GeoIP2 module will be explained later in this section. The following examples show how to install the dynamic NGINX GeoIP module package, as well as how to download the GeoIP country and city databases. It’s important to note that the databases for the original GeoIP module are no longer maintained. NGINX Open Source with YUM package manager: $ yum install nginx-module-geoip NGINX Open Source with APT package manager: $ apt install nginx-module-geoip NGINX Plus with YUM package manager: $ yum install nginx-plus-module-geoip NGINX Plus with APT package manager: $ apt install nginx-plus-module-geoip Download the GeoIP country and city databases and unzip them: $ mkdir /etc/nginx/geoip $ cd /etc/nginx/geoip $ wget "http://geolite.maxmind.com/\ download/geoip/database/GeoLiteCountry/GeoIP.dat.gz" $ gunzip GeoIP.dat.gz $ wget "http://geolite.maxmind.com/\ download/geoip/database/GeoLiteCity.dat.gz" $ gunzip GeoLiteCity.dat.gz 3.2 Using the GeoIP Module and Database | 27 This set of commands creates a geoip directory in the /etc/nginx directory, moves to this new directory, and downloads and unzips the packages. With the GeoIP database for countries and cities on the local disk, you can now instruct the NGINX GeoIP module to use them to expose embedded variables based on the client IP address: load_module modules/ngx_http_geoip_module.so; http { geoip_country /etc/nginx/geoip/GeoIP.dat; geoip_city /etc/nginx/geoip/GeoLiteCity.dat; #... } The load_module directive dynamically loads the module from its path on the filesys‐ tem. The load_module directive is only valid in the main context. The geoip_country directive takes a path to the GeoIP.dat file containing the database that maps IP addresses to country codes and is valid only in the http context. The GeoIP2 module is available as a dynamic module and can be compiled at build time of NGINX Open Source. Compiling NGINX Open Source with the GeoIP2 module is outside the scope of this book. The following material will explain how to install the nginx-plus-module-geoip2 module. Install the NGINX Plus dynamic module for GeoIP2 with APT package manager: $ apt install nginx-plus-module-geoip2 Install the NGINX Plus dynamic module for GeoIP2 with YUM package manager: $ yum install nginx-plus-module-geoip2 Load the dynamic module for GeoIP2: load_module modules/ngx_http_geoip2_module.so; load_module modules/ngx_stream_geoip2_module.so; http { #... } Download the free MaxMind GeoLite2 database (requires sign up). Then configure NGINX to use the databases and expose Geo information through variables: http {... geoip2 /etc/maxmind-country.mmdb { auto_reload 5m; $geoip2_metadata_country_build metadata build_epoch; $geoip2_data_country_code default=US source=$variable_with_ip country iso_code; $geoip2_data_country_name country names en; 28 | Chapter 3: Traffic Management } geoip2 /etc/maxmind-city.mmdb { $geoip2_data_city_name default=London city names en; }.... fastcgi_param COUNTRY_CODE $geoip2_data_country_code; fastcgi_param COUNTRY_NAME $geoip2_data_country_name; fastcgi_param CITY_NAME $geoip2_data_city_name;.... } stream {... geoip2 /etc/maxmind-country.mmdb { $geoip2_data_country_code default=US source=$remote_addr country iso_code; }... } Discussion To use this functionality, you must have the NGINX GeoIP or GeoIP2 module installed and a local GeoIP country and city database. Installation and retrieval of these prerequisites was demonstrated in this section. In the original GeoIP module, geoip_country and geoip_city directives expose a number of embedded variables available in this module. The geoip_country direc‐ tive enables variables that allow you to distinguish the country of origin of your client. These variables include $geoip_country_code, $geoip_country_code3, and $geoip_country_name. The country code variable returns the two-letter country code, and the variable with a 3 at the end returns the three-letter country code. The country name variable returns the full name of the country. The geoip_city directive enables quite a few variables. The geoip_city directive enables all the same variables as the geoip_country directive, just with differ‐ ent names, such as $geoip_city_country_code, $geoip_city_country_code3, and $geoip_city_country_name. Other variables include $geoip_city, $geoip_latitude, $geoip_longitude, $geoip_city_continent_code, and $geoip_postal_code, all of which are descriptive of the value they return. $geoip_region and $geoip_region_ name describe the region, territory, state, province, federal land, and the like. Region is the two-letter code, whereas region name is the full name. $geoip_area_code, only valid in the US, returns the three-digit telephone area code. When using the GeoIP2 module, the geoip2 directive exposes the same variables, but the prefix of the variables is geoip2_data_ rather than geoip_. The geoip2 directive 3.2 Using the GeoIP Module and Database | 29 also configures the defaults, and the interval in which the database is reloaded from MaxMind. With these variables, you’re able to log information about your client. You could optionally pass this information to your application as a header or variable, or use NGINX to route your traffic in particular ways. See Also geoip Module Documentation GeoIP Update GitHub NGINX GeoIP2 Dynamic Module Documentation Source Repository for GeoIP2 with Documentation on GitHub 3.3 Restricting Access Based on Country Problem You need to restrict access from particular countries for contractual or application requirements. Solution Map the country codes you want to block or allow to a variable: load_module modules/ngx_http_geoip2_module.so; http { geoip2 /etc/maxmind-country.mmdb { auto_reload 5m; $geoip2_metadata_country_build metadata build_epoch; $geoip2_data_country_code default=US source=$variable_with_ip country iso_code; $geoip2_data_country_name country names en; } map $geoip2_data_country_code $country_access { "US" 0; default 1; } #... } This mapping will set a new variable, $country_access, to 1 or 0. If the client IP address originates from the US, the variable will be set to 0. For any other country, the variable will be set to 1. 30 | Chapter 3: Traffic Management Now, within our server block, we’ll use an if statement to deny access to anyone not originating from the US: server { if ($country_access = '1') { return 403; } #... } This if statement will evaluate True if the $country_access variable is set to 1. When True, the server will return a 403 Unauthorized. Otherwise the server operates as normal. So this if block is only there to deny access to people who are not from the US. Discussion This is a short but simple example of how to only allow access from a couple of countries. This example can be expounded on to fit your needs. You can utilize this same practice to allow or block based on any of the embedded variables made available from the geoip2 module. 3.4 Finding the Original Client Problem You need to find the original client IP address because there are proxies in front of the NGINX server. Solution Use the geoip2_proxy directive to define your proxy IP address range and the geoip_proxy_recursive directive to look for the original IP: load_module "/usr/lib64/nginx/modules/ngx_http_geoip_module.so"; http { geoip2 /etc/maxmind-country.mmdb { auto_reload 5m; $geoip2_metadata_country_build metadata build_epoch; $geoip2_data_country_code default=US source=$variable_with_ip country iso_code; $geoip2_data_country_name country names en; } geoip2_proxy 10.0.16.0/26; geoip2_proxy_recursive on; #... } 3.4 Finding the Original Client | 31 The geoip2_proxy directive defines a classless inter-domain routing (CIDR) range in which our proxy servers live and instructs NGINX to utilize the X-Forwarded-For header to find the client IP address. The geoip2_proxy_recursive directive instructs NGINX to recursively look through the X-Forwarded-For header for the last client IP known. A header named Forwarded has become the standard header for adding proxy information for proxied requests. The header used by the NGINX GeoIP2 module is X-Forwarded-For and cannot be configured otherwise at the time of writing. While X-ForwardedFor is not an official standard, it is still very widely used, accepted, and set by most proxies. Discussion You may find that if you’re using a proxy in front of NGINX, NGINX will pick up the proxy’s IP address rather than the client’s. In this case, you can use the geoip2_proxy directive to instruct NGINX to use the X-Forwarded-For header when connections are opened from a given range. The geoip2_proxy directive takes an address or a CIDR range. When there are multiple proxies passing traffic in front of NGINX, you can use the geoip2_proxy_recursive directive to recursively search through X-Forwarded-For addresses to find the originating client. You will want to use some‐ thing like this when utilizing load balancers, such as Amazon Web Services Elastic Load Balancing (AWS ELB), Google Cloud Platform’s load balancer, or Microsoft Azure’s load balancer, in front of NGINX. 3.5 Limiting Connections Problem You need to limit the number of connections based on a predefined key, such as the client’s IP address. Solution Construct a shared memory zone to hold connection metrics, and use the limit_conn directive to limit open connections: 32 | Chapter 3: Traffic Management http { limit_conn_zone $binary_remote_addr zone=limitbyaddr:10m; limit_conn_status 429; #... server { #... limit_conn limitbyaddr 40; #... } } This configuration creates a shared memory zone named limitbyaddr. The pre‐ defined key used is the client’s IP address in binary form. The size of the shared memory zone is set to 10 MB. The limit_conn directive takes two param‐ eters: a limit_conn_zone name and the number of connections allowed. The limit_conn_status sets the response when the connections are limited to a status of 429, indicating too many requests. The limit_conn and limit_conn_status direc‐ tives are valid in the http, server, and location contexts. Discussion Limiting the number of connections based on a key can be used to defend against abuse and share your resources fairly across all your clients. It is important to be cautious with your predefined key. Using an IP address, as we do in the previous example, could be dangerous if many users are on the same network that originates from the same IP, such as when behind a network address translation (NAT). The entire group of clients will be limited. The limit_conn_zone directive is only valid in the http context. You can utilize any number of variables available to NGINX within the http context in order to build a string by which to limit. Utilizing a variable that can identify the user at the application level, such as a session cookie, may be a cleaner solution depending on the use case. The limit_conn_status defaults to 503 service unavailable. You may find it preferable to use a 429, as the service is available, and 500-level responses indicate server error, whereas 400-level responses indicate client error. Testing limitations can be tricky. It’s often hard to simulate live traffic in an alter‐ nate environment for testing. In this case, you can set the limit_conn_dry_run directive to on, then use the variable $limit_conn_status in your access log. The $limit_conn_status variable will evaluate to either PASSED, DELAYED, REJECTED, DELAYED_DRY_RUN, or REJECTED_DRY_RUN. With dry run enabled, you’ll be able to analyze the logs of live traffic and tweak your limits as needed before rejecting requests over the limit, providing you with assurance that your limit configuration is correct. 3.5 Limiting Connections | 33 3.6 Limiting Rate Problem You need to limit the rate of requests by a predefined key, such as the client’s IP address. Solution Utilize the rate-limiting module to limit the rate of requests: http { limit_req_zone $binary_remote_addr zone=limitbyaddr:10m rate=3r/s; limit_req_status 429; #... server { #... limit_req zone=limitbyaddr; #... } } This example configuration creates a shared memory zone named limitbyaddr. The predefined key used is the client’s IP address in binary form. The size of the shared memory zone is set to 10 MB. The zone sets the rate with a keyword argument. The limit_req directive takes a required keyword argument: zone. zone instructs the directive on which shared memory request–limit zone to use. Requests that exceed the expressed rate are returned a 429 HTTP code, as defined by the limit_req_ status directive. It’s advised to set a status in the 400-level range, as the default is a 503, implying a problem with the server, when the issue is actually with the client. Use optional keyword arguments to the limit_req directive to enable two-stage rate limiting: server { location / { limit_req zone=limitbyaddr burst=12 delay=9; } } In some cases, a client will need to make many requests all at once, and then it will reduce its rate for a period of time before making more. You can use the keyword argument burst to allow the client to exceed its rate limit but not have requests rejected. The rate-exceeded requests will have a delay in processing to match the rate limit up to the value configured. A set of keyword arguments alter this behavior: delay and nodelay. The nodelay argument does not take a value, and simply allows the client to consume the burstable value all at once; however, all requests will be 34 | Chapter 3: Traffic Management rejected until enough time has passed to satisfy the rate limit. In this example, if we used nodelay, the client could consume 12 requests in the first second, but would have to wait four seconds after the initial request to make another. The delay key‐ word argument defines how many requests can be made up front without throttling. In this case, the client can make nine requests up front with no delay, the next three will be throttled, and any more within a four-second period will be rejected. Discussion The rate-limiting module is very powerful for protecting against abusive rapid requests, while still providing a quality service to everyone. There are many reasons to limit rate of request, one being security. You can deny a brute-force attack by putting a very strict limit on your login page. You can set a reasonable limit on all requests, thereby disabling the plans of malicious users who might try to deny service to your application or to waste resources. The configuration of the rate-limiting module is much like the connection-limiting module described in Recipe 3.5, and many of the same concerns apply. You can specify the rate at which requests are limited in requests per second or requests per minute. When the rate limit is reached, the incident is logged. There’s also a directive not included in the example, limit_req_log_level, which defaults to error, but can be set to info, notice, or warn. In NGINX Plus, rate limiting is now cluster-aware (see Recipe 12.5 for a zone sync example). Testing limitations can be tricky. It’s often hard to simulate live traffic in an alter‐ native environment for testing. In this case, you can set the limit_conn_dry_run directive to on, then use the variable $limit_conn_status in your access log. The $limit_conn_status variable will evaluate to PASSED, REJECTED, or REJECTED_DRY_RUN. With dry run enabled, you’ll be able to analyze the logs of live traffic and tweak your limits as needed before rejecting requests over the limit, providing you with assurance that your limit configuration is correct. 3.7 Limiting Bandwidth Problem You need to limit download bandwidth per client for your assets. Solution Utilize NGINX’s limit_rate and limit_rate_after directives to limit the band‐ width of response to a client: 3.7 Limiting Bandwidth | 35 location /download/ { limit_rate_after 10m; limit_rate 1m; } The configuration of this location block specifies that for URIs with the prefix down‐ load, the rate at which the response will be served to the client will be limited after 10 MB to a rate of 1 MB per second. The bandwidth limit is per connection, so you may want to institute a connection limit as well as a bandwidth limit where applicable. Discussion Limiting the bandwidth for particular connections enables NGINX to share its upload bandwidth across all of the clients in a manner you specify. These two directives do it all: limit_rate_after and limit_rate. The limit_rate_after directive can be set in almost any context: http, server, location, and if when the if is within a location. The limit_rate directive is applicable in the same con‐ texts as limit_rate_after; however, it can alternatively be set by a variable named $limit_rate. The limit_rate_after directive specifies that the connection should not be rate limited until after a specified amount of data has been transferred. The limit_rate directive specifies the rate limit for a given context in bytes per second by default. However, you can specify m for megabytes or g for gigabytes. Both directives default to a value of 0. The value 0 means not to limit download rates at all. This module allows you to programmatically change the rate limit of clients. 36 | Chapter 3: Traffic Management CHAPTER 4 Massively Scalable Content Caching 4.0 Introduction Caching accelerates content serving by storing responses to be served again in the future. By serving from its cache, NGINX reduces load on upstream servers by offloading them of expensive, repetitive work. Caching increases performance and reduces load, meaning you can serve faster with fewer resources. Caching also reduces the time and bandwidth it takes to serve resources. The scaling and distribution of caching servers in strategic locations can have a dramatic effect on user experience. It’s optimal to host content close to the consumer for the best performance. You can also cache your content close to your users. This is the pattern of content delivery networks, or CDNs. With NGINX you’re able to cache your content wherever you can place an NGINX server, effectively enabling you to create your own CDN. With NGINX caching, you’re also able to passively cache and serve cached responses in the event of an upstream failure. Caching features are only available within the http context. This chapter will cover NGINX’s caching and content delivery capabilities. 4.1 Caching Zones Problem You need to cache content and need to define where the cache is stored. Solution Use the proxy_cache_path directive to define shared memory-cache zones and a location for the content: 37 proxy_cache_path /var/nginx/cache keys_zone=main_content:60m levels=1:2 inactive=3h max_size=20g min_free=500m; proxy_cache CACHE; The cache definition example creates a directory for cached responses on the filesys‐ tem at /var/nginx/cache and creates a shared memory space named main_content with 60 MB of memory. This example sets the directory structure levels, defines the eviction of cached responses after they have not been requested in three hours, and defines a maximum size of the cache of 20 GB. The min_free parameter instructs NGINX on how much disk space to keep free of the max_size before evicting cached resources. The proxy_cache directive informs a particular context to use the cache zone. The proxy_cache_path is valid in the http context, and the proxy_cache directive is valid in the http, server, and location contexts. Discussion To configure caching in NGINX, it’s necessary to declare a path and zone to be used. A cache zone in NGINX is created with the proxy_cache_path directive. The proxy_cache_path directive designates a location to store the cached information and a shared memory space to store active keys and response metadata. Optional parameters to this directive provide more control over how the cache is maintained and accessed. The levels parameter defines how the directory structure is created. The value is a colon-separated list of numbers that defines the length of successive subdirectory names. Deeper structures help to avoid too many cached files appearing in a single directory. NGINX then stores the result in the file structure provided, using the cache key as a filepath and breaking up directories based on the levels value. The inactive parameter allows for control over the length of time a cache item will be hosted after its last use. The size of the cache is also configurable with the use of the max_size parameter. Other parameters relate to the cache-loading process, which loads the cache keys into the shared memory zone from the files cached on disk, along with many other options. For more information about the proxy_cache_path directive, find a link to the documentation in the following “See Also” section. See Also proxy_cache_path Documentation 38 | Chapter 4: Massively Scalable Content Caching 4.2 Caching Hash Keys Problem You need to control how your content is cached and retrieved. Solution Use the proxy_cache_key directive along with variables to define what constitutes a cache hit or miss: proxy_cache_key "$host$request_uri $cookie_user"; This cache hash key will instruct NGINX to cache pages based on the host and URI being requested, as well as a cookie that defines the user. With this you can cache dynamic pages without serving content that was generated for a different user. Discussion The default proxy_cache_key, which will fit most use cases, is "$scheme$proxy_ host$request_uri". The variables used include the scheme (http, or https); the proxy_host, where the request is being sent; and the request URI. All together, this reflects the URL that NGINX is proxying the request to. You may find that there are many other factors that define a unique request per application—such as request arguments, headers, session identifiers, and so on—to which you’ll want to create your own hash key.1 Selecting a good hash key is very important and should be thought through with understanding of the application. Selecting a cache key for static content is typically pretty straightforward; using the hostname and URI will suffice. Selecting a cache key for fairly dynamic content like pages for a dashboard application requires more knowledge around how users interact with the application and the degree of variance between user experiences. When caching dynamic content, using session identifiers such as cookies or JWT tokens is especially useful. Due to security concerns, you may not want to present cached data from one user to another without fully understand‐ ing the context. The proxy_cache_key directive configures the string to be hashed for the cache key. The proxy_cache_key can be set in the context of http, server, and location blocks, providing flexible control on how requests are cached. 1 Any combination of text or variables exposed to NGINX can be used to form a cache key. A list of variables is available in NGINX. 4.2 Caching Hash Keys | 39 4.3 Cache Locking Problem You want to control how NGINX handles concurrent requests for a resource for which the cache is being updated. Solution Use the proxy_cache_lock directive to ensure only one request is able to write to the cache at a time, where subsequent requests will wait for the response to be written to the cache and served from there: proxy_cache_lock on; proxy_cache_lock_age 10s; proxy_cache_lock_timeout 3s; Discussion We don’t want to proxy requests for which earlier requests for the same content are still in flight and are currently being written to the cache. The proxy_cache_lock directive instructs NGINX to hold requests destined for a cached resource that is currently being populated. The proxied request that is populating the cache is limited in the amount of time it has before another request attempts to populate the resource, defined by the proxy_cache_lock_age directive, which defaults to five seconds. NGINX can also allow requests that have been waiting a specified amount of time to pass through to the proxied server, which will not attempt to populate the cache, by use of the proxy_cache_lock_timeout directive, which also defaults to five seconds. You can think of proxy_cache_lock_age and proxy_cache_lock_timeout as “You’re taking too long, I’ll populate the cache for you,” and “You’re taking too long for me to wait, I’m going to get what I need and let you populate the cache in your own time,” respectively. 4.4 Use Stale Cache Problem You want to send expired cache entries when the upstream sever is unavailable. Solution Use the proxy_cache_use_stale directive with a parameter value defining for which cases NGINX should use stale cache: 40 | Chapter 4: Massively Scalable Content Caching proxy_cache_use_stale error timeout invalid_header updating http_500 http_502 http_503 http_504 http_403 http_404 http_429; Discussion NGINX’s ability to serve stale cached resources while your application is not return‐ ing correctly can really save the day. This feature can give the illusion of a working web service to the end user while the backend may be unreachable. Serving from stale cache also reduces the stress on your backend servers when there are issues, which can in turn provide some breathing room for the engineering team to debug the issue. The configuration tells NGINX to use stale cached resources when the upstream request times out, returns an invalid_header error, or returns 400- and 500-level response codes. The error and updating parameters are a little special. The error parameter permits use of stale cache when an upstream server cannot be selected. The updating parameter tells NGINX to use stale cached resources while the cache is updating with new content. See Also NGINX Use Stale Cache Directive Documentation 4.5 Cache Bypass Problem You need the ability to sometimes bypass caching. Solution Use the proxy_cache_bypass directive with a nonempty or nonzero value. One way to do this dynamically is with a variable set to anything other than an empty string or 0 within a location block that you do not want cached: proxy_cache_bypass $http_cache_bypass; The configuration tells NGINX to bypass the cache if the HTTP request header named cache_bypass is set to any value that is not 0. This example uses a header as the variable to determine if caching should be bypassed—the client would need to specifically set this header for their request. 4.5 Cache Bypass | 41 Discussion There are a number of scenarios that demand that the request is not cached. For this, NGINX exposes a proxy_cache_bypass directive so that when the value is nonempty or nonzero, the request will be sent to an upstream server rather than be pulled from the cache. Different needs and scenarios for bypassing the cache will be dictated by your application’s use case. Techniques for bypassing the cache can be as simple as using a request or response header, or as intricate as multiple map blocks working together. You may want to bypass the cache for many reasons, such as troubleshooting or debugging. Reproducing issues can be hard if you’re consistently pulling cached pages or if your cache key is specific to a user identifier. Having the ability to bypass the cache is vital. Options include, but are not limited to, bypassing the cache when a particular cookie, header, or request argument is set. You can also turn off the cache completely for a given context, such as a location block, by setting proxy_cache off;. 4.6 Cache Purging with NGINX Plus Problem You need to invalidate an object from the cache. Solution Use the purge feature of NGINX Plus, the proxy_cache_purge directive, and a non‐ empty or zero-value variable: map $request_method $purge_method { PURGE 1; default 0; } server { #... location / { #... proxy_cache_purge $purge_method; } } In this example, the cache for a particular object will be purged if it’s requested with the PURGE method. The following is a curl example of purging the cache of a file named main.js: $ curl -X PURGE http://www.example.com/main.js 42 | Chapter 4: Massively Scalable Content Caching Discussion A common way to handle static files is to put a hash of the file in the filename. This ensures that as you roll out new code and content, your CDN recognizes it as a new file because the URI has changed. However, this does not exactly work for dynamic content to which you’ve set cache keys that don’t fit this model. In every caching scenario, you must have a way to purge the cache. NGINX Plus provides a simple method for purging cached responses. The proxy_cache_purge directive, when passed a zero or nonempty value, will purge the cached items matching the request. A simple way to set up purging is by mapping the request method for PURGE. However, you may want to use this in conjunction with the geoip module or simple authentication to ensure that not just anyone can purge your precious cache items. NGINX has also allowed for the use of *, which will purge cache items that match a common URI prefix. To use wildcards, you will need to configure your proxy_cache_path directive with the purger=on argument. See Also NGINX Cache Purge Example 4.7 Cache Slicing Problem You need to increase caching efficiency by segmenting the resource into fragments. Solution Use the NGINX slice directive and its embedded variables to divide the cache result into fragments: proxy_cache_path /tmp/mycache keys_zone=mycache:10m; server { #... proxy_cache mycache; slice 1m; proxy_cache_key $host$uri$is_args$args$slice_range; proxy_set_header Range $slice_range; proxy_http_version 1.1; proxy_cache_valid 200 206 1h; location / { proxy_pass http://origin:80; } } 4.7 Cache Slicing | 43 Discussion This configuration defines a cache zone and enables it for the server. The slice directive is then used to instruct NGINX to slice the response into 1 MB file segments. The cached resources are stored according to the proxy_cache_key direc‐ tive. Note the use of the embedded variable named slice_range. That same variable is used as a header when making the request to the origin, and that request HTTP version is upgraded to HTTP/1.1 because 1.0 does not support byte-range requests. The cache validity is set for response codes of 200 or 206 for one hour, and then the location and origins are defined. The cache slice module was developed for delivery of HTML5 video, which uses byte-range requests to pseudostream content to the browser. By default, NGINX is able to serve byte-range requests from its cache. If a request for a byte range is made for uncached content, NGINX requests the entire file from the origin. When you use the cache slice module, NGINX requests only the necessary segments from the origin. Range requests that are larger than the slice size, including the entire file, trigger subrequests for each of the required segments, and then those segments are cached. When all of the segments are cached, the response is assembled and sent to the client, enabling NGINX to more efficiently cache and serve content requested in ranges. The cache slice module should be used only on large resources that do not change. NGINX validates the ETag each time it receives a segment from the origin. If the ETag on the origin changes, NGINX aborts the cache population of the segment because the cache is no longer valid. If the content does change and the file is smaller, or your origin can handle load spikes during the cache fill process, it’s better to use the cache_lock directive described in Recipe 4.3. Learn more about cache-slicing techniques in the blog listed in the “See Also” section. See Also “Smart and Efficient Byte-Range Caching with NGINX & NGINX Plus” 44 | Chapter 4: Massively Scalable Content Caching CHAPTER 5 Programmability and Automation 5.0 Introduction Programmability refers to the ability to interact with something through program‐ ming. The API for NGINX Plus provides just that: the ability to interact with the configuration and behavior of NGINX Plus through an HTTP interface. This API provides the ability to reconfigure NGINX Plus by adding or removing upstream servers through HTTP requests. The key-value store feature in NGINX Plus enables another level of dynamic configuration—you can utilize HTTP calls to inject infor‐ mation that NGINX Plus can use to route or control traffic dynamically. This chapter will touch on the NGINX Plus API and the key-value store module exposed by that same API. Configuration management tools automate the installation and configuration of servers, which is an invaluable utility in the age of the cloud. Engineers of large-scale web applications no longer need to configure servers by hand; instead, they can use one of the many configuration management tools available. With these tools, engineers only need to write configurations and code once to produce many servers with the same configuration in a repeatable, testable, and modular fashion. This chapter covers a few of the most popular configuration management tools available and how to use them to install NGINX and template a base configuration. These examples are extremely basic but demonstrate how to get an NGINX server started with each platform. 5.1 NGINX Plus API Problem You have a dynamic environment and need to reconfigure NGINX Plus on the fly. 45 Solution Configure the NGINX Plus API to enable adding and removing servers through API calls: upstream backend { zone http_backend 64k; } server { #... # enable /api/ location with appropriate access # control in order to make use of NGINX Plus API location /api { # Set write=off for read only mode, recommended api write=on; # Directives limiting access to the API # See chapter 7 } # enable NGINX Plus Dashboard; requires /api/ location to be # enabled and appropriate access control for remote access location = /dashboard.html { root /usr/share/nginx/html; } } This NGINX Plus configuration creates an upstream server with a shared memory zone, enables the API in the /api location block, and provides a location for the NGINX Plus dashboard. You can utilize the API to add servers when they come online: $ curl -X POST -d '{"server":"172.17.0.3"}' \ 'http://nginx.local/api/9/http/upstreams/backend/servers/' { "id":0, "server":"172.17.0.3:80", "weight":1, "max_conns":0, "max_fails":1, "fail_timeout":"10s", "slow_start":"0s", "route":"", "backup":false, "down":false } The curl call in this example makes a request to NGINX Plus to add a new server to the backend upstream configuration. The HTTP method is a POST, a JSON object is passed as the body, and a JSON response is returned. The JSON response shows the server object configuration—note that a new id was generated, and other configuration settings were set to default values. 46 | Chapter 5: Programmability and Automation The NGINX Plus API is RESTful; therefore, there are parameters in the request URI. The format of the URI is as follows: /api/{version}/http/upstreams/{httpUpstreamName}/servers/ You can utilize the NGINX Plus API to list the

nginx-cookbook.pdf

Document Details

Tags

Related

Full Transcript

Upgrade to continue