Lecture 04-05 Service Demand PDF

What is (service) demand? Any thoughts? What is (service) demand? Demand means the various quantities of goods (or services!) that would be purchased per time period at different prices in a given market. (It could also be free) 3rd episode, 6th season, Futurama Example of Service Demand: Skype If not for the outage, the service demand was quite seasonal. Example of Service Demand Increase: YouTube, et-al During the COVID19 pandemic, the main video streaming companies lowered their video quality. o To support heavy internet traffic amid quarantine. o To avoid breaking the internet (as 80%+ of the total internet traffic is video nowadays!) o Part capacity planning, part governments’ requests. Example of Service Demand: Academic Life at TUD Have you noticed how the library tends to be more crowded at the end of the semester? – Probably saturated (or closed to). Likewise, Brightspace is mostly (almost!) idle during the summer. – Significant spare (unused!) capacity. Which scenario is worst? Best Practice Scalability of Infrastructure Btw, do you know what a pattern and an anti-pattern are? They are extensively used in IT/CS. Thus, be sure to know what it is! Best Practice Best practice (aka a pattern!) (event!) Vertical versus Horizontal Scaling Vertical versus Horizontal Scaling Btw, do you know what garbage collection is? It is a key feature of several of the most widely used programming languages (e.g., Java, Python, C#). Thus, you should certainly familiarize yourself with this concept! Vertical versus Horizontal Scaling Vertical scaling Horizontal scaling May introduce a longer Is virtually limitless. pause time. Is the ideal solution to handle May require the server to a growing workload. be restarted. Can be used effectively May need the app to be without application redesign. designed to leverage it. Does not require server restarts. Instrumentation is the process of adding code to your application so you can understand its inner state (collecting data like metrics, events, logs, and traces). Capture Information Your infra/system’s Information You https://prometheus.io/ Without Information Monitoring Practically the same principle used in project management! Important to emphasise its iterative nature! Gathering Statistics Default Metrics in GCP https://cloud.google.com/monitoring/api/ metrics_gcp#gcp-compute Hundreds of metrics to choose from! Custom Metrics Prometheus Example (again) Custom Metrics Have you heard the term rule-based system? (a system that applies human-made rules to store, sort and manipulate data) Alarms / Actions / State Load Balancing Portillo‐Dominguez, A. Omar. Performance optimisation of clustered Java systems. UCD, 2016. **A collection of instances (e.g., containers, VMS). Also known as auto-scale groups or managed instance groups. Load Balancing Portillo‐Dominguez, A. Omar, et al. "Trini: an adaptive load balancing strategy based on garbage collection for clustered java systems." Software: Practice and Experience (2016) Balancing Modes Target Capacity Rate-Balanced Permutations Instance: Represents a computing resource (e.g., VM, container). Endpoint: Represents the access point/network location where resources can be accessed (e.g., URLs, IP addresses, 127.0.0.1). Utilization Balanced https://cloud.google.com/load-balancing/docs/backend-service#utilization_balancing_mode Key Differences TCP stands for "transmission control protocol“. It is one of the core protocols behind the Internet. It provides reliable, ordered, and error-checked delivery of a data stream between a server and a client. Managed Instance Groups MIG and Load Balancer Stateless / Stateful Applications Stateless / Stateful Applications Real-life Example: Imagine you call a help desk to explain a long-time-to-explain issue. Suddenly, the call breaks, and you need to call back! Option A) If you can continue your conversation where you left it (suggesting your information was recorded somewhere!), it is a stateful transaction. Option B) Otherwise, it is stateless. HALF Periodic Traffic Patterns What do you think about the above plot? Is the provisioned capacity correct? Why yes?, or why not? Variance in Demand The goal of capacity planning! (uses inputs like current usage/logs, (business) forecasts, simulation models, etc.) Underutilization Do you remember this? => You can “easily” convert customer (lack of) confidence into revenue lost (and almost always it would be higher than the cost of the extra capacity) How Does Auto-Scaling Work? Scale out, and in Example: How AWS does it Launch Configurations Auto-Scaling Group Subnet: a network inside a network. Why? Many advantages like more granular control/security, more efficient data transfer What is more accurate to deliver a letter? Eircodes (e.g.,D24 FKT9) or old postcodes (D24) Auto-Scaling Policy Auto-Scaling Schedule Seasonality: A characteristic of a time series and refers to periodic, generally regular, and predictable changes that occur over a period of time (e.g., a year). Warm-Up / Cool Down E.g., the memory initial usage might take a while to stabilize (what is known as the application/system memory footprint). Scale In/Out During Cool Down Design decision, optimistic approach: (Usage+0)/(#instances+1) will always be lower than Usage/#instances Why? Any thoughts? GCP Auto vs Manual Scale-In There are two configs here (green and blue): Scale In Controls – GCP Example Scale In Controls – GCP Example Scale In Controls Thrashing Do you remember the auto-scalers? (the technical decisions taken in their design aim to reduce the likelihood of thrashing) Cloud Vendors Managed Instance Groups Autoscaling Policies Target Utilization Metrics Target Utilization Metrics Monitor Analyze Plan Execute Target Utilization Metrics Anyone interested in the theory behind the above should have a look at statistical process control Stabilization Period Lifecycle Management Do you remember how Life cycles are extensively used in IT/CS/SE? This is a good example of why! Shutdown Event Multiple technologies follow the event-driven paradigm (e.g., native/web/mobile GUI) Real World Example: Slack Slack’s messaging platform faced significant user growth when it expanded into enterprise markets (i.e., user traffic, and data storage). Solution: Slack uses Amazon S3 for (auto-)scalable storage and AWS Lambda for serverless computing to handle unpredictable traffic spikes. Real World Example: Fortnite Fortnite experiences extreme spikes in player traffic during game launches, tournaments, and special in-game events. Solution: The game uses AWS and Kubernetes to auto-scale its infrastructure in response to player demand. Examples of specific non-functional requirements: Handling over 10M concurrent players during peak times. Ensuring sub-100ms latency across diverse global regions. https://www.simcentric.com/hong-kong-dedicated-server/how- epic-games-uses-kubernetes-to-power-fortnite-servers/ Summary – Scaling Types Summary – Scaling Rules Summary – Load Balancing That is all, folks!

Lecture 04-05 Service Demand PDF

Document Details

Tags

Related

Summary

Full Transcript