Cloud Computing Scalability and Performance

Study Notes

Scalability: Cloud computing allows for horizontal scaling (increasing/decreasing instances) and vertical scaling (increasing/decreasing instance size) to match changing workload demands.
On-demand self-service: Cloud resources can be provisioned and de-provisioned automatically, without human intervention.
Resource pooling: Cloud providers pool resources together to provide a multi-tenant environment, maximizing resource utilization.
Rapid elasticity: Cloud resources can be quickly scaled up or down to match changing workload demands.

Definition: Load balancing is a technique to distribute incoming traffic across multiple servers to improve responsiveness, reliability, and scalability.
Types of load balancing:
- Hardware-based: Using a dedicated hardware device to balance traffic.
- Software-based: Using software to balance traffic, often running on a virtual machine or container.
- Cloud-based: Cloud providers offer load balancing services, often integrated with their infrastructure.
Load balancing algorithms:
- Round-robin: Each incoming request is sent to the next available server in a predetermined sequence.
- Least connection: Incoming requests are sent to the server with the fewest active connections.
- IP Hash: Each incoming request is directed to a server based on the client's IP address.

Active-passive failover: One active server handles all requests, while a passive server waits in standby mode, ready to take over in case of failure.
Active-active failover: Both servers are active and handle requests, with load balancing and synchronization mechanisms to ensure data consistency.
N+1 redundancy: One or more redundant servers are added to a cluster, ensuring that the system remains operational even if one server fails.
Failback: A failed server is repaired or replaced, and then returned to service, often with automated processes to ensure minimal downtime.

Cloud computing allows for horizontal scaling, increasing or decreasing instances to match changing workload demands.
Cloud computing allows for vertical scaling, increasing or decreasing instance size to match changing workload demands.
Cloud resources can be provisioned and de-provisioned automatically through on-demand self-service, without human intervention.
Cloud providers pool resources together to provide a multi-tenant environment, maximizing resource utilization through resource pooling.
Cloud resources can be quickly scaled up or down to match changing workload demands through rapid elasticity.

Load balancing is a technique to distribute incoming traffic across multiple servers to improve responsiveness, reliability, and scalability.
Load balancing can be categorized into three types: hardware-based, software-based, and cloud-based.

Round-robin algorithm: Each incoming request is sent to the next available server in a predetermined sequence.
Least connection algorithm: Incoming requests are sent to the server with the fewest active connections.
IP Hash algorithm: Each incoming request is directed to a server based on the client's IP address.

Active-passive failover: One active server handles all requests, while a passive server waits in standby mode, ready to take over in case of failure.
Active-active failover: Both servers are active and handle requests, with load balancing and synchronization mechanisms to ensure data consistency.

N+1 redundancy: One or more redundant servers are added to a cluster, ensuring that the system remains operational even if one server fails.
Failback: A failed server is repaired or replaced, and then returned to service, often with automated processes to ensure minimal downtime.

Cloud computing resources can be scaled up or down quickly to match changing business needs.
Resources can be provisioned and de-provisioned automatically without human intervention through on-demand self-service.
Multiple customers share the same infrastructure, reducing costs and improving utilization, which is known as multi-tenancy.
Cloud computing resources can be quickly scaled up or down to match changing business needs, which is known as rapid elasticity.

Incoming traffic is distributed across multiple servers to improve responsiveness and reliability.
There are three types of load balancing:
- Hardware-based: Dedicated hardware devices distribute traffic.
- Software-based: Software applications distribute traffic.
- Cloud-based: Cloud providers offer load balancing as a service.

There are three load balancing algorithms:
- Round Robin: Each server is used in sequence.
- Least Connection: Traffic is directed to the server with the fewest connections.
- IP Hash: Each client is directed to a specific server based on their IP address.

There are two types of scaling:
- Horizontal scaling (Scaling out): Adding more servers to distribute the workload and increase processing power.
- Vertical scaling (Scaling up): Upgrading individual servers to increase processing power.

There are two scaling methods:
- Manual scaling: Scaling is performed manually by administrators.
- Auto-scaling: Scaling is performed automatically based on predefined rules and metrics.