AWS Sunil Halai Study notes.pdf
Document Details
Uploaded by FastGrowingBaltimore5920
Tags
Related
Full Transcript
AWS Certified Developer Associate exam prep General Concepts AWS Well architected framework AWS Global architectures Accessing AWS Major AWS Services Analysis Athena Kinesis OpenSearch Servi...
AWS Certified Developer Associate exam prep General Concepts AWS Well architected framework AWS Global architectures Accessing AWS Major AWS Services Analysis Athena Kinesis OpenSearch Service Application Integration SQS SNS EventBridge AppSync Step Functions Compute EC2 Lambda AWS Serverless Application Model (SAM) Elastic Beanstalk Containers: AWS Copilot Amazon Elastic Container Registry (Amazon ECR) Amazon Elastic Container Service (Amazon ECS) Amazon Elastic Kubernetes Service (Amazon EKS) Database: Relational vs No SQL Amazon Aurora Basic Writer / Reader endpoint Amazon DynamoDB Amazon ElastiCache Basic Memcached vs Redis Caching Design Patterns Cache Eviction / TTL (Time to Live) Amazon MemoryDB for Redis Amazon RDS Basic RDS Storage Auto Scaling Read Replicas RDS Multi AZ (Disaster Recovery) Encrypting an Unencrypted RDS DB Developer Tools: AWS Amplify AWS Cloud9 AWS CloudShell AWS CodeArtifact AWS CodeBuild AWS CodeCommit AWS CodeDeploy Amazon CodeGuru AWS CodePipeline AWS CodeStar Amazon CodeWhisperer AWS X-Ray Management and Governance: AWS AppConfig AWS CLI AWS Cloud Development Kit (AWS CDK) AWS CloudFormation AWS CloudTrail Amazon CloudWatch Amazon CloudWatch Logs AWS Systems Manager Networking and Content Delivery: Amazon API Gateway Amazon CloudFront Elastic Load Balancing (ELB) Amazon Route 53 Amazon VPC Security, Identity, and Compliance: AWS Certificate Manager (ACM) Amazon Cognito AWS Identity and Access Management (IAM) AWS Key Management Service (AWS KMS) AWS Private Certificate Authority AWS Secrets Manager AWS Security Token Service (AWS STS) AWS WAF Storage: Amazon Elastic Block Store (Amazon EBS) EC2 Instance Store Amazon Elastic File System (Amazon EFS) Amazon S3 Amazon S3 Glacier General Concepts AWS Well architected framework Operational Excellence Security Reliability Performance Efficiency Cost Optimization Sustainability AWS Global architectures Availability Zone (60 miles apart?) Regions are made up of multiple availability zones (e,g, ap-southwest-2 - sydney, has 3 availability zones) Some AWS Services are global. Some are only available in certain regions - important factor in choosing a region Edge Networks `allow content to be cached closer to other locations Also known as CDN Uses CloudFront to serve content at edge locations. CloudFront can also run lambda Accessing AWS AWS management console (protected by password and MFA) AWS CLI (protected by access keys) AWS SDK (Protected by access Keys) Major AWS Services Analysis Athena Interactive Query service stored in S3 vs S3 Select you can query the entire bucket instead of just a subset with S3 Athena is Serverless - so it costs less compared with RedShift / EMR or ES Kinesis DataStreams Similar to SQS but Kinesis is real-time and provides ability to perform analysis and preserves order of messaging by default (SQS has two types and you have to select FIFO queue) Ordered Supports Multiple data sources Data Analytics Firehose Can stream data without need for a consumer (e.g. to S3 bucket??) Video Streams OpenSearch Service Application Integration SQS Standard vs FIFO (benefits) SNS EventBridge AppSync Step Functions Orchestrates lambda functions Statemachine - serverless workflow - allows you to review flow visually 8 state types: Task - single unit of work Choice - if-then-else logic Parallel - run units of work in parralel Wait - delay execution for time period Fail - stop execution, mark as failure Succeed - stop execution, mark as success Pass - passes input to its output Map - for each loop Has built in retry/ error handling that you can implement at each state Compute EC2 Sizing and configuration options: OS (Linux, windows or Mac OS) CPU RAM EBS and EFS (Network attached) EC2 Instance Store (Hardware) Network card (speed of card / public IP address) Security Group (Firewall rules) Bootstrap Script (configure at first launch: EC2 User Data) EC2 User Data Bootstrapped, Run once only at the first instance start e.g. installing updates / software Run as root user EC2 Instance types: e.g. t2.micro, c5d.4xlarge - many different types. Naming convention: m5.2xlarge (m = memory optimized, 5 = generation (aws improves over time). 2xLarge = spec Compute optimized - Use cases: good for batch processing / media transcoding / machine learning / dedicate gaming server etc. Memory optimized - Use cases: High performance dbs (with memory). Web scale cache stores (e.g. elasticache), Real time processing of big unstructured data Storage optimized - Use cases: High frequency OLTP systems / DB (relational and NOSQL) / data warehouses etc General purpose - Good general diversity for general websites HPC optimized If running AmazonLinux on EC2, you can use EC2 Instance Connect to effectively SSH into the box Can add IAM roles for EC2 instances Setting up SSH (and other ports using attached SecurityGroups) Purchasing Options On-demand instances: Short workload, predictable pricing, pay by second Reserved (1 & 3 years): Reserved Instances: Long workloads Convertible Reserved Instances - Long workloads with flexible instances Savings plans (1 & 3 Years): Commitment to an amount of usage / long workload - save money Spot instances - short workloads, cheap, but can lose instances (less reliable) - e.g. processing Dedicated Hosts - Book an entire physical server, control instance placement - useful for bring your own licence and regulatory requirements - access to the physical server itself Dedicated Instances - No other customers will share you hardware (your own instance on your own hardware) Capacity Reservations - reserve capacity in a specific AZ for a duration Lambda Asynchronous vs Synchronous invocation Execution lifecycle of a function Cold start / warm start - you don't pay for cold start up to 10 secs Init / Invoke / Shutdown Execution environment / Context Reuse (can speed up execution, by reusing resources from INIT phase up to 512MB) Event Object / Context Object - Parameters into the lambda function parameter Event object json data for your lambda function to process (e.g. SNS Notification event, amazon s3 event) Context object describes the current execution event of the lambda function (e.g. memory of the function, get remaining time in millis etc Lambda Layers Allow you to re-use extenal dependencies that will be used by multiple Lambda functions Deployed as zips that can be re-used Advantages: Can be shared with all lambda functions inside a region Faster Deployments Seperation of concern - can seperate business logic from its dependencies Can manage all dependencies for shared resources in a single layer, rather than repeating the same dependency /utility function in each lambda function Lambda Versions / Aliases Allows new version of function can be created to avoid affecting Prod. Useful for Canary deployments Versions auto increment number Alias is like a nickname for a version - can change the version its pointing to (so your other code can refer to the alias) Like a symlink, referring code does not need to update version its pointing to, can just point to alias Lambda / VPC integration Need to assign execution role IAM managed policy with LambdaVPCAccessExecutionRole to allow access (to the Lambda) Lambda function will lose access to internet after it connects to VPC AWS Serverless Application Model (SAM) Elastic Beanstalk Containers: AWS Copilot Amazon Elastic Container Registry (Amazon ECR) Amazon Elastic Container Service (Amazon ECS) Amazon Elastic Kubernetes Service (Amazon EKS) Database: Relational vs No SQL Harder to make Schema changes with relational Amazon Aurora Basic Autoscales in increments of 10GB up to 128TB Supports MySQL and PostgreSQL 20% more expensive than RDS but "AWS cloud optimized" e.g. claims 5x performance over MySQL on RDS Higher availability - 6 copies of data across 3 AZ - will automatically failover for reads / writes Supports Cross Region Replication Writer / Reader endpoint Writer endpoint: Provides single DNS endpoint pointing to master instance to write (instead of specific write instance) Reader endpoint: Provides single DNS endpoint to access read replicas (via connection load balancer) - application point to DNS instead of specific read instance Amazon DynamoDB No SQL database Amazon ElastiCache Basic Managed Redis or Memcached instances - in memory dbs with high performance and low latency Need to heavily modify application code to effectively query from cache appropriately instead of DB Cache hit (get from cache) / Cache miss (fetch from DB) Use cases / Advantages: Reduce load off of dbs for read intensive workloads Make your application stateless e.g. cache user session across application servers into elasticache Maximum number of read replicas for Elasticache Redis cluster with cluster-mode disabled = 5 Memcached vs Redis Redis Memcached Multi AZ with Auto-failover Multi-node for partitioning (sharding) of data Read replicas to horizontally scale and provide availability No high availability (replication) Backup and restore features Non-persistent - no backup and restore Support Sets and Sorted Sets Multi-threaded architecture Caching Design Patterns https://aws.amazon.com/caching/implementation-considerations/ Lazy loading / Cache-Aside/ Lazy Population Write Through Wh Check if cached data is present in application if not load Write to cache when DB is updated at? from db into cache Pros Only requested data is cached (the cache isn’t filled up Data in cache is never stale, reads are quick with unused data) Write penalty vs Read penalty (each write requires 2 calls) Node failures are not fatal (just increased latency to warm the cache) Co ns Cache miss penalty that results in 3 round trips, Missing Data until it is added / updated in the DB. Mitigation is to noticeable delay for that request implement Lazy Loading strategy as well Stale data: data can be updated in the database and Cache churn – a lot of the data will never be read outdated in the cache Cache Eviction / TTL (Time to Live) Cache can be evicted if: item is deleted explicitly in cache memory is full and its not recently used (LRU) TTL that has been set is exceeded (Can range from a few secs to few days) If too many evictions happen due to memory limits - you should scale out or up Amazon MemoryDB for Redis Redis compatible, durable, in-memory db Ultra high performance with over 160 million requests / sec Scale seamlessly to 100s of TB of storage Use cases: Web apps / Online Gaming / Media Streaming... Amazon RDS Basic RDS = Relational Database Service Managed DB Service supports major technologies including MYSQL, Oracle, PostgresSQL and Aurora Advantage over EC2 - managed Automated provisioning / OS patching / Maintenance windows / backup storage on EBS ( GP2 or IO1) Dashboards for monitoring Supports Read replicas / Multi AZ setup for Disaster Recovery Disadvantage over EC2 - cannot SSH into it RDS Storage Auto Scaling Automatically scale db storage upto a set Maximum Storage Threshold Useful for applications with unpredictable workloads Read Replicas Replications are Asynchronous - data will be eventually consistent Up to 15 read replicas within AZ, Cross AZ or Cross Region (same region is free) Replicas can be promoted to their own DB Applications must update connection string to use a read replica Usages: Reporting / Analytics / Read-only high load environments RDS Multi AZ (Disaster Recovery) Synchronous replication - main purpose is to increase availability not for scaling One DNS name - automatic app failover to standby Read replicas can be setup as Multi AZ for DR (Disaster Recovery) Going from Single AZ to Multi AZ is a Zero downtime operation (no need to stop the DB) - Just click on modify Amazon RDS proxy Fully managed, serverless, autoscaling, highly available (multi-AZ) DB proxy for RDS Allows apps to pool and share DB connections - Improving DB efficiency by reducing stress on DB resources Reduced RDS and Aurora failover time by up to 66% Enforce IAM Authentication for DB and securely store credentials in AWS Secrets manager RDS Proxy is never publicly accessible (must be accessed from VPC) Encrypting an Unencrypted RDS DB Create a snapshot of DB, copy the snapshot, click "Enable Encryption" then restore the DB instance from encrypted snapshot Unencrypted RDS DB will always have unencrypted read replicas Developer Tools: AWS Amplify Deploy application in serverless architecture, allows auto-deployment / scaling / management of application and underlying resources a complete solution that allows frontend web and mobile developers to easily build, connect, and host fullstack applications. (WYSWYG) Amplify studio Amplify libraries Amplify CLI Amplify Hosting Can export to cloudformation template AWS Cloud9 AWS CloudShell Available in a few AWS regions only (not every region) - terminal in the cloud (works similar to aws CLI) Advantage over terminal: No need to configure AWS with access key (already, set up for you with your logged in AWS user) Supports linux commands like ls, echo, cat etc Stateful if you create or edit files. AWS CodeArtifact AWS CodeBuild Serverless CI Server for AWS Reduces need for patching / maintaining a dedicated server Only pay for time it takes to build (not idle time) Provides pre-packaged environments such as Docker containers Build environment = OS + Prograrmming env + Tools used by CodeBuild to run the build AWS Codebuild agent can test / run application locally You can create build project using CodeBuild console / AWS CLI / AWS SDK / creation of CodePipeline Buildspec.yaml defines the build to run (on the code pulled from source repo) Can upload build artifact to CodeArtifact or another artifact repo Supports Amazon SNS on build notifications eg. Build failure AWS CodeCommit Basically github / stash (code repo) but for amazon (git based code repo) Hosted in S3 - which gives it high availability and resiliency Advantage over github / stash : Integrates well with other aws services (e.g. can omit an event when code has been committed and changed for other services to use) AWS CodeDeploy Amazon CodeGuru AWS CodePipeline AWS CodeStar Amazon CodeWhisperer AWS X-Ray Management and Governance: AWS AppConfig AWS CLI Protected by Access Keys Command line tool to interact with AWS services using commands in your shell Direct access to the public APIs of AWS Services Opensource Alternative to AWS management console AWS CLI is built on AWS SDK for Python AWS SDK Set of libraries to access Language specific APIs (embedded within application) - high level SDK and low level SDK (for API level commands) Programming language specific e.g. Javascript, java, python, PHP, Go etc) AWS Cloud Development Kit (AWS CDK) AWS CloudFormation AWS CloudTrail Amazon CloudWatch Amazon CloudWatch Logs AWS Systems Manager Networking and Content Delivery: Amazon API Gateway Amazon CloudFront Elastic Load Balancing (ELB) Scalability vs High Availability Vertical scalability (e.g. Increase instance size of EC2) Horizontal scalability (Add load balancer / auto scaling group) Scalability is linked to but different to high availability (Means your data is running at least 2 data centers - to survive data center loss) (e.g. Auto scaling group multi az / load balancer multi AZ) Load balancing: Forward traffic to multiple servers downstream (e.g. EC2 instances) ELB: Managed load balancer - more cost effective compared with setting up your own load balancer / aws guarantees it is working / upgrades / maintanance etc. integrates with many aws offerings / services: e.g. EC2, EC2 Autoscaling groups, ECS, ACM, CloudWatch, Route53, AWS WAF, AWS Global Accelerator Health Checks: Done by ELB on a port and route (/health eg.) to check 200 response to make sure downstream server is healthy Types of load balancer: Load Protocols Target Groups Uses Features Balancer Type Classic load HTTP, don't worry about it (replaced with v2 load balancer HTTPS, balancers) CLB TCP, SSL Application HTTP, EC2 (or Auto scaling Load HTTPS, groups), Private IPs, Standard load balancer for general purpose Load balance to multiple applications (e.g. Balancer Websocket ECS, Lambda (via HTTP website etc containers) on same machine or different (ALB) Json) machines (target groups - even lambda functions) Can support redirects from HTTP to HTTPS Supports Query Strings / Parameters routing Supports Sticky sessions Port mapping Network TCP, TLS, EC2, Private IPs, Load UDP Application Load Balancer High throughput, low latency load balancer You can setup a NLB (which has a static IP) Balancer (millions of requests per sec) in front of an ALB to enable static IP address (NLB) Static IP provisioning - has one static IP per EZ for your HTTP traffic and supports assigning elastic IP (helpful fo Health Check supports TCP, HTTP and whitelisting specific IP) HTTPS Protocols Supports Sticky Sessions Gateway GWLB EC2 Instances, Private IPs load Deploy, scale and manage a fleet of 3rd party Operates at layer 3 (Network layer) balancer network virtual appliances (e.g. Firewalls, Supports Sticky Sessions (GWLB) Intrusion Detection Systems , Payload Manipulation) Security Groups / Use cases Load balancers can have security groups which can be setup to allow HTTP traffic, and can connect Application Security group to restrict access from load balance security group only. IP address to load balance to, must be private IPs Stick sessions / Session Affinity Ensure use request only routed to same target group CLB and ALB use cookie with expiration date Use case - make sure user doesn't lose his session data Enabling stickiness may cause ec2 instances to not be equally balanced Application based Cookies: Check for custom attributes required by the application Duration based cookies: Generated by load balancer Cross Zone load balancing: (Cross AZ load balancing) With Enabled: each load balancer instance distributes evenly across all registered instances in all AZ With Disabled: Requests are distributed in the instances of the node of the Elastic Load Balancer ALB - enabled by default (can be disabled at target group level), no charge for inter AZ data NLB - disabled by default (Pay charges for cross zone AZ) SSL / TLS Can use SSL Cert / TLS cert between your clients and your load balancer to allow encryption in transit (in-flight encryption) - TLS is newer Managed via AWS ACM (Certificate manager) - Load balancer uses an X.509 Cert but you can upload your own certs to ACM Set a default certificate on HTTPS listener (with optional list of domains) Clients can use SNI (Server name Indication) to indicate the hostname they would like to reach on initial SSL handshake SNI solves the problem of multiple SSL Certs onto one web server (you may have more than one domain SSL cert at the ALB level) SNI only works for ALB / NLB and CloudFront Connection Draining / Deregistration delay Can set time to complete 'in-flight requests' while target group instances are de-registering or unhealthy. While instance is draining it will route to other instances Default 300 secs, can set to 0-3600 secs. Autoscaling groups (ASG) ASG is free (only pay for underlying EC2 instances). Automatically scales out (add EC2 instances) or scale in (remove EC2 instances) to match load per requiements Configured via Launch template containing: AMI + Instance Type EC2 User Data EBS Volumes Security Groups SSH Key Pair IAM Roles for your EC2 Instances Network + Subnets Information Load Balancer Information Also specify, Minimum size, Maximum size, initial capacity (number of instances) Can attach security groups to ASG just like EC2 Scaling policies: Can be based on on CloudWatch alarm (e.g. Average CPU, or other metric) Dynamic scaling: Target tracking scaling (e.g. I want the average ASG CPU to stay at around 40%) Simple / Step Scaling (e.g. When a CloudWatch alarm is triggered (example CPU > 70%), then add 2 units) Scheduled scaling: e.g. increase the min capacity to 10 at 5pm on Fridays (e.g. shopping site with a specific sale or predictable workload) Predictive scaling: Continuously forecast load and schedule scaling ahead Scaling cooldowns = time period where another scaling in and out is not allowed to happen after a scaling activity (default 300 sec) To allow metrics to stabilise Instance Refresh = after updating launch template - you can recreate all EC2 instances (can specify minimum healthy percentage as a trigger / warm-up time (time before instance can be used) Amazon Route 53 Amazon VPC VPC can only exist within one region Private subnet within one availability zone only (one subnet cannot span two or more AZs) (for backend systems like DBs, app servers) - not accessible from internet Public subnet - can have multiple subnets in the same AZ (e.g. publicly accessible web servers) CIDR Block allows you to specify size of network between /16 (16 netmask allows 65,536 IP addresses) and /28 (16 IP addresses) netmask (total number of available hosts for network) IPV4 / IPV6 CIDR range First 4 and last 1 ip address reserved for Amazon DHCP options set Automatically provision IP addresses for EC2 instances and other resources Configures DNS, Netbios Name server and NTP NAT Devices Enable EC2 instances in private subnet to connect to public internet or other AWS services (lives in public subnet, and has a route to Internet Gateway Like a gateway but Prevents Public Internet from initiating connections with your private EC2 instances Two types NAT Instance - virtualized running in EC2, managed by customer, not highly scalable or available NAT Gateway - Managed by AWS not on VPC, Highly available and scalable. Associated with particular AZ only (so can implement redundancy by implementing for each AZ seperately) Route table Controls the network traffic in your vpc through subnet routing Allow access between subnets / to the internet One route table can be associated with multiple subnets, but each subnet must have exactly one route table associated. VPC peering Connect two VPC privately using AWS' network - make them behave as if they were in the same network Must not have ovelapping CIDR (IP address range) VPC peering connection is not transitive (if a is connected to b, and b is connected to c, then a is not connected to c, unless a direct connection exists) VPC Endpoints Endpoints allow you to Connect to AWS services using private network instead of www network Enhanced security and lower latency to access AWS Services Site to Site VPC - connect on-premises VPN to AWS (encrypted over public internet) Direct Connect (DX) - Physical connection between on-premises and AWS - secure, fast and private network Security Features Network ACL Firewall to allow or deny at a subnet level - explicitly allow or deny traffic by Port / IP address / Destination Security Groups Works at instance level (e.g. EC2). (ENI - elastic network interface) Can only specify ALLOW rules not DENY Inbound security group all deny by default, outbound security group is allow all by default Security group rule, comprises of IP / port (e.g. ssh) or other security groups Rules can be added to authorize another security group through (useful for load balancer where ec2 instances can connect without needing to specify IP all the time) Virtual private gateway / public gateway: Internet Gateway - allow connection to Internet at the VPC level Customer Gateway - Virtual private gateway can be used to establish an AWS DirectConnect connection to CustomerGateway (which could be a hardware or virtual gateway in the customers own on-premises data centre) at VPC level VPC Endpoint can be used to connect EC2 instance in the VPC to AWS Global services like AWS Lambda / S3 - Traffic does not pass through the internet Amazon S3 and DynamoDB has a VPC gateway endpoint available. VPC Flow log - capture information about IP traffic going to instances - therer are also Subnet flow logs, ENI (Elastic Network Interface Flow logs) Monitor network traffic through the VPC Can be sent to S3, CloudWatch, or Kinesis data firehouse Example Architectures: Three Tier Architecture: Understanding of general diagram (For e.g. User connect via route 53, Tier1 ELB in public subnet, Tier 2 EC2 Autoscaling group pivate subnets, Tier 3 RDS / Elasticache (private subnet) LAMP Stack on EC2 Wordpress on AWS Security, Identity, and Compliance: AWS Certificate Manager (ACM) Amazon Cognito AWS Identity and Access Management (IAM) Best Practices Root account created by default -shouldnt be used or shared Do not use Root (except AWS account setup) Always apply principle of least privilege Users Users - people in your organisation can be grouped Users can be part of multiple groups Groups Can group users together (e.g. developers / sales etc) Policies Users or groups can be assigned policies (which are JSON documents) If policy attached to group level, all users in group get policy. Policy structure: Version: policy language version eg. 2012-10-17 Id: Identifier for policy (optional) Statement: one or more individual statements (required): Sid (optional) identifier for the statement Effect: (Allow / Deny) Principal: Which account/ user / role the policy applies to: e.g. AWS : aws:iam:123456789012:root for the root user Action: List of actions this policy allows or denies (e.g. s3:GetObject, s3:putObject) - supports for wildcard e.g. s3:Get* or just * Resource: list of resources to which the actions apply to. - supports * for wildcard Condition: conditions for when this policy is in effect (optional) Roles Can assign permissions to AWS Services (e.g. EC2 instance / lambda functions) Password policy Can be setup for minimum length, specific character types (e.g. uppercase, numbers, lowercase, special characters) Can be setup for password expiry Can be enabled to prevent password er-use Can be setup to allow users of IAM to change their own passwords MFA (Multi factor authentication) MFA = password you know + security device you own. Ideally should apply to root / all IAM users Supports (Virtual MFA device) google authenticator / authy U2F(Universal 2nd Factor) security key hardware Hardware Key Fob MFA Device (e.g. Gemalto) Hardware Key Fob MFA device for AWS GovCloud (US): (e.g. SurepassID) Security tools IAM Credentials Report: (Account Level): A report that lists all account's users and status of their various credentials (csv) IAM Access Advisor: Tool in AWS Management Console that shows service permissions grated to the user and when those were last accessed Shared Responsibility of IAM: AWS: Infrastructure Configuration and vulnerability analysis Compliance validation You: Users, Groups, Roles, Policies, Monitoring MFA enablement / Access Key rotation policy (should be often) Applying appropriate permissions in IAM Analysing access patterns and reviewing permissions AWS Access Keys: Generated through AWS Console - users are responsible for their own access keys (should be kept secret, just like a password) Access Key ID ~=username, Secret Access Key ~=password AWS Key Management Service (AWS KMS) Can be used to encrypt EBS at rest AWS Private Certificate Authority AWS Secrets Manager AWS Security Token Service (AWS STS) AWS WAF Storage: Amazon Elastic Block Store (Amazon EBS) General Network drive you can attach to your instances (like a network usb stick) - low latency compared to S3 Persists data even after EC2 termination Bound to specific AZ (need to snapshot it to move it across) Can be detached from EC2 instance and attached to another one quickly Has a provisioned activity (IOPS, space etc) IOPS = input operations per sec Delete on Temination attribute - Controls EBS behaviour when EC2 instance terminates - by default (can be set / changed using AWS console / AWS CLI) By default Root ebs volume is deleted By default any other EBS volume is not deleted EBS Snapshots Backup of your EBS volume can copy snapshots across AZ Other features Can move to archive tier (75% cheaper) Recycle bin for snapshots - to prevent accidental deletion - specify retention (1 day to 1 year) Fast Snapshot Restore (FSR) - No latency on first use initialization of snapshot - very expensive AMI overview Customization of an EC2 instance - add own software, os monitoring. Pre-packing the software with EC2 isntance AMI is region specific EC2 instances can be launched from public (made by AWS) and private AMIs (made yourself) or AWS MarketPlace AMI (made by someone else and potentially sold by) EBS Volume types GP2 / GP3 - Cost effective storage, low latency, general purpose SSD volume (can be used for boot volume) - GP3 can independently set throughput and storage, whereas for GP2 is preconfigured (GP2 older) - up to 16,000 iops io1 / io2 - Provisioned IOPS SSD - Applications that need sustained IOPS performance eg. Database workloads - sensitive to storage performance and consistency io1 can independently set IOPs upto 64,000 IOPS for Nitro instances, IO2 set with a max PIOPS of 256,000 with a IOPS to GiB ratio of 100:1. If you want over 32,000 IOPS you need nitro Supports EBS multi attach feature Can be used for boot volume st1 - Hard disk drives (HDD) Suitable for Big data. data warehouses, log processing (500 iops) sc1 - Cold HDD Suitable for archiving EBS Multi-attach Attach the same EBS volume to multiple EC2 instances in the same AZ Each instance has full read write permissions to the high performance volume Use case: Achieve higher application availability in clustered linux applications (e.g. Teradata) Applications must manage concurrent write operations Up to 16 EC2 instances at a time Must use a filesystem that's cluster-aware (not XFS, EXT4..) EC2 Instance Store High performance, low latency, better io performance Hardware disk attached via network drive to EC2 Ephemeral storage - lose their storage if EC2 instance is stopped Good as a buffer / cache/ scratch data / temporary content Amazon Elastic File System (Amazon EFS) Managed NFS (Network file system that can be mounted on many EC2) EFS can work in Multiple AZ Highly available, scalable, expensive (3x the cost of gp2), and pay per use Use cases: Content management, web serving, datasharing, wordperss Only compatible with Linux based AMI (not windows) Encryption with KMS at rest Posix file system with standard file API File system scales automatically - no capacity planning! - pay-per-use Scalability and performance modes EFS Scale: 1000s of NFS clients concurently, 10GB+ throughput - grow to petabyte scale automatically Performance mode: (set at EFS creation time) General purpose - use case: general sites Max IO e.g. Big data Throughput mode: Bursting Provisioned Elastic Storage tiers: Standard: for frequently accessed files Infrequent access (EFS-IA) cost to retrieve files, lower price to store Archive: Rarely accessed data (few times per year) - 50% Cheaper Can implement lifecycle policies to move files between storage tiers. Availability and durability Standard = multi -az One Zone - great for dev, backup enabled by default, compatible with IA (EFS One Zone IA) Amazon S3 Highly available data (six 9's of reliablity). Serverless storage Different tiers Amazon S3 Glacier