Cloud Computing Roles and AWS Well-Architected Framework PDF

Cloud Computing Roles Summary In the field of cloud computing, various roles exist, each with specific skills and responsibilities. Understanding these roles is essential for anyone looking to start or transition into a cloud computing career. Below are some common roles in cloud computing: 1. IT Professional Skills and Responsibilities: ○ Generalists who manage applications and production environments. ○ Highly technical with varying experience in cloud technologies. ○ May specialize in areas like security or storage. Job Titles: ○ IT Administrator ○ Systems Administrator ○ Network Administrator 2. IT Leader Skills and Responsibilities: ○ Lead teams of IT professionals. ○ Oversee day-to-day operations and manage budgets. ○ Stay informed about technologies and choose new ones for projects. ○ Hands-on in early project stages, then delegates tasks. Job Titles: ○ IT Manager ○ IT Director ○ IT Supervisor 3. Developer Skills and Responsibilities: ○ Write, test, and fix code. ○ Focus on application-level project development. ○ Work with APIs and SDKs, using sample code. ○ May specialize in areas such as security or storage. Job Titles: ○ Software Developer ○ System Architect ○ Software Development Manager 4. DevOps Engineer Skills and Responsibilities: ○ Build and maintain the infrastructure for applications, often in the cloud. ○ Follow guidelines provided by cloud architects. ○ Experiment to enhance deployment processes. Job Titles: ○ DevOps Engineer ○ Build Engineer ○ Reliability Engineer 5. Cloud Architect Skills and Responsibilities: ○ Stay updated on new technologies and determine which to use. ○ Provide documentation, processes, and tools for developers. ○ Facilitate developer innovation while managing costs, performance, reliability, and security. Job Titles: ○ Cloud Architect ○ Systems Engineer ○ Systems Analyst Course Perspective Throughout this course, participants will take on the perspective of a cloud architect. This involves understanding the architecture of applications, selecting appropriate technologies based on business needs, and addressing challenges related to resource management, cost optimization, and best practices for performance, reliability, and security. The roles described align with the principles outlined in the AWS Well-Architected Framework, which will be discussed in detail throughout the course. Section 1 Cloud Architecting Introduction to Cloud Architecting Overview of Cloud Computing and AWS Early Challenges: In the early 2000s, Amazon faced difficulties in creating an ecommerce service for third-party sellers. Their initial attempts at building a scalable and highly available shopping platform were hindered by poorly planned architectures and long development times, often taking three months just to establish database and storage components. Transition to Cloud Services: To address these issues, Amazon introduced well-documented APIs, allowing for better organization and development. In 2006, they launched Amazon Web Services (AWS), starting with core services like Amazon Simple Queue Service (SQS), Amazon Simple Storage Service (S3), and Amazon Elastic Compute Cloud (EC2). What is Cloud Architecture? Definition: Cloud architecture involves designing solutions that leverage cloud services to fulfill an organization’s technical requirements and business objectives. Analogy: Similar to constructing a building, cloud architecture requires a solid foundation. Here’s how the process parallels construction: 1. Customer Requirements: The customer defines the business needs. 2. Architectural Design: The cloud architect creates a design blueprint to meet these needs. 3. Implementation: The delivery team constructs the solution based on the architect’s design. Benefits: Well-architected systems enhance the likelihood that technology solutions will align with and support business goals. Role of a Cloud Architect Key Responsibilities: ○ Planning: Develop a technical cloud strategy in collaboration with business leaders and analyze solutions to meet business needs. ○ Research: Investigate cloud service specifications, workload requirements, and existing architectures; design prototype solutions. ○ Building: Create a transformation roadmap with milestones, manage adoption and migration efforts. Engagement: Cloud architects work closely with decision-makers to identify business goals and ensure that technology deliverables align with these objectives. They collaborate with delivery teams to ensure the appropriate implementation of technology features. Knowledge: They possess in-depth knowledge of architectural principles and are responsible for: ○ Developing cloud strategies based on business needs. ○ Assisting with cloud migration. ○ Reviewing workload requirements and addressing high-risk issues. ○ Implementing the AWS Well-Architected Framework. Key Takeaways Cloud Architecture: It is about applying cloud characteristics to create solutions that meet technical needs and business use cases. AWS Services: These services enable the creation of highly available, scalable, and reliable architectures. Role of Cloud Architects: They play a crucial role in managing cloud architectures, ensuring that solutions are well-aligned with business objectives and leveraging best practices for implementation. Section 2 AWS Well-Architected Framework Overview The AWS Well-Architected Framework offers a systematic approach to evaluate cloud architectures and implement best practices. Developed from reviewing thousands of customer architectures, it consists of six pillars: Operational Excellence, Security, Reliability, Performance Efficiency, Cost Optimization, and Sustainability. Six Pillars of the AWS Well-Architected Framework 1. Operational Excellence ○ Focuses on running and monitoring systems to deliver business value. ○ Emphasizes continuous improvement of processes and procedures. ○ Encourages viewing the entire workload as code, enabling automation and efficient operations. 2. Security ○ Ensures the protection of information, systems, and assets while delivering business value. ○ Stresses implementing a strong identity foundation, maintaining traceability, and applying security across all layers. ○ Involves risk assessment and mitigation strategies to prepare for security events. 3. Reliability ○ Addresses the ability of systems to recover quickly from disruptions and dynamically meet compute demand. ○ Helps mitigate issues such as misconfigurations and network problems. ○ Promotes designs that ensure high availability, fault tolerance, and redundancy. 4. Performance Efficiency ○ Aims to maximize performance by efficiently utilizing computational resources and maintaining that efficiency as demand changes. ○ Encourages the democratization of advanced technologies by using vendors to manage complexity. ○ Advocates for “mechanical sympathy,” which involves understanding how systems operate best. 5. Cost Optimization ○ Focuses on measuring efficiency and eliminating unnecessary expenses. ○ Emphasizes adopting the right consumption model and considering managed services to lower costs. ○ Views cost optimization as an ongoing iterative process that evolves throughout the production lifecycle. 6. Sustainability ○ Concentrates on building architectures that maximize efficiency and minimize waste. ○ Addresses the long-term environmental, economic, and societal impacts of business activities. ○ Involves reducing the downstream impact by choosing efficient hardware, software, and programming practices. Using the AWS Well-Architected Tool (WA Tool) The AWS Well-Architected Tool helps evaluate workloads against AWS best practices. It provides: ○ Guidance: Access to knowledge used by AWS architects. ○ Action Plan: Step-by-step recommendations for building better workloads. ○ Consistent Process: A method to review and measure cloud architectures, enabling identification of next steps for improvement. The tool allows users to define their workloads and answer questions across the six pillars, delivering insights that help minimize operational costs and enhance system reliability. Key Takeaways The AWS Well-Architected Framework provides a structured way to evaluate and improve cloud architectures. Each of the six pillars has foundational questions to assess alignment with cloud best practices. The AWS WA Tool offers resources for reviewing workloads and implementing best practices effectively. Section 3 best practices for building solutions on AWS: 1. Using Loosely Coupled Components Independent Components: Design architectures with independent components to avoid disruptions and scaling issues. Intermediaries: Use managed solutions like load balancers (e.g., Elastic Load Balancing) and message queues to handle failures and scale components. 2. Designing Services, Not Servers Leverage AWS Services: Use a wide range of AWS services beyond traditional servers, such as containers and serverless solutions (e.g., AWS Lambda, Amazon SQS, Amazon DynamoDB). Managed Solutions: Consider using managed services to reduce overhead and improve performance. 3. Choosing the Right Database Solution Match Technology to Workload: Select database solutions based on specific needs like read/write requirements, storage needs, latency, and query nature. AWS Recommendations: Use AWS guidance for selecting an appropriate database service based on your application environment. 4. Avoiding Single Points of Failure Assume Failure: Design architectures that can recover from failures by implementing secondary databases or replicas. Disposable Resources: Treat resources as disposable and design applications to support hardware changes. 5. Optimizing for Cost Variable Expenses: Take advantage of AWS’s pay-as-you-go model to only pay for what you use. Resource Monitoring: Regularly assess resource sizes and types, monitor usage metrics, and shut down unused resources. 6. Using Caching Improve Performance: Minimize redundant data retrieval by temporarily storing frequently accessed data (e.g., using Amazon CloudFront with Amazon S3). Reduced Costs: Caching can lower latency and reduce costs by serving repeated requests from closer locations. 7. Securing Your Entire Infrastructure Layered Security: Build security into every layer of your infrastructure through managed services, logging, data encryption, access controls, and multi-factor authentication (MFA). Granular Access Control: Use principles like least privilege to limit access across your infrastructure components. 8. Key Takeaways for Building Solutions on AWS Evaluate trade-offs based on empirical data. Follow best practices including scalability, automation, treating resources as disposable, using loosely-coupled components, and optimizing for cost. This framework and its associated best practices aim to enhance the reliability, efficiency, and security of architectures built on AWS, ultimately leading to better operational performance and cost management. Section 4 AWS Global Infrastructure: AWS Global Infrastructure Overview 1. Key Components: ○ Regions: Geographical areas that consist of two or more Availability Zones (AZs). ○ Availability Zones: Isolated locations within a Region, made up of one or more data centers designed for fault isolation. ○ Local Zones: Extensions of Regions that bring services closer to large population centers for low-latency applications. ○ Data Centers: Facilities where data is processed and stored, typically containing tens of thousands of servers. ○ Points of Presence (PoPs): Network locations that reduce latency by caching content closer to end users. 2. Infrastructure Details: ○ AWS spans 102 Availability Zones across 32 geographic regions. ○ Each Region is connected via a private global network backbone, offering lower costs and consistent network latency. ○ AWS data centers provide high availability, redundancy, and load balancing during failures. 3. Choosing Regions and Availability Zones: ○ The choice of Region often depends on compliance and latency requirements. ○ AWS recommends deploying applications across multiple AZs for resilience against failures. 4. Local Zones and PoPs: ○ Local Zones enable running latency-sensitive applications closer to end users. ○ PoPs enhance content delivery by caching popular data at edge locations, significantly improving performance. 5. Key Takeaways: ○ The AWS Global Infrastructure includes Regions, AZs, and edge locations to support scalable, reliable applications. ○ Effective architecture design considers the geographic location of resources to meet performance and compliance needs. This summary encapsulates the main elements of the AWS Global Infrastructure, providing an overview of its architecture and operational considerations. Module check 1,5,6 d 7 Section 1 AWS Shared Responsibility Model Customer Responsibilities (Security IN the Cloud): ○ Data Management: Customer data, applications, identity and access management. ○ Configuration: Operating system, network, and firewall settings. ○ Encryption: Client-side and server-side data encryption, as well as ensuring data integrity and authentication. ○ Networking: Protecting networking traffic through encryption and identity assurance. AWS Responsibilities (Security OF the Cloud): ○ Infrastructure: Securing the foundational services (compute, storage, database, networking) and the global infrastructure (Regions, Availability Zones, Edge Locations). Security as a Well-Architected Framework Pillar Six Pillars of the Well-Architected Framework: ○ Security ○ Operational Excellence ○ Reliability ○ Performance Efficiency ○ Cost Optimization ○ Sustainability AWS Well-Architected Tool: Helps implement best practices from the framework. Design Principles for the Security Pillar 1. Strong Identity Foundation: Enforce least privilege and separate duties for authorization. 2. Protect Data in Transit and at Rest: Use encryption and tokenization based on data sensitivity. 3. Security at All Layers: Apply multiple security controls across all layers of the architecture. 4. Minimize Direct Data Access: Reduce the need for direct access to sensitive data. 5. Maintain Traceability: Monitor and audit actions in real time. 6. Prepare for Security Events: Have policies and tools ready for incident management. 7. Automate Security Best Practices: Use automated security mechanisms to enhance scalability and cost-effectiveness. Key Security Practices User Permissions: Use policies to manage access to AWS resources based on roles and needs. Principle of Least Privilege: Grant minimum permissions required for tasks. Data Protection: ○ In Transit: Use TLS to protect data during transfer. ○ At Rest: Implement client-side and server-side encryption to protect stored data. Takeaways Security is a shared responsibility between AWS and customers. Implementing a strong identity foundation and protecting data are crucial to enhancing security posture in the cloud. This summary encapsulates the essential aspects of AWS security responsibilities and principles, highlighting the collaborative nature of cloud security between AWS and its customers. Section 2 IAM Overview IAM User: Represents an individual (person or application) with a permanent set of credentials to access AWS services. IAM Group: A collection of IAM users granted identical permissions, simplifying permission management. IAM Role: An identity with temporary permissions used for specific tasks; it does not have long-term credentials. IAM Policy: A document defining permissions, specifying what resources can be accessed and the level of access granted. Authentication Credentials AWS Management Console: Requires a username and password for sign-in. AWS CLI and API Calls: Use an access key (combination of an access key ID and secret key) for programmatic access. Best Practices for Secure Access 1. Principle of Least Privilege: Assign minimum permissions necessary for users. 2. Enable MFA: Adds an extra security layer for the root user and other accounts. 3. Use Temporary Credentials: Prefer temporary credentials over long-term ones for human users. 4. Rotate Access Keys: Regularly rotate access keys to maintain security. 5. Use Strong Passwords: Ensure all accounts have complex passwords. 6. Secure Local Credentials: Store sensitive credentials securely. 7. Utilize AWS Organizations: Manage multiple AWS accounts centrally. 8. Enable AWS CloudTrail: Monitor actions taken in the account for security auditing. 9. Protect the Root User: Limit use of the root user account and monitor its activity. Protecting the Root User Create an Admin User: For daily tasks, create a user with administrative privileges instead of using the root account. Use MFA: Always set up multi-factor authentication for the root user. IAM User and Group Management Attach Policies to Groups: For efficient access control, attach IAM policies to groups rather than individual users. IAM Roles Temporary Security Credentials: Roles provide temporary credentials without long-term commitments. Common Use Cases: Used for applications running on EC2, cross-account access, and mobile applications. Examples of IAM Role Usage 1. EC2 Instance Access: An IAM user accesses an application on an EC2 instance by assuming a role. 2. S3 Bucket Access: An application on EC2 accesses an S3 bucket through an attached role. 3. Cross-Account Access: An IAM user in one account accesses resources in another account through a trusted role. Key Takeaways Utilize IAM for fine-grained access control. Avoid using the root account for everyday tasks. Use groups for managing user permissions effectively. Roles allow for temporary access to resources as needed. This summary encapsulates the fundamental concepts and best practices related to IAM in AWS, emphasizing the importance of security and proper access management. Section 3 authorizing users with IAM policies in AWS: IAM Policies and Permissions 1. Types of Policies: ○ Identity-based Policies: Attached to IAM users, groups, or roles. These policies specify what actions the identity can perform. ○ Resource-based Policies: Attached to AWS resources. They define what actions specific users or groups can perform on those resources. 2. Policy Structure: ○ Policies are defined in JSON format. ○ Each policy specifies the resources and operations that are allowed or denied. ○ Follow the principle of least privilege, granting only the permissions necessary for a user or resource to perform their tasks. Permission Evaluation Logic 1. Default Deny: By default, all requests are denied. 2. Explicit Allow and Deny: ○ An explicit allow will override the default deny. ○ An explicit deny will override any explicit allow. 3. Implicit Deny: If there is no applicable explicit allow or deny, access is denied by default. 4. Conflict Resolution: In cases where one policy allows an action and another denies it, the more restrictive (deny) policy takes precedence. IAM Policy Simulation The IAM policy simulator is a tool that helps test and troubleshoot policies to determine if access will be granted to IAM entities. Examples of Policies 1. Example 1: ○ Identity-based Policy: Grants Bob permission to GET, PUT, and LIST objects in bucket X. ○ Resource-based Policy: Allows Bob to GET and LIST, but denies PUT. Therefore, Bob cannot PUT objects into bucket X despite the identity-based policy. 2. Example 2: ○ Bob's identity-based policy allows LIST for bucket Y but does not specify GET or PUT. ○ The resource-based policy for bucket Y allows GET and LIST, but not PUT. Therefore, Bob can read objects from the bucket. Key Takeaways A policy defines permissions associated with an identity or resource. IAM supports two types of policies (identity-based and resource-based) to control access. Permissions determine the outcome of requests: by default denied, overridden by explicit allows or denies. This overview encapsulates the concepts of IAM policies and permissions, emphasizing their role in securing access to AWS resources. Section 4 IAM (Identity and Access Management) policies IAM Policy Structure and Concepts 1. Policy Document Structure: ○ Version: Specifies the policy language version (e.g., "2012-10-17"). ○ Statement: Contains one or more individual permission statements. Effect: Indicates whether access is allowed (Allow) or denied (Deny). Principal: Defines the user, account, role, or federated user that the policy applies to (for resource-based policies). Action: Lists the actions that are allowed or denied (e.g., s3:GetObject). Resource: Specifies the resources to which the actions apply (e.g., Amazon Resource Names, or ARNs). Condition (optional): Specifies conditions that must be met for the policy to apply. 2. Types of Policies: ○ Resource-based Policy: Attached to resources like S3 buckets or DynamoDB tables. Allows specifying the principal and can grant access to other accounts. ○ Identity-based Policy: Attached to IAM users, groups, or roles. Specifies what actions the principal can perform on specified resources. 3. Policy Evaluation Logic: ○ AWS applies a logical OR across statements when evaluating permissions. If any statement allows access, the permission is granted, unless there is an explicit deny. 4. Explicit Deny and Allow: ○ An explicit deny takes precedence over any allow statement. This means that even if a user has permission to access a resource, if a deny statement applies, access will be denied. 5. Examples of Policies: ○ Resource-based Policy Example: Allows any S3 action on specified S3 resources but denies all other actions on DynamoDB or S3 resources not listed. ○ Identity-based Policy Example: Allows a user to manage their IAM login profile, access keys, and SSH keys. ○ Cross-account Policy Example: Allows one AWS account to access resources in another account. 6. CIDR and IP Conditions: ○ Policies can include conditions based on IP addresses using CIDR notation, allowing or denying actions based on the source IP address. 7. Key Takeaways: ○ IAM policies are JSON documents defining permissions. ○ Key elements include effect, action, and resources, which together determine what permissions are granted or denied. ○ Understanding the structure and evaluation of IAM policies is crucial for managing access to AWS resources securely. Conclusion IAM policies play a vital role in securing access to AWS resources. By understanding their structure and how to apply conditions effectively, you can implement a robust access control strategy in your AWS environment. Module checks Section 1 Defining Amazon S3: Types of Storage 1. Block Storage: Data is stored in fixed-sized blocks, managed by applications and file systems. Each block is identifiable and can be stored efficiently across systems. 2. File Storage: Data is organized in a hierarchical structure, resembling a shared network drive. 3. Object Storage: Data is stored as objects with associated attributes and metadata. Each object contains data, metadata, and a unique identifier (object key). Amazon S3 Overview Object Storage Service: Amazon S3 is designed to store massive amounts of unstructured data. Buckets and Objects: ○ Data is stored as objects within buckets defined by the user. ○ Each bucket has a globally unique name and is associated with a specific AWS Region. ○ Each object can range in size from 0 bytes to a maximum of 5 TB. ○ Objects include a key, version ID, content (immutable), metadata, and sub-resources. Components of Amazon S3 Bucket: A container for objects that organizes the S3 namespace and manages storage costs and access control. Object: The basic entity stored in S3, including data and metadata. Object Key: A unique identifier for each object in a bucket. Folder Structure Using Prefixes Amazon S3 allows for a folder-like structure by using prefixes in object names. This helps organize objects within a bucket for easier retrieval. Benefits of Amazon S3 Durability: 99.999999999% durability, ensuring data is not lost and is redundantly stored across multiple devices. Availability: 99.99% availability, allowing quick access to data with virtually unlimited storage capacity. High Performance: Capable of handling thousands of transactions per second and scales automatically to high request rates. Key Takeaways Amazon S3 is an object storage service that supports massive unstructured data storage, organized into unique buckets and objects. The service offers significant benefits, including durability, availability, and performance. Section 2 use cases for Amazon S3: Common Use Cases for Amazon S3 1. Spikes in Demand: ○ Web Hosting: Amazon S3 is ideal for hosting web content, particularly when there's a need to handle extreme spikes in traffic. 2. Static Website Hosting: ○ S3 can host static websites consisting of HTML files, images, and videos. This setup eliminates the need for a dedicated web server, as S3 can serve the content directly via HTTP URLs. It offers a cost-effective solution with high performance, scalability, and availability. 3. Data Store for Computation and Analytics: ○ Amazon S3 is suitable for large-scale analytics and computation tasks, such as financial transaction analysis and media transcoding. It can efficiently handle multiple concurrent transactions, making it a vital component in data processing workflows. 4. Example Workflow: ○ A compute cluster (like Amazon EC2 or EMR) extracts raw data from S3, processes it, and stores the transformed data back in another S3 bucket. Analytics tools, such as Amazon QuickSight, can then analyze this data for insights. 5. Backup and Archiving: ○ Due to its durability and scalability, Amazon S3 is effective for data backup and archival purposes. Data from on-premises data centers and EC2 instances can be backed up to S3 buckets. ○ Cross-Region Replication: To enhance durability, S3 supports cross-Region replication, automatically copying objects from one bucket to others in different Regions. ○ Long-term Storage Options: Users can transition long-term data from S3 standard storage to Amazon S3 Glacier for cost-effective archival. Key Benefits of Amazon S3 Durability: Ensures data integrity with high durability ratings (11 nines). Availability: Offers 99.99% availability, ensuring quick access to data. High Performance: Capable of handling thousands of transactions per second. Summary of S3's Capabilities Amazon S3 serves a broad range of needs, such as: Storing and distributing media files (videos, photos, music). Hosting static content. Supporting data processing for analytics. Providing backup and archival solutions. This versatility makes Amazon S3 a foundational service for many cloud-based applications and workflows. Section 3 Move data - Amazon S3 Amazon S3 Overview Object Storage: Amazon S3 allows for an unlimited number of objects to be stored in a bucket. Objects are automatically encrypted during upload using server-side encryption (SSE-S3). Upload Methods: Objects can be uploaded via: ○ AWS Management Console: A user-friendly interface that allows drag-and-drop uploads (max 160 GB). ○ AWS Command Line Interface (CLI): Command-line operations for bulk uploads or downloads. ○ AWS SDKs: Programmatic uploads using various supported programming languages. ○ Amazon S3 REST API: For direct PUT requests to upload data. Multipart Upload: For files over 100 MB, multipart uploads improve throughput, allow parts to be uploaded independently, and support pause/resume functionality. Key Features of Amazon S3 Transfer Acceleration: Enhances upload speed over long distances by routing data through globally distributed CloudFront edge locations. It can significantly improve upload times for large files. Use Cases: ○ Media Hosting: Storing and distributing videos, photos, and music. ○ Static Website Hosting: Hosting static HTML sites without needing a dedicated server. ○ Data Store for Analytics: Supporting large-scale analytics and computations. ○ Backup and Archiving: Reliable solutions for data backup with options like cross-Region replication and integration with Amazon S3 Glacier for long-term storage. AWS Transfer Family Overview: A fully managed service for transferring files to/from Amazon S3 and Amazon EFS using protocols like SFTP, FTPS, FTP, and AS2. Benefits: ○ Scales in real-time without the need for infrastructure changes. ○ Seamless integration with AWS services for analytics and processing. ○ Serverless file transfer workflows to automate uploads. ○ Pay-as-you-go pricing with no upfront costs. Common Use Cases for Transfer Family With Amazon S3: ○ Data lakes for third-party uploads. ○ Subscription-based data distribution. ○ Internal data transfers. With Amazon EFS: ○ Data distribution and supply chain management. ○ Content management and web-serving applications. Key Takeaways Utilize various methods for uploading objects to S3, including the Management Console, CLI, SDKs, and REST API. Implement multipart upload for large files to optimize transfer efficiency. Leverage Transfer Acceleration for faster uploads over long distances and use Transfer Family for secure file transfers into/out of AWS storage. This summary encapsulates the various functionalities and use cases of Amazon S3 and AWS Transfer Family, providing a clear understanding of their capabilities and applications. Section 4 Storing Data s3 Amazon S3 Versioning 1. Deleting Objects: ○ When an object is deleted in a version-enabled bucket, all versions remain in the bucket, and a delete marker is inserted. For example, if photo.gif is deleted, versions can still be accessed using their IDs (e.g., 121212 or 111111). 2. Retrieving Versions: ○ Requests for an object key return the most recent version. If the latest version is a delete marker, a 404 error is returned, indicating no object found. ○ Specific versions can be retrieved by using their version ID. For example, a GET request for photo.gif with version ID 121245 will return that specific version. 3. Permanently Deleting Objects: ○ Only the owner of the bucket can permanently delete a specific object version using its version ID. This action does not add a delete marker, making the version irrecoverable. Cross-Origin Resource Sharing (CORS) CORS allows client web applications loaded in one domain to interact with resources in a different domain. By creating a CORS configuration, you specify: ○ Allowed origins. ○ Supported operations (e.g., GET requests). This is useful for scenarios such as hosting web fonts in S3 buckets that need to be accessed by webpages from different domains. Amazon S3 Data Consistency Model Amazon S3 provides strong read-after-write consistency for all GET, LIST, PUT, and DELETE operations on objects, simplifying the migration of on-premises analytics workloads. If a PUT request is successful, any subsequent GET or LIST request will return the most recently written data. While object operations are strongly consistent, bucket configurations may have eventual consistency; for instance, a deleted bucket might still appear in listings shortly after deletion. Key Takeaways S3 Standard Storage is ideal for cloud applications, dynamic websites, and big data analytics. Implementing an S3 lifecycle policy enables automatic transfer of data between storage classes without application changes. Versioning helps recover from unintended deletions and application failures. Understanding CORS is essential for building web applications that interact with S3 resources. The data consistency model in S3 supports seamless migration of existing workloads by eliminating the need for additional infrastructure changes. This summary captures the core ideas and functionality of Amazon S3 versioning, CORS, and data consistency Section 5 Amazon S3 Tools for Protection 1. Block Public Access: Prevents public access to buckets. 2. IAM Policies: Authenticate users for specific bucket/object access. 3. Bucket Policies: Define access rules for buckets or objects, useful for cross-account access. 4. Access Control Lists (ACLs): Set specific rules for bucket/object access (less commonly used). 5. S3 Access Points: Customized network endpoints for applications. 6. Preassigned URLs: Provide temporary access to objects. 7. AWS Trusted Advisor: Checks bucket permissions for global access. Approaches to Access Configuration 1. Default Security Settings: All new buckets and objects are private. 2. Controlled Access: Specific users granted access while others are denied. 3. Public Access: Not recommended for sensitive data; suitable for static websites only. Considerations for Choosing a Region Data Privacy Laws: Compliance with regulations in the selected region. User Proximity: Lower latency improves user experience. Service Availability: Some AWS services may not be available in all regions. Cost-Effectiveness: Costs vary by region; consider overall expenses. Amazon S3 Inventory Helps manage storage, audit, and report on object statuses. Provides scheduled reports in various formats (CSV, ORC, Parquet). Cost Structure Pay-as-you-go: Charges based on stored data, requests made, and data transfer. Free Tier: Offers 5 GB of storage and limited requests for new AWS customers. Example Activity Controlled Access Setup: Create folders for content and IAM policies for employee-level access. Key Takeaways S3 buckets are private by default; controlled access is crucial for security. SSE-S3 is the standard encryption method. Consider multiple factors when selecting a region, including compliance and cost. Understand the cost structure to manage expenses effectively. This summary encapsulates the essential features, configurations, and considerations related to Amazon S3. Section 6 key points from the AWS Training and Certification Module on adding a storage layer with Amazon S3, particularly in the context of the AWS Well-Architected Framework principles: AWS Well-Architected Framework for Storage 1. Security ○ Data Protection: Enforce encryption at rest to ensure confidentiality. Implement access control to limit data exposure. Use regular backups and versioning to safeguard against data loss. ○ Best Practices: S3 buckets and objects are encrypted by default. Access is private by default; explicit permissions are required. Versioning helps protect against accidental deletions or overwrites. 2. Reliability ○ Failure Management: Utilize multi-Availability Zone (AZ) deployments for high availability. Amazon S3 offers 99.999999999% (11 nines) durability and 99.99% (4 nines) availability. Data is redundantly stored across multiple AZs, with regular integrity checks. 3. Performance Efficiency ○ Architecture Selection: Understand available services and choose based on workload needs. Amazon S3 is suitable for storing massive amounts of unstructured data. Performance can be enhanced with S3 Transfer Acceleration and multipart uploads. 4. Cost Optimization ○ Cost-Effective Resources: Conduct cost analyses based on usage patterns over time. Utilize S3’s various storage classes and lifecycle rules to manage costs effectively. S3 Intelligent-Tiering automatically optimizes storage costs based on access patterns. Key Takeaways Amazon S3 supports data protection through default encryption and access restrictions. Its architecture is designed for efficiency and can handle unstructured data at scale. Cost management features, such as lifecycle policies and intelligent tiering, aid in long-term savings. S3’s reliability features, including durability and availability, ensure data resilience. This summary encapsulates how the AWS Well-Architected Framework principles are applied to storage solutions using Amazon S3, focusing on security, reliability, performance, and cost-effectiveness. Module check Section 1 Adding a Compute Layer Using Amazon EC2 AWS compute service categories, Amazon EC2, and the process of provisioning an EC2 instance: 1. Compute Service Categories Differentiators Management Responsibilities: ○ VMs, Containers, VPS: Offer more control over infrastructure, allowing for high customization of the environment. ○ PaaS and Serverless: Abstract infrastructure management, letting developers focus primarily on application code and functionality. Scalability: ○ VMs, Containers, VPS: Provide flexibility to scale manually or automatically based on workload demands. ○ PaaS and Serverless: Automatically manage scaling in response to traffic, reducing manual intervention. Suitability for Workloads: ○ VMs, Containers, VPS: Ideal for workloads requiring specific configurations, high control, or custom applications. ○ PaaS and Serverless: Best suited for applications with variable workloads that need rapid deployment and minimal management overhead. 2. Amazon EC2 Overview Definition: Amazon EC2 (Elastic Compute Cloud) provides virtual servers (EC2 instances) that can be provisioned within minutes. Capabilities: ○ Automatic scaling of capacity based on demand. ○ Pay-as-you-go pricing model for cost efficiency. 3. EC2 Instance Components Physical Host: The underlying server where EC2 instances run. Availability Zones: Geographic locations that contain multiple data centers to ensure redundancy and fault tolerance. VMs: Virtual machines that operate with their own OS and applications. Operating Systems: Various OS options, including Amazon Linux, Microsoft Windows, and macOS. Hypervisor: Software that manages VM access to physical resources like CPU, memory, and storage. 4. Storage Options Instance Store: Temporary storage physically attached to the host. Amazon EBS (Elastic Block Store): Persistent storage that retains data even when instances are stopped. 5. Network Connectivity EC2 instances can connect to other AWS resources, allowing configuration of network access to balance security and accessibility. 6. Use Cases for Amazon EC2 Complete control over computing resources. Cost optimization options through On-Demand, Reserved, and Spot Instances. Hosting a variety of applications, from simple websites to complex enterprise applications. 7. Steps for Provisioning an EC2 Instance 1. Choose an Amazon Machine Image (AMI): The template for launching instances. 2. Select Instance Type: Based on CPU, memory, and storage requirements. 3. Specify a Key Pair: For secure SSH or RDP access. 4. Network Placement: Define how the instance connects to your network. 5. Assign a Security Group: Controls inbound and outbound traffic. 6. Specify Storage Options: Choose between instance store and EBS volumes. 7. Attach an IAM Role: For AWS service API calls. 8. User Data: Automate configurations and installations upon launch. 8. Key Takeaways Amazon EC2 allows running VMs with complete control over resources. Essential to choose an AMI and instance type during provisioning. Configuration settings include network, security, storage, and user data. This summary encapsulates the differentiators of AWS compute service categories, the role of EC2, and the key steps for provisioning instances, emphasizing the flexibility and control that AWS provides for various workloads. Section 2 Choosing an AMI to launch an EC2 Instance 1. Definition and Purpose of AMIs AMIs provide the necessary information to launch an Amazon EC2 instance, including: ○ Template for the root volume: Contains the guest operating system (OS) and potentially other software. ○ Launch permissions: Control access to the AMI. ○ Block device mappings: Specify additional storage volumes to attach to the instance at launch. 2. Benefits of Using AMIs Repeatability: AMIs allow efficient and precise instance launches. Reusability: Instances launched from the same AMI are identically configured, making it easy to create clusters or recreate environments. Recoverability: AMIs serve as restorable backups; a new instance can be launched from the same AMI if the original instance fails. 3. Choosing an AMI When selecting an AMI, consider the following characteristics: Region: Each AMI exists in a specific AWS Region. Operating System: Options include various Linux distributions and Microsoft Windows. Storage Type: AMIs can be Amazon EBS-backed (persistent storage) or instance store-backed (temporary storage). Architecture: Options include 32-bit or 64-bit, x86 or ARM instruction sets. Virtualization Type: Use HVM AMIs for better performance. 4. Sources of AMIs Quick Start: AMIs provided by AWS. My AMIs: Custom AMIs created by the user. AWS Marketplace: Pre-configured AMIs from third parties. Community AMIs: AMIs shared by others, used at your own risk. 5. Instance Store-Backed vs. EBS-Backed AMIs Boot Time: EBS-backed instances boot faster. Root Device Size: EBS-backed instances can have a maximum root device size of 16 TiB; instance store-backed instances are limited to 10 GiB. Stopping and Changing Instance Type: EBS-backed instances can be stopped and have their type changed; instance store-backed instances cannot. Cost Structure: Different charging mechanisms for EBS and instance store-backed instances. 6. Instance Lifecycle An instance transitions through states: pending, running, stopping, stopped, and terminated. EBS-backed instances can be stopped and restarted, while instance store-backed instances cannot. 7. Creating New AMIs AMIs can be created from existing EC2 instances, allowing for customized configurations to be captured and reused. 8. EC2 Image Builder Automates the creation, management, and deployment of up-to-date AMIs, providing a graphical interface, version control, and security validations. Key Takeaways AMIs are crucial for launching EC2 instances efficiently and consistently. Benefits include repeatability, reusability, and recoverability. Consider performance and source when choosing an AMI. This overview captures the essential information about AMIs, their benefits, usage, and lifecycle in the context of Amazon EC2. Section 3 Here's a summary of the key points regarding Amazon EC2 instance types and their configuration: 1. EC2 Instance Types Overview Definition: EC2 instance types define configurations of CPU, memory, storage, and network performance. Importance: Choosing the right instance type is crucial for matching workload performance and cost requirements. 2. Instance Type Configuration Example Instance Types: ○ m5d.large: 2 vCPUs, 24 GiB memory, 1 x 50 NVMe SSD, Up to 10 Gbps network performance. ○ m5d.xlarge: 4 vCPUs, 48 GiB memory, 1 x 100 NVMe SSD, Up to 10 Gbps network performance. ○ m5d.8xlarge: 32 vCPUs, 128 GiB memory, 2 x 600 NVMe SSD, 10 Gbps network performance. Network Performance: Generally more static across instance sizes; m5d.large and m5d.xlarge both support up to 10 Gbps, while m5d.8xlarge provides a consistent 10 Gbps. 3. Instance Type Naming Convention Components: ○ Family: e.g., c for compute optimized. ○ Generation: Indicated by a number (higher means newer and generally better). ○ Processor Family: Optional indicator (e.g., g for AWS Graviton). ○ Size: Indicates performance category (e.g., xlarge). 4. Suitability for Workloads General Purpose: Suitable for web/app servers, gaming, etc. (e.g., M5, T3). Compute Optimized: Best for compute-bound applications (e.g., C5). Storage Optimized: Designed for high-performance databases (e.g., I3). Memory Optimized: For applications with large datasets (e.g., R5). Accelerated Computing: For machine learning and AI (e.g., P3). High Performance Computing (HPC): For complex simulations (e.g., Hpc7). 5. Choosing the Right Instance Type Considerations: Analyze workload requirements and cost. Start with a smaller instance and scale as necessary. Resources: Use the EC2 console’s Instance Types page and AWS Compute Optimizer for recommendations. 6. AWS Compute Optimizer Functionality: Recommends optimal instance types and configurations based on workload patterns. Classification: Results are classified as Under-provisioned, Over-provisioned, Optimized, or None. This summary captures the essential details regarding EC2 instance types, their configurations, naming conventions, suitability for different workloads, and strategies for selecting the right type based on performance and cost considerations. Section 4 Shared File System Options for EC2 Instances 1. Amazon EBS: ○ Attaches to a single instance at a time. ○ Not suitable for sharing data among multiple instances. 2. Amazon S3: ○ An object store, not a block store. ○ Suitable for storing files, but changes overwrite entire files rather than individual blocks. ○ Not ideal for applications requiring high-throughput and consistent read/write access. 3. Amazon EFS (Elastic File System): ○ Fully managed service for Linux workloads. ○ Scales automatically from gigabytes to petabytes. ○ Supports Network File System (NFS) protocols. ○ Allows multiple EC2 instances to access the file system simultaneously. ○ Common use cases include home directories, enterprise applications, application testing, database backups, web serving, media workflows, and big data analytics. 4. Amazon FSx for Windows File Server: ○ Fully managed shared file system for Microsoft Windows EC2 instances. ○ Supports NTFS and SMB protocols. ○ Integrates with Microsoft Active Directory for permissions and access control. ○ Suitable for home directories, lift-and-shift applications, media workflows, data analytics, web serving, and software development environments. Key Takeaways Storage Options: ○ For a root volume: Use instance store or SSD-backed Amazon EBS. ○ For data volume serving a single instance: Use instance store or Amazon EBS. ○ For data volume serving multiple Linux instances: Use Amazon EFS. ○ For data volume serving multiple Windows instances: Use Amazon FSx for Windows File Server. This summary provides a clear overview of the available shared file system options for EC2 instances, focusing on their suitability and primary use cases. Section 5 Shared File System Options for EC2 1. Amazon EBS (Elastic Block Store): ○ Attaches to one instance only; not suitable for sharing. 2. Amazon S3 (Simple Storage Service): ○ Object storage; not ideal for high-throughput applications due to file overwriting behavior. 3. Amazon EFS (Elastic File System): ○ Fully managed service for Linux workloads. ○ Supports NFS protocols and allows multiple EC2 instances to access the same file system. ○ Ideal for applications like web serving, media workflows, and big data analytics. 4. Amazon FSx for Windows File Server: ○ Fully managed shared file system for Windows instances. ○ Supports NTFS and SMB protocols; integrates with Active Directory. ○ Suitable for home directories, lift-and-shift applications, and media workflows. Key Takeaways Root Volume: Use instance store or Amazon EBS. Single Instance Data Volume: Use instance store or Amazon EBS. Multiple Linux Instances: Use Amazon EFS. Multiple Windows Instances: Use Amazon FSx for Windows File Server. This summary highlights the key options and considerations for shared file systems in Amazon EC2 environments. Section 6 Amazon EC2 Pricing Models 1. On-Demand Instances: ○ Pay per second or hour without long-term commitments. ○ Recommended for: Spiky workloads, experimentation, short-term applications. 2. Reserved Instances: ○ Commit to a 1-year or 3-year term for significant discounts on On-Demand prices. ○ Recommended for: Committed workloads, steady-state applications. 3. Savings Plans: ○ Flexible pricing model that offers discounts in exchange for a commitment to a consistent usage amount (in $/hour) over 1 or 3 years. ○ Types: Compute Savings Plans: Most flexibility, applies to various services. EC2 Instance Savings Plans: Less flexible, specific to instance family and region. ○ Recommended for: All EC2 workloads, particularly those needing flexibility. 4. Spot Instances: ○ Bid on unused EC2 capacity for substantial savings. ○ Recommended for: Fault-tolerant, flexible, or stateless workloads. ○ Can be interrupted with a 2-minute notice; billed in increments (1-second for Amazon Linux/Ubuntu, 1-hour for others). 5. Capacity Reservations: ○ Reserve compute capacity in a specific Availability Zone to ensure availability. ○ Types: On-Demand Capacity Reservations: For workloads needing guaranteed capacity. EC2 Capacity Blocks for ML: Reserve GPU instances for machine learning workloads. 6. Dedicated Options: ○ Dedicated Instances: Run on dedicated hardware isolated from other accounts. ○ Dedicated Hosts: Fully dedicated physical servers, useful for licensing and compliance. ○ Recommended for: Workloads requiring dedicated hardware or specific software licenses. Cost Optimization Guidelines Combine Models: Use a mix of purchasing options to optimize costs. ○ Reserved Instances or Savings Plans for steady-state workloads. ○ On-Demand Instances for stateful spiky workloads. ○ Spot Instances for fault-tolerant, flexible workloads. Key Takeaways Models Include: On-Demand, Reserved, Savings Plans, Spot Instances, and Dedicated Hosts. Billing: Per-second billing is available for specific models. Optimization: A combination of the pricing models helps reduce overall EC2 costs effectively. This summary encapsulates the essential points regarding Amazon EC2's pricing strategies and recommendations for optimizing costs based on usage patterns Section 6 AWS Well-Architected Framework Overview The AWS Well-Architected Framework consists of six pillars, each addressing critical areas in cloud architecture. This summary emphasizes security, performance efficiency, cost optimization, and sustainability. Key Principles 1. Automate Compute Protection (Security) ○ Implement automation for protective measures like vulnerability management and resource management to reduce human error and enhance security. ○ Utilize tools such as EC2 Image Builder and user data scripts to strengthen defenses against threats. 2. Control Traffic at All Layers (Security) ○ Establish robust traffic controls across your network, focusing on secure operation of workloads. ○ Use VPCs to manage network topology, employing security groups to regulate inbound and outbound traffic. 3. Scale the Best Compute Options (Performance Efficiency) ○ Choose the most suitable compute options based on workload requirements to optimize performance and resource efficiency. ○ Continuously evaluate compute choices to match application design and usage patterns. 4. Configure and Right-Size Compute Resources (Performance Efficiency) ○ Ensure compute resources are properly sized to meet performance requirements, preventing over- or under-utilization. ○ Optimize configurations to enhance customer experience while minimizing costs. 5. Select the Correct Resource Type, Size, and Number (Cost Optimization) ○ Choose appropriate resource types and sizes to meet technical needs at the lowest cost. ○ Engage in right-sizing as an iterative process influenced by usage patterns and external factors. 6. Select the Best Pricing Model (Cost Optimization) ○ Analyze workload requirements to determine the most cost-effective pricing models, including On-Demand, Reserved, Spot instances, and Savings Plans. 7. Use the Minimum Amount of Hardware (Sustainability) ○ Right-size resources to minimize environmental impact while maintaining performance. ○ Take advantage of AWS's flexible resource modification capabilities. 8. Use Instance Types with the Least Impact (Sustainability) ○ Monitor and adopt new instance types for enhanced energy efficiency, particularly for specific workloads. 9. Use Managed Services (Sustainability) ○ Shift operational responsibilities to AWS through managed services, allowing teams to focus on innovation while AWS maintains efficiency. Key Takeaways Automate security measures and compute protection. Optimize and right-size compute resources to meet workload demands efficiently. Select cost-effective resource types and pricing models. Focus on sustainability by minimizing hardware usage and utilizing managed services. Module Checks Section 1 Key Considerations for Database Selection 1. Scalability: ○ Assess the required throughput and ensure the database can scale efficiently without downtime. ○ Avoid underprovisioning (leading to application failures) or overprovisioning (increased costs). 2. Storage Requirements: ○ Determine the necessary storage capacity (gigabytes, terabytes, or petabytes) based on your data needs. ○ Different architectures support varying maximum data capacities. 3. Data Characteristics: ○ Understand your data model (relational, structured, semi-structured, etc.) and access patterns. ○ Consider latency needs and specific data record sizes. 4. Durability and Availability: ○ Define the level of data durability and availability required. ○ Consider regulatory obligations for data residency and compliance. Types of Databases Relational Databases: ○ Structure: Tabular form (columns and rows), with strict schema rules. ○ Benefits: Data integrity, ease of use, and SQL compatibility. ○ Use Case: Ideal for online transactional processing and structured data. Non-Relational Databases (NoSQL): ○ Structure: Varied models (key-value, document, graph), with flexible schemas. ○ Benefits: Scalability, high performance, and suitable for semi-structured/unstructured data. ○ Use Case: Effective for caching, JSON storage, and low-latency access. AWS Database Options Relational Databases: ○ Amazon RDS, which offers multiple familiar database engines (e.g., Aurora, MySQL, PostgreSQL). Non-Relational Databases: ○ Amazon DynamoDB (key-value), Amazon Neptune (graph), Amazon ElastiCache (in-memory). Managed Database Services Managed services like Amazon RDS reduce the administrative burden by handling tasks such as backups, scaling, and maintenance. As you move to managed services, your responsibilities shift primarily to optimizing queries and application efficiency. Database Capacity Planning Process: 1. Analyze current capacity. 2. Predict future requirements. 3. Decide on horizontal or vertical scaling. Vertical Scaling: Increasing resources of existing servers; may require downtime. Horizontal Scaling: Adding more servers to handle increased load without downtime. Key Takeaways Consider scalability, storage, data characteristics, costs, and durability when selecting a database. Relational databases are suited for structured data and transactional applications, while non-relational databases offer flexibility for diverse data types. AWS managed services minimize operational responsibilities, allowing focus on application optimization. Section 2 ey Considerations for Database Selection 1. Scalability: ○ Assess the required throughput and ensure the database can scale efficiently without downtime. ○ Avoid underprovisioning (leading to application failures) or overprovisioning (increased costs). 2. Storage Requirements: ○ Determine the necessary storage capacity (gigabytes, terabytes, or petabytes) based on your data needs. ○ Different architectures support varying maximum data capacities. 3. Data Characteristics: ○ Understand your data model (relational, structured, semi-structured, etc.) and access patterns. ○ Consider latency needs and specific data record sizes. 4. Durability and Availability: ○ Define the level of data durability and availability required. ○ Consider regulatory obligations for data residency and compliance. Types of Databases Relational Databases: ○ Structure: Tabular form (columns and rows), with strict schema rules. ○ Benefits: Data integrity, ease of use, and SQL compatibility. ○ Use Case: Ideal for online transactional processing and structured data. Non-Relational Databases (NoSQL): ○ Structure: Varied models (key-value, document, graph), with flexible schemas. ○ Benefits: Scalability, high performance, and suitable for semi-structured/unstructured data. ○ Use Case: Effective for caching, JSON storage, and low-latency access. AWS Database Options Relational Databases: ○ Amazon RDS, which offers multiple familiar database engines (e.g., Aurora, MySQL, PostgreSQL). Non-Relational Databases: ○ Amazon DynamoDB (key-value), Amazon Neptune (graph), Amazon ElastiCache (in-memory). Managed Database Services Managed services like Amazon RDS reduce the administrative burden by handling tasks such as backups, scaling, and maintenance. As you move to managed services, your responsibilities shift primarily to optimizing queries and application efficiency. Database Capacity Planning Process: 1. Analyze current capacity. 2. Predict future requirements. 3. Decide on horizontal or vertical scaling. Vertical Scaling: Increasing resources of existing servers; may require downtime. Horizontal Scaling: Adding more servers to handle increased load without downtime. Key Takeaways Consider scalability, storage, data characteristics, costs, and durability when selecting a database. Relational databases are suited for structured data and transactional applications, while non-relational databases offer flexibility for diverse data types. AWS managed services minimize operational responsibilities, allowing focus on application optimization. Section 3 Key Database Considerations 1. Scalability: ○ Ensure the database can handle current and future throughput without downtime. Avoid under- or overprovisioning. 2. Storage Requirements: ○ Determine the necessary capacity (gigabytes, terabytes, petabytes) based on data needs. 3. Data Characteristics: ○ Understand the data model (relational, structured, semi-structured) and access patterns, including latency requirements. 4. Durability and Availability: ○ Define needed data durability and availability levels. Consider regulatory compliance for data residency. Database Types Relational Databases: ○ Structured data in tables with strict schemas; ideal for online transactional processing. Non-Relational Databases (NoSQL): ○ Flexible schemas and various models (key-value, document, graph); suitable for semi-structured and unstructured data. AWS Database Options Relational: Amazon RDS (supports multiple engines like Aurora, MySQL). Non-Relational: Amazon DynamoDB (key-value), Amazon Neptune (graph), Amazon ElastiCache (in-memory). Managed Database Services AWS manages tasks like backups and scaling, allowing focus on query optimization and application efficiency. Database Capacity Planning Analyze current capacity, predict future needs, and choose between vertical (increasing server resources) or horizontal scaling (adding servers). Key Takeaways Consider scalability, storage, data characteristics, costs, and durability in database selection. Use relational databases for structured data; non-relational databases for flexibility and diverse data types. Section 3 features and structure of Amazon DynamoDB based on the provided information: Key Features of Amazon DynamoDB 1. Serverless Performance with Limitless Scalability: ○ Global and Local Secondary Indexes: Enables flexible data access through alternate keys, allowing for lower write throughput and cost-effective performance. ○ DynamoDB Streams: Records item-level changes in near real-time, facilitating event-driven architectures. ○ Global Tables: Supports multi-region, multi-active data replication for fast local access, automatically scaling to accommodate workloads. 2. Built-in Security and Reliability: ○ Data Encryption: Automatically encrypts all customer data at rest. ○ Point-in-Time Recovery (PITR): Protects data from accidental operations with continuous backups for up to 35 days. ○ Fine-Grained Access Control: Uses IAM for authentication and allows access control at the item and attribute level. Data Structure in DynamoDB Tables: A table is a collection of data containing items, uniquely identifiable by their attributes. Items: An item consists of one or more attributes, similar to a row in a relational database. Attributes: Fundamental data elements consisting of key-value pairs. Primary Keys: ○ Simple Primary Key: Composed of a single attribute known as the partition key. ○ Composite Primary Key: Combines the partition key and an optional sort key for richer query capabilities. Examples of Data Representation Base Table: Represents IoT sensor data with attributes like temperature and error status, indexed by device ID and timestamp. Global Secondary Index (GSI): Provides alternate schemas to query data using different partition and sort keys while ensuring eventual consistency. Local Secondary Index (LSI): Allows strong consistency reads and must be created with the table; supports alternate sort keys based on the same partition key. Multi-Region Replication Global Tables: Enable multi-region, multi-active databases that replicate data automatically across selected AWS regions, enhancing availability and performance. Security Best Practices Preventative Measures: ○ Use IAM roles for authentication. ○ Apply IAM policies for resource access and fine-grained control. ○ Implement VPC endpoints to limit access to DynamoDB. Detective Measures: ○ Utilize AWS CloudTrail for monitoring and logging usage. ○ Employ AWS Config to track configuration changes and compliance with rules. By leveraging these features, DynamoDB provides a scalable, secure, and highly available database solution suitable for various applications. Section 4 1. Amazon DocumentDB Suitable Workloads: Flexible schema, dynamic data storage, online customer profiles. Data Model: Document data model using JSON-like documents. Key Features: MongoDB-compatible, high performance for complex documents. Common Use Cases: Content management systems, customer profiles, operational data storage. 2. Amazon Keyspaces Suitable Workloads: Fast querying of high-volume data, scalability, heavy write loads. Data Model: Wide column data model, flexible columns. Key Features: Managed Apache Cassandra-compatible service, scalability, and high availability. Common Use Cases: Industrial equipment maintenance, trade monitoring, route optimization. 3. Amazon MemoryDB Suitable Workloads: Latency-sensitive applications, high request rates, high throughput. Data Model: In-memory database service. Key Features: In-memory speed, data durability, compatible with Redis. Common Use Cases: Caching, game leaderboards, banking transactions. 4. Amazon Neptune Suitable Workloads: Finding connections in data, complex relationships, highly connected datasets. Data Model: Graph data model with nodes and edges. Key Features: High throughput and low latency for graph queries, supports multiple graph query languages. Common Use Cases: Recommendation engines, fraud detection, knowledge graphs. 5. Amazon Timestream Suitable Workloads: Identifying trends over time, efficient data processing, ease of data management. Data Model: Time series data model. Key Features: Serverless, built-in timeseries functions, automatic data replication. Common Use Cases: IoT applications, web traffic analysis. 6. Amazon QLDB Suitable Workloads: Accurate history of application data, financial transaction tracking, data lineage verification. Data Model: Ledger database with immutable transaction logs. Key Features: Cryptographically verifiable transaction log, built-in data integrity, flexible querying with PartiQL. Common Use Cases: Financial transactions, claims history tracking, supply chain data management. Key Takeaways Each AWS database service is purpose-built to address specific application needs: ○ Amazon Redshift for data warehousing. ○ Amazon DocumentDB for JSON document storage. ○ Amazon Keyspaces for wide-column data. ○ Amazon MemoryDB for in-memory applications. ○ Amazon Neptune for graph databases. ○ Amazon Timestream for timeseries data. ○ Amazon QLDB for ledger functionality. This comprehensive overview serves to guide organizations in selecting the appropriate database solutions based on their unique requirements Section 5 Overview of AWS Database Migration Service (AWS DMS) Managed Service: AWS DMS is a managed service that facilitates the migration and replication of existing databases and analytics workloads to and within AWS. Supported Databases: It supports a wide range of commercial and open-source databases, including Oracle, Microsoft SQL Server, MySQL, PostgreSQL, and more. It can replicate data on demand or on a schedule. Endpoints: AWS DMS allows migration between homogeneous (same database engine) and heterogeneous (different database engines) endpoints. One endpoint must be an AWS service. Continuous Replication: It can continuously replicate data with low latency, supporting various use cases such as building data lakes in Amazon S3 or consolidating databases into a data warehouse using Amazon Redshift. Homogeneous Migration Simplification: Homogeneous migrations simplify moving databases with the same engine (e.g., from on-premises PostgreSQL to Amazon RDS for PostgreSQL). Serverless: These migrations are serverless, automatically scaling resources as needed. Instance Profiles: Migrations use instance profiles for network and security settings, ensuring a managed environment for the migration project. Tools for Heterogeneous Migrations Database Discovery Tool: AWS DMS Fleet Advisor automatically inventories and assesses on-premises databases, identifying migration paths. Schema Conversion Tools: ○ AWS Schema Conversion Tool (SCT): Converts source schema and SQL code into compatible formats for the target database. ○ AWS DMS Schema Conversion: A managed service for schema assessment and conversion integrated into AWS DMS workflows. Example Use Case Data Lake Replication: AWS DMS can replicate data from an on-premises database (like a Student Information System) into an Amazon S3 data lake for analytics, allowing easy access to data for analysis and visualization. Key Takeaways AWS DMS is efficient for migrating data quickly and securely to AWS, supporting both homogeneous and heterogeneous migrations. Tools like AWS SCT and AWS DMS Schema Conversion help streamline schema and code conversion processes. This information provides a solid foundation for understanding AWS DMS and its role in database migration and replication. If you have specific questions or need further details on any aspect, feel free to ask! Section 6 AWS Well-Architected Framework Database Pillars The framework consists of six pillars, and this section highlights best practices relevant to database management: 1. Performance Efficiency: ○ Architecture Selection: Use a data-driven approach to evaluate database options. Consider the trade-offs that architectural choices have on customer experience and system performance. Understand data characteristics and access patterns to select optimal data services. 2. Security: ○ Data Protection: Implement secure key management and enforce encryption at rest. Utilize AWS Key Management Service (KMS) to manage encryption keys effectively. Ensure data confidentiality through encryption to mitigate risks of unauthorized access. 3. Cost Optimization: ○ Cost-effective Resources: Select the correct database type, size, and number based on workload characteristics to minimize costs. Engage in right-sizing practices to balance performance and cost effectively. Consider the use of serverless options like Aurora Serverless for on-demand scaling. Key Takeaways Performance Efficiency: Make selection choices based on data characteristics and access patterns to optimize workloads. Security: Implement secure key management and data protection measures to ensure data durability and safety. Cost Optimization: Assess and select resource types, sizes, and numbers based on workload requirements to achieve the lowest possible costs while meeting technical needs. By adhering to these principles, cloud architects can create a robust database layer that enhances performance, security, and cost-effectiveness in AWS environments. Module Checks Section 1 Here's a summary of the key points regarding Amazon VPC (Virtual Private Cloud) networking, focusing on public and private subnets, Elastic IP addresses, and NAT devices: Subnets in Amazon VPC 1. Public Subnets: ○ Have direct access to the internet through an Internet Gateway. ○ Require instances to have both private and public IP addresses for internet connectivity. ○ A public subnet route table includes a route for 0.0.0.0/0 that points to the Internet Gateway. 2. Private Subnets: ○ Do not have direct access to the internet. ○ Instances in private subnets cannot be reached from the internet, enhancing security. ○ The private subnet route table typically mirrors the main VPC route table. 3. Elastic IP Addresses: ○ Static public IP addresses associated with an EC2 instance. ○ Can be transferred between instances if needed. ○ There is no cost for the first Elastic IP associated with a running instance; additional Elastic IPs incur charges. NAT Devices 1. NAT (Network Address Translation) Devices: ○ Allow instances in a private subnet to initiate outbound traffic to the internet while remaining unreachable from the internet. ○ Two types of NAT devices: NAT Gateway: A managed AWS service that incurs an hourly cost, providing better availability and bandwidth. NAT Instance: An EC2 instance configured to perform NAT, which incurs EC2 usage costs. 2. Connecting Private Subnets to the Internet: ○ To allow instances in private subnets to access the internet, route traffic through a NAT device. ○ NAT gateways and instances are placed in public subnets to access the Internet Gateway. Use Cases for Subnets Database Instances: Recommended in private subnets for security. Batch-Processing Instances: Should also reside in private subnets. Web Application Instances: Can be placed in public or private subnets, but AWS recommends placing them in private subnets behind a load balancer for enhanced security. Key Takeaways Amazon VPC creates a logically isolated virtual network. Public subnets allow direct internet access; private subnets do not. NAT gateways enable outbound internet connectivity for private subnet resources. Elastic IPs can be reassigned to different instances as needed. This summary encapsulates the main points from the training module on creating a networking environment in AWS. Section2 This text provides a detailed overview of network security mechanisms in AWS, focusing on Security Groups, Network ACLs (Access Control Lists), AWS Network Firewall, and Bastion Hosts. Here’s a breakdown of the key concepts and differences discussed in the content: 1. Security Groups vs. Network ACLs Scope: ○ Security Groups: Operate at the resource level (e.g., EC2 instances). ○ Network ACLs: Operate at the subnet level. Traffic Rules: ○ Security Groups: Only allow rules (stateful). Inbound rules are not allowed by default, but all outbound traffic is allowed by default. ○ Network ACLs: Can specify both allow and deny rules (stateless). By default, they allow all inbound and outbound traffic. Statefulness: ○ Security Groups: Stateful, meaning return traffic is automatically allowed. ○ Network ACLs: Stateless, so return traffic must be explicitly allowed. Rule Evaluation: ○ Security Groups: All rules are evaluated. ○ Network ACLs: Rules are evaluated in number order, and evaluation stops at the first match. 2. AWS Network Firewall Acts as an additional layer of security for VPCs, providing intrusion detection and prevention. It is stateful and managed, inspecting incoming traffic to protect subnet resources. Requires modification of VPC route tables to route external traffic through the firewall. 3. Bastion Hosts A bastion host provides secure access to private subnets from an external network, minimizing direct access to instances. Security groups are used to control access; the bastion host connects to resources in the private subnet through SSH (port 22). Ideally, only the bastion host should be the source of SSH traffic to the private instances. Key Takeaways Implement multiple layers of defense for securing AWS infrastructure. Use security groups for resource-level traffic control and network ACLs for subnet-level control. Route external VPC traffic through AWS Network Firewall for enhanced security. Utilize bastion hosts to securely administer resources in private subnets from external environments. This overview encapsulates the content, highlighting the key differences and functionalities of AWS networking security features. Section 3 1. Connecting EC2 Instances to Managed Services Use Case: After deploying a workload on an EC2 instance in a private subnet, you may need to access an Amazon S3 bucket in the same AWS Region. Since Amazon S3 operates outside your VPC, direct connectivity is not possible. Challenges: Accessing S3 through the public internet incurs data transfer costs and may expose your traffic. 2. VPC Endpoints There are two main types of VPC endpoints for secure, private connectivity to AWS managed services: A. Interface VPC Endpoints Description: These use AWS PrivateLink and allow private connections to AWS services. Components: AWS creates an Elastic Network Interface (ENI) with a private IP address in each specified subnet. IAM Policies: You can control access to the endpoint using IAM resource policies. Cost: Charged for hourly usage and data processed. B. Gateway VPC Endpoints Description: These provide direct connections to Amazon S3 and DynamoDB using route tables, without AWS PrivateLink. Components: No additional charge for usage, with no throughput limitations. Route Table: Configured in the private subnet's route table to direct traffic to S3 and DynamoDB. 3. Steps to Set Up an Interface VPC Endpoint 1. Specify the name of the AWS service you want to connect to. 2. Choose the VPC and specify subnets in different Availability Zones for redundancy. 3. Select a subnet for the interface endpoint, creating a network interface. 4. Specify security groups to control traffic to the network interface. 4. Gateway Load Balancer Endpoint Description: This type of endpoint connects security appliances in one VPC to application instances in another, allowing for traffic inspection. Traffic Flow: Incoming and outgoing traffic is routed through the Gateway Load Balancer for inspection before reaching the EC2 application instance. 5. Key Takeaways VPC Resources: Can access AWS managed services using VPC endpoints. Cost Considerations: Interface endpoints incur costs while gateway endpoints do not. Throughput: Interface endpoints have limitations, whereas gateway endpoints do not. Security: The use of private IP addresses enhances security and eliminates internet exposure. 6. Factors for Choosing Between Endpoint Types Access Method: Can S3 objects be accessed via public IP or only through a private IP? On-Premises Access: Do you need on-premises connectivity? Region Access: Is access needed from another AWS Region? Cost: Is the budget sufficient for interface endpoint costs? Bandwidth and Packet Size: What are the throughput requirements and maximum packet size? This information should provide a solid understanding of how to effectively connect to AWS managed services from a VPC, along with considerations for cost and security. Section 4 Network Troubleshooting Scenarios 1. Common Issues: ○ Slow EC2 instance response times ○ Inability to access EC2 instances via SSH ○ EC2 database instances not applying patches 2. Investigation: ○ Monitor network traffic to identify issues like unnecessary traffic (e.g., DDoS attacks). ○ Verify security group rules for allowed traffic (e.g., port 22 for SSH). ○ Check configurations, such as NAT gateways and route tables, for private subnets. Amazon VPC Flow Logs Purpose: To capture and log network traffic for troubleshooting. Log Types: ○ VPC flow logs: General traffic monitoring. ○ Elastic network interface flow logs: Specific to network interfaces. ○ Subnet flow logs: Focused on specific subnets. Log Delivery: ○ Logs can be sent to Amazon CloudWatch, Amazon S3, or Amazon Kinesis Data Firehose. ○ Can be queried using Amazon Athena or visualized in Amazon OpenSearch Service. IAM Access for Flow Logs Users must have appropriate IAM permissions to create, describe, and delete flow logs. An example IAM policy allows these actions across all AWS resources. Flow Log Record Structure Flow logs consist of multiple fields, including: ○ Version, account ID, interface ID, source and destination addresses, ports, and protocol. ○ Also tracks packets transferred, bytes transferred, and action (accept/reject). Log Status: ○ OK: Normal logging. ○ NODATA: No traffic during the interval. ○ SKIPDATA: Some records skipped due to constraints or errors. Additional VPC Troubleshooting Tools 1. Reachability Analyzer: ○ Tests connectivity between source and destination resources. ○ Identifies any blocking components, such as security groups or network ACLs. 2. Network Access Analyzer: ○ Identifies unintended network access and helps improve security posture. ○ Useful for compliance verification (e.g., isolating networks processing sensitive information). 3. Traffic Mirroring: ○ Creates copies of network traffic for analysis. ○ Enables deep inspection of actual packet content for troubleshooting performance issues or detecting network attacks. These tools and practices help maintain a secure and efficient networking environment in AWS, ensuring that issues can be quickly identified and resolved. Section 5 AWS Well-Architected Framework Network Pillars When designing a VPC network, it's essential to consider future workloads to ensure they are resilient, secure, performant, and cost-effective. The network design should support these standards, as the AWS Well-Architected Framework provides best practices for workload design, operation, and maintenance, helping you make informed architectural decisions. Key Best Practices for Network Design 1. Plan Your Network Topology: ○ Resiliency: The network should perform reliably, anticipating failures and accommodating future traffic growth. ○ IP Address Allocation: Ensure sufficient IP subnet allocation for expansion and future needs: Use CIDR blocks that allow multiple subnets across Availability Zones (AZs). Reserve CIDR space for future growth and be mindful of reserved IP addresses. Deploy large CIDR blocks as they cannot be changed or deleted later. 2. Infrastructure Protection: ○ Network Layers: Group components based on sensitivity. For example, databases with no internet access should reside in isolated subnets. ○ Control Traffic: Implement security controls at all layers using security groups, network ACLs, subnets, and route tables. ○ Inspection and Protection: Use tools like the VPC Network Access Analyzer to inspect and filter traffic. 3. Network Architecture Selection: ○ Performance Impact: Analyze how network-related decisions affect workload performance (e.g., consider latency between AZs versus regions). ○ Networking Features: Continuously benchmark and evaluate workload performance metrics. ○ Network Protocols: Select appropriate protocols (e.g., using TCP for critical data and UDP for real-time data) to enhance performance. 4. Select the Best Pricing Model: ○ Region Selection: Choose AWS Regions based on cost and proximity to users to minimize latency and meet data privacy requirements. Identifying Network Design Issues In the provided scenario for Company A, the following design mistakes were identified: Small VPC/Subnets: A small VPC with limited IP addresses hinders future growth. Permissive Security Groups: Security groups allowed broad internet access to both web and database servers. Direct Database Access: The design permitted direct internet access to database servers, which is insecure. Poor Region Choice: Deploying resources in a European region while the customer base is in the US leads to higher latency. Recommendations for Improvement 1. Use Large VPCs: Implement larger VPCs with sufficient IP addresses for anticipated growth. 2. Separate Security Groups: Configure distinct security groups for web and database servers to restrict access appropriately. 3. Private Subnets for Databases: Place databases in private subnets without direct internet access, allowing maintenance through secure channels. 4. Choose a Closer AWS Region: Deploy in an AWS Region located near the customer base to reduce latency and ensure compliance with data sovereignty. Key Takeaways Plan for IP subnet allocation that accommodates growth. Establish network layers and control traffic effectively. Understand the impact of networking on performance. Optimize performance through suitable network protocols. Select AWS Regions based on cost and proximity to users for efficient service delivery. These principles will guide the design of a robust, secure, and efficient VPC network that aligns with AWS best practices. Module Checks Section 1 key concepts related to scaling VPC networks with AWS Transit Gateway, focusing on centralized routing and peering strategies. Here's a summary: Centralized Outbound Routing Pattern: Purpose: Centralizing egress internet traffic for enhanced security and cost efficiency by routing outbound internet traffic from multiple VPCs through a dedicated egress VPC containing a NAT gateway. Benefits: Simplifies monitoring and control, reduces NAT gateway costs by centralizing the function to one VPC, and enhances security. A NAT gateway is recommended for each Availability Zone for redundancy. Transit Gateway Peering: Scenario: When VPCs in different AWS Regions or accounts need communication, transit gateway peering is used. This allows network traffic to flow between VPCs across different accounts and regions without traversing the public internet, enhancing security. Configuration: To enable this, both transit gateways must create and accept peering connections, and the route tables should be updated to point to the transit gateway attachments. Company Scenario - Connecting Multiple Departments: Problem: A company with multiple VPCs (in the same or different AWS accounts) needs full resource sharing between departments. Solution: The simplest and most scalable solution is to connect all departmental VPCs to a transit gateway, ensuring full connectivity and simplifying future expansion. Configuration Activity: VPC Route Tables: Configure the VPC route tables by setting each VPC’s route destination as the CIDR range of all connected VPCs and routing it through the transit gateway. Transit Gateway Route Tables: The transit gateway route table must have entries for each VPC CIDR block, with the respective transit gateway VPC attachment as the target. Key Takeaways: Transit Gateway: Acts as a centralized regional router connecting multiple VPCs, providing a scalable solution for managing network traffic across regions and AWS accounts. Costs: Charges are based on the number of connections and the amount of traffic passing through the transit gateway. This pattern simplifies VPC management, reduces costs, and improves security for larger organizations or those needing cross-region or cross-account connectivity. Section 2 This section focuses on connecting multiple VPCs in AWS using the VPC Peering feature. VPC Peering establishes a one-to-one, point-to-point connection between two VPCs, allowing Amazon EC2 instances in those VPCs to communicate over private IP addresses, similar to being on the same network. It is ideal for smaller environments or when the budget is constrained, as it does not incur costs (except for inter-Region or inter-Availability Zone data transfers). The feature allows connections between VPCs owned by the same account, different accounts, or even across different AWS Regions, enabling resource sharing or geographic redundancy. Inter-Region traffic is encrypted, and VPC peering ensures that all traffic remains on the AWS backbone, reducing exposure to common internet threats. VPC Peering Architecture and Use Mesh Architecture: When a small number of VPCs need to be connected without the need for transit gateways, you can use peering for each VPC pair. No Transitive Peering: Traffic between two VPCs is isolated to the peering connection, meaning traffic from VPC A to VPC B does not automatically pass through to VPC C unless direct peering connections are established. Establishing VPC Peering 1. Requesting and Accepting: One VPC sends a request to peer, and the other accepts. CIDR blocks of the VPCs cannot overlap. 2. Updating Route Tables: Each VPC owner updates their route table to add a route to the other's CIDR block, specifying the peering connection as the target. 3. Security Groups: The VPC owners may also need to update security group rules to allow traffic between the peered VPCs. Limitations Transitive Peering: Not supported. Peering connections are direct and isolated. CIDR Block Restrictions: Peering is not possible with overlapping CIDR blocks. Internet/NAT Gateway: VPCs in a peering connection cannot use each other’s internet or NAT gateway. PrivateLink Architecture For application-level connections or overlapping CIDR blocks, AWS PrivateLink can be used with a Network Load Balancer. This allows consumer VPCs to connect to service provider VPCs within the same AWS Region, without needing a peering connection. Use Cases 1. File Sharing: Peering VPCs to share data without exposing traffic to the internet. 2. Customer Access: Providing limited access to customers by peering their VPCs with your central VPC. 3. Active Directory Integration: Using VPC peering for centralized services like Active Directory while restricting traffic flow to other VPCs. This helps create secure, low-latency, private connections between VPCs for resource sharing and communication. Section 3 This section covers how to connect on-premises environments to an Amazon Virtual Private Cloud (VPC) using a Site-to-Site VPN, AWS VPN CloudHub, or AWS Global Accelerator. The primary method is through AWS Site-to-Site VPN, which creates encrypted IPsec VPN tunnels between the on-premises customer gateway and the AWS virtual private gateway (or transit gateway). This setup provides a secure connection to your VPC and ensures high availability by creating two VPN tunnels, one for primary traffic and the other for redundancy. The process involves: 1. Creating a customer gateway for the on-premises device. 2. Setting up a virtual private gateway with different Autonomous System Numbers (ASN). 3. Configuring routing tables and updating security groups for protocols like SSH or RDP. 4. Establishing the VPN connection and downloading configuration details. For larger organizations with multiple on-premises networks, AWS VPN CloudHub enables centralized connectivity, allowing communication between multiple customer gateways using unique BGP ASNs. AWS Global Accelerator can also be used to improve VPN connection performance by routing traffic over AWS's infrastructure. Lastly, by using Transit Gateway, you can isolate VPCs from each other while still providing full VPN access to the on-premises network. This setup ensures VPC isolation while routing traffic efficiently between on-premises and VPC environments. Key takeaways: Site-to-Site VPN ensures secure, encrypted communication between on-premises networks and AWS. Global Accelerator improves performance for transit gateway-attached VPN connections. Transit Gateway can be configured to isolate VPCs, preventing cross-communication while maintaining VPN connectivity to on-premises networks. Section 4 This section discusses how to connect an on-premises network to an Amazon Virtual Private Cloud (VPC) using AWS Direct Connect. AWS Direct Connect is a dedicated, private, virtual local area network (VLAN) connection that links your on-premises network to AWS resources. It offers a consistent network experience with predictable performance, high bandwidth, and low latency compared to Site-to-Site VPN, which uses encrypted tunnels over the public internet.

Cloud Computing Roles and AWS Well-Architected Framework PDF

Document Details

Tags

Related

Summary

Full Transcript

Upgrade to continue