Cloud Computing: Principles and Technology Lecture Notes PDF

Cloud Computing: Principles and Technology (4IT482) January 2025 Dr George Feuerlicht Department of Information Technology, Prague University of Economics 1 1 Introduction...

Cloud Computing: Principles and Technology (4IT482) January 2025 Dr George Feuerlicht Department of Information Technology, Prague University of Economics 1 1 Introduction l IT trends l Historical perspective l Technology and business drivers l What is Cloud Computing? l Course objectives and topics Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 2 2 1 Learning Objectives l understand cloud computing motivations l understand business and technology drivers l appreciate the evolution of cloud computing l appreciate the benefits of cloud computing Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 3 3 IT trends and predictions l According to a recent IBM Institute for Business Value (IBV) study, 74% of executives believe that AI will fundamentally change their business operations within the next 5 years. Furthermore, the same study indicates that companies leveraging AI at scale are expected to outperform their peers by 25% in profitability by 2025. l Bloomberg predicts the generative AI market will explode from £40 billion in 2022 to $1.3 trillion by 2032. What’s more, according to research by Goldman Sachs, Generative AI could raise global GDP by 7% and, as McKinsey suggests, save up to 70% of workers’ time. Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 4 4 2 Physical AI [Jensen Huang, Nvidia CEO] l “The internet in the next couple of years will produce more data than all of humanity has ever produced since the beginning. l Machine learning has changed how every application is going to be built, how computing will be done, and the possibilities, beyond. l Nvidia’s new Blackwell RTX 50-series GPUs - 92 billion transistors + Cosmos platform l Physical AI enables autonomous machines like robots and self-driving cars to perceive, understand, and perform complex actions in the real (physical) world” Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 5 5 AI and Cloud [Jeffrey Erickson, Oracle] l AI is integral to cloud computing, enhancing cloud services' automation, decision-making, and scalability. l Cloud computing provides the necessary infrastructure for AI, enabling businesses to leverage AI technologies without significant investments in hardware and software. l The synergy between AI and cloud computing drives innovative applications like generative AI, IoT, and AI- assisted business intelligence. Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 6 6 3 Historical Perspective 2010 … Cloud Computing 1990s ERP Applications 1960s 2007 Mainframes/Data SOA/SaaS Processing Bureau Utility Computing 1970s-80s In-house Development 2000s Outsourcing/ASP Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 7 7 ASP (Application Service Provider) Model l ASP - providing application services to customers over a network (Internet or WAN) l Pre-cursor to cloud computing (early 2000s) l Unsuitable architecture and unviable business model l Multi-instance architecture - separate instance for each client l poor scalability l slow and unreliable connectivity Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 8 8 4 Utility Computing “As information technology's power and ubiquity have grown, its strategic importance has diminished. The way you approach IT investment and management will need to change dramatically” [N. G. Carr, IT Does Not Matter, 2003] “After pouring millions of dollars into in-house data centers, companies may soon find that it’s time to start shutting them down. IT is shifting from being an asset companies own to a service they purchase.” [N. G. Carr, The End of Corporate Computing, 2005] Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 9 9 Commoditization of IT N. G. Carr, IT Does Not Matter, 2003 “The arrival of the Internet has accelerated the commoditization of IT by providing a perfect delivery channel for generic applications. More and more, companies will fulfill their IT requirements simply by purchasing fee-based “Web services” from third parties - similar to the way they currently buy electric power or telecommunications services.” Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 10 10 5 Google Data Centre Containers Extending data center by adding self-contained units with thousands of machines, air-conditioning, power supply, etc. Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 11 11 20th Century paradigm l Software vendors license software to customers l Customers install, customize and maintain software on premises l Challenges l Under-utilization of hardware resources l Excessive demand on IT skills in client organizations to perform customization, integration, upgrades, bug fixes, etc. l Costly, inefficient and unsustainable Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 12 12 6 On-premises Costs l Capital expenses, hardware & software cost, utility bills, etc. l Technical personnel: upgrades and patches, testing and deployment cycles, security and access control, etc. l Administrative staff: keep track of licenses and support arrangements l Frequent project failures l Clients need to focus on core business, not IT! Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 13 13 21st Century paradigm l Fast and reliable connectivity and highly scalable computer infrastructure - practically unlimited compute data storage capacity l Economies of scale - concentration of computing resources in large data centers l Software developed, deployed and maintained by software vendors and delivered as services to end users l In most cases on-premises deployment no longer makes any sense. Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 14 14 7 Technology Drivers l Moore’s Law l Increased processing power l Increased storage capacity l Increased network bandwidth l Reduced cost & size l Advanced processor architectures l Network Effect and Economies of Scale l Reduced cost of shared services as the number of users increases Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 15 15 Moore's Law [Intel co-founder Gordon E. Moore, 1965] The number of transistors that can be placed on an integrated circuit is doubling approximately every two years with a corresponding impact on processing speed and memory capacity of computers. Limits to Moore’s law: l transistor sizes approach atomic scales l high manufacturing costs l heat generation l but, new chip architectures, GPU, specialized processors for AI, etc. Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 16 16 8 Increasing Transistor Density Processor Year Transistors Resolution Intel 4004 1971 2,250 (10,000nm) Intel 8008 1972 2,500 Intel 8080 1974 5,000 Intel 8086 1978 29,000 Intel 80286 1982 120,000 Intel 80386 1985 275,000 Intel 80486 DX 1989 1,180,000 Pentium® 1993 3,100,000 Pentium II 1997 7,500,000 Pentium III 1999 24,000,000 Pentium 4 2000 42,000,000 Core i7 2008 731,000,000 POWER7 2012 2,100,000,000 Apple M1 2020 16,000,000,000 ( 5nm) Blackwell RTX 50-series GPU 2025 92,000,000,000 Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 17 17 Cost of Storage l Cost of gigabyte of disk storage: l 1956 - $10 million l 1990 - $10,000 l 2000 - $10 l 2010 < $ 0.1 l 1TB SSD ~ $120 -> $0.12 per GB l Amazon AWS Glacier ~ $0.004 per gigabyte per month Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 18 18 9 Fast Connectivity l New ultra-fast networks like 5G and Wi-Fi 6E enable more data to be streamed and the lower latency supports the proliferation of cloud-based applications l 5G: 10 GB/S, latency < 4mS l Near real-time connectivity enables IoT applications, autonomous vehicles, at-home healthcare services, VR apps, etc. l Faster and more reliable network connectivity favours cloud computing – avoids the need for on premises hardware and software Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 19 19 Network Effect - Economies of Scale l The cost of a product/service reduces as the number of consumers of the product/service increases. l The value of a product/service increases with the number of users of that product/service, i.e. as more people use a product/service, the more valuable it becomes to each individual user. Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 20 20 10 Business Drivers l Reduction of cost of entry (start-up costs) – converts CapEx into OpEx l Reduction of TCO (Total Cost of Ownership) l minimize hiring technical and administrative personnel l avoid utility bills (electricity, etc.) l Scale up or down (elasticity) l Flexibility and Agility - fast deployment of resources when needed, experimentation and innovation l Global reach (multiple region deployment) Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 21 21 What is Cloud Computing? Delivery of virtualized IT resources as services over the Internet. Cloud Computing services are delivered in a scalable and secure manner on- demand from a remote data center on a pay-per- use basis and can be categorized into infrastructure services (IaaS), platform services (PaaS) and software services (SaaS). Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 22 22 11 Cloud Computing l Elastic IT services delivered on demand on a pay-as-you-go basis l Benefits l Agility l Elasticity – up and down scalability l Cost-predictability – no upfront CapEx l Economies of scale l Global deployment l Allows focus on core business Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 23 23 Industrialization of IT l Economies of scale – massive data centers with users sharing H/W and S/W services l Commoditization of hardware, ubiquitous network connectivity, virtualization, Infrastructure as Code l Democratization of access to advanced technologies without requiring specialized skills or extensive resources l Increasing maturity and specialization results in re- distribution of responsibilities between cloud providers, end-user organizations and various types of third-parties Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 24 24 12 Course Objectives l Appreciate cloud computing challenges and opportunities l Understand the main principles of distributed computing l Understand components of cloud computing environment, and the delivery and deployment models l Understand the underlying principles and techniques of NoSQL databases l Gain practical experience with selected AWS services Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 25 25 Topics l Introduction: current trends, evolution of IT, business and technology drivers l Distributed computing concepts: transactions, data replication, APIs, RPCs, message queuing, virtualization, etc. l Cloud computing concepts: deployment models: private, public and hybrid clouds, delivery models: SaaS, IaaS, PaaS, multitenancy, cloud benefits and challenges, etc. l Microservices Architecture, DevOps: continuous delivery, continuous deployment and continuous integration Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 26 26 13 Topics … l Amazon Web Services (AWS): AWS security, EC2, S3, RDS, MongoDB, Lambda, ML, etc. l NoSQL databases: principles and technology: CAP theorem, Document databases, Graph databases, Column databases, etc. l Cloud migration: identifying opportunities for cloud, cost- benefit analysis, provider selection, etc. l Cloud computing trends Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 27 27 Tutorials and AWS Exercises l Videos and tutorial questions l AWS Lab Exercises using AWS Academy l S3 storage services l EC2 service deployment l Lambda functions l AWS RDS service: MySQL l NoSQL databases: MongoDB l Machine Learning l Speech and Image recognition l Programming assistant - Amazon Q Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 28 28 14 Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 1 29 29 15 Distributed Computing Concepts and Techniques l Evolution of Distributed Computing l Client/server Architecture l Remote Procedure Calls (RPCs) l Message Queuing l Application Programming Interfaces (APIs) l Transactions and Consistency l Data Replication l Service Oriented Architecture (SOA) l Microservices Architecture Copyright© George Feuerlicht 2025 Concepts - Lecture 2 1 1 Learning Objectives l understand core distributed computing concepts l understand SOA l understand basic client/server communication models (RPCs and message queueing) l understand transactions l understand data replication l understand the operation of APIs Copyright© George Feuerlicht 2025 Concepts - Lecture 2 2 2 1 Enterprise Computing Objectives l Functionality as per user requirements l Maintainability, flexibility and reuse l Reliability (consistency, fault tolerance, recovery) l Elasticity (up and down scalability) l Security (authentication, authorization, physical security) l Cost minimizations (ROI) l How to configure computing resources to support the above requirements (architecture)? l Rapidly evolving technology Copyright© George Feuerlicht 2025 Concepts - Lecture 2 3 3 IT Architecture Presentation Layer Business Process Layer Data Layer Decisions about implementation of the layers: What devices/platforms should they run on? How do they communicate? How to maximize performance and minimize the cost? As IT evolves the answers to these questions may change. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 4 4 2 Evolution of Distributed Computing l Centralized mainframe computing (1960s) l Client/server Computing (1980s) l Distributed Objects and Components (1990s) l Service Oriented Computing (2000s) l Cloud & Microservices Architecture (2010 - ) l Architectures evolve, but most concepts remain the same Copyright© George Feuerlicht 2025 Concepts - Lecture 2 5 5 Centralized Mainframe Computing Copyright© George Feuerlicht 2025 Concepts - Lecture 2 6 6 3 IBM 1800 Computer (1976) 64 KB memory 1– 5 MB disks punch cards magnetic tape Fortran/Cobol Copyright© George Feuerlicht 2025 Concepts - Lecture 2 7 7 2-Tier Client/Server Architecture PC revolution … Copyright© George Feuerlicht 2025 Concepts - Lecture 2 8 8 4 3-Tier Client/Server Architecture More tiers -> better performance but more complexity …. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 9 9 Client/Server and Middleware Client Client Client Middleware (SOA, ESB, MQS) Server Server Copyright© George Feuerlicht 2025 Concepts - Lecture 2 10 10 5 Client/Server Communications l Synchronous vs Asynchronous communications l Synchronous – client waits for response before proceeding with the next request l Asynchronous - client proceeds without waiting for response from the server l Message vs Procedure Call (RPC) l Message-style interaction (REST) – client sends a message with (XML/JSON) document as payload l RPC-style interaction – client executes a remote procedure call Copyright© George Feuerlicht 2025 Concepts - Lecture 2 11 11 Synchronous vs Asynchronous Communication Copyright© George Feuerlicht 2025 Concepts - Lecture 2 12 12 6 Remote Procedure Calls l RPCs are a basic client/server communication mechanism - extension of procedure calls l RPCs are supported at programming, database, and operating system levels l RPCs can operate synchronously or asynchronously l Examples: DCE RPC, SOAP RPC, Database RPC, etc. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 13 13 Message Queuing l Messaging queuing is a communication model suitable for highly distributed, heterogeneous and autonomous applications l Message queuing applications communicate using the store-and-forward paradigm and are more resilient to network, machine, and application failures l The asynchronous nature of message queuing enables applications to be loosely coupled and highly autonomous Copyright© George Feuerlicht 2025 Concepts - Lecture 2 14 14 7 Message Queuing l Asynchronous operation l Store and Forward communication model l Reliable, recoverable queues queue queue Site A Site B Message Queuing Middleware Copyright© George Feuerlicht 2025 Concepts - Lecture 2 15 15 Transactions l Centralized and distributed DBMS l TP monitor middleware - X/Open DTP l Corba - Object Transaction Service (OTS) l EJB - Java Transaction Service (JTS) l Web Services (XLANG, BPEL) l NoSQL databases: MongoDB, Neo4j, etc. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 16 16 8 Full Consistency Model l Synchronous tightly coupled transactions l Consistent state is reached at the end of each transaction l All operations (sub-transactions) complete in the context of a single atomic transaction l All resources must be available for the transaction to complete l Failure recovery (1PC, 2PC, 3PC..) – all sub- transactions must be undone Copyright© George Feuerlicht 2025 Concepts - Lecture 2 17 17 ACID Transactions l Atomicity - all operations or none l Consistency - data consistent after transaction completes l Isolation - partial results not revealed to other transactions l Durability - committed transactions cannot be undone l Gold standard Copyright© George Feuerlicht 2025 Concepts - Lecture 2 18 18 9 Commit and Rollback u1 I1 u2 d1 begin commit u1 I1 u2 d1 begin rollback Copyright© George Feuerlicht 2025 Concepts - Lecture 2 19 19 Failure Recovery l Rollback Recovery - rollback recovery takes place when a failure occurs during the execution of a transaction; the original state (i.e. state at the beginning of the transaction) of the database is restored. l Rollforward Recovery Roll-forward recovery is used to recover from media failures (e.g. disk crashes) and involves re-applying all committed transactions into the database, starting from a specific point in time. Rollforward recovery is preceded by restoring the database from a backup. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 20 20 10 Distributed Transactions (synchronous) whenever error rollback; BEGIN T1: update savings_account@syd set amount = amount - 100 where accno = 77608; update cheque_account@mel set amount = amount + 100 where accno = 99609; commit; END Sydney Melbourne Copyright© George Feuerlicht 2025 Concepts - Lecture 2 21 21 2-Phase Commit Protocol l Coordinator (C) l Participants (P) l Two phases are involved: l During phase one coordinator messages participants to establish if they are ready to commit l During phase two all participants must complete the transaction as instructed by the coordinator l Coordinator and participants record all events in local logs before sending messages Copyright© George Feuerlicht 2025 Concepts - Lecture 2 22 22 11 2PC Suitability l 2PC is a synchronous protocol resilient to all failures l Some drawbacks l Complexity and performance issues l Blocking - participant must retain resources until the transaction completes l All participants must be available for transaction to commit (limited scalability) l Not suitable for unreliable networks and complex network topologies Copyright© George Feuerlicht 2025 Concepts - Lecture 2 23 23 Eventual Consistency Model l Asynchronous operation l Sub-transactions proceed independently l Loose (eventual) consistency – consistency eventually restored l Failure recovery may be required l Suitable for distributed systems with high latency and low reliability Copyright© George Feuerlicht 2025 Concepts - Lecture 2 24 24 12 Data Replication l Maintaining multiple copies of data objects l Replication transparency - controlled redundancy l Synchronous vs. asynchronous replication l Replication can improve performance and availability l Replication is used to synchronize data across multiple databases or storage systems Copyright© George Feuerlicht 2025 Concepts - Lecture 2 25 25 Synchronous Replication l Tight consistency model l All copies are always up-to-date l Updated in a single transaction (2PC) l Suitable for reliable and fast network environments l Only needed for applications that cannot tolerate asynchrony Copyright© George Feuerlicht 2025 Concepts - Lecture 2 26 26 13 Asynchronous Replication l Loose consistency model - deferred updates, eventual consistency l Inconsistencies may arise during the latency period (the time it takes to update secondary sites) l Suitable for unreliable or intermittently connected networks, e.g. mobile computing l Most applications can tolerate some degree of asynchrony Copyright© George Feuerlicht 2025 Concepts - Lecture 2 27 27 Master-Slave Replication l table snapshots, materialized views, etc. l refreshed in regular intervals l full or incremental refresh l can be performed during off-peak periods Master Replica regular refresh Copyright© George Feuerlicht 2025 Concepts - Lecture 2 28 28 14 Master-Slave Replication … l store and forward replication l each transaction is queued up separately l shorter latency period Master Replica queued transactions Copyright© George Feuerlicht 2025 Concepts - Lecture 2 29 29 Multi-Master Replication l All sites are equal - any copy can be targeted for updates l Conflicts may arise when multiple sites update the same records within the latency interval Master Master queued transactions Copyright© George Feuerlicht 2025 Concepts - Lecture 2 30 30 15 Conflict Resolution l Conflict resolution l System detects conflicts l Application defines a resolution algorithm l Standard resolution algorithm l timestamp determined most recent update l commutative resolution of additive updates l site with the highest priority value l min/max selection of updates Copyright© George Feuerlicht 2025 Concepts - Lecture 2 31 31 Application Programing Interfaces (APIs) l Primary objectives – to gain independence of the underlying platforms l Portability across different platforms l Standard APIs avoid the need for point-to-point integration solutions l JDBC, Google Maps API, AWS service API, Kubernetes... l Extensive use in web and cloud applications Copyright© George Feuerlicht 2025 Concepts - Lecture 2 32 32 16 Service-Oriented Architecture (SOA) l SOA is a set of architectural concepts and principles that include design patterns, development methods and related technologies for the implementation of service- oriented applications l Services are a basic SOA abstraction; SOA applications are compositions of loosely coupled, autonomous services l Web services – standards for machine-to-machine communication using SOAP or REST Copyright© George Feuerlicht 2025 Concepts - Lecture 2 33 33 SOA Motivations l Requirement to support inter-enterprise business processes, i.e. interactions between organizations by forming supply chains, or by outsourcing individual business functions to external (cloud) service providers l Need for a more flexible and responsive intra- enterprise computing architecture l Standard interfaces (XML, JSON) – implementation using different languages and technologies Copyright© George Feuerlicht 2025 Concepts - Lecture 2 34 34 17 Properties of Services l Functional properties l Interface definition (e.g. WSDL) l Service methods l Protocols – REST, SOAP l Non-Functional properties (QoS) l Security l Availability l Response time l Price, etc. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 35 35 Web Services Standards Copyright© George Feuerlicht 2025 Concepts - Lecture 2 36 36 18 SOAP l Web Services SOAP is a standard protocol for communication between services and a mechanism for error handling l Extensibility mechanism l Conventions for representing data structures in XML l Binding to network protocols (HTTP) l Supports Remote Procedure Calls (RPCs) and XML document interchange Copyright© George Feuerlicht 2025 Concepts - Lecture 2 37 37 REST (Representational State Transfer) l Architectural Styles and the Design of Network-based Software Architectures, Roy Fielding, 2000 l Stateless operation: each request must contain all the information necessary to perform the operation; i.e. no context stored on the server l Named resources using a URLs l Uniform API: all resources accessed with a generic interface (HTTP GET, POST, PUT, DELETE). l Uses web infrastructure to implement Web Services: HTTP, but not SOAP Copyright© George Feuerlicht 2025 Concepts - Lecture 2 38 38 19 Microservices Architecture l Development of applications as loosely-coupled independently deployable services l Each fine-grained service typically implements a single (cohesive) function (or set of functions) and communicates using a well-defined interface l Microservices are typically implemented and deployed as (Docker) containers l Light-weight messaging protocols l Scalability and fault tolerance via container replication Copyright© George Feuerlicht 2025 Concepts - Lecture 2 39 39 Microservices Architecture … The microservice architectural style is an approach to developing an application as a suite of small services, each running in its own process and communicating with a lightweight mechanisms. l Benefits l Short development cycles l Incremental development l Fast deployment l Independent scalability Copyright© George Feuerlicht 2025 Concepts - Lecture 2 40 40 20 Medical Surgery Management Microservices Surgery Doctor Nurse Management Management Management Service Service Service Admin. Prescription Service Service Staff Patient Appointments Management Management Service Service Service - each microservice is deployed on a number of separate nodes - microservices communicate via API calls Copyright© George Feuerlicht 2025 Concepts - Lecture 2 41 41 DevOps (Development and Operations) l DevOps combines software development (Dev) and software operation (Ops) l Automation and monitoring of the entire SDLC (development, integration, testing, deployment and infrastructure management) l Small focused teams, shorter development cycles, increased deployment frequency l Closely linked with the use of microservices and cloud computing Copyright© George Feuerlicht 2025 Concepts - Lecture 2 42 42 21 DevOps and Microservices Architecture l Microservices are typically implemented using (Docker) containers deployed on a cloud infrastructure and communicate via stable APIs protocol. l Containerization of applications enables rapid deployment and scalability and is essential for the implementation of microservices architecture. l DevOps teams can rapidly respond to user requests for enhancements and bug fixes without impacting on other components of the application. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 43 43 DevOps and the Cloud l Cloud brings many benefits for both development and operations teams. l Cloud enhances DevOps by abstracting the complexity of the underlying hardware and software environments. l H/W and S/W resources are treated as services that are controlled programmatically via service APIs. l Cloud platforms support automatic infrastructure provisioning, configuration, and management and this enables shorter development cycles, as well as improved collaboration between development and operations teams. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 44 44 22 Scalability and Fault Tolerance Considerations l Modern cloud applications need to be designed for fault tolerance and scalability. l Vertical scalability - system scales vertically (scale up) when it is expanded by adding computing resources (increasing number of CPUs, main memory, storage, and network interfaces) to an existing computing node l Horizontal scalability - system scales horizontally (scale out) by adding share-nothing processing nodes. The workload and data is distributed across multiple nodes. Fault tolerance achieved by multiple nodes running identical microservices. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 45 45 Vertical vs. Horizontal Scalability Copyright© George Feuerlicht 2025 Concepts - Lecture 2 46 46 23 Horizontal Scalability l Share-nothing architecture - each node contains all the resources (CPUs, RAM, storage, etc.) l Scalability is achieved by adding cluster nodes l Sharding data into partitions across multiple nodes allows storage of very large data volumes l Asynchronous data replication is used to improve availability (3-4 copies of each record) l Data consistency becomes an issue in systems that use asynchronous replication Copyright© George Feuerlicht 2025 Concepts - Lecture 2 47 47 Fault Tolerance l Failures occur when system capacity is exceeded (out of memory, disk space, or computing capacity) or due to an unexpected event (node failure, power interruption, disaster, etc.) l Redundancy - all critical components of the system are implemented redundantly, no single point of failure l Reserve capacity - all resources should have reserve capacity to handle peak loads l Recoverability - the ability of the system to recover from a failure. This includes backup and recovery, stand-by systems, etc. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 48 48 24 Summary l IT architecture evolves to take advantage of more powerful, faster and less expensive hardware components, balancing the various requirements of enterprise applications. l The latest iteration is cloud computing that utilizes large numbers of independent commodity computing units (nodes) interconnected with a very fast network to service large user populations and manage massive volumes of data. Copyright© George Feuerlicht 2025 Concepts - Lecture 2 49 49 Copyright© George Feuerlicht 2025 Concepts - Lecture 2 50 50 25 Cloud Computing l What is Cloud Computing? l Cloud service models: SaaS, IaaS, PaaS l Cloud deployment models l Cloud Computing pre-requisites l Virtualization: virtual machines and containers l Multitenant architecture l Cloud Computing benefits and challenges l Impact of Cloud Computing Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 1 1 Learning Objectives l understand the motivations and key concepts of Cloud Computing l understand Cloud Computing models: SaaS, IaaS, PaaS l understand the benefits and challenges of Cloud Computing l appreciate the impact of Cloud Computing Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 2 2 1 What is Cloud Computing? l Cloud Computing is an umbrella term with no precise definition. l There is some disagreement about the definition of Cloud Computing. Some authors insist on multitenancy for SaaS application. There is also a discussion about the minimal scale of a cloud, automatic provisioning, etc. Some experts dismiss the concept of a private cloud. Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 3 3 NIST Definition “Cloud computing is a pay-per-use model for enabling available, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” [The National Institute of Standards and Technology (NIST), Information Technology Laboratory] Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 4 4 2 Cloud Computing Pre-requisites l Reliable and low-cost communications l Virtualized commodity hardware resources l Up and down scalability - elasticity l Fault tolerance and recoverability l Autonomic operation – minimal manual intervention l Multitenant architecture for SaaS and PaaS Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 5 5 Virtualization l Infrastructure as software - software representation of servers, storage, networks, etc. l Virtual Machine Images (VMI) l Benefits of virtualization l Improves efficiency and agility l Faster server provisioning and recovery from failure l Isolation of applications l Hardware independence Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 6 6 3 Virtual Machines l Complete pre-configured images of applications and OS - abstraction of physical hardware l Amazon AMIs, DMTF's (Distributed Management Task Force) OVF (Open Virtualization Format) l Hypervisor manages multiple VMs that run on a single physical machine l VMs can be slow to deploy and boot (GBs in size) Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 7 7 Container-Based Virtualization l Application layer abstraction that packages application code and all its dependencies l Multiple containers can run on the same machine and share the OS kernel, each running as an isolated process l Containers take up less space than VMs (typically tens of MBs) l Faster provisioning and recovery Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 8 8 4 Container-based Virtualization Virtual Machines Containers Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 9 9 Container Management l Management of large-scale container-based environments requires automation to ensure fast and predictable application deployment, auto-scaling, load balancing and control of resource usage l Requirement for portability across different public and private clouds (AWS, Azure, GCP, on-premises, etc.) l Amazon Elastic Container Service (Amazon ECS), Google Kubernetes Engine (GKE), Azure Kubernetes Service (AKS), Amazon Elastic Kubernetes Service (Amazon EKS) Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 10 10 5 Kubernetes Project https://kubernetes.io/ l Kubernetes is a system for automating deployment, scaling, and management of containerized applications l Initiated by Google in 2014 as an open-source cluster manager for Docker containers l Hosted by Cloud Native Computing Foundation (CNCF) l Participants: Google Cloud Platform, Microsoft Azure and most recently AWS Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 11 11 Kubernetes Concepts l Environment for services, not machines l Abstract the complexity of the underlying cloud infrastructure using a set of well-designed APIs l Self-healing: auto-placement, auto-restart, auto- replication, auto-scaling l Portable: public, private, hybrid, multi-cloud l Well-defined APIs as the main mechanism for ensuring extensibility and portability Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 12 12 6 Multitenant Architecture l Tenants operate in virtual isolation from one another l Clear separation of tenant data and the metadata that describes each application l Each tenant has its own virtual database l Multitenant Data Model – each tenant can extend its database (e.g. add columns to tables, add new database objects, etc.) Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 13 13 Multi-instance vs Multitenant Architecture l A tenant is a group of users (organization) that share a common access to their own version of the application and data, and may be a subject to a separate SLA (Service Level Agreement) l Multi-instance architecture – a separate software instance operates on behalf of each tenant (e.g. separate databases for every tenant) – does not scale l Multitenant architecture - a single instance of software serves multiple tenants Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 14 14 7 Polymorphic Applications l Customizable functionality: user interface, business logic, database schema l Different runtime application behavior for different tenants l Application components configured at runtime; tenants can use different versions of application modules l Multitenant query optimization l Multitenant indexing Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 15 15 Cloud Service Models l IaaS (Infrastructure as a Service) l Networks, computers, storage, … l PaaS (Platform as a Service) l Development and deployment of applications l SaaS (Software as a Service) l Complete applications, e.g. Gmail, CRM, … l Different levels of control and responsibilities Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 16 16 8 Cloud Service Models … Source: MicroSoft Azure Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 17 17 AWS (IaaS) l Amazon EC2 Compute Services l Amazon S3 storage services l Amazon RDS database services l MySQL, PosgreSQL, Oracle, SQLServer, etc. l Amazon NoSQL database services l DynamoDB, Amazon Neptune, DocumentDB l Amazon VPC service l https://aws.amazon.com/ Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 18 18 9 MicroSoft Azure (IaaS) l Virtual Machines: Windows and Linux l Networking: virtual networks, load balancer, VPN gateway l Container services: Azure Container Service (AKS), Container Registry, Kubernetes l Databases: MySQL, PostgreSQL, etc. l Data and Analytics: SQL Data Warehouse, Machine Learning l https://azure.microsoft.com/en-us/ Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 19 19 Google App Engine (PaaS) l Google App Engine for Java l Develop and deploy Java and J2EE applications l Support for Java Servlets, Java Server Pages (JSP), Java Data Objects (JDO), Java Persistence API (JPA), etc. l Ruby, C#, Go, Python, or PHP—or bring your own language runtime l https://cloud.google.com/appengine/ l https://cloud.google.com/products/ Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 20 20 10 Salesforce.com (SaaS) (www.salesforce.com) l Over 150,000 companies sharing Salesforce CRM applications and infrastructure l Multitenant architecture l Meta-based customisation enables users to retain the changes to their applications with future software upgrades l Integration via Web Services, REST services, and social networking platforms: Facebook, etc. l Analytics and forecasting Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 21 21 Function as a Service (FaaS) l Serverless computing is a cloud execution model that dynamically allocates compute resources needed to execute a particular piece of code. l Avoids the need for server provisioning l The cloud provider is responsible for provisioning and maintenance of all the resources required to execute the code; consumers pay for duration of the function execution (in seconds). l Examples: AWS Lambda, Azure Functions, Google Functions, IBM OpenWhisk, etc. Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 22 22 11 Characteristics of Serverless Computing l Supports event-driven programming l Fully managed stateless services l Rapid transparent scalability l On-demand services - pay for execution time l Built-in logging & monitoring l Next step in virtualization – unit of compute l https://aws.amazon.com/serverless/ Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 23 23 Cloud Deployment Models l Public cloud - cloud infrastructure is owned and managed by the cloud provider and is available to the public over the Internet. Public cloud services are typically offered on a pay-per-usage basis. l Private cloud - cloud infrastructure is owned or leased by a single organization and is operated solely for that organization. l Hybrid cloud - combination of public cloud and on- premises systems/private cloud. Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 24 24 12 Government Cloud “AWS GovCloud (US) is an isolated AWS region, subject to FedRAMP High and Moderate baselines, that allows customers to host sensitive Controlled Unclassified Information (CUI) and all types of regulated workloads. The region is operated by employees who are U.S. citizens on U.S. soil. The region is only accessible to vetted U.S. entities and root account holders, who must confirm they are U.S. Persons to gain access to this region.” https://aws.amazon.com/govcloud-us/ Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 25 25 Government Cloud … Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 26 26 13 Benefits of Cloud Computing l Reduced cost – elimination of up-front costs l Predictability of costs – “pay as you use” l Elasticity – up and down scalability l Possibility of short-term renting of IT resources l Transference of risk: responsibility for operation and upgrades transferred to the provider l Rapid Implementation l Reduced demand for skills and IT staff l Enables new innovative solutions and experimentation Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 27 27 Cost Reduction l Improved hardware utilization from < 20% to > 80% l Additional savings by the placement of data centers in locations to minimize the cost of labor, electricity, accommodation, taxes, etc. l Overall cost reduction by a factor of 5-7 Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 28 28 14 Cloud Computing Challenges l Customer lock-in l Business continuity and service availability l Data confidentiality and security l Data transfer bottlenecks l Performance unpredictability l Most of these concerns have been addressed by large cloud providers (AWS, MS Azure, Google cloud, etc.) Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 29 29 Customer lock-in l Concerns about the difficulty of extracting data from the cloud is preventing some organizations from adopting cloud computing l Users may be vulnerable to price increases, reliability problems, providers going out of business or providers de-platforming user organizations l Cloud native vs. cloud agnostic solutions: Kubernetes, open-source software Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 30 30 15 Business Continuity and Service Availability l Service availability of large Cloud Computing providers is typically > 99.9% = down-time 8.77 hours per year l Multiple data centers in different locations and multiple network providers l Amazon EC2 service level agreement guarantees 99.95 % availability for each region for multiple availability zone operation l Non-technical outages: provider going out of business, target of regulatory action, de- platforming Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 31 31 Provider Liability – Small Print l “We will have no liability to you for any unauthorized access or use, corruption, deletion, destruction or loss of Your Content or Applications” [Customer Agreement, Amazon Web Services] l “Salesforce.com shall not be responsible or liable for the deletion, correction, destruction, damage, loss or failure to store any customer data” Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 32 32 16 Data Confidentiality and Security l Security has been one of the most often-cited concerns about Cloud Computing l Auditability requirements (Sarbanes-Oxley in USA) l Regulations that affect location of corporate data l Responsibility is divided among parties, including the cloud user and the cloud provider l Cloud providers must guard against theft or denial- of-service attacks. Users need to be protected from one another, and the provider. Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 33 33 Salesforce Security Certifications Most Most Trusted Largest Complete Certifications Security Team Application Security ISO 27001 SSL encryption Identity confirmation SAS 70 Type II; SysTrust certified FDA 21 CFR Part 11 & Part 820 Network Security IP restrictions Fault tolerant, multi-layered firewall Intrusion detection 3rd party assessments Facility Security 24 x 365 security Biometric readers Silent alarms CCTV Motion detection Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 34 34 17 Data Transfer Bottlenecks l Data transfer can be expensive – for AWS: $100 per terabyte (depends on the region) l It is often less expensive and quicker to physically ship disks than to use cloud data transfer l 10Tb at 20Mbits/sec = 45 days ($1,000) l AWS provides import/export courier service Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 35 35 Performance Unpredictability l Problem of I/O interference between virtual machines, large variability in disk write bandwidth l Unpredictability in running large batch jobs on large clusters Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 36 36 18 Identifying Opportunities for Cloud Deployment l Good solution for unpredictable or variable demand on compute and storage resources l Batch processing application – can take advantage of parallelism l Suitable for new startup operations – no need for on-premises IT l On-going costs need to be evaluated Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 37 37 Conclusions l Most enterprise applications will be delivered as cloud services in the future l Each scenario requires careful evaluation balancing advantages against drawbacks l Some standardization efforts in progress (Open Cloud Consortium), but existing providers use proprietary architectures and APIs l Rapid developments – many players, consolidation is inevitable Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 38 38 19 Copyright© George Feuerlicht 2025 Cloud Computing - Lecture 3 39 39 20 Amazon Web Services l Introduction to AWS l Service categories l Global infrastructure l AWS security and Virtual Private Cloud (VPC) l AWS Service examples l EC2, Elastic Beanstalk, ELB, Lambda, EKS, ML, CloudFormation l Demos/Exercises l EC2, AWS Lambda Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 1 1 Learning Objectives l gain familiarity with a range of AWS compute services l demonstrate and gain hands-on experience with selected AWS compute services Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 2 2 1 AWS Core Services l Provision of IT resources (compute, network, database, etc.) on-demand using the pay-as- you-go model over the internet l Platform to build scalable applications using services as building blocks l Enterprise applications are a collection of remotely hosted services controlled via well- defined APIs l Improved productivity throughout the SDLC Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 3 3 Service Categories l Compute (EC2, etc.) l Storage (S3, etc.) l Database (RDB, NoSQL, etc.) l Security (IAM) l Cost management and budgeting (AWS Budget) l Governance (CloudTrail, etc.) l Machine Learning l and many more https://aws.amazon.com/ Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 4 4 2 Interacting with AWS Services l using AWS management console l via REST-like interfaces l using AWS Command Line Interface (CLI) l using SDKs (Software Development Kits) for C#, Java, Node.Js, etc. Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 5 5 AWS Global Infrastructure l Regions, Availability Zones, Edge Locations l Ensures elasticity, fault-tolerance and high availability Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 6 6 3 Regions l Region is a geographic location: US East (Ohio), (N. Virginia), US West (N. California), (Oregon), Asia Pacific (Mumbai), (Osaka-Local), (Singapore), (Sydney), etc. + GovCloud region l Proximity reduces latency l Services may not be available in all regions and cost of services can vary l Specific region deployment is used to satisfy data residency requirements (e.g. for healthcare data) l Data is not replicated outside of the region Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 7 7 Availability Zones l Region contains Availability Zones (AZs) l AZ consists of one or more data centres within a region isolated from failures in other AZs. l Data center contains typically 50-100 thousand servers l AZs are isolated from each other, but are within 100 km of each other and connected with a fast low-latency link l Independent power supply and utility companies, backup generators, cooling equipment, network connectivity (different network providers) l Replicating across availability zones improves resiliency Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 8 8 4 Edge Locations l Content delivery network – Amazon CloudFront l Located in highly populated locations l Content Delivery Network (CDN) l Amazon CloudFront speeds up distribution of static and dynamic web content, such as image files l CloudFront delivers content through a worldwide network of data centers called edge locations l User is routed to the edge location that provides the lowest latency using Amazon Route 53 DNS Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 9 9 AWS Security l Data center physical security l Electronic surveillance l 24/7 security guards l AWS Identity and Access Management (IAM) l Multi-factor access control l username/password, SMS code, MFA device l AWS Shield (Denial of Service) l VPC (Virtual Private Cloud) l Security Groups (Firewall) Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 10 10 5 Security and Access Control l IAM (Identity Access Management) defines who can access which services l CloudTrail – audit trail of all user activity and API access l CloudHSM (Hardware Security Module) – storage of encryption keys l VPC (Virtual Private Cloud) – control of network addresses, rules for allowing traffic flows l Direct connect – into AWS region (uses AWS global network, not the public internet) Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 11 11 Amazon Virtual Private Cloud l Creates a private network within AWS cloud l Several layers of security controls – ability to allow and deny specific internet and internal traffic l VPC integrates with other AWS services, e.g. EC2, RDB, etc. l VPC lives within a region l Multiple VPCs can be created for each account Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 12 12 6 VPC … l VPC defines an IP address space that can be divided into subnets – each subnet is deployed in an AZ, so that VPC can span AZs l Route tables control traffic out of subnets l Subnets can be public (with internet access) or private (without internet access) l Public subnets need an internet gateway (IGW) l EC2 instances need a public IP address to route to the IGW Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 13 13 VPC … Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 14 14 7 Security Groups l Firewalls for virtual servers l Filter traffic to instances l Security rules for inbound and outbound traffic l Specifies allowed sources (IP addresses or other security groups) and the permitted ports and protocols, e.g. l Web tier accepts traffic on port 80/443 l Application tier only accepts traffic from the Web tier l Database tier only accepts traffic from the Application tier l Remote administration (ssh port 22) Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 15 15 Security Groups … Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 16 16 8 Compute Services l Flexible and cost-effective l Elastic - scale computing needs to match workloads l Flexible configuration, e.g. for machines learning l Management via API l High reliability: 99.99% Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 17 17 Types of Compute Services l Infrastructure services (EC2) l Serverless (AWS Lambda) l Container-based services: ECS (Elastic Container Service), EKS (Elastic Kubernetes Service), etc. l Web applications (AWS Beanstalk) l Select service based on application design, availability requirements and usage pattern Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 18 18 9 AWS EC2 (Elastic Compute Cloud) l Resizable compute services running a range of O/S (virtual machines templates - AMIs ) l Selection of H/W & S/W and the hosting site l Optimized EC2 instances l M1, M3 - general purpose l C1, C3, CC2 - compute optimized l I2, HS1 - storage and IO optimized l GPU (floating point operations) - G2, CG1 l T2 - load cost instances (burst performance) l X1 - large-scale in-memory applications l C7gn with Nitro5 chip, 200Gpbs bandwidth for high performance computing – ML applications, etc. Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 19 19 EC2 Machine Images (AMIs) l Standard and modified AMIs l Different types and sizes and networking capacity - t3.large – general purpose, 3rd gen. large l Choice of region, subnet and VPC for AMI deployment l Allocation of public IP address, roles, storage options (EBS, S3, size, encryption) Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 20 20 10 Classic Load Balancer (CLB) l Operates at Layer 4 (Transport layer) of OSI l Access through a single point to a collection of EC2 instances l Provides high availability and fault tolerance l Increased scalability and elasticity l Round robin distribution for HTTP requests l Distributes requests across availability zones Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 21 21 Classic Load Balancer (CLB) Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 22 22 11 Application Load Balancer (ALB) l Operates at Layer 7 (Application layer) of OSI l Native support for microservices and container- based architectures l Directs traffic based on the content of the URL to containerized microservices listening on different ports l Protocols: HTTP, HTTPS, HTTP/2, etc. l Health checks: ping, repeat interval, timeout Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 23 23 Application Load Balancer … Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 24 24 12 Auto Scaling l Auto scaling ensures that the correct number of EC2 instances are available for the service to meet the workload requirements l Maintain performance and minimize costs l On-demand provisioning l AWS CloudWatch monitoring l Variable load requires allocating EC2 instances to handle peak loads and at the same time avoiding over-allocation and additional costs Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 25 25 Auto Scaling … l Based on utilization l Adding EC2 instances – scaling out l Removing EC2 instances – scaling in l Auto scaling components l Auto scaling policy: on-demand, scale out/in policy Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 26 26 13 Auto Scaling … Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 27 27 CloudWatch Alarms l Monitor EC2 instances and load balancer l Whenever threshold (e.g. 80% utilization) for a duration (e.g. 5 mins) is reached l Action l increase group size by two instances if CPUUtilization > 80% for 5 mins l decrease group size by one instance if CPUUtilization < 20% for 5 mins Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 28 28 14 Cost Optimization Options l On-Demand Instance – pay per second, no long-term commitment l Spot Instances - applications that can tolerate interruptions, up to 90% discount, instance can be reclaimed with 2 mins warning l Reserved Instances - long-term workloads with predictable usage patterns l Dedicated Hosts existing per-core, or per-VM software licenses Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 29 29 AWS Lambda l Serverless computing - run code without provisioning and managing servers l Event-driven microservices – changes to data (S3), http requests, API calls, etc. l Pay only when the application is running l Autoscaling and logging l Support for multiple languages: Java, C#, etc. l Disk limited to 512Mb, memory limited to 1,536Mb, function execution max. 5 mins Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 30 30 15 Lambda Use Cases l Image recognition, automated backups, IoT apps l Trigger - upload image to S3, lambda function used to re-size images to view on different devices l Analysis of real time streaming data l ETL pipelines for DW applications l Sorting, data transformations Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 31 31 Amazon Elastic Container Service for Kubernetes (EKS) l EKS runs Kubernetes instances across multiple Availability Zones to ensure high availability l EKS automatically detects and replaces unhealthy instances, and provides automated version upgrades and patching l EKS supports autoscaling to maintain desired application performance metrics Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 32 32 16 Horizontal Pod Autoscaler l Metric Server collects resource utilization metrics and the Horizontal Pod Autoscaler creates Pod replicas on demand Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 33 33 Machine Learning (ML) Services l Speech Recognition (Polly) l Image recognition – (Rekognition) l SageMaker – build and train ML models l Amazon Code Whisperer – coding recommendations l Fraud Detector l etc…. Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 34 34 17 Automating IT Operations (AWS CloudFormation) l Creates a set of resources such as Amazon EC2 instance, Amazon RDS database instances and Elastic Load Balancers needed to run an application l The template describes what resources are needed and CloudFormation provisions the resources in a predictable fashion, handling and recovering from any failures Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 35 35 Tools and Utilities l AWS trusted Advisor – checks utilizations, security, etc. and suggest how cost could be minimized, security improved, etc. Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 36 36 18 AWS Billing Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 37 37 AWS Free Tier Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 38 38 19 Setting Budget Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 39 39 Summary l Hundreds of new services released each year l Services available immediately to all regions and customers l New features released almost daily l Build IT infrastructure using AWS services Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 40 40 20 Copyright© George Feuerlicht 2025 Compute Services - Lecture 4 41 41 21 Amazon Storage and Database Services l Amazon storage services l AWS Relational Database Services (RDS) Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 1 1 Learning Objectives l gain familiarity with AWS storage and database services l demonstrate selected AWS services Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 2 2 1 AWS Storage Services l Simple Storage Service - S3 l Elastic Block Store – EBS (use with EC2) l Elastic File System – EFS (Network File Service) Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 3 3 Amazon Simple Storage (S3) l Managed secure cloud storage – not associated with any server l Almost unlimited number of objects (trillions) l Almost unlimited object size (TBs) l Any object type: images, documents, database snapshots, etc. l Low latency access from anywhere via HTTP and HTTPS l Security: VPC endpoint, data encryption, etc. Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 4 4 2 S3 … l Durability: 99.999999999 % = average annual expected loss of 0.000000001% of objects. l For example, if you store 10,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000,000 years l Availability: 99.99 % l Regional assurance – data stays within a given region Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 5 5 S3 Bucket Replication Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 6 6 3 S3 Scalability l S3 scales with growing data volumes l S3 scales with growing number of requests l S3 access via AWS Management Console, AWS CLI, AWS SDKs or via REST HTTP l Cross-region replication – minimize latency l S3 use cases include storing globally accessible application data, log files, backup and disaster recovery data Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 7 7 S3 standard vs Glacier storage Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 8 8 4 S3 storage classes l S3 Standard for frequently accessed data l mS latency, SLA availability 99.9%, AZ >=3 l S3 Intelligent-Tiering - automatic cost savings for data with unknown or changing access patterns l mS latency, SLA availability 99%, AZ >=3 l S3 Express One Zone for most frequently accessed data l Single digit mS latency, SLA availability 99.9%, AZ=1, MinStoreDuration=1h l S3 Standard-Infrequent Access (S3 Standard-IA) l mS latency, SLA availability 99%, AZ >=3, retrieval charge, MinStoreDuration=30d Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 9 9 S3 storage classes … l S3 One Zone-Infrequent Access (S3 One Zone-IA) for less frequently accessed data l mS latency, SLA availability 99%, AZ=1, retrieval charge, MinStoreDuration=30d l S3 Glacier Instant Retrieval archive data that needs instant access l mS latency, SLA availability 99%, AZ>=3, retrieval charge, MinStoreDuration=90d l S3 Glacier Flexible Retrieval for rarely accessed long-term data that does not require immediate access l SLA availability 99.9%, AZ>=3, retrieval charge, MinStoreDuration=30d l Amazon S3 Glacier Deep Archive (S3 Glacier Deep Archive) for long-term archive and digital preservation with retrieval in hours at the lowest cost storage in the cloud. l SLA availability 99.9%, AZ>=3, retrieval charge, MinStoreDuration=180d Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 10 10 5 Uploading Data l Drag and drop files l Using CLI: l $aws s3 cp filename s3://bucketname l $aws s3 sync foldername s3://bucketname Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 11 11 Elastic Block Store l EBS volumes – high performance block storage for EC2 instances (HDD and SSD) up to 16TB 250 MB/S l Associated with a single instance – low latency l Automatically replicated across multiple servers within the availability zone l Point-in-time snapshots – volumes can be recreated from snapshots l Snapshots can be copied to another AWS region for disaster recovery l Resize volumes and change type on the fly Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 12 12 6 EBS Use Cases l Big Data analytics engines l Relational and NoSQL databases (Microsoft SQL Server and MySQL or Cassandra and MongoDB) l Each Amazon EBS volume is designed for 99.999% availability and automatically replicates within its Availability Zone l EBS encryption Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 13 13 Elastic File System (EFS) l Shared file storage for multiple EC2 instances with automatic, high-performance scaling l High throughput - up to 500,000 IOPS or 10 GB per second l Use cases: content management systems, application development, storing code and media files Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 14 14 7 AWS RDS l AWS RDS is a service - no ongoing administration l Resizable capacity l OS installation and patches l Database installation and patches l Automated backups l High availability l Server maintenance Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 15 15 Managing Relational Databases on Premises l Server maintenance l Ongoing energy costs l O/S software installation and patches l Database software installation and patches l Database backups and restore l Scalability limitations l Responsibility over data security Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 16 16 8 AWS RDS Database Instance l Database instance is an isolated database environment that can contain multiple databases l Database instance l Class: CPU, memory, network performance l Storage: SSD, provisioned IOPS l Database type l MySQL, Amazon Aurora, SQL Server, PostgreSQL, MariaDB, Oracle Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 17 17 Running RDS DB in VPC Environment Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 18 18 9 Stand-by Instances l Database instance in a private subnet can only be accessed by selected application instances l Database instance is isolated in a subnet associated with an AZ (physical location) l Stand-by copy of database instance in a separate AZ l Synchronous replication of transactions l Automatic failover to a stand-by database if the master instance fails – no loss of data Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 19 19 Multiple AZ Deployment Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 20 20 10 Asynchronous Read Replicas l Updates are asynchronously copied to the replicas l Reduce load on master database by routing queries to read replicas l Read replicas can be created in a different region than the master database Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 21 21 Asynchronous Read Replicas Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 22 22 11 RDB Use Cases l Web and mobile applications l high throughput, massive storage, high availability, database monitoring l e-commerce applications l low cost, data security, fully managed, auto- scaling, database monitoring Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 23 23 Amazon Aurora l MySQL and PostgreSQL compatible enterprise- class database build for cloud l High performance and availability l Managed service: AWS manages scaling, availability, backups, DBMS install and patches, OS install and patches, H/W maintenance l Continuous backups to Amazon S3 Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 24 24 12 Amazon Aurora … l Up to 5 times the throughput of MySQL and 3 times the throughput of PostgreSQL l Up to 64TB of auto-scaling SSD storage l 6-way replication across three AZ l Up to 15 Read Replicas with sub-10ms latency l Automatic monitoring and failover in < 30s Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 25 25 Aurora Cluster Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 26 26 13 Amazon Redshift l Fast fully managed data warehouse l Complex queries against PBs of data l Parallel query processing l Columnar storage (column-oriented) l Strong encryption l Standard SQL support l Load data from external sources: S3, etc. l Compatibility with SQL clients Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 27 27 Summary l AWS supports an extensive range of storage and database services l High scalability (elasticity) and performance l High availability and durability of data l Extensive security features Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 28 28 14 Copyright© George Feuerlicht 2025 Storage Services - Lecture 5 29 29 15 NoSQL Databases l Database trends l Motivations for NoSQL l Types of NoSQL data stores l Amazon DynamoDB l MongoDB l Neo4j Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 1 1 Learning Objectives l appreciate current database trends l understand the motivations for NoSQL l understand the differences between SQL and NoSQL databases l understand the concept of eventual consistency l understand different types of NoSQL databases Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 2 2 1 Big Data l Amount of data collected by various IoT devices (sensors, computers, sound and video recorders, mobile phones, RFID readers etc.) is growing at a compound annual rate of 60% l Scientific (e-science) applications in astronomy, earth sciences, etc. produce massive amounts of data, Large Hadron Collider at CERN - 40 terabytes/second l Structured data constitutes only about 5% of the total volume of generated data Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 3 3 Motivations for NoSQL l Semi-structured data doesn’t fit naturally into relational tabular structures l Need for adaptable database schemas l Traditional RBDMS cannot handle the high throughput required by internet-scale companies (Google, Facebook, Amazon, etc.) l Vertical scaling is expensive and cannot deal with vast amounts of data - horizontal scaling provides a less expensive alternative Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 4 4 2 What problem are NoSQL database addressing? l Modern database challenges l Very large data sizes l Very large user populations l Data complexity l Reliability: avoid a single point of failure l Solution –share-nothing cluster database architecture l Problem with this solution l Complexity of the architecture l Maintaining data consistency Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 5 5 NoSQL Databases l Key-value stores (DynamoDB, Berkeley DB) l Column-oriented databases (Vertica) l Document databases (MongoDB, CouchDB) l XML databases (myXMLDB, Tamino) l Graph database (neo4j) l Google BigTable (Hypertable) l In-memory data stores (VoltDB) l Facebook Cassandra Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 6 6 3 NoSQL Databases … l Divergent group of products, but some common features l Most NoSQL databases are open source l Various data models: document, graph, etc. l Storage of aggregate/compound documents l Horizontal scaling (scale out) l Data sharding l Data replication l Relaxed data consistency (eventual consistency) l Relaxed database schema Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 7 7 Share-nothing cluster architecture l Horizontal scaling typically involves thousands of machines in a share-nothing cluster l This architecture typically involves data replication across nodes of the cluster to improve availability l Data replication in share-nothing cluster architecture forces trade-off between consistency and availability. Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 8 8 4 Trading Consistency for Availability l Horizontal scalability - distributed share-nothing architecture; each processing node is independent and has all the required resources (i.e. processors, storage, and memory) l Data partitioning and replication l Asynchronous replication to ensure availability (typically 3-4 nodes hold copies of data records) l Loose consistency model Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 9 9 CAP Theorem [Eric Brewer, 2001] l Consistency (all replicas have the same value, every user has the same view of the database) l Availability (continuous operation even when parts of the system fail, users can always read and write data) l Partition tolerance (continuous operation when connectivity between segments of the network is interrupted) l Brewer’s CAP Theorem: ”distributed system sharing data cannot guarantee simultaneously all of three properties” l ACID properties cannot be guaranteed for asynchronously replicated data in a distributed system Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 10 10 5 CAP Theorem … l Strong consistency - every read receives the most recent write or an error l High availability - every read receives a response without guarantee that it contains the most recent write l Partition tolerance - the system continues to operate despite an arbitrary number of messages being dropped (or delayed) by the network between nodes Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 11 11 Network Partitioning Application asynchronous replication Record R1 Record R1 Copy 1 Copy 2 Value = 100 Value = 110 Network Partition Data centre A Data centre B Copyright© George Feuerlicht 2025 NoSQL Databases - Lecture 6 12 12 6 Eventual Consistency l BASE (Basically Available, Soft state, Eventual consistency) as an alternative to ACID transactions l Soft state: replicas may be inconsistent following a failure until consistency is restored l Eventually Consistency: consistency of replicas is

Cloud Computing: Principles and Technology Lecture Notes PDF

Document Details

Tags

Related

Summary

Full Transcript