Cloud Computing - NIST Architecture PDF
Document Details
Uploaded by Deleted User
Tags
Related
- Chapter 8 - Cloud Computing 2023 PDF
- Unit 01 - Introduction to Cloud Computing and Cloud Service Models.pdf
- Chapter 10 - 02 - Understand Cloud Computing Fundamentals - 01_ocred.pdf
- Chapter 10 - 02 - Understand Cloud Computing Fundamentals - 02_ocred.pdf
- Chapter 10 - 02 - Understand Cloud Computing Fundamentals - 03_ocred.pdf
- Chapter 10 - 02 - Understand Cloud Computing Fundamentals - 04_ocred.pdf
Summary
This document provides an overview of cloud computing, including the NIST architecture and the roles of various actors involved in cloud computing, such as consumers and providers. It also discusses different types of cloud services like SaaS, PaaS, and IaaS.
Full Transcript
Cloud Computing Unit 1 NIST Architecture (National Institute of Standard and Technology) the NIST cloud computing reference architecture defines five major actors: cloud consumer, cloud provider, cloud carrier, cloud auditor and cloud broker. Each actor is an entity (a person or an organization)...
Cloud Computing Unit 1 NIST Architecture (National Institute of Standard and Technology) the NIST cloud computing reference architecture defines five major actors: cloud consumer, cloud provider, cloud carrier, cloud auditor and cloud broker. Each actor is an entity (a person or an organization) that participates in a transaction or process and/or performs tasks in cloud computing. ![](media/image2.png) ***1.Cloud Consumer***:\_The cloud consumer is the principal stakeholder for the cloud computing service. A cloud consumer represents a person or organization that maintains a business relationship with, and uses the service from a cloud provider. A cloud consumer browses the service catalog from a cloud provider, requests the appropriate service, sets up service contracts with the cloud provider, and uses the service. The cloud consumer may be billed for the service provisioned, and needs to arrange payments accordingly. Cloud consumers need SLAs(covers terms regarding the quality of service, security, remedies for performance failures) i)SaaS applications:- organizations that provide their members with access to software application ii\) PaaS :\_ employs the tools and execution resources provided by cloud providers to develop, test, deploy and manage the applications hosted in a cloud environment. application developers who design and implement application software, application testers who run and test applications in cloud-based environments, application deployers who publish applications into the cloud, and application administrators who configure and monitor application performance on a platform. iii\) IaaS: have access to virtual computers, network-accessible storage, network infrastructure components, and other fundamental computing resources on which they can deploy and run arbitrary software.Consumers can be be system developers, system administrators and IT managers who are interested in creating, installing, managing and monitoring services for IT infrastructure operations. ***2.Cloud Provider***: A Cloud Provider acquires and manages the computing infrastructure required for providing the services, runs the cloud software that provides the services, and makes arrangement to deliver the cloud services to the Cloud Consumers through network access Software as a Service, the cloud provider deploys, configures, maintains and updates the operation of the software applications on a cloud infrastructure so that the services are provisioned at the expected service levels to cloud consumers. PaaS, the Cloud Provider manages the computing infrastructure for the platform and runs the cloud software that provides the components of the platform, such as runtime software execution stack, databases, and other middleware components IaaS, the Cloud Provider acquires the physical computing resources underlying the service, including the servers, networks, storage and hosting infrastructure. ***3. Cloud Auditor***:- A cloud auditor is a party that can perform an independent examination of cloud service controls with the intent to express an opinion thereon. Audits are performed to verify conformance to standards through review of objective evidence. A cloud auditor can evaluate the services provided by a cloud provider in terms of security controls, privacy impact, performance, etc. **4. Cloud Broker**:- A cloud broker is an entity that manages the use, performance and delivery of cloud services and negotiates relationships between cloud providers and cloud consumers. Service Intermediation: A cloud broker enhances a given service by improving some specific capability and providing value-added services to cloud consumers. Service Aggregation: A cloud broker combines and integrates multiple services into one or more new services. The broker provides data integration and ensures the secure data movement between the cloud consumer and multiple cloud providers. Service Arbitrage: Service arbitrage is similar to service aggregation except that the services being aggregated are not fixed. Service arbitrage means a broker has the flexibility to choose services from multiple agencies. The cloud broker, for example, can use a credit-scoring service to measure and select an agency with the best score. 5\. Cloud Carrier:- Cloud carriers provide access to consumers through network, telecommunication and other access devices. Cloud Computing Reference Architecture NIST Definition of Cloud Computing (Special Publication 800-145): Provides broad cloud computing definitions in terms of characteristics and models. The aim is to develop industry standards with minimal restrictions to avoid specifications that inhibit innovation. NIST Guidelines on Security and Privacy in Public Cloud Computing (Special Publication 800-144): Provides an overview of the security and privacy challenges pertinent to public cloud computing and points out considerations organizations should take when outsourcing data, applications, and infrastructure to a public cloud environment. NIST Cloud Computing Standards Roadmap (Special Publication 500-291): Surveys the existing standards landscape for security, portability, and interoperability standards, models, and use cases that are relevant to cloud computing, as well as identifying current standards, gaps, and priorities. NIST Cloud Computing Reference Architecture (Special Publication 500-292): Describes a cloud computing reference architecture, designed as an extension to the NIST Cloud Computing Definition, that depicts a generic high-level conceptual model for discussing the requirements, structures, and operations of cloud computing. Cloud Security Threats. The encryption mechanism is a digital coding system dedicated to preserving the confidentiality and integrity of data. It is used for encoding plaintext data into a protected and unreadable format. Encryption technology commonly relies on a standardized algorithm called a cipher to transform original plaintext data into encrypted data, referred to as ciphertext. the data is paired with a string of characters called an encryption key, a secret message that is established by and shared among authorized parties. The encryption key is used to decrypt the ciphertext back into its original plaintext format. The encryption mechanism can help counter the traffic eavesdropping, malicious intermediary, insufficient authorization, and overlapping trust boundaries security threats. Symmetric Encryption:Also known as secret key cryptography, messages that are encrypted with a specific key can be decrypted by only that same key. Asymmetric Encryption: Asymmetric encryption relies on the use of two different keys, namely a private key and a public key. With asymmetric encryption (which is also referred to as public key cryptography), the private key is known only to its owner while the public key is commonly available This method of encryption does not offer any confidentiality protection, even though successful decryption proves that the text was encrypted by the rightful private key owner. Private key encryption therefore offers integrity protection in addition to authenticity and non-repudiation. **Hashing** Hashing technology can be used to derive a hashing code or message digest from a message, which is often of a fixed length and smaller than the original message. The message sender can then utilize the hashing mechanism to attach the message digest to the message. The recipient applies the same hash function to the message to verify that the produced message digest is identical to the one that accompanied the message. Any alteration to the original data results in an entirely different message digest and clearly indicates that tampering has occurred. ![](media/image4.png) **. Digital Signature** The digital signature mechanism is a means of providing data authenticity and integrity through authentication and non-repudiation. A message is assigned a digital signature prior to transmission, which is then rendered invalid if the message experiences any subsequent, unauthorized modifications. A digital signature provides evidence that the message received is the same as the one created by its rightful sender. The digital signature mechanism helps mitigate the malicious intermediary, insufficient authorization, and overlapping trust boundaries security threats Public Key Infrastructure (PKI) A common approach for managing the issuance of asymmetric keys is based on the public key infrastructure (PKI) mechanism, which exists as a system of protocols, data formats, rules, and practices that enable largescale systems to securely use public key cryptography. This system is used to associate public keys with their corresponding key owners (known as public key identification) while enabling the verification of key validity. PKIs rely on the use of digital certificates, which are digitally signed data structures that bind public keys to certificate owner identities, as well as to related information, such as validity periods. Digital certificates are usually digitally signed by a third-party certificate authority (CA), as illustrated in Figure 10.7 The PKI is a dependable method for implementing asymmetric encryption, managing cloud consumer and cloud provider identity information, and helping to defend against the malicious intermediary and insufficient authorization threats. The PKI mechanism is primarily used to counter the insufficient authorization threat ![](media/image6.png) **Identity and Access Management (IAM)** (IAM) mechanism encompasses the components and policies necessary to control and track user identities and access privileges for IT resources, environments, and systems. Specifically, IAM mechanisms exist as systems comprised of four main components: Authentication -- Username and password combinations remain the most common forms of userauthentication credentials managed by the IAM system, which also can support digital signatures, digital certificates, biometric hardware (fingerprint readers), specialized software (such as voice analysis programs), and locking user accounts to registered IP or MAC addresses. Authorization -- The authorization component defines the correct granularity for access controls and oversees the relationships between identities, access control rights, and IT resource availability. User Management -- Related to the administrative capabilities of the system, the user management program is responsible for creating new user identities and access groups, resetting passwords, defining password policies, and managing privileges. Credential Management -- The credential management system establishes identities and access control rules for defined user accounts, which mitigates the threat of insufficient authorization. The IAM mechanism is primarily used to counter the insufficient authorization, denial of service, overlapping trust boundaries threats, virtualization attack and containerization attack threats. Single Sign-On (SSO) The single sign-on (SSO) mechanism enables one cloud service consumer to be authenticated by a security broker, which establishes a security context that is persisted while the cloud service consumer accesses other cloud services or cloud-based IT resources. Otherwise, the cloud service consumer would need to re-authenticate itself with every subsequent request. The SSO mechanism essentially enables mutually independent cloud services and IT resources to generate and circulate runtime authentication and authorization credentials. The credentials initially provided by the cloud service consumer remain valid for the duration of a session, while its security context information is shared (Figure 10.9 ). The SSO mechanism's security broker is especially useful when a cloud service consumer needs to access cloud services residing on different clouds. **Cloud-Based Security Groups** Resource segmentation is used to enable virtualization by allocating a variety of physical IT resources to virtual machines. It needs to be optimized for public cloud environments, since organizational trust boundaries from different cloud consumers overlap when sharing the same underlying physical IT resources. The cloud-based resource segmentation process creates cloud-based security group mechanisms that are determined through security policies. Networks are segmented into logical cloud-based security groups that form logical network perimeters. Each cloud-based IT resource is assigned to at least one logical cloud-based security group. Each logical cloud-based security group is assigned specific rules that govern the communication between the security groups. Multiple virtual servers running on the same physical server can become members of different logical cloud based security groups (Figure 10.11 ). Virtual servers can further be separated into public-private groups, development-production groups, or any other designation configured by the cloud resource administrator ![](media/image8.png) Cloud-based security groups delineate areas where different security measures can be applied. Properly implemented cloud-based security groups help limit unauthorized access to IT resources in the event of a security breach. This mechanism can be used to help counter the denial of service, insufficient authorization, overlapping trust boundaries, virtualization attack and container attack threats, and is closely related to the logical network perimeter mechanism. **Hardened Virtual Server Images** As previously discussed, a virtual server is created from a template configuration called a virtual server image (or virtual machine image). Hardening is the process of stripping unnecessary software from a system to limit potential vulnerabilities that can be exploited by attackers. Removing redundant programs, closing unnecessary server ports, and disabling unused services, internal root accounts, and guest access are all examples of hardening. A hardened virtual server image is a template for virtual service instance creation that has been subjected to a hardening process (Figure 10.13). This generally results in a virtual server template that is significantly more secure than the original standard image Hardened virtual server images help counter the denial of service, insufficient authorization, and overlapping trust boundaries threats. **Cloud Computing Mechanisms** Cloud Infrastructure Mechanisms Logical Network Perimeter Virtual Server Cloud Storage Device Cloud Usage Monitor Resource Replication Ready-Made Environment 1. Logical Network Perimeter:- Defined as the isolation of a network environment from the rest of a communications network, the logical network perimeter establishes a virtual network boundary that can encompass and isolate a group of related cloud-based IT resources that may be physically distributed This mechanism can be implemented to: isolate IT resources in a cloud from non-authorized users isolate IT resources in a cloud from non-users isolate IT resources in a cloud from cloud consumers control the bandwidth that is available to isolated IT resources Logical network perimeters are typically established via network devices that supply and control the connectivity of a data center and are commonly deployed as virtualized IT environments that include: Virtual Firewall -- An IT resource that actively filters network traffic to and from the isolated network while controlling its interactions with the Internet. Virtual Network -- Usually acquired through VLANs, this IT resource isolates the network environment within the data center infrastructure. Figure 7.2 introduces the notation used to denote these two IT resources. Figure 7.3 depicts a scenario in which one logical network perimeter contains a cloud consumer's on-premise environment, while another contains a cloud provider's cloud-based environment. These perimeters are connected through a VPN that protects communications, since the VPN is typically implemented by point-to-point encryption of the data packets sent between the communicating endpoints. ![](media/image10.png) **Virtual Server** A virtual server is a form of virtualization software that emulates a physical server. Virtual servers are used by cloud providers to share the same physical server with multiple cloud consumers by providing cloud consumers with individual virtual server instances. Figure 7.5 shows three virtual servers being hosted by two physical servers. The number of instances a given physical server can share is limited by its capacity. ![](media/image12.png) As a commodity mechanism, the virtual server represents the most foundational building block of cloud environments. Each virtual server can host numerous IT resources, cloud-based solutions, and various other cloud computing mechanisms. The instantiation of virtual servers from image files is a resource allocation process that can be completed rapidly and on-demand. Cloud consumers that install or lease virtual servers can customize their environments independently from other cloud consumers that may be using virtual servers hosted by the same underlying physical server. Figure 7.6 depicts a virtual server that hosts a cloud service being accessed by Cloud Service Consumer B, while Cloud Service Consumer A accesses the virtual server directly to perform an administration task. ![](media/image14.png) **Cloud Storage Device** The cloud storage device mechanism represents storage devices that are designed specifically for cloud-based provisioning. Instances of these devices can be virtualized, similar to how physical servers can spawn virtual server images. Cloud storage devices are commonly able to provide fixed-increment capacity allocation in support of the pay-per-use mechanism. Cloud storage devices can be exposed for remote access via cloud storage services. A primary concern related to cloud storage is the security, integrity, and confidentiality of data, which becomes more prone to being compromised when entrusted to external cloud providers and other third parties. There can also be legal and regulatory implications that result from relocating data across geographical or national boundaries. Another issue applies specifically to the performance of large databases. LANs provide locally stored data with network reliability and latency levels that are superior to those of WANs Cloud Storage Levels Cloud storage device mechanisms provide common logical units of data storage, such as: Files -- Collections of data are grouped into files that are located in folders. Blocks -- The lowest level of storage and the closest to the hardware, a block is the smallest unit of data that is still individually accessible. Datasets -- Sets of data are organized into a table-based, delimited, or record format. Objects -- Data and its associated metadata are organized as Web-based resources. Each of these data storage levels is commonly associated with a certain type of technical interface which corresponds to a particular type of cloud storage device and cloud storage service used to expose its API (Figure 7.9). 14 **Cloud Usage Monitor** The cloud usage monitor mechanism is a lightweight and autonomous software program responsible for collecting and processing IT resource usage data. Depending on the type of usage metrics they are designed to collect and the manner in which usage data needs to be collected, cloud usage monitors can exist in different formats. The upcoming sections describe three common agent-based implementation formats. Each can be designed to forward collected usage data to a log database for post-processing and reporting purposes. Monitoring Agent A monitoring agent is an intermediary, event-driven program that exists as a service agent and resides along existing communication paths to transparently monitor and analyze dataflows (Figure 7.12 ). This type of cloud usage monitor is commonly used to measure network traffic and message metrics. Resource Agent A resource agent is a processing module that collects usage data by having event-driven interactions with specialized resource software (Figure 7.13 ). This module is used to monitor usage metrics based on pre defined, observable events at the resource software level, such as initiating, suspending, resuming, and vertical scaling. ![](media/image16.png) Polling Agent A polling agent is a processing module that collects cloud service usage data by polling IT resources. This type of cloud service monitor is commonly used to periodically monitor IT resource status, such as uptime and downtime (Figure 7.14 ) number of polling cycles, until it receives a usage status of "B" (1), upon which the polling agent records the new usage status in the log database (2). **Resource Replication** Defined as the creation of multiple instances of the same IT resource, replication is typically performed when an IT resource's availability and performance need to be enhanced. Virtualization technology is used to implement the resource replication mechanism to replicate cloud-based IT resources (Figure 7.16) ![](media/image18.png) **Ready-Made Environment** The ready-made environment mechanism (Figure 7.20 ) is a defining component of the PaaS cloud delivery model that represents a pre-defined, cloud-based platform comprised of a set of already installed IT resources, ready to be used and customized by a cloud consumer. These environments are utilized by cloud consumers to remotely develop and deploy their own services and applications within a cloud. Typical ready-made environments include pre-installed IT resources, such as databases, middleware, development tools, and governance tools A ready-made environment is generally equipped with a complete software development kit (SDK) that provides cloud consumers with programmatic access to the development technologies that comprise their preferred programming stacks. Middleware is available for multitenant platforms to support the development and deployment of Web applications. Some cloud providers offer runtime execution environments for cloud services that are based on different runtime performance and billing parameters. For example, a front-end instance of a cloud service can be configured to respond to time-sensitive requests more effectively than a back-end instance. The former variation will be billed at a different rate than the latter. As further demonstrated in the upcoming case study example, a solution can be partitioned into groups of logic that can be designated for both frontend and backend instance invocation so as to optimize runtime execution and billing **Specialized Cloud Mechanisms** The following specialized cloud mechanisms are described in this chapter: Automated Scaling Listener Load Balancer SLA Monitor Pay-Per-Use Monitor Audit Monitor Failover System Hypervisor Resource Cluster Multi-Device Broker State Management Database **Automated Scaling Listener** The automated scaling listener mechanism is a service agent that monitors and tracks communications between cloud service consumers and cloud services for dynamic scaling purposes. Automated scaling listeners are deployed within the cloud, typically near the firewall, from where they automatically track workload status information. Workloads can be determined by the volume of cloud consumer-generated requests or via back end processing demands triggered by certain types of requests. For example, a small amount of incoming data can result in a large amount of processing. Automated scaling listeners can provide different types of responses to workload fluctuation conditions, such as: Automatically scaling IT resources out or in based on parameters previously defined by the cloud consumer (commonly referred to as auto-scaling). Automatic notification of the cloud consumer when workloads exceed current thresholds or fall below allocated resources (Figure 8.1 ). This way, the cloud consumer can choose to adjust its current IT resource allocation. ![](media/image20.png) Different cloud provider vendors have different names for service agents that act as automated scaling listeners. **Load Balancer** A common approach to horizontal scaling is to balance a workload across two or more IT resources to increase performance and capacity beyond what a single IT resource can provide. The load balancer mechanism is a runtime agent with logic fundamentally based on this premise. Beyond simple division of labor algorithms (Figure 8.5 runtime workload distribution functions that include: ), load balancers can perform a range of specialized Asymmetric Distribution -- larger workloads are issued to IT resources with higher processing capacities Workload Prioritization -- workloads are scheduled, queued, discarded, and distributed workloads according to their priority levels Content-Aware Distribution -- requests are distributed to different IT resources as dictated by the request content A load balancer is programmed or configured with a set of performance and QoS rules and parameters with the general objectives of optimizing IT resource usage, avoiding overloads, and maximizing throughput. The load balancer mechanisms can exist as a: multi-layer network switch dedicated hardware appliance dedicated software-based system (common in server operating systems) service agent (usually controlled by cloud management software) The load balancer is typically located on the communication path between the IT resources generating the workload and the IT resources performing the workload processing. This mechanism can be designed as a transparent agent that remains hidden from the cloud service consumers, or as a proxy component that abstracts the IT resources performing their workload. **SLA Monitor** The SLA monitor mechanism is used to specifically observe the runtime performance of cloud services to ensure that they are fulfilling the contractual QoS requirements that are published in SLAs (Figure 8.7). The data collected by the SLA monitor is processed by an SLA management system to be aggregated into SLA reporting metrics. The system can proactively repair or failover cloud services when exception conditions occur, such as when the SLA monitor reports a cloud service as "down." **Pay-Per-Use Monitor** The pay-per-use monitor mechanism measures cloud-based IT resource usage in accordance with predefined pricing parameters and generates usage logs for fee calculations and billing purposes. Some typical monitoring variables are: request/response message quantity transmitted data volume bandwidth consumption The data collected by the pay-per-use monitor is processed by a billing management system that calculates the payment fees. The billing management system mechanism is covered in Chapter 9 Figure 8.12. shows a pay-per-use monitor implemented as a resource agent used to determine the usage period of a virtual server ![](media/image22.png) **Audit Monitor** The audit monitor mechanism is used to collect audit tracking data for networks and IT resources in support of (or dictated by) regulatory and contractual obligations. Figure 8.15 depicts an audit monitor implemented as a monitoring agent that intercepts "login" requests and stores the requestor's security credentials, as well as both failed and successful login attempts, in a log database for future audit reporting purposes. ![](media/image24.png) **Failover System** The failover system mechanism is used to increase the reliability and availability of IT resources by using established clustering technology to provide redundant implementations. A failover system is configured to automatically switch over to a redundant or standby IT resource instance whenever the currently active IT resource becomes unavailable. Failover systems are commonly used for mission-critical programs and reusable services that can introduce a single point of failure for multiple applications. A failover system can span more than one geographical region so that each location hosts one or more redundant implementations of the same IT resource. The resource replication mechanism is sometimes utilized by the failover system to provide redundant IT resource instances, which are actively monitored for the detection of errors and unavailability conditions. Failover systems come in two basic configurations: Active-Active In an active-active configuration, redundant implementations of the IT resource actively serve the workload synchronously (Figure 8.17 ). Load balancing among active instances is required. When a failure is detected, the failed instance is removed from the load balancing scheduler (Figure 8.18 ). Whichever IT resource remains operational when a failure is detected takes over the processing (Figure 8.19 ) Active-Passive In an active-passive configuration, a standby or inactive implementation is activated to take over the processing from the IT resource that becomes unavailable, and the corresponding workload is redirected to the instance taking over the operation (Figures 8.20 to 8.22 ). ![](media/image26.png) Some failover systems are designed to redirect workloads to active IT resources that rely on specialized load balancers that detect failure conditions and exclude failed IT resource instances from the workload distribution. This type of failover system is suitable for IT resources that do not require execution state management and provide stateless processing capabilities. In technology architectures that are typically based on clustering and virtualization technologies, the redundant or standby IT resource implementations are also required to share their state and execution context. A complex task that was executed on a failed IT resource can remain operational in one of its redundant implementations. **Hypervisor** The hypervisor mechanism is a fundamental part of virtualization infrastructure that is primarily used to generate virtual server instances of a physical server. A hypervisor is generally limited to one physical server and can therefore only create virtual images of that server (Figure 8.27 ). Similarly, a hypervisor can only assign virtual servers it generates to resource pools that reside on the same underlying physical server. A hypervisor has limited virtual server management features, such as increasing the virtual server's capacity or shutting it down. The VIM provides a range of features for administering multiple hypervisors across physical servers. Hypervisor software can be installed directly in bare-metal servers and provides features for controlling, sharing and scheduling the usage of hardware resources, such as processor power, memory, and I/O. These can appear to each virtual server's operating system as dedicated resources. **Resource Cluster** Cloud-based IT resources that are geographically diverse can be logically combined into groups to improve their allocation and use. The resource cluster mechanism (Figure 8.30 ) is used to group multiple IT resource instances so that they can be operated as a single IT resource. This increases the combined computing capacity, load balancing, and availability of the clustered IT resources. ![](media/image28.png) Resource cluster architectures rely on high-speed dedicated network connections, or cluster nodes, between IT resource instances to communicate about workload distribution, task scheduling, data sharing, and system synchronization. A cluster management platform that is running as distributed middleware in all of the cluster nodes is usually responsible for these activities. This platform implements a coordination function that allows distributed IT resources to appear as one IT resource, and also executes IT resources inside the cluster. Common resource cluster types include: Server Cluster -- Physical or virtual servers are clustered to increase performance and availability. Hypervisors running on different physical servers can be configured to share virtual server execution state (such as memory pages and processor register state) in order to establish clustered virtual servers. In such configurations, which usually require physical servers to have access to shared storage, virtual servers are able to live-migrate from one to another. In this process, the virtualization platform suspends the execution of a given virtual server at one physical server and resumes it on another physical server. The process is transparent to the virtual server operating system and can be used to increase scalability by live-migrating a virtual server that is running at an overloaded physical server to another physical server that has suitable capacity. Database Cluster -- Designed to improve data availability, this high-availability resource cluster has a synchronization feature that maintains the consistency of data being stored at different storage devices used in the cluster. The redundant capacity is usually based on an active-active or active-passive failover system committed to maintaining the synchronization conditions. Large Dataset Cluster -- Data partitioning and distribution is implemented so that the target datasets can be efficiently partitioned without compromising data integrity or computing accuracy. Each cluster node processes workloads without communicating with other nodes as much as in other cluster types. There are two basic types of resource clusters: Load Balanced Cluster -- This resource cluster specializes in distributing workloads among cluster nodes to increase IT resource capacity while preserving the centralization of IT resource management. It usually implements a load balancer mechanism that is either embedded within the cluster management platform or set up as a separate IT resource. HA Cluster -- A high-availability cluster maintains system availability in the event of multiple node failures, and has redundant implementations of most or all of the clustered IT resources. It implements a failover system mechanism that monitors failure conditions and automatically redirects the workload away from any failed nodes. The provisioning of clustered IT resources can be considerably more expensive than the provisioning of individual IT resources that have an equivalent computing capacity. **Multi-Device Broker** An individual cloud service may need to be accessed by a range of cloud service consumers differentiated by their hosting hardware devices and/or communication requirements. To overcome incompatibilities between a cloud service and a disparate cloud service consumer, mapping logic needs to be created to transform (or convert) information that is exchanged at runtime. The multi-device broker mechanism is used to facilitate runtime data transformation so as to make a cloud service accessible to a wider range of cloud service consumer programs and devices (Figure 8.35 ). Multi-device brokers commonly exist as gateways or incorporate gateway components, such as: XML Gateway -- transmits and validates XML data Cloud Storage Gateway -- transforms cloud storage protocols and encodes storage devices to facilitate data transfer and storage Mobile Device Gateway -- transforms the communication protocols used by mobile devices into protocols that are compatible with a cloud service The levels at which transformation logic can be created include: transport protocols messaging protocols storage device protocols data schemas/data models For example, a multi-device broker may contain mapping logic that coverts both transport and messaging protocols for a cloud service consumer accessing a cloud service with a mobile device. **State Management Database** A state management database is a storage device that is used to temporarily persist state data for software programs. As an alternative to caching state data in memory, software programs can off-load state data to the database in order to reduce the amount of runtime memory they consume (Figures 8.37 and 8.38). By doing so, the software programs and the surrounding infrastructure are more scalable. State management databases are commonly used by cloud services, especially those involved in long-running runtime activities **Cloud Management Mechanisms** The following management-related mechanisms are described in this chapter: Remote Administration System Resource Management System SLA Management System Billing Management System These systems typically provide integrated APIs and can be offered as individual products, custom applications, or combined into various product suites or multi-function applications. **Remote Administration** System The remote administration system mechanism (Figure 9.1 ) provides tools and user-interfaces for external cloud resource administrators to configure and administer cloud-based IT resources ![](media/image30.png) A remote administration system can establish a portal for access to administration and management features of various underlying systems, including the resource management, SLA management, and billing management systems described in this chapter (Figure 9.2 ). The tools and APIs provided by a remote administration system are generally used by the cloud provider to develop and customize online portals that provide cloud consumers with a variety of administrative controls. The following are the two primary types of portals that are created with the remote administration system: Usage and Administration Portal -- A general purpose portal that centralizes management controls to different cloud-based IT resources and can further provide IT resource usage reports. This portal is part of numerous cloud technology architectures covered in Chapters 11 to 13. ![](media/image32.png) Self-Service Portal -- This is essentially a shopping portal that allows cloud consumers to search an up to-date list of cloud services and IT resources that are available from a cloud provider (usually for lease). The cloud consumer submits its chosen items to the cloud provider for provisioning. This portal is primarily associated with the rapid provisioning architecture described in Chapter 12 Figure 9.3 illustrates a scenario involving a remote administration system and both usage and administration and self-service portals. ![](media/image34.png) Depending on: the type of cloud product or cloud delivery model the cloud consumer is leasing or using from the cloud provider, the level of access control granted by the cloud provider to the cloud consumer, and further depending on which underlying management systems the remote administration system interfaces with, \...tasks that can commonly be performed by cloud consumers via a remote administration console include: configuring and setting up cloud services provisioning and releasing IT resource for on-demand cloud services monitoring cloud service status, usage, and performance monitoring QoS and SLA fulfillment managing leasing costs and usage fees managing user accounts, security credentials, authorization, and access control tracking internal and external access to leased services planning and assessing IT resource provisioning capacity planning While the user-interface provided by the remote administration system will tend to be proprietary to the cloud provider, there is a preference among cloud consumers to work with remote administration systems that offer standardized APIs. This allows a cloud consumer to invest in the creation of its own front-end with the fore knowledge that it can reuse this console if it decides to move to another cloud provider that supports the same standardized API. Additionally, the cloud consumer would be able to further leverage standardized APIs if it is interested in leasing and centrally administering IT resources from multiple cloud providers and/or IT resources residing in cloud and on-premise environments. **Resource Management System** The resource management system mechanism helps coordinate IT resources in response to management actions performed by both cloud consumers and cloud providers (Figure 9.5 ). Core to this system is the virtual infrastructure manager (VIM) that coordinates the server hardware so that virtual server instances can be created from the most expedient underlying physical server. A VIM is a commercial product that can be used to manage a range of virtual IT resources across multiple physical servers. For example, a VIM can create and manage multiple instances of a hypervisor across different physical servers or allocate a virtual server on one physical server to another (or to a resource pool). Tasks that are typically automated and implemented through the resource management system include: managing virtual IT resource templates that are used to create pre-built instances, such as virtual server images allocating and releasing virtual IT resources into the available physical infrastructure in response to the starting, pausing, resuming, and termination of virtual IT resource instances coordinating IT resources in relation to the involvement of other mechanisms, such as resource replication, load balancer, and failover system enforcing usage and security policies throughout the lifecycle of cloud service instances monitoring operational conditions of IT resources Resource management system functions can be accessed by cloud resource administrators employed by the cloud provider or cloud consumer. Those working on behalf of a cloud provider will often be able to directly access the resource management system's native console. Resource management systems typically expose APIs that allow cloud providers to build remote administration system portals that can be customized to selectively offer resource management controls to external cloud resource administrators acting on behalf of cloud consumer organizations via usage and administration portals. Both forms of access are depicted in Figure 9.6 ![](media/image36.png) **SLA Management System** The SLA management system mechanism represents a range of commercially available cloud management products that provide features pertaining to the administration, collection, storage, reporting, and runtime notification of SLA data (Figure 9.7) An SLA management system deployment will generally include a repository used to store and retrieve collected SLA data based on pre-defined metrics and reporting parameters. It will further rely on one or more SLA monitor mechanisms to collect the SLA data that can then be made available in near-real time to usage and administration portals to provide on-going feedback regarding active cloud services (Figure 9.8). The metrics monitored for individual cloud services are aligned with the SLA guarantees in corresponding cloud provisioning contracts. **Billing Management System** The billing management system mechanism is dedicated to the collection and processing of usage data as it pertains to cloud provider accounting and cloud consumer billing. Specifically, the billing management system relies on pay-per-use monitors to gather runtime usage data that is stored in a repository that the system components then draw from for billing, reporting, and invoicing purposes (Figures 9.9 and 9.10) ![](media/image38.png) The billing management system allows for the definition of different pricing policies, as well as custom pricing models on a per cloud consumer and/or per IT resource basis. Pricing models can vary from the traditional pay-per-use models, to flat-rate or pay-per-allocation modes, or combinations thereof. Billing arrangements be based on pre-usage and post-usage payments. The latter type can include pre-defined limits or it can be set up (with the mutual agreement of the cloud consumer) to allow for unlimited usage (and, consequently, no limit on subsequent billing). When limits are established, they are usually in the form of usage quotas. When quotas are exceeded, the billing management system can block further usage requests by cloud consumers. unit 3 **Cloud Computing Architecture** Fundamental Cloud Architectures Workload Distribution Architecture IT resources can be horizontally scaled via the addition of one or more identical IT resources, and a load balancer that provides runtime logic capable of evenly distributing the workload among the available IT resources (Figure 11.1 ). The resulting workload distribution architecture reduces both IT resource over utilization and under-utilization to an extent dependent upon the sophistication of the load balancing algorithms and runtime logic ![](media/image40.png) This fundamental architectural model can be applied to any IT resource, with workload distribution commonly carried out in support of distributed virtual servers, cloud storage devices, and cloud services. Load balancing systems applied to specific IT resources usually produce specialized variations of this architecture that incorporate aspects of load balancing, such as: the service load balancing architecture explained later in this chapter the load balanced virtual server architecture covered in Chapter 12 the load balanced virtual switches architecture described in Chapter 13 In addition to the base load balancer mechanism, and the virtual server and cloud storage device mechanisms to which load balancing can be applied, the following mechanisms can also be part of this cloud architecture: Audit Monitor -- When distributing runtime workloads, the type and geographical location of the IT resources that process the data can determine whether monitoring is necessary to fulfill legal and regulatory requirements. Cloud Usage Monitor -- Various monitors can be involved to carry out runtime workload tracking and data processing. Hypervisor -- Workloads between hypervisors and the virtual servers that they host may require distribution. Logical Network Perimeter -- The logical network perimeter isolates cloud consumer network boundaries in relation to how and where workloads are distributed. Resource Cluster -- Clustered IT resources in active/active mode are commonly used to support workload balancing between different cluster nodes. Resource Replication -- This mechanism can generate new instances of virtualized IT resources in response to runtime workload distribution demands. **Resource Pooling Architecture** A resource pooling architecture is based on the use of one or more resource pools, in which identical IT resources are grouped and maintained by a system that automatically ensures that they remain synchronized. Provided here are common examples of resource pools: Physical server pools are composed of networked servers that have been installed with operating systems and other necessary programs and/or applications and are ready for immediate use. Virtual server pools are usually configured using one of several available templates chosen by the cloud consumer during provisioning. For example, a cloud consumer can set up a pool of mid-tier Windows servers with 4 GB of RAM or a pool of low-tier Ubuntu servers with 2 GB of RAM. ![](media/image42.png) ![](media/image44.png) Storage pools, or cloud storage device pools, consist of file-based or block-based storage structures that contain empty and/or filled cloud storage devices Network pools (or interconnect pools) are composed of different preconfigured network connectivity devices. For example, a pool of virtual firewall devices or physical network switches can be created for redundant connectivity, load balancing, or link aggregation ![](media/image46.png) CPU pools are ready to be allocated to virtual servers, and are typically broken down into individual processing cores Pools of physical RAM can be used in newly provisioned physical servers or to vertically scale physical servers. Dedicated pools can be created for each type of IT resource and individual pools can be grouped into a larger pool, in which case each individual pool becomes a sub-pool (Figure 11.2 ) Resource pools can become highly complex, with multiple pools created for specific cloud consumers or applications. A hierarchical structure can be established to form parent, sibling, and nested pools in order to facilitate the organization of diverse resource pooling requirements (Figure 11.3 ). ![](media/image48.png) Sibling resource pools are usually drawn from physically grouped IT resources, as opposed to IT resources that are spread out over different data centers. Sibling pools are isolated from one another so that each cloud consumer is only provided access to its respective pool. In the nested pool model, larger pools are divided into smaller pools that individually group the same type of IT resources together (Figure 11.4 ). Nested pools can be used to assign resource pools to different departments or groups in the same cloud consumer organization. After resources pools have been defined, multiple instances of IT resources from each pool can be created to provide an in-memory pool of "live" IT resources. In addition to cloud storage devices and virtual servers, which are commonly pooled mechanisms, the following mechanisms can also be part of this cloud architecture: Audit Monitor -- This mechanism monitors resource pool usage to ensure compliance with privacy and regulation requirements, especially when pools contain cloud storage devices or data loaded into memory. Cloud Usage Monitor -- Various cloud usage monitors are involved in the runtime tracking and synchronization that are required by the pooled IT resources and any underlying management systems. Hypervisor -- The hypervisor mechanism is responsible for providing virtual servers with access to resource pools, in addition to hosting the virtual servers and sometimes the resource pools themselves. Logical Network Perimeter -- The logical network perimeter is used to logically organize and isolate resource pools. Pay-Per-Use Monitor -- The pay-per-use monitor collects usage and billing information on how individual cloud consumers are allocated and use IT resources from various pools. Remote Administration System -- This mechanism is commonly used to interface with backend systems and programs in order to provide resource pool administration features via a front-end portal. Resource Management System -- The resource management system mechanism supplies cloud consumers with the tools and permission management options for administering resource pools. Resource Replication -- This mechanism is used to generate new instances of IT resources for resource pools **Dynamic Scalability Architecture** The dynamic scalability architecture is an architectural model based on a system of predefined scaling conditions that trigger the dynamic allocation of IT resources from resource pools. Dynamic allocation enables variable utilization as dictated by usage demand fluctuations, since unnecessary IT resources are efficiently reclaimed without requiring manual interaction. The automated scaling listener is configured with workload thresholds that dictate when new IT resources need to be added to the workload processing. This mechanism can be provided with logic that determines how many additional IT resources can be dynamically provided, based on the terms of a given cloud consumer's provisioning contract. The following types of dynamic scaling are commonly used: Dynamic Horizontal Scaling -- IT resource instances are scaled out and in to handle fluctuating workloads. The automatic scaling listener monitors requests and signals resource replication to initiate IT resource duplication, as per requirements and permissions. Dynamic Vertical Scaling -- IT resource instances are scaled up and down when there is a need to adjust the processing capacity of a single IT resource. For example, a virtual server that is being overloaded can have its memory dynamically increased or it may have a processing core added. Dynamic Relocation -- The IT resource is relocated to a host with more capacity. For example, a database may need to be moved from a tape-based SAN storage device with 4 GB per second I/O capacity to another disk-based SAN storage device with 8 GB per second I/O capacity. Figures 11.5 to 11.7 illustrate the process of dynamic horizontal scaling. ![](media/image50.png) The dynamic scalability architecture can be applied to a range of IT resources, including virtual servers and cloud storage devices. Besides the core automated scaling listener and resource replication mechanisms, the following mechanisms can also be used in this form of cloud architecture: Cloud Usage Monitor -- Specialized cloud usage monitors can track runtime usage in response to dynamic fluctuations caused by this architecture. Hypervisor -- The hypervisor is invoked by a dynamic scalability system to create or remove virtual server instances, or to be scaled itself. Pay-Per-Use Monitor -- The pay-per-use monitor is engaged to collect usage cost information in response to the scaling of IT resources.