Cloud Computing Slides - Service Level Agreements (SLA)

Document Details

IDK2456

Uploaded by IDK2456

Manipal University Jaipur

Sukhwinder Sharma

Tags

cloud computing service level agreements AWS virtualization

Summary

These slides cover Cloud Computing and Service Level Agreements (SLAs). Topics include SLA management, cloud security, AWS services, virtualization concepts, and real-world cloud computing case studies. Presented by Dr. Sukhwinder Sharma from the Department of Data Science and Engineering at Manipal University Jaipur.

Full Transcript

Cloud Computing Presented By: Dr. Sukhwinder Sharma Associate Professor Department of Data Science and Engineering Manipal University Jaipur, Jaipur Syllabus Introduction to Cloud Computing: Definition, Characteristics, History, Deployment Models (Publ...

Cloud Computing Presented By: Dr. Sukhwinder Sharma Associate Professor Department of Data Science and Engineering Manipal University Jaipur, Jaipur Syllabus Introduction to Cloud Computing: Definition, Characteristics, History, Deployment Models (Public, Private, Hybrid, Community), Service Models (IaaS, PaaS, SaaS), Cloud Architecture, Cloud Providers and Services, Cost Benefit Analysis of Cloud Adoption. Virtual Machines Provisioning and Migration Services: Virtualization Concepts, Types of Virtualization, Hypervisors, Creating and Managing Virtual Machines, Containerization (Docker, Kubernetes), High Availability and Disaster Recovery, Cloud Migration Concepts, Cloud Migration Techniques. SLA Management, Cloud Security and AWS Services: Service Level Agreement (SLA), SLA Management in Cloud, Automated Policy-based Management, Cloud Security Fundamentals, Security Challenges in the Cloud, Vulnerability Assessment, Security and Privacy, Cloud Computing Security Architecture, Amazon Web Services (AWS), AWS Services - Identity and Access Management (IAM), and Virtual Private Cloud (VPC). Advanced Topics: Serverless Computing, Edge Computing, Managed databases (RDS, NoSQL), Data Warehousing Solutions (Redshift, BigQuery), AI/ML services in the cloud (AWS SageMaker, Google AI Platform), Real-world Cloud Computing Case Studies, Discussion on Cloud Adoption in Various Industries. Services and Service Oriented Architectures In the early days of web-application deployment, performance of the application at peak load was a single important criterion for provisioning server resources. Provisioning in those days involved deciding hardware configuration, determining the number of physical machines, and acquiring them upfront so that the overall business objectives could be achieved. The web applications were hosted on these dedicated individual servers within enterprises’ own server rooms. These web applications were used to provide different kinds of e-services to various clients. Typically, the service-level objectives (SLOs) for these applications were response time and throughput of the application end-user requests. The capacity buildup was to cater to the estimated peak load experienced by the application. The activity of determining the number of servers and their capacity that could satisfactorily serve the application end-user requests at peak loads is called capacity planning. Example Scenario Two web applications, application A and application B, are hosted on a separate set of dedicated servers within the enterprise-owned server rooms The planned capacity for each of the applications to run successfully is three servers. As the number of web applications grew, the server rooms in the organization became large and such server rooms were known as data centers. These data centers were owned and managed by the enterprises themselves. Service Level Agreement (SLA) Enterprises developed the web applications and deployed on Infrastructure of the third-party service providers. These providers get the required hardware and make it available for application hosting. It necessitated the enterprises to enter into a legal agreement with the infrastructure service providers to guarantee a minimum quality of service (QoS). Typically, the QoS parameters are related to the availability of the system CPU, data storage, and network for efficient execution of the application at peak loads. This legal agreement is known as the service-level agreement (SLA). SLA Examples One SLA may state that the application’s server machine will be available for 99.9% of the key business hours of the application’s end users, also called core time, and 85% of the non-core time. Another SLA may state that the service provider would respond to a reported issue in less than 10 minutes during the core time, but would respond in one hour during non- core time. These SLAs are known as the infrastructure SLAs, and the infrastructure service providers are known as Application Service Providers (ASPs). Infrastructure SLA Scenario The enterprise applications are hosted on the dedicated servers belonging to an ASP. Consequently, a set of tools for monitoring and measurement of availability of the infrastructure were required and developed. However, availability of the infrastructure doesn’t automatically guarantee the availability of the application for its end users. These tools helped in tracking the SLA adherence. The responsibility for making the application available to its end users is with the enterprises. Therefore, the enterprises’ IT team performs capacity planning, and the infrastructure provider procures the same. The dedicated hosting practice resulted in massive redundancies within the ASP’s data centers due to the underutilization of many of their servers. This is because the applications were not fully utilizing their servers’ capacity at nonpeak loads. To reduce the redundancies and increase the server utilization in data centers, ASPs started co-hosting applications with complementary workload patterns. Co-hosting of applications means deploying more than one application on a single server. This led to further cost advantage for both the ASPs and enterprises. Figure 16.3 shows the enterprise and the third-party perspective before and after the applications are co- located. Figure 16.3a and Figure 16.3c shows the underutilized capacity of a server during dedicated hosting. However, Figure 16.3b shows the scenario when the same system is multiplexed between two applications, application A and application B; and the capacity of the server visible to the enterprise owning application A is only the amount consumed by it. However, Figure 16.3d depicts the ASP’s perspective of the server capacity utilization when two applications, application A and application B, having complementary workload patterns are co-located. However, newer challenges such as application performance isolation and security guarantees have emerged and needed to be addressed. Performance isolation implies that one application should not steal the resources being utilized by other co-located applications. For example, assume that application A is required to use more quantity of a resource than originally allocated to it for duration of time t. For that duration the amount of the same resource available to application B is decreased. This could adversely affect the performance of application B. Similarly, one application should not access and destroy the data and other information of co-located applications. Hence, appropriate measures are needed to guarantee security and performance isolation. These challenges prevented ASPs from fully realizing the benefits of co-hosting. Virtualization technologies have been proposed to overcome the above challenges. The ASPs could exploit the containerization features of virtualization technologies to provide performance isolation and guarantee data security to different co-hosted applications [2, 3]. The applications, instead of being hosted on the physical machines, can be encapsulated using virtual machines. These virtual machines are then mapped to the physical machines. System resource allocation to these virtual machines can be made in two modes: (1) conserving and (2) nonconserving. In the conserving mode, a virtual machine demanding more system resources (CPU and memory) than the specified quota cannot be allocated the spare resources that are remain un-utilized by the other co-hosted virtual machines. In the nonconserving mode the spare resources that are not utilized by the co-hosted virtual machines can be used by the virtual machine needing the extra amount of resource. If the resource requirements of a virtual machine cannot be fulfilled from the current physical host, then the virtual machine can be migrated to another physical machine capable of fulfilling the additional resource requirements. This new development enabled the ASPs to allocate system resources to different competing applications on demand. Because, the system resources are allocated to the applications based on their needs at different times, the notion of capacity planning is redundant. This is because the enterprises and the ASPs need not provision their resources for the peak load. Adoption of virtualization technologies required ASPs to get more detailed insight into the application runtime characteristics with high accuracy. Based on these characteristics, ASPs can allocate system resources more efficiently to these applications on-demand, so that application-level metrics can be monitored and met effectively. These metrics are request rates and response times. Therefore, different SLAs than the infrastructure SLAs are required. These SLAs are called application SLAs. These service providers are known as Managed Service Providers (MSP) because the service providers were responsible for managing the application availability too. This scenario is shown in Figure 16.4, where both application A and application B share the same set of virtualized servers. To fulfill the SLOs mentioned in the application SLA and also make their IT infrastructure elastic, an in-depth understanding of the application’s behavior is required for the MSPs. Elasticity implies progressively scaling up the IT infrastructure to take the increasing load of an application. The customer is billed based on their application usage of infrastructure resources for a given period only. The infrastructure can be augmented by procuring resources dynamically from multiple sources, including other MSPs, if resources are scarce at their data centers. This kind of new hosting infrastructure is called cloud platform. The cloud platforms introduce another set of challenges to fulfill the SLOs agreed between the cloud owners and the application owners. Due to nonavailability of high-level design documents, the cloud owners have to treat the customer application that might include many third-party components and packaged applications, as a black box. To address these challenges in meeting SLAs, service providers are required to follow a meticulous process for understanding and characterizing the applications runtime behavior better. Types of SLA Service-level agreement provides a framework within which both seller and buyer of a service can pursue a profitable service business relationship. It outlines the broad understanding between the service provider and the service consumer for conducting business and forms the basis for maintaining a mutually beneficial relationship. From a legal perspective, the necessary terms and conditions that bind the service provider to provide services continually to the service consumer are formally defined in SLA. SLA can be modeled using web service-level agreement (WSLA) language specification. Although WSLA is intended for web-service-based applications, it is equally applicable for hosting of applications. Key Components of WSLA Types of SLAs (from the perspective of application hosting) There are two types of SLAs from the perspective of application hosting: Infrastructure SLA Application SLA Infrastructure SLA The infrastructure provider manages and offers guarantees on availability of the infrastructure, namely, server machine, power, network connectivity, and so on. Enterprises manage themselves, their applications that are deployed on these server machines. The machines are leased to the customers and are isolated from machines of other customers. In such dedicated hosting environments, a practical example of service-level guarantees offered by infrastructure providers is shown in Table 16.2. Application SLA In the application co-location hosting model, the server capacity is available to the applications based solely on their resource demands. Hence, the service providers are flexible in allocating and de-allocating computing resources among the co-located applications. Therefore, the service providers are also responsible for ensuring to meet their customer’s application SLOs. For example, an enterprise can have the following application SLA with a service provider for one of its application, as shown in Table 16.3. Challenges It is also possible for a customer and the service provider to mutually agree upon a set of SLAs with different performance and cost structure rather than a single SLA. The customer has the flexibility to choose any of the agreed SLAs from the available offerings. At runtime, the customer can switch between the different SLAs. However, from the SLA perspective there are multiple challenges for provisioning the infrastructure on demand. a. The application is a black box to the MSP and the MSP has virtually no knowledge about the application runtime characteristics. Therefore, the MSP needs to determine the right amount of computing resources required for different components of an application at various workloads. b. The MSP needs to understand the performance bottlenecks and the scalability of the application. c. The MSP analyzes the application before it goes on-live. However, subsequent operations/enhancements by the customer’s to their applications or auto updates beside others can impact the performance of the applications, thereby making the application SLA at risk. d. The risk of capacity planning is with the service provider instead of the customer. If every customer decides to select the highest grade of SLA simultaneously, there may not be a sufficient number of servers for provisioning and meeting the SLA obligations of all the customers. Life Cycle of SLA Each SLA goes through a sequence of steps starting from identification of terms and conditions, activation and monitoring of the stated terms and conditions, and eventual termination of contract once the hosting relationship ceases to exist. Such a sequence of steps is called SLA life cycle and consists of the following five phases: 1. Contract definition 2. Publishing and discovery 3. Negotiation 4. Operationalization 5. De-commissioning Contract Definition Generally, service providers define a set of service offerings and corresponding SLAs using standard templates. These service offerings form a catalog. Individual SLAs for enterprises can be derived by customizing these base SLA templates. Publication and Discovery Service provider advertises these base service offerings through standard publication media, and the customers should be able to locate the service provider by searching the catalog. The customers can search different competitive offerings and shortlist a few that fulfill their requirements for further negotiation. Negotiation Once the customer has discovered a service provider who can meet their application hosting need, the SLA terms and conditions needs to be mutually agreed upon before signing the agreement for hosting the application. For a standard packaged application which is offered as service, this phase could be automated. For customized applications that are hosted on cloud platforms, this phase is manual. The service provider needs to analyze the application’s behavior with respect to scalability and performance before agreeing on the specification of SLA. At the end of this phase, the SLA is mutually agreed by both customer and provider and is eventually signed off. SLA negotiation can utilize the WS-negotiation specification. Operationalization SLA operation consists of SLA monitoring, SLA accounting, and SLA enforcement. SLA monitoring involves measuring parameter values and calculating the metrics defined as a part of SLA and determining the deviations. On identifying the deviations, the concerned parties are notified. SLA accounting involves capturing and archiving the SLA adherence for compliance. As part of accounting, the application’s actual performance and the performance guaranteed as a part of SLA is reported. Apart from the frequency and the duration of the SLA breach, it should also provide the penalties paid for each SLA violation. SLA enforcement involves taking appropriate action when the runtime monitoring detects a SLA violation. Such actions could be notifying the concerned parties, charging the penalties besides other things. The different policies can be expressed using a subset of the Common Information Model (CIM). The CIM model is an open standard that allows expressing managed elements of data center via relationships and common objects. De-commissioning SLA decommissioning involves termination of all activities performed under a particular SLA when the hosting relationship between the service provider and the service consumer has ended. SLA specifies the terms and conditions of contract termination and specifies situations under which the relationship between a service provider and a service consumer can be considered to be legally ended. SLA Management in Cloud SLA management of applications hosted on cloud platforms involves five phases. 1. Feasibility 2. On-boarding 3. Pre-production 4. Production 5. Termination 1. Feasibility Analysis MSP conducts the feasibility study of hosting an application on their cloud platforms. This study involves three kinds of feasibility: (1) technical feasibility, (2) infrastructure feasibility, and (3) financial feasibility. The technical feasibility of an application implies determining the following: 1. Ability of an application to scale out. 2. Compatibility of the application with the cloud platform being used within the MSP’s data center. 3. The need and availability of a specific hardware and software required for hosting and running of the application. 4. Preliminary information about the application performance and whether they can be met by the MSP. Performing the infrastructure feasibility involves determining the availability of infrastructural resources in sufficient quantity so that the projected demands of the application can be met. The financial feasibility study involves determining the approximate cost to be incurred by the MSP and the price the MSP charges the customer so that the hosting activity is profitable to both of them. A feasibility report consists of the results of the above three feasibility studies. The report forms the basis for further communication with the customer. Once the provider and customer agree upon the findings of the report, the outsourcing of the application hosting activity proceeds to the next phase, called “onboarding” of application. Only the basic feasibility of hosting an application has been carried in this phase. However, the detailed runtime characteristics of the application are studied as part of the on-boarding activity. On-Boarding of Application Once the customer and the MSP agree in principle to host the application based on the findings of the feasibility study, the application is moved from the customer servers to the hosting platform. Moving an application to the MSP’s hosting platform is called on-boarding. As part of the on-boarding activity, the MSP understands the application runtime characteristics using runtime profilers. This helps the MSP to identify the possible SLAs that can be offered to the customer for that application. This also helps in creation of the necessary policies (also called rule sets) required to guarantee the SLOs mentioned in the application SLA. The application is accessible to its end users only after the onboarding activity is completed. On-boarding Activity On-boarding activity consists of the following steps: a. Packing of the application for deploying on physical or virtual environments. Application packaging is the process of creating deployable components on the hosting platform (could be physical or virtual). Open Virtualization Format (OVF) standard is used for packaging the application for cloud platform. b. The packaged application is executed directly on the physical servers to capture and analyze the application performance characteristics. It allows the functional validation of customer’s application. Besides, it provides a baseline performance value for the application in nonvirtual environment. This can be used as one of the data points for customer’s performance expectation and for application SLA. Additionally, it helps to identify the nature of application—that is, whether it is CPU-intensive or I/O-intensive or network-intensive and the potential performance bottlenecks. c. The application is executed on a virtualized platform and the application performance characteristics are noted again. Important performance characteristics like the application’s ability to scale (out and up) and performance bounds (minimum and maximum performance) are noted. d. Based on the measured performance characteristics, different possible SLAs are identified. The resources required and the costs involved for each SLA are also computed. e. Once the customer agrees to the set of SLOs and the cost, the MSP starts creating different policies required by the data center for automated management of the application. This implies that the management system should automatically infer the amount of system resources that should be allocated/de-allocated to/from appropriate components of the application when the load on the system increases/decreases. These policies are of three types: (1) business, (2) operational, and (3) provisioning. Business policies help prioritize access to the resources in case of contentions. Business policies are in the form of weights for different customers or group of customers. Operational policies are the actions to be taken when different thresholds/conditions are reached. Also, the actions when thresholds/ conditions/triggers on service-level parameters are breached or about to be breached are defined. The corrective action could be different types of provisioning such as scale-up, scale-down, scale-out, scale-in, and so on, of a particular tier of an application. Additionally, notification and logging action (notify the enterprise application’s administrator, etc.) are also defined. Operational policies (OP) are represented in the following format: Here the action could be workflow defining the sequence of actions to be undertaken. For example, one OP is It means, if average latency of the web server is more than 0.8 sec then automatically scale out the web-server tier. On reaching this threshold, MSP should increase the number of instances of the web server. Provisioning policies help in defining a sequence of actions corresponding to external inputs or user requests. Scale-out, scale-in, start, stop, suspend, resume are some of the examples of provisioning actions. A provisioning policy (PP) is represented as For example, a provisioning policy to start a web site consists of the following sequence: start database server, start web-server instance 1, followed by start the web-server instance 2, and so on. On defining these policies, the packaged applications are deployed on the cloud platform and the application is tested to validate whether the policies are able to meet the SLA requirements. This step is iterative and is repeated until all the infrastructure conditions necessary to satisfy the application SLA are identified. Once the different infrastructure policies needed to guarantee the SLOs mentioned in the SLA are completely captured, the on-boarding activity is said to be completed. Preproduction Once the determination of policies is completed as discussed in previous phase, the application is hosted in a simulated production environment. It facilitates the customer to verify and validate the MSP’s findings on application’s runtime characteristics and agree on the defined SLA. Once both parties agree on the cost and the terms and conditions of the SLA, the customer sign-off is obtained. On successful completion of this phase the MSP allows the application to go on-live. Production In this phase, the application is made accessible to its end users under the agreed SLA. However, there could be situations when the managed application tends to behave differently in a production environment compared to the preproduction environment. This in turn may cause sustained breach of the terms and conditions mentioned in the SLA. Additionally, customer may request the MSP for inclusion of new terms and conditions in the SLA. If the application SLA is breached frequently or if the customer requests for a new non-agreed SLA, the on-boarding process is performed again. In the case of the former, on-boarding activity is repeated to analyze the application and its policies with respect to SLA fulfillment. In case of the latter, a new set of policies are formulated to meet the fresh terms and conditions of the SLA. Termination When the customer wishes to withdraw the hosted application and does not wish to continue to avail the services of the MSP for managing the hosting of its application, the termination activity is initiated. On initiation of termination, all data related to the application are transferred to the customer and only the essential information is retained for legal compliance. This ends the hosting relationship between the two parties for that application, and the customer sign-off is obtained. Automated Policy-based Management This section explains in detail the operationalization of the “Operational” and “Provisioning” policies defined as part of the on-boarding activity. The policies specify the sequence of actions to be performed under different circumstances. Operational policies specify the functional relationship between the system level infrastructural attributes and the business level SLA goals. Knowledge of such a relationship helps in identifying the quantum of system resources to be allocated to the various components of the application for different system attributes at various workloads, workload compositions, and operating conditions, so that the SLA goals are met. Figure 16.8 explains the importance of such a relationship. For example, consider a three-tier web application consisting of web server, application server, and database server. Each of the servers is encapsulated using a virtual machine and is hosted on virtualized servers. Furthermore, assume that the web tier and the database tier of the application have been provisioned with sufficient resources at a particular work-load. The effect of varying the system resources (such as CPU) on the SLO, which in this case is the average response time for customer requests, is shown in Figure 16.8. To understand the system resource requirements for each of the tiers of an application at different workloads necessitates the deployment of the application on a test system. The test system is used to collect the low-level system metrics such as usage of memory and CPU at different workloads, as well as to observe the corresponding high-level service level objectives such as average response time. The metrics thus collected are used to derive the functional relationship between the SLOs and low-level system attributes. These functional relations are called policies. For example, a classification technique is used to derive policies [12, 13]. The triggering of operational and provisional policies results in a set of actions to be executed by the service provider platform. It is possible that some of these actions contend for the same resources. In such a case, execution of certain actions needs to be prioritized over the execution of others. The rules that govern this prioritization of request execution in case of resource contention are specified as a part of business policy. Some of the parameters often used to prioritize action and perform resource contention resolution are: The SLA class (Platinum, Gold, Silver, etc.) to which the application belongs to. The amount of penalty associated with SLA breach. Whether the application is at the threshold of breaching the SLA. Whether the application has already breached the SLA. The number of applications belonging to the same customer that has breached SLA. The number of applications belonging to the same customer about to breach SLA. The type of action to be performed to rectify the situation. Priority ranking algorithms use these parameters to derive scores. These scores are used to rank each of the actions that contend for the same resources. Actions having high scores get higher priority and hence, receive access to the contended resources. Figure 16.9. Component diagram of policy based automated manageme nt system. The basic functionality of these components is described below: 1. Prioritization Engine. 1. Requests from different customers’ web applications contending for the same resource are identified, and accordingly their execution is prioritized. 2. Business policies defined by the MSP helps in identifying the requests whose execution should be prioritized in case of resource contentions so that the MSP can realize higher benefits. 2. Provisioning Engine. 1. Every user request of an application will be enacted by the system. 2. The set of steps necessary to enact the user requests are defined in the provisioning policy, and they are used to fulfill the application request like starting an application, stopping an application, and so on. 3. These set of steps can be visualized as a workflow. 4. Hence, the execution of provisioning policy requires a workflow engine. 3. Rules Engine. 1. The operation policy defines a sequence of actions to be enacted under different conditions/trigger points. 2. The rules engine evaluates the data captured by the monitoring system , evaluates against the predefined operation rules, and triggers the associated action if required. 3. Rules engine and the operational policy is the key to guaranteeing SLA under a self healing system. 4. Monitoring System. 1. Monitoring system collects the defined metrics in SLA. 2. These metrics are used for monitoring resource failures, evaluating operational policies, and auditing and billing purpose. 5. Auditing. 1. The adherence to the predefined SLA needs to be monitored and recorded. It is essential to monitor the compliance of SLA because any noncompliance leads to strict penalties. 2. The audit report forms the basis for strategizing and long-term planning for the MSP. 6. Accounting/Billing System. 7. Based on the payment model, chargebacks could be made based on the resource utilized by the process during the operation. 8. The fixed cost and recurring costs are computed and billed accordingly. The policies and packaged application are deployed on the platform after completing the on-boarding activity. The customer is provided with options to start the application in any of the agreed SLAs. The application request is sent via the access layer to the system. Using the provisioning policy, the provisioning engine determines how and in what sequence the different components/tiers of an application should be started and configured. If the start operation requires a resource that is also contended by a different application request, then provisioning engine interacts with the prioritization engine to determine the request that should have access to the contended resource in case of conflict. This conflict resolution is guided by the business policy defined in the prioritization engine. Once an application begins execution, it is continuously monitored by the monitoring system. Monitoring involves collecting statistics about the key metrics and evaluating them against the rules defined in the operational policy for validating the SLA adherence. SLA violation triggers rules that initiate appropriate corrective action automatically. For example, whenever the performance of the application degrades and chances of violating the agreed SLO limits are high, the rules that help scale out the bottleneck tier of the application is triggered. This ensures that the performance does not degenerate to a level of violating the SLA. Periodically, the amount of resource utilized by the application is calculated. On calculating the resource utilization, the cost is computed correspondingly, and the bill is generated. The bill along with the report on the performance of the application is sent to the customer. Alternatively, the monitoring system can interact with the rules engine through an optimization engine, as shown in Figure 16.10. The role of the optimization system is to decide the migration strategy that helps optimize certain objective functions for virtual machine migration. The objective could be to minimize the number of virtual machines migrated or minimize the number of physical machines affected by the migration process. The following example highlights the importance of the optimization engine within a policy based management system. Assume an initial assignment of seven virtual machines (VM) to the three physical machines (PM) at time t1 as shown in Figure 16.11. Also, each of the three PMs has memory and CPU capacity of 100. At time t1, the CPU usage by VM1, VM2, and VM3 on PMA are 40, 40, and 20, respectively, and the memory consumption is 20, 10, and 40 respectively. Similarly, at time t1 the CPU and memory requirements of VM4, VM5, and VM6 on PMB are 20, 10, 40 and 20, 40, 20, respectively. VM7 only consumes 20% of CPU and 20% of memory on PMC. Thus, PMB and PMC are underloaded but PMA is overloaded. Assume VM1 is the cause of the overload situation in PMA. (a) Initial configuration of the VMs and the PMs at time t1. (b) Configuration resulting from event based migration of VM1 at time t1. (c) Resource requirement situation at time t2. t1. (d) Configuration resulting from “event based” migration of VM 4 at time t2. t1. (e) Alternate configuration resulting from optimization based migration at time t2. t1. In the above scenario, event-based migration will result in migration of VM1 out of PMA to PMC. Furthermore, consider that at time t (t. t ), PM is overloaded as the 2 2 1 B memory requirement of VM4 increases to 40. Consequently, an event-based scheme results in migration of VM to PM. 4 C At time t (t. t ), a new VM, VM , with CPU and memory requirements of 3 3 2 8 70 each, needs to be allocated to one of the PMs; then a new PM, PM D, needs to be switched on for hosting it. In such a scenario, VM cannot be hosted on any of the three existing PMs: 8 PMA, PMB, and PMC. However, assume that the duration of the time window t - t is such that 2 1 the QoS and SLA violations due to the continued hosting of VM 1 on PMA are well within the permissible limits. In such a case, the migration of both VMs—VM to PM and VM to PM — at 1 B 4 A time t2 ensures lesser number of PM are switched on. This results in a global resource assignment that may be better than local resource management. In such environment, consider a case wherein a virtual machine is overloaded. The optimization module needs to not only determine the virtual machine that needs to be migrated out of its current physical machine but also determine the new physical machine where the migrating virtual machine should be hosted. The Sandpiper technique has been proposed for monitoring and detecting hotspots, determining new assignments of virtual resources to physical resources, and initiating the necessary migrations. Conclusion The chapter presented a detailed overview of SLA and its importance from the service provider’s perspective. It described a brief history of how the SLA that evolved from a state of infrastructure availability was the prime consideration today where complex application SLO could be included as part of it. The chapter provided the necessary mechanisms that make it possible for a service provider to evaluate the infrastructure needs to meet the provisions mentioned in the SLA. A complete view of the process involved and also an overview of the architectural stack for achieving the same are presented. References 1. D. Mensce and V. Almeida, Capacity Planning for web Performance: Metrics, Models and Methods, Prentice Hall, Englewood Cliffs, NJ, 1998. 2. P. Barham, B. Dragovic, K. Fraser, S. Hand, T. Harris, A. Ho, R. Neugebauer, I. Pratt, and A. Warfield, Xen and the art of virtualization, in Proceedings of the 19th ACM Symposium on Operating Systems Principles, New York, October 19 22, 2003, pp. 164 177. 3. G. Popek and R. Goldberg, Formal requirements for virtualizable third generation architectures, Communications of ACM, 17(7):412 421, 1974. 4. E. de Souza E. Silva, and M. Gerla, Load balancing in distributed systems with multiple classes and site constraints, in Proceedings of the 10th International Symposium on Computer Performance Modeling, Measurement and Evaluation, Paris, France, December 19 21 1984, pp. 17 33. 5. J. Carlstrom and R. Rom, Application aware admission control and scheduling in web servers, in Proceedings of the 21st IEEE Infocom, New York, June 23 27, 2002, pp. 824 831. 6. X. Chen, P. Mohapatra, and H. Chen, An admission control scheme for predictable server response time for web accesses, in Proceedings of the 10th International Conference on World Wide web, Hong Kong, China, May 1 5, 2001, pp. 545 554. 7. Web Service Level Agreement (WSLA) Language Specification Version 1.0, www.research.ibm.com/wsla/WSLASpecV1 20030128.pdf, accessed on April 16, 2010. 8. Web Services Agreement Specification (WS Agreement), http://ogsa.gridforum.org/ Public_Comment_Docs/Documents/Oct 2006/WS AgreementSpecificationDraftFi nal_sp_tn_jpver_v2.pdf, accessed on April 16, 2010. 9. Common Information Model (CIM) Standards, DMTF standard version 2.25.0, March 2010, http://www.dmtf.org/standards/cim/cim_schema_v2250/, accessed on April 16, 2010. 10. S. Bose, N. Tiwari, A. Pasala, and S. Padmanabhuni, SLA Aware “on boarding” of applications on the cloud, SETLabs Briefings, 7(7):27 32, 2009. 11. Open Virtualization Format Specification, DMTF standard version 1.0.0, Doc. no. DSP0243, February 2009, http://www.dmtf.org/standards/published_documents/DSP0243_1.0.0.pdf, accessed on April 16, 2010. 12. Y. Udupi, A. Sahai, and S. Singhal, A classification based approach to policy refinement, in Proceedings of the 10th IFIP/IEEE International Symposium on Integrated Network Management, Munich, Germany, May 21 25, 2007, pp. 785 788. 13. Y. Chen, S. Iyer, X. Liu, D. Milojicic, and A. Sahai, SLA decomposition: Translating service level objectives to system level thresholds, in Proceedings of the 4th International Conference on Autonomic Computing (ICAC), Florida, June 11 15, 2007, pp. 3 3. 14. K. Gor, D. Ra, S. Ali, L. Alves, N. Arurkar, I. Gupta, A. Chakrabarti, A. Sharma, and S. Sengupta, Scalable enterprise level workflow and infrastructure management in a grid computing environment, in Proceedings of the 5th IEEE International