TT284 Web Technologies Tutorial 10 PDF

4/11/2015 TT284: Web Technologies 1 Arab Open University TT284 WEB TECHNOLOGIES Tutorial 10 By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies...

4/11/2015 TT284: Web Technologies 1 Arab Open University TT284 WEB TECHNOLOGIES Tutorial 10 By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 2 Block 4 Managing application development By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 3 introduce you to some of the important concepts and techniques associated with managing the development of a web application. The key question addressed in the block is how you ensure that a web application is always available to users. In this Block 4: Block 4, Part 1 ‘Maintaining service’ introduces some basic definitions and reviews the key factors that influence the availability of a web application and explores how performance can be maintained through the use of fault tolerance, load sharing, and virtualisation. Block 4, Part 2 ‘Managing projects’ explores the management of a web application through the development of a project plan. Building on the software development life-cycle, it describes how the plan is used to communicate the project’s requirements for quality, resources, and risks. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 4 Optional, self-study 1. Block 4, Part 3 ‘Managing assets’ 2. Block 4, Part 4 ‘Practical version control’ Block 4, Part 5 ‘System testing’ explores the part that testing plays in the ensuring the quality and reliability of an application. Building on the life cycle, it describes the activities to be undertaken as development proceeds through unit, integration, and system test phases, leading to the development of a test plan. Block 4, Part 6 ‘System Security’ examines why web applications become vulnerable to attack when poor coding goes undetected or servers are not configured correctly. It also looks at how web servers implement access controls through authentication and authorisation, and use encryption to safeguard data. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 5 Block 4, Part 1: Maintaining service Contents 1 Introduction 2 Learning outcomes 3 Downtime 4 Metrics 5 Another look at availability 6 Increasing availability 7 Virtualisation and clouds 8 Disaster recovery By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 6 1 Introduction The growth of eBusiness and eCommerce means that for many companies, regardless of their size, web technologies are essential to everyday operations. This part explore the theme of ‘maintaining service’, by which the tools and techniques that can be used to ensure that the web applications, web services, and host systems run 24 hours a day, 365 days of the year. The key metric used in the industry is ‘availability’; this is the probability that an application, service or system is available for use. the basic definition of availability (A) is the ratio of the ‘uptime’ to ‘total time’, which can be expressed as: Where Uptime is the period of time the system is operational and Downtime is the period the system is non-operational. For a retail web application the typical target is an availability of 0.999, or 99.9% of the time. The availability for a service or system handling financial transactions would have a target availability of 0.99999 or 99.999%. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 7 3 Downtime The term ‘downtime’ refers to the total time that a system is unavailable for use and every hour of downtime can cost an organisation a lot of money. Do Activity 1 (exploration) The causes of downtime many sources of information about downtime, the majority of downtime is the result of planned maintenance and upgrades. “More than 80 per cent of downtime results not from unplanned incidents, but rather from planned, unavoidable activities such as data backups, database reorganizations and hardware and software upgrades. The fact that they are planned may reduce the impact these events have on business operations, but it does not eliminate their cost. Whatever the reason, if the system is not available, users cannot access the data and applications they need to do their jobs. Consequently, the business starts losing money with the very first minute of downtime; regardless of whether it was planned or unplanned In the past, organizations took advantage of 'maintenance and batch windows', which were hours when the business was closed and allowed IT to shut systems down with minimal impact on the business. In today’s global e-business environment, critical systems must be available 24 hours a day, seven days a week. Planned or unplanned downtime can hurt a business because information is not available, decisions are not made, orders are not shipped, funds are not transferred and customers cannot interact with the organization — in short, business stops.” Source: Arnold (2009). By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 8 Arnold (2009) suggests that all downtime can be broadly classified under seven headings, as detailed in Table 1. Table 1 Causes of downtime Cause Percentage downtime Backup and restoration 10% Hardware, network, operating system upgrades 10% Batch processing transactions 10% Application and database maintenance 50% Environmental factors 4% Application errors 8% Operator and user errors 8% The bottom three rows are classed as unplanned downtime. Backup and restoration of data is an everyday task and is essential to safeguard data, applications and services. The ‘upgrades’ category covers all hardware replacements and enhancements and the regular updates issued to enhance operating system software and fix security ‘loop-holes’. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 9 The ‘batch processing’ category covers a special type of transaction processing carried out offline, for example payroll payments and inter-bank payments. The ‘application maintenance’ category takes account of the maintenance of application software and the distribution of that software to users. The ‘database maintenance’ category covers tasks such as updating indexes, partitioning tables, and data replication, but excludes backup. The ‘environmental factors’ category appears to cover failures in cooling plant and power generation failures as well as bad weather events such as flooding and tornadoes. The ‘application errors’ category covers problems arising from software design errors, such as overwriting data or failing to save data. The ‘operator and user errors’ category covers downtime arising from simple ‘people’ mistakes, such as turning off the wrong switch or following the wrong procedure, as well as malicious actions. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 10 Downtime in data centres The majority of small- and medium-sized companies outsource the provision of computers and networks to commercial data centres; large warehouses containing several thousand servers, special cooling equipment, backup generators and multiple suppliers of power and internet access. These data centres can spread the cost of the specialised equipment and staff across all the servers and incorporate disaster recovery techniques to safeguard customer operations. The primary causes of downtime reported are listed in Table 2, from which you can see that power-related failures account for nearly 40% of reported incidents. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 11 The costs of downtime Various organisations have been tracking the costs of downtime for more than 30 years, but my research suggests that prior to the year 2000 the only costs tracked were the direct operating costs of the IT equipment. More recent studies reflect the growth of online business and have attempted to track the loss of revenue when systems become unavailable. Tables below shows many studies results: Table 4 Revenue loss estimates per hour of downtime By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 12 Data on average revenue loss according to the size of the company. On average downtime for all the companies surveyed was 14 hours, Table 6 Average annual revenue loss per company according to size On average downtime for all the companies surveyed was 14 hours, during which time employees were reduced to 63% of normal productivity. A further 9 hours of reduced productivity (75%) is incurred while data is recovered. Reviewing the causes and costs of downtime show that it is a major factor for any organisation that relies on computers and networks to conduct its business. The causes of downtime are numerous, but perhaps the most significant lesson is that planned activities, such as backup, maintenance, and upgrades contribute 80% of all downtime. By: Dr. Monif Jazzar. AOU-KW Read The Patterson model 4/11/2015 TT284: Web Technologies 13 4 Metrics Reliability: the ability of a system or component to perform its required functions under stated conditions for a specified period of time. IEEE. include the performance of normal functions in hostile or unexpected circumstances. Mathematically reliability is the probability R(t) that at some time in the future an item of equipment will be operational. The formula to calculate reliability is: where f(x) is the failure probability density function and t is the time at which reliability is calculated (which is assumed to start from time zero). Do Activity 2 Mean time to failure (MTTF) One of the more common metrics associated with reliability is the ‘mean time to failure’ (MTTF), which is a statistical measure of reliability for items that cannot be repaired. It is based on testing a batch of items over a short period of time, where ‘short’ is taken to mean short when compared to the expected life of the item, perhaps two months of testing for a product with a 5-year lifespan. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 14 A simple definition for calculating the mean time to failure (MTTF) is: As an example a disk drive manufacturer tests a sample of 1,000 drives by running them for a period of 1,000 hours (just over 41.5 days). At the end of that period one disk drive was found to have failed. The MTTF is calculated as: Unfortunately this does not mean the average disk drive will last for 1,000,000 hours (approximately 114 years). The more appropriate interpretation is that if you ran 114 disk drives for 1 year, then 1 drive would fail. Annualised failure rate (AFR) Some manufacturers have stopped publishing MTTF data and instead quote the annualised failure rate (AFR) expressed as a percentage. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 15 The factor 8760 is the number of hours in one year. Using the same example of 1 failure per year in a batch of 114 disk drives (MTTF of 1,000,000 hours) the AFR = 0.876%. We can use the AFR to estimate the performance of the disk drives in a server farm as follows: Number of drive failures = number of drives x AFR. If there are 2000 disk drives we can expect 2000 × 0.876, or approximately 18 drive failures per year. Note: there is a quicker way to calculate the AFR. It is 100/MTTF where the MTTF is expressed in years. Another example Do Activity 3 One of the problems with MTTF and AFR is that they represent statistical values obtained from special test set ups, so they share little in common with the average computer. Furthermore, attempting to calculate the MTTF value for a computer from the MTTF data of the individual components is not a simple task. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 16 5 Another look at availability Availability is the probability that a service or system is available to be used; it is fully functional, performing with specified limits, and delivering the appropriate level of quality. From a user’s perspective There are a couple of points to appreciate from this expression. First, if the uptime is very large compared to the downtime then the availability will be high because problems are infrequent. Second, if the downtime is very small compared to the uptime, indicating problems are quick to fix, the availability is also high. Table 7 The ‘nines’ measure of resilience By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 17 Inherent availability If a system has not yet been built, the normal calculation for availability cannot be applied, so we have to turn to a slightly different notion of availability based on the mean time to failure (MTTF) and the ‘mean time to repair’ (MTTR), also known as ‘mean time to replace’. Using these values the inherent availability (Ai) is given by the expression: The MTTF values are typically provided by equipment manufacturers based on data from accelerated-use testing. MTTF specifically excludes failures in what is known as the ‘infant mortality’ period, the early failure of products because of weak parts or bad production. Values for MTTR can be estimated based on how quickly replacement parts can be obtained and how quickly staff can replace those parts. The expression tells us that as reliability decreases, that is as MTTF gets shorter, better maintainability is needed (i.e. shorter MTTR) to achieve the same level of availability. Thus trade-offs can be made between reliability and maintainability to achieve a target level of availability. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 18 The values provided in Table 8 give some idea of the range of MTTR values for computing equipment when spare parts and staff are available onsite to when they have to be transported to site. The estimated values take into account weekends, annual leave, and public holidays. Given the MTTR values from Table 8 it should be clear that even with parts and staff available on site 24 x 7 it would be difficult to achieve a ‘5 nines’ level of availability (99.999%) without taking other measures to reduce the impact of downtime. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 19 Operational availability Another version of availability you might encounter is operational availability (Ao), which is a measure of the availability actually measured for a real system. It is defined as: The mean time between maintenance (MTBM) takes account of any corrective and preventive maintenance that might be carried out, whereas MTTF only accounts for failures. The mean downtime (MDT) accounts for all the time the system is unavailable, no matter the cause. It will include downtime for corrective maintenance (fixing a problem), preventative maintenance (change a noisy fan, software update), and human errors. An estimate based on inherent availability, Ai, will generally give a higher value than operational availability, Ao, because the former does not account for any downtime for preventative actions. As an example DO Activity 4. (important) By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 20 6 Increasing availability Given the importance of availability to successful business operations, what measures can we adopt to increase the availability of the system that will host our application? three approaches: load sharing, clustering and fault tolerance, and the more recent development of virtualisation. Load sharing Load sharing is intended to improve availability by sharing the total workload across a number of computers, hence the name. A typical load sharing configuration comprises a number of independent ‘nodes’ and some form of ‘load sharing’ monitor, as illustrated in 2. In the case of a web application the ‘nodes’ would be web servers, each with its own local storage for operating system, web server, content, and application data, and the combination of nodes and monitor is referred to as a ‘web server farm’. Other applications might comprise multiple file servers, print servers, DNS servers; hence use of the generic ‘node’. The ‘load sharing’ monitor provides a single global IP address for the web server farm and communicates with the individual web servers over a private IP network. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 21 The basic load sharing configuration is cheap to set up because it can utilise commodity computers and disks, and is quickly expanded to cope with growth by the addition of more nodes, but there are several disadvantages. 1. There is no means for the monitor to track that each request receives an appropriate response; its task is to distribute incoming requests and monitor availability. So if a node does fail the user could be left wondering what has gone wrong. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 22 2. The monitor represents a single point of failure in that if the monitor fails the entire farm ceases to function. 3. Content and software upgrades must be replicated to each node independently, increasing the workload of the system administrator. 4. There is no guarantee that multiple requests from a single client are directed to the same node, which has implications for web applications that rely on accumulated data (also known as ‘session data’) such as the ‘shopping cart’. For example, if Node 1 has been processing the user’s requests to add items to the cart, the data relating to those items will be stored on its local disk. If Node 1 fails there is no means for nodes 2 or 3 to access the stored data. One solution to the user data problem is to incorporate cookies into the application so that the data is held by the client’s browser and returned to the web server with every request. Another solution is to add some form of shared storage to the architecture, as illustrated in 3. In this configuration the local disk stores the operating system and web server software and all application data (data arising from client requests) is saved to the shared data store. If a node should fail then another node can take over by retrieving the user’s data from the shared store. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 23 Both solutions can handle the situation of a node that fails after completing a single user request, because any new request will be directed to an operational node. That still leaves the problem of a node failing while processing a user request as there is no way to monitor completion. To overcome the limitations of the basic configuration requires that the nodes are more tightly integrated and somehow share their current state across the entire farm. One solution, illustrated in 4, is to link the nodes together by means of a dedicated ‘heartbeat’ network and a special heartbeat monitor. Software in each node communicates with the monitor to report a node’s current state, which is the heartbeat signal. The heartbeat has little impact on overall performance as the messages are short and sent over the dedicated network. Should a node fail it can report the problem to the monitor, or, if no heartbeat is received the monitor will assume failure. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 24 Figure 4 Load sharing with heartbeat messaging The higher availability of the heartbeat configuration comes with additional costs: dual network interfaces in each node; one for application data and one for the heartbeat an additional computer to function as the heartbeat monitor software to provide the heartbeat function in each node extra system administration activities to install, configure, and monitor the farm. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 25 Availability enhancement It is important to appreciate that simple load sharing offers limited availability gains because of the independence of each node. While it is true that the application will continue to function with multiple node failures, a user connected to a failed node will perceive zero availability. Additional measures are required to benefit from multiple nodes, such as application data sharing and integration by means of heartbeat monitoring. It is not until nodes become tightly integrated and additional hardware and software added that there are significant improvements in availability. The real benefit of load sharing is what is termed ‘scaling out’; the ability to process more requests by the simple addition of an extra node. Load sharing can reduce planned downtime because a single node can be updated, or backed up, in isolation while the application continues to run on the other nodes. Clustering A cluster is a collection of independent computer nodes that function as a single ‘logical’ server to the user. Tight integration between the members of the cluster allows one node to take over a running application without the user being aware that such a takeover has occurred. Although not generally used for web servers, clustering does feature in the design of high-availability (HA), back-end data storage and file By: Dr. Monif Jazzar. AOU-KW servers 4/11/2015 TT284: Web Technologies 26 The primary goal of a cluster is to increase availability by means of redundant nodes. Clusters typically operate in two forms, active-passive and active-active. To explain the difference let’s assume a cluster comprises two nodes, each with its own storage devices, hosting a single application. A simplified representation of this configuration is shown in Figure 5. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 27 In the active-passive configuration one node is actively dealing with application requests; this is the ‘active’ node on the left, labelled System A. The other ‘passive’ node, System B, is in a standby state ready to take over in case of failure. Each node is aware of its current role in the cluster, ensuring that only System A responds to client requests. In the active-active configuration (as shown in Figure 6) both nodes are fully operational, each hosting a single application. If the heartbeat messages indicate a failure of Server A, then Server B will start its own copy of the application, adding to the application it is already hosting. The data for the new application will already have been replicated to Disk B. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 28 The gains of clustering come with additional costs: dual network interfaces in each node, one for application data and one for the heartbeat special software to support the clustering function and failover extra system administration work to install, configure, and monitor the cluster high cost of removing single points of hardware failure (power supplies, disks, networks, etc.). Do Activity 5 Fault tolerant systems Although hardware failures account for less than 20% of downtime, there is no way to predict when such a failure will occur; a quiet period in the middle of the night, or at the busiest period during the day. Of course a global organisation may have no quiet period and any reduction in availability is unacceptable. Consider also the consequences if the IT systems of air-traffic control, the emergency services, or an off-shore oil platform were to fail. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 29 One of the strategies adopted by organisations that require continuous availability is fault tolerance, whereby a system is designed so that it will continue to operate in the event of a single hardware failure. The general technique is to design the system with redundant elements and extra hardware such as dual power supplies, dual network interfaces, and RAID disks. What is special about fault tolerant systems such as the HP ‘NonStop’ (http://h20223.www2.hp.com/nonstopcomputing/cache/76385-0-0-0- 121.html?404m=cache-ccto0) and the Stratus ‘ftserver’ (http://www.stratus.com/Products/ftServerSystems.aspx) is the design of the processing and input-output (I/O) units to add redundancy. In essence the processor, memory, and I/O modules from two computers are linked tightly together to form the core of a single computer, as illustrated in Figure 7. You can see that in addition to the basic modules there are special fault detecting components to identify any failure of the computer. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 30 Figure 7 Block diagram of a fault tolerant core The two CPUs are configured in such a way that both execute the individual instructions of the program at exactly the same time, a technique referred to as ‘lock-step’. At the end of each instruction the detection circuits compare the results from the two CPUs and memory. If the results are the same the next instruction is executed. Should the results differ each CPU will run diagnostic software to verify its operation. The diagnostics may identify a faulty CPU, in which case it is isolated from service. If no fault is found then a single CPU is declared winner and the other CPU is reset. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 31 A similar process takes place to test the operation of the I/O of the computer, such as reading and writing data to disk or transferring data over a network. If any I/O chips are diagnosed as faulty they are isolated from use. Fault tolerant computer systems, such as the ‘NonStop’ or the ‘ftserver’, are significantly more expensive than commodity computers, but for some business sectors (e.g. ATM networks, telecommunications, stock exchanges) the extra cost is justified by the potential losses from downtime. The Stratus website indicates a typical availability in the range of four to six ‘nines’. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 32 7 Virtualisation and clouds Virtualisation The techniques that I have covered for increasing availability are based on the traditional deployment model of one application per server farm or server cluster. Combining these techniques with the typical 3-tier architecture of a business application requires a lot of equipment, as illustrated in Figure 8. This is a very expensive deployment model in terms of computing equipment, floor space (or rack space), cooling equipment, power consumption and personnel. Figure 8 Equipment requirements for a 3-tier architecture By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 33 The recent growth of raw processing power, as provided by faster processing speeds, multi-processor and multi-core processor computers, means that the typical single-application server is under-utilised, running at less than 10–15% of its potential. The solution touted to solve both rising costs and under-utilisation is ‘virtualisation’, making one computer host multiple virtual computers. They are virtual because each computer appears to have its own processor, memory, and peripherals, but in reality they are sharing the resources of a single host computer. When you start your computer the operating system gets loaded into memory along with special software ‘device drivers’ that handle the exchange of data with the input and output devices (e.g. disk drives, network cards, keyboard and mouse, etc.). Next you start an application, perhaps a word- processor, which also gets loaded into memory. A simple representation of the computer’s memory and input-output devices might look something like By: Dr. Monif Jazzar. AOU-KW Figure 9. Device and memory model for a single server application 4/11/2015 TT284: Web Technologies 34 As you can see there is no direct link between the application and the I/O devices. Instead the application will make a request to the operating system to complete any I/O tasks. In its turn the operating system will make its own requests to the low-level drivers to perform that actual data transfer. Now let’s look at a model corresponding to a computer supporting a pair of virtualised servers, as illustrated in Figure 10. The left-hand boxes represent the same input-output devices, which will be shared between the virtual servers. The righthand blocks can be grouped into three parts, the virtualisation software and two virtualised servers (also known as virtual machines or VMs). Each virtual server represents a complete computer in terms of its hardware and software. Figure 10 Device and memory model for virtualised applications By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 35 The virtualisation software, or hypervisor, provides the code to manage and protect the virtual servers together with the code for device drivers. It is responsible for creating each virtual server, protecting a virtual server’s memory space from other virtual servers, scheduling usage of the processors, and cleaning up when a virtual server is removed. Within each virtual server is contained the code for a ‘guest’ operating system and an application. The guest operating system, or guest OS, is responsible for managing its own application. A single guest OS can’t be allowed direct control because the I/O devices are shared by all the virtual servers. When a guest OS wants to use any I/O it makes a request to the corresponding virtual driver, which in turn makes a request through the hypervisor to a real driver. Another complication is how memory gets allocated to a virtual server because each virtual machine must be protected from the other virtual machines. Given the complexity of virtualisation why would anyone bother? What are the potential benefits? Here are a few of the commonly cited, but there are plenty more. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 36 Infrastructure consolidation: the term used when multiple server applications are virtualised to run on a single hardware platform. So instead of two physical computers, one to host a file server another to host a print server, two virtual servers are created to operate as guests on a single computer. Infrastructure consolidation is the main contributor to reducing the total cost of ownership for business applications. Sandboxing: the term used to isolate applications for testing or to enforce a high level of security. An untrusted application can be combined with a guest OS and executed within a virtual machine with the benefit of the additional protection afforded by virtualisation. Legacy systems: many virtualisation solutions support operating systems, and hence applications, that cannot be executed on newer hardware platforms. It may also be possible to emulate older peripherals that are no longer manufactured. Quick installation: a software vendor can provide an image of a virtual server to ship an entire application. For example, you could download a virtualised and pre-configured Apache web server and copy the image to your own hardware. Installation is quick and could be completed by someone who knows little about Apache. Recovery: an image of a virtual machine can be quickly restarted or migrated from one virtual/physical computer to another very quickly. Most of the commercial virtualisation products support automatic restart or migration in the event of virtual server failure. Testing and debugging: by combing the features of sandboxing and recovery it is possible to create test configurations, take images of the configuration at different stages of testing, and revert to a saved configuration in the event of a fault. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 37 Clouds There is considerable talk about ‘cloud computing’ as the solution to provide a rapid or agile response to the changing business environment. In a period of quick growth additional computing resources (computers, storage or network) can be brought onstream with minimal delay. If business requirements change then a new application can be brought into play. VMware (2011), one of the largest players in the market, defines cloud computing as: “ Cloud computing promises a more agile and efficient IT environment. It replaces traditional, costly and inefficient computing silos with elastic, self-managed, dynamic IT infrastructure. It enables IT to intelligently anticipate and respond to business needs. As organizations transform their IT environments, they want to achieve the benefits of cloud computing with a scalable, secure and manageable solution that addresses their unique business challenges. “ By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 38 From a technology perspective, cloud computing is little more than a combination of the high-availability hardware and virtualised servers I’ve already described. If the equipment is owned and managed by the business for its own use then the infrastructure is referred to as a private cloud. Alternatively a business can buy the same facilities from a third party offering a public cloud. The key point about a cloud is that everything is fluid, both the physical infrastructure and the virtualised services. What makes things interesting is the variety of business models to choose from. Infrastructure as a Service (IaaS) provides a business with a complete set of computers (servers, firewalls, load monitors), network links and storage devices on which to host their own software. It is up to the business to install, update, and generally manage everything. The costs are typically based on the physical equipment utilised. Platform as a Service (PaaS) offers a business a computing platform, typically a virtual server and guest OS, on which they can run their applications. The costs are typically based on the resources used, such as the proportion of the processor utilisation, the amount of data stored on disk, or the number of bytes transmitted By: Dr. Monif Jazzar. AOU-KW across the network. 4/11/2015 TT284: Web Technologies 39 Software as a Service (SaaS) provides a business with an entire application, such as an email service for all employees. In this model the business relinquishes control of where the computing equipment is located, what operating system is used, or where data is stored, but pays a fixed monthly charge for each user. A number of concerns have been raised about the level of security and privacy offered by public clouds. Many commentators have questioned whether data stored within a cloud (given the physical location may not be static) is as safe as data stored in a fixed location under the direct management of the business. A related issue concerns privacy legislation which governs how personal data is handled. If a UK company, for example, deploys an IaaS solution and that solution relies on cloud storage in another country, which country’s laws protect the data? Although such issues are important, they extend beyond the scope of our discussion, which is limited to technical solutions that enhance the availability of web applications. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 40 Availability Virtualisation in itself does not increase availability; it simply makes better utilisation of the performance of modern processors. A single physical server has the same probability of failure whether it is running a single application or multiple virtual servers. What does help is that should a virtual server fail it can be closed down and restarted by the hypervisor, which reduces the MTTR, or the entire system image can be migrated automatically to a standby server with little downtime. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 41 8 Disaster recovery Disaster Recovery (DR), which is closely linked to business contingency planning. In simple terms it means putting in place a plan that will enable a company to recover its IT systems following a disaster, or enable the systems to continue functioning during a disaster. The cause of the disaster may be environmental (flood, earthquake, or hurricane), equipment failure, loss of power or communications, or a security incident that prevents normal operation. The focus for disaster recovery is the web applications used by organisations, but any organisation should consider the broader implications of business continuity management (BCM). The British Standards Institution has published standards that provide an overview of the principles, processes and terminology of BCM and a framework of best practice for management. They are BSI (2006) BS 25999-1, Business Continuity Management: Code of Practice and BSI (2007) BS 25999-2, Business Continuity Management: Specification. All disaster recovery is based on an assessment of the potential risks to a business should a disaster occur, followed by the creation of a strategy for minimising the impact of that disaster. This strategy is generally expressed in a formal plan that describes: what activities should be undertaken in the event of a disaster who is responsible for undertaking each of the tasks when and in what sequence the tasks should be completed. By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 42 The primary goal of the plan is to minimise the impact of a disaster and to restore all systems and applications to a fully functional state as quickly as possible. Given the available technology it is possible to design an IT solution that would exhibit no loss of function and no loss of data, but the costs would be very high. That means someone has to decide between what a business can afford to pay versus how much loss of functionality can be accepted. DR solutions build on the technologies described in this Part, such as clustering, fault-tolerant design, and active-active systems, with the added requirement of geographic separation. To get some flavour of how these technologies come together you should complete the activities that follows. Do Activity 6 Do Activity 7 By: Dr. Monif Jazzar. AOU-KW 4/11/2015 TT284: Web Technologies 43 End of Tutorial 10 Have a nice day By: Dr. Monif Jazzar. AOU-KW

TT284 Web Technologies Tutorial 10 PDF

Document Details

Tags

Related

Summary

Full Transcript