Performance Evaluation PDF

Performance Evaluation PA RT I A L LY B A S E D O N R. J A I N , “A RT OF CO M P UT E R S Y S TE M S P E RF O R M A NC E A NA LY S I S ,” W I L E Y, 1 9 9 1 , I S B N: 0 4 7 1 5 0 3 3 6 3 OT H E R S U P P L E M E N TA RY M AT ER I A L D O UG L A S C. M ON TG O M ER Y, « D E S I G N A ND A N A LY S I S O F E X P E R I M E N TS » , W I L E Y, 2 0 1 3 I S B N 9 7 8 - 1 1 1 8 - 1 4 6 9 2 - 7 LINFO2241 - TOM BARBETTE 1 Why do we want to evaluate the performance of a computer system? HIGHER PERFORMANCE AT LOWEST COST Faster Cheaper Better Stronger LINFO2241 - TOM BARBETTE 2 Why do we need to characterize the performance of a system? Satisfy a service level objective Find the ressource needed to match a set of constraints Capacity Planning How many resource do we need for 500 clients? For the superbowl? LINFO2241 - TOM BARBETTE 3 When? ▪When do we need to characterize the performance of a computer system? IMPROVE AN EXISTING COMPARE ALTERNATIVE SYSTEM DESIGNS LINFO2241 - TOM BARBETTE 5 More examples ▪How much users can a single server handle? ▪ Scalability → handle more users by planning how much servers we need ▪How much memory will a certain program need? ▪ Reliability → prevent failures by not going over a certain threshold ▪What algorithm will perform better for a specific task? ▪ Optimization → compare multiple algorithms and take the fastest one ▪… LINFO2241 - TOM BARBETTE 6 What is a system? ▪Process, transformation, machine which transforms an input to an output.... y1 y2 yr Observations (Metrics) ▪Examples ▪ a program that takes a CSV in input and gives a statistics in output ▪ a program that sorts an array ▪ A web server that serves web pages to customers ▪ Pétanque game ▪ A cooking process LINFO2241 - TOM BARBETTE 7 Web service « system » ▪Controllable factors ▪ The number of servers ▪ The web server software being used ▪ Various parameters of the web servers ▪ The characteristics of the server (if it’s yours) ▪Uncontrollable factors ▪ The load of whatever else is running on the server ▪ The temperature of the room ▪ The distance of the clients ▪Input characteristics ▪ The distribution of users ▪ The arrival rate ▪ The page each user visits ▪Output : web pages ▪ Observation we can make: ▪ Time to receive a page « Response time » - Do users come back to the website? - Percentage of users who buy articles? - … LINFO2241 - TOM BARBETTE 9 Simplified experiment ▪We want to know the time to receive a page according to the number of users trying to make requests to the web server ▪Workload / input: 1 request per second, following an exponential distribution ▪Metric observed : the time to return the page by the sever ▪Controllable parameters ▪ The number of servers : 1 (simplification) ▪Uncontrollable parameters ▪ The distance of users ▪ The quality of their connection ▪ … ▪ → Ignored for now ! LINFO2241 - TOM BARBETTE 10 Simple experiment A machine generates the requests, another machine responds to those requests Request Rate Service Time Exponential Distribution Deterministic Rate 𝜆 = 1 request / seconds Rate 𝜇 = 1.1 requests / second 1 1 i.e. mean inter-arrival time = s i.e. s per request 1 1.1 Network queuing and latency is ignored (very small compared to the processing time) LINFO2241 - TOM BARBETTE 11 How can we estimate the performance of a system? MEASUREMENTS SIMULATION NUMERICAL ANALYSIS LINFO2241 - TOM BARBETTE 12 Basic terms System Any collection of hardware, software, and firmware Metrics Criteria used to evaluate the performance of the system component Workloads The requests made by the users of the system LINFO2241 - TOM BARBETTE 13 Exponential distribution 𝜆 =1 𝜇 = 1.1 rps exponential 1 i.e. spr 1.1 deterministic Performance? Requests arrive at those time : [ 2.4042086 4.74039826 7.12515926 7.40495355 7.49139095 8.94405146 10.35401216 13.47830812 13.55760231 14.60416316 14.67459947 15.76362309 17.49495045 17.88184527 19.11343093 19.26720419 19.35878146 19.67396066 … ] https://tinyurl.com/2yryy74f LINFO2241 - TOM BARBETTE 14 Simulation ▪Building a model of the system and simulate it ▪ Advantage: Relatively easy* ▪ Disadvantage: For complex systems, the simulation has to run for a long time (sometimes several days) to give reliable results. *unless there are many external factors impacting the system under test ▪ Example ▪ Create a timeline of web server requests and responses, e.g. users arrive at time t=1.1, get response at t=2.1 ▪ Simulating packets in a network ▪ Simulate all paths a program can take instead of running it ▪ Models how a storage system will behave under various read/write patterns, data redundancy schemes, or failures LINFO2241 - TOM BARBETTE 15 Measurements ▪Observing a real system ▪Advantage: We get realistic results ▪Disadvantage: We have to build the system first before we can evaluate its performance. This is not always possible, for example when our job is to design a new system. ▪Examples: ▪Implement 3 sorting algorithms and run them with N=1, 100, 1000 … ▪Run a web server and generate requests. Measure the response time, the bytes downloaded, … ▪Ask 10 users how a website looks ▪ Try multiple versions of a website and check which target group buys the most stuffs LINFO2241 - TOM BARBETTE 17 Numerical analysis ▪Building a mathematical model of the system ▪ Advantage: only requires a pen and paper ▪ Disadvantage: it’s near impossible to take all components of the system into account. ▪ Example ▪ With queuing theory, compute the response time of queuing systems ▪ Physical systems (gravitation, collision) LINFO2241 - TOM BARBETTE 19 Analytical Analysis ▪This is an M/D/1 queue ▪𝐸 𝑋 =𝑑 ▪ Expected service time, First moment ▪ 𝐸 𝑋 2 = 𝑑2 ▪ Second moment ▪ 𝜎𝑠2 = 0 ▪ Variance ▪Pollaczek Khinchine 𝜌2 + 𝜆²𝜎𝑠2 𝜆 1 𝐸 𝑄 = 𝜌+ with the utilization 𝜌 = = 2(1 − 𝜌) µ 1.1 For an M/D/1 system 𝜌2 𝐸 𝑄 = 𝜌+ 2(1 − 𝜌) 12 1 1.1 𝐸𝑄 = + 1.1 2(1 − 1 ) 1.1 𝐸 𝑄 = 5,4545 … LINFO2241 - TOM BARBETTE 21 Final results Mean service times ▪Simulation 5.66121808675306 ▪Measurement 5.460817964558457 ▪Analytical 5.454545454545452 LINFO2241 - TOM BARBETTE 22 Selecting a technique Analytical Simulation Measurement Stage Any Any Post Time required Small Medium Varies Tools Analysts Computer languages Instrumentation Accuracy Low Moderate High (Varies) Cost Small Medium High Saleability Low Medium High LINFO2241 - TOM BARBETTE 23 Three rules of validation Do not trust the results of an analytical model until they have been validated by a simulation model or measurements. Do not trust the results of a simulation model until they have been validated by analytical modeling or measurements. Do not trust the results of a measurement until they have been validated by simulation or analytical modeling. LINFO2241 - TOM BARBETTE 24 More complex systems ▪We over-simplified the web example ▪There might be many more parameters ▪Let’s take an example where you probably have more glimpse into LINFO2241 - TOM BARBETTE 26 Cooking rice ▪Controllable factors ▪ Cooking time Parameters ▪ From 1 to 30? What do you think ▪ Amount of water Factors of the experiment ▪ From 100ml to 1liter? Levels of the factors ▪ Quantity of salt ▪ 1 to 5 spoon ▪Uncontrollable factors ▪ Room temperature ▪ At the time of the experiment, it was 20° ▪Input characteristics « Workload » in ▪ White rice of the brand « everyday » computer systems ▪ We could try to explain the distribution if rice grains, their color, etc Distribution of clients ▪Output DNA Samples ▪ Cooked rice Observations Packet trace ▪ Is it good? Metrics … ▪ Is it sticky? LINFO2241 - TOM BARBETTE 27 Experiment ▪Repeatedly try: ▪ Execute/Simulate/Compute the system with a set of levels ▪ E.g. try to cook the rice with WATER=800ml, SALT=5spoons,TIME=20min ▪ Measure the outcomes ▪Repeat with another set of factors ▪ Again ▪ And again ▪Each experimental run is a test. ▪More formally, we can define an experiment as a test or series of runs in which purposeful changes are made to the input variables of a process or system so that we may observe and identify the reasons for changes that may be observed in the output response. LINFO2241 - TOM BARBETTE 28 Experimental design ▪Experimental design is the process of deciding which levels will be combined for each runs... y1 y2 yr Observations (Metrics) LINFO2241 - TOM BARBETTE 29 Full factorial experimental design ▪First idea : try everything! ▪ 30 different levels for time ▪ 10 different quantity of water ▪ 5 different level of salt ▪1500 different runs or test ▪This is called a « full factorial design » or « grid search » in ML LINFO2241 - TOM BARBETTE 30 Design space LINFO2241 - TOM BARBETTE 32 What should we do?! ▪First focus on one factor ▪ Study the influence of cooking time with X ml of water, X spoons of salt? ▪ Then the influence of water with a cooking time of whatever we found above? ▪ Then … ▪ Very dangerous! ▪ You could miss interactions --> With not enough water, the rice will be always bad ! ▪ You conclude that the cooking time is not very important ▪ And you then select a wrong cooking time ▪ Everything is biased ▪ Interactions are very present in computer science experiments ▪ This is called one-at-a-time experimental design this is often a bad idea before you ruled out interactions LINFO2241 - TOM BARBETTE 34 Design space of OAT No idea what happens here LINFO2241 - TOM BARBETTE 35 One solution 1) Sample a few points of the design space 2) Check what is important 3) Conduct a full experiment on what’s important only ▪ Now it can be one-at-a-time, when you know what you’re doing ▪ Eg. ▪ We observe that without water, the rice is always bad no matter the quantity of salt ▪ However with an appropriate amount of water, the influence of salt is … LINFO2241 - TOM BARBETTE 36 Random experimental design LINFO2241 - TOM BARBETTE 37 Terminology ▪Response Variable: Outcome. ▪ E.g., throughput, response time, taste of the rice ▪Factors: Variables that affect the response variable. ▪ E.g., CPU type, memory size, number of disk drives, workload used, and user's educational level. Also called predictor variables or predictors. ▪Levels: The values that a factor can assume, E.g., the CPU type has three levels: 68000, 8080, or Z80. # of disk drives has four levels. ▪ Also called treatment. ▪Experiment/Test: Execution of the setup for a single level of each factor ▪Replication/Runs: Repetition of all or some experiments. ▪Design: The number of experiments, the factor level and number of replications for each experiment. LINFO2241 - TOM BARBETTE 38

Performance Evaluation PDF

Document Details

Tags

Related

Summary

Full Transcript