Verification and Validation of Simulation Models PDF

Chapter 10 Verification and Validation of Simulation Models Banks, Carson, Nelson & Nicol Discrete-Event System Simulation Model building, validation, and verification Observe the real system and interactions between entities Get knowledge from the experts Construct the conceptual model (assumptions and hypotheses) Validate the concepts of the model (comparison) Implement the operational model by using a simulation software Validate and verify the two models Modify if needed 2 Modeling-Building, Verification & Validation 3 Model building, validation, and verification The goal of the validation process is:  To produce a model that represents true behavior closely enough for decision-making purposes  To increase the model’s credibility to an acceptable level Validation is an integral part of model development  Verification – building the model correctly (correctly implemented with good input and structure)  Validation – building the correct model (an accurate representation of the real system) Most methods are informal subjective comparisons while a few are formal statistical procedures 4 Verification Purpose: ensure the conceptual model is reflected accurately in the computerized representation. Many common-sense suggestions, for example:  Have someone else check the model.  Make a flow diagram that includes each logically possible action a system can take when an event occurs.  Closely examine the model output for reasonableness under a variety of input parameter settings.  Use simplicity assumptions e.g. inter-arrival times and service times for exponential functions with C servers to check with M/M/C queuing model.  Print the input parameters at the end of the simulation, make sure they have not been change. 5 Examination of Model Output for Reasonableness [Verification] Example: A model of a complex network of queues consisting many service centers.  Response time is the primary interest, however, it is important to collect and print out many statistics in addition to response time. Two statistics that give a quick indication of model reasonableness are current contents and total counts, for example:  If the current content grows in a more or less linear fashion as the simulation run time increases, it is likely that a queue is unstable  If the total count for some subsystem is zero, indicates no items entered that subsystem.  If the total and current count are equal to one, can indicate that an entity has captured a resource but never freed that resource. Compute certain long-run measures of performance, e.g. compute the long-run server utilization and compare to simulation results 6 Other Important Tools [Verification] Documentation A means of clarifying the logic of a model and verifying its completeness Use of a trace A detailed printout of the state of the simulation model over time. 7 Use of a trace Is the model correct ? Is the calculated parameter correct ? 8 Calibration and Validation Validation: the overall process of comparing the model and its behavior to the real system. Calibration: the iterative process of comparing the model to the real system and making adjustments as shown in the figure below. 9 Calibration and Validation No model is ever a perfect representation of the system  The modeler must weigh the possible, but not guaranteed, increase in model accuracy versus the cost of increased validation effort. “Time and money controls the level of detail (+validity of the model)” Three-step approach:  Build a model that has high face validity.  Validate model assumptions.  Compare the model input-output transformations with the real system’s data. 10 High Face Validity [Calibration & Validation] Ensure a high degree of realism: Potential users should be involved in model construction (from its conceptualization to its implementation). Sensitivity analysis can also be used to check a model’s face validity.  Example: In most queueing systems, if the arrival rate of customers were to increase, it would be expected that server utilization, queue length and delays would tend to increase. 11 Validate Model Assumptions [Calibration & Validation] General classes of model assumptions:  Structural assumptions: how the system operates.  Data assumptions: reliability of data and its statistical analysis. Bank example: customer queueing and service facility in a bank.  Structural assumptions, e.g., customer waiting in one line versus many lines, served FCFS versus priority.  Data assumptions, e.g., inter-arrival time of customers, service times for commercial accounts. Verify data reliability with bank managers. Test correlation and goodness of fit for data (see Chapter 9 for more details). 12 Validate Input-Output Transformation [Calibration & Validation] Goal: Validate the model’s ability to predict future behavior  The only objective test of the model.  The structure of the model should be accurate enough to make good predictions for the range of input data sets of interest. possible approaches: use historical data that have been reserved for validation purposes only. Use Turing test. People knowledgeable about the system are asked to examine output data with system data without knowing which data is which. Criteria: use the main responses of interest. 13 Bank Example [Validate I-O Transformation] Example: One drive-in window serviced by one teller, only one or two transactions are allowed.  Data collection: 90 customers during 11 am to 1 pm. Observed service times {Si, i = 1,2, …, 90}. Observed interarrival times {Ai, i = 1,2, …, 90}.  Data analysis let to the conclusion that: Interarrival times: exponentially distributed with rate l = 45 Service times: N(1.1, 0.22) 14 15 The Black Box [Bank Example: Validate I-O Transformation] A model was developed in close consultation with bank management and employees Model assumptions were validated Resulting model is now viewed as a “black box”: Model Output Variables, Y Input Variables Primary interest: Possion arrivals Y1 = teller’s utilization l = 45/hr: X11, X12, … Y2 = average delay Uncontrolled Services times, Model Y3 = maximum line length variables, X N(D2, 0.22): X21, X22, … “black box” f(X,D) = Y Secondary interest: D1 = 1 (one teller) Y4 = observed arrival rate Controlled Decision D2 = 1.1 min Y5 = average service time variables, D (mean service time) Y6 = sample std. dev. of D3 = 1 (one line) service times Y7 = average length of the line 16 Comparison with Real System Data [Bank Example: Validate I-O Transformation] Real system data are necessary for validation.  System responses should have been collected during the same time period (from 11am to 1pm on the same Friday.) Compare the average delay from the model Y2 with the actual delay Z2:  Average delay observed, Z2 = 4.3 minutes, consider this to be the true mean value m0 = 4.3.  When the model is run with generated random variates X1n and X2n, Y2 should be close to Z2.  Six statistically independent replications of the model, each of 2- hour duration, are run. 17 Hypothesis Testing [Bank Example: Validate I-O Transformation] Compare the average delay from the model Y2 with the actual delay Z2 (continued):  Null hypothesis testing: evaluate whether the simulation and the real system are the same (w.r.t. output measures): H 0: E(Y2 ) 4.3 minutes H1: E(Y2 ) 4.3 minutes If H0 is not rejected, then, there is no reason to consider the model invalid If H0 is rejected, the current version of the model is rejected, and the modeler needs to improve the model 18 Hypothesis Testing [Bank Example: Validate I-O Transformation] 19 Hypothesis Testing [Bank Example: Validate I-O Transformation]  Conduct the t test: Chose level of significance (a = 0.05) and sample size (n = 6), see result in Table 10.2. Compute the same mean and sample standard deviation over the n replications: n 1 n  (Y 2i  Y2 ) 2 Y2   Y2i 2.51 minutes n i 1 S  i 1 n 1 0.81 minutes Compute test statistics: Y2   0 2.51  4.3 t0   5.24  tcritical 2.571 (for a 2 - sided test) S/ n 0.82 / 6 Critical: (2-sided test) and n-1 degree of freedom Hence, reject H. Conclude that the model is inadequate. 0 Check: the assumptions justifying a t test, that the observations (Y2i) are normally and independently distributed. 20 Hypothesis Testing [Bank Example: Validate I-O Transformation] Similarly, compare the model output with the observed output for other measures: Y4  Z4, Y5  Z5, and Y6  Z6 21 Type I and II Error [Validate I-O Transformation] Type I error (a):  Error of rejecting a valid model.  Controlled by specifying a small level of significance a. Type II error (b):  Error of accepting a model as valid when it is invalid.  Controlled by specifying critical difference and find the n. For a fixed sample size n, increasing a will decrease b. 22 Type II Error [Validate I-O Transformation] For validation, the power of the test is:  Probability[ detecting an invalid model ] = 1 – b  b = P(Type II error) = P(failing to reject H0|H1 is true)  Consider failure to reject H0 as a strong conclusion, the modeler would want b to be small.  Value of b depends on: Sample size, n E (Y )    The true difference, delta d, between E(Y) and m:  In general, the best approach to control b error is: 1. Specify the critical difference, d. 2. Choose a sample size, n, by making use of the operating characteristics curve (OC curve). 23 Example 1 2 3 24 0.6 25 Confidence Interval Testing [Validate I-O Transformation] Confidence interval testing: evaluate whether the simulation and the real system are close enough. If Y is the simulation output, and m = E(Y), the confidence interval (C.I.) for m is: Y t / 2,n  1S / n Validating the model: Suppose the C.I. does not contain m0: (Figure a) If the best-case error is > e, model needs to be refined (invalid). If the worst-case error is  e, accept the model (valid). If best-case error is  e but the worst-case error is > e, additional replications are necessary.  Suppose the C.I. contains m :(Figure b) 0 If the worst-case error is  e, accept the model (valid) If either the best-case or worst-case error is > e, additional replications are necessary. 26 Confidence Interval Testing [Validate I-O Transformation] Bank example: m0 = 4.3, and “close enough” is e = 1 minute of expected customer delay.  A 95% confidence interval, based on the 6 replications is [1.65, 3.37] because: Y t0.025,5 S / n 2.51 2.57(0.82 / 6 )  Falls outside the confidence interval, the best case |3.37 – 4.3| = 0.93 < 1, but the worst case |1.65 – 4.3| = 2.65 > 1, additional replications are needed to reach a decision. 27 Confidence Interval Testing (cont.) [Validate I-O Transformation] The following approach is for situations where it is possible to collect large amount of data. Suppose that we collected m independent sets of data from the system and independent sets of data from the model ( n independents replications or runs) : average of system observations. j=1,2,…,m : average of model observations. j=1,2,…,n The ’s are IID random variables with The ’s are IID random variables with Confident interval 𝒔𝒑𝒆𝒄𝒊𝒂𝒍 𝒄𝒂𝒔𝒆 , 𝜺=𝟎 If then the difference - is said to be statistically significant If then the difference - is not significant 28 Confidence Interval Testing (cont.) Example [Validate I-O Transformation] Run # 1 2 3 4 5 6 7 8 9 10 Zj (system) 0.548 0.491 0.49 0.454 0.567 0.486 0.419 0.527 0.521 0.461 Yj (model) 0.613 0.618 0.63 0.732 0.548 0.614 0.463 0.614 0.463 0.572 Yj ‐Zj 0.065 0.127 0.14 0.278 ‐0.019 0.128 0.044 0.087 ‐0.058 0.111 𝜶=𝟎. 𝟏 𝒏=𝒎 𝜺=𝟎 Since then the difference - is significant (the model is invalid) 29

Verification and Validation of Simulation Models PDF

Document Details

Tags

Related

Summary

Full Transcript