B. Advanced Reserving Methods PDF

B. Advanced Reserving Methods Taylor Taylor Tasks: 9. Calculate the mean and prediction error of a reserve 10. Derive predictive distributions using stochastic methods...

B. Advanced Reserving Methods Taylor Taylor Tasks: 9. Calculate the mean and prediction error of a reserve 10. Derive predictive distributions using stochastic methods In this paper, we will look more closely at the chain ladder method of reserving. We will also look at some additional distributions that can be used with a GLM Bootstrap model. And we will see that the GLM Bootstrap model used with the over-dispersed Poisson distribution gives us the chain ladder results. Taylor also shows proof that under certain conditions, the chain ladder estimates of the loss development factors are the minimum variance unbiased estimators (MVUE). Notation For origin period k reported in period j: Ykj = incremental losses from j − 1 to j Xkj = cumulative losses Experience period = calendar period Xk,j+1 f kj = = age-to-age factors Xkj K− j K− j f j = ∑ wkj f kj , where wkj is some set of weights and ∑ wkj = 1 k =1 k =1 Exponential Dispersion Family GLMs require a distribution from the exponential dispersion family (”EDF”). This is a family of distributions with probability density functions of the form: yθ − b(θ ) ln[π (y; θ, ϕ)] = + c(y, ϕ) a(ϕ) where y is the value of an observation; θ is a location parameter, called the canonical parameter; ϕ is a dispersion parameter, sometimes called the scale parameter; b(θ ) is the cumulant function, and determines the shape of the distribution; exp[c(y, ϕ)] is a normalizing factor, which makes the area under the probability density function = 1. The functions a, b, and c should be continuous. And b should be one-to-one, twice differentiable, and the first derivative should also be one-to-one. © 2024 The Infinite Actuary, LLC July 27, 2024 Page 1 B. Advanced Reserving Methods Taylor Here are some examples of a number of well-known distributions from the EDF: Distribution b(θ ) a(ϕ) c(y, ϕ) 2 Normal 1 2 2θ ϕ − 21 [ yϕ + ln(2πϕ)] Poisson exp(θ ) 1 −ln(y!) n Binomial ln(1 + exp(θ )) n −1 ln(ny ) Gamma −ln(−θ ) v −1 v ln(vy) − ln(y) − ln(Γv) 1 Inverse Gaussian −(−2θ ) 2 ϕ − 12 [ln(2πϕy3 + ϕ1 y)] Note: n and v are alternate representations of ϕ, but we aren’t given much additional information about them. We would select a distribution based on what we are modeling. For example claim counts are often modeled using the Poisson or binomial distributions, whereas the other distributions are commonly used for loss amounts. We also need the following information to make sense of the table: E (Y ) = b ′ ( θ ) = µ θ = ( b ′ ) −1 ( µ ) V (µ) = b′′ (θ ) also called the variance function, but, confusingly, it does not equal the variance Var (Y ) = a(ϕ)V (µ) = a(ϕ)b′′ (θ ) Example: Show that the following characteristics of the Poisson distribution can be derived using the EDF terminology. E (Y ) = λ Var (Y ) = λ λy e−λ π (y; θ, ϕ) = PDF = y! Step 1: Solve for E(y). Set µ = λ b(θ ) = exp(θ ) (from the table above) b′ (θ ) = exp(θ ) = µ θ = (b′ )−1 (µ) = ln(µ) = ln(λ) E(Y ) = b′ (θ ) = exp(θ ) = exp(ln(λ)) = λ © 2024 The Infinite Actuary, LLC July 27, 2024 Page 2 B. Advanced Reserving Methods Taylor Step 2: Solve for Var (y) V (µ) = b′′ (θ ) = exp(θ ) Var (Y ) = a(ϕ)V (µ) = 1 × exp(θ ) = exp(ln(λ)) = λ Step 3: Solve for π (y; θ, ϕ). yθ − b(θ ) yln(λ) − exp(ln(λ)) ln[π (y; θ, ϕ)] = + c(y, ϕ) = − ln(y!) = yln(λ) − λ − ln(y!) a(ϕ) 1 λy e−λ π (y; θ, ϕ) = exp(yln(λ) − λ − ln(y!)) = y! Tweedie Sub-Family The tweedie sub-family is a subset of EDF distributions where : V (µ) = µ p , p ≤ 0 or p ≥ 1 If we restrict a(ϕ) to ϕ (which we will assume for the remainder of the paper), this means Var (Y ) = ϕµ p and the variance is proportional to a power of the mean. We have the following well-known members of the tweedie family: Distribution p b(θ ) µ 1 2 Normal 0 2θ θ Over-Dispersed Poisson 1 exp(θ ) exp(θ ) Gamma 2 −ln(−θ ) −1/θ 1 Inverse Gaussian 3 −(−2θ ) 2 (−2θ )−1/2 When 1 ≤ p < 2, then we have a compound Poisson distribution with a gamma severity distribution. The larger the value of p, the heavier the tail of the distribution. We can use the data to help us determine how heavy the tail is (and thus how large p should be). If we notice that the dispersion of the residuals is greater than we would expect, then we should consider increasing p. (Taylor doesn’t mention how to determine what the expected dispersion should be.) Over-Dispersed Poisson The over-dispersed Poisson (ODP) distribution is similar to the traditional Poisson, with E(Y ) = λ. However, a slight adjustment is made to the variance, such that Var (Y ) = ϕλ. When ϕ = 1, we have the traditional Poisson distribution. The ODP distribution works well when little is known of the actual data distribution. It has much of the simplicity of the traditional Poisson, while acquiring additional flexibility as the result of the second parameter, ϕ. This makes its use valuable in the GLM models. © 2024 The Infinite Actuary, LLC July 27, 2024 Page 3 B. Advanced Reserving Methods Taylor Stochastic Models Supporting the Chain Ladder Method The bottom line of this section is that the chain ladder method provides the maximum likelihood estimate (MLE) of loss reserves. We will also look at some additional models with conditions that will further strengthen this result. Non-Parametric Mack Model Mack introduced some assumptions around the chain ladder model that help to integrate variance into the estimate. These may (hopefully!) look familiar: 1. Accident years are independent of one another 2. For each accident year, the losses form a Markov chain (this is stated differently than we’ve seen before, but basically means that a loss in one period only depends on the losses in the period immediately prior and nothing else) 3. (a) The expected cumulative losses in the next period are proportional to the losses to date (b) The variance of losses in the next period are proportional to losses to date. Two of the results of this model are: The conventional chain ladder estimates of the loss development factors are: – Unbiased – Minimum variance among estimators that are linear combinations of the loss development factors The conventional chain ladder estimates of the reserves are unbiased Parametric (EDF) Mack Models If we change assumption 3b above to say that the cumulative losses in the next period are distributed according to a distribution from the exponential dispersion family, then we have a parametric model, specifically the EDF Mack Model. All other assumptions must continue to hold. As long as we have a full triangle of data (where the number of accident periods equals the number of development periods) in our model, then the following are true: The model’s MLEs of the loss development factors will equal the conventional chain ladder LDFs The model’s MLEs of the loss development factors are unbiased estimators If we also have the EDF distribution restricted to an ODP distribution (we call this the ODP Mack Model), and if the dispersion parameters, ϕ, don’t vary by accident period (they can vary by development period), then – The conventional chain ladder LDFs are minimum variance unbiased estimators (MVUE) – The future cumulative loss estimates and reserve estimates are also MVUEs Note that the MVUE result is much stronger than the result from the nonparametric Mack model which was limited to linear combinations. This result gives us the minimum variance estimators out of all unbiased estimators. © 2024 The Infinite Actuary, LLC July 27, 2024 Page 4 B. Advanced Reserving Methods Taylor Cross-Classified Models A cross-classified model is defined by the following: 1. The incremental losses are independent 2. The incremental losses have a distribution belonging to the exponential dispersion family 3. E[Ykj ] = αk β j , where Ykj is the incremental loss and β j > 0 J 4. ∑ β j = 1 j =1 The last requirement is only there to eliminate redundancy. Another constraint (like β 1 = 1 or α1 = 1) would work equally well. Note that this model has row and column parameters, whereas the Mack models only have column parameters (the LDFs). However, in the Mack models, the losses reported to date serve as an implied row parameter in the forecasting of future losses. If we also have the following conditions: We have a full triangle of data (where the number of accident periods equals the number of development periods) The EDF distribution is restricted to an over-dispersed Poisson distribution The dispersion parameter, ϕ, is identical for all cells (it doesn’t vary by accident period or development period) Then we have the following results: The MLE future incremental loss estimates and reserve estimates are the same as those given by the conventional chain ladder If the future incremental loss estimates and reserve estimates are corrected for bias, then they are the MVUEs With the bias correction, reserve estimates for the ODP Mack and ODP Cross-Classified models are identical Example: You are given the following cumulative loss data. Calculate the αk and β j parameters and show that future incremental loss estimates are the same using the cross-classified model and the conventional chain ladder methods. 1 2 3 1 120 155 185 2 130 170 3 125 © 2024 The Infinite Actuary, LLC July 27, 2024 Page 5 B. Advanced Reserving Methods Taylor Step 1: Calculate the loss development factors and future expected losses using the chain ladder method. LDF(1-2) = 1.300 LDF(2-3) = 1.194 1 2 3 1 185 2 170 203 3 125 163 194 Note that the bold numbers are the last diagonal from the actual losses. The other numbers are obtained by multiplying the prior losses by the appropriate LDF. Step 2: Calculate the incremental losses. 1 2 3 1 120 35 30 2 130 40 3 125 Steps 3-8 iteratively calculate the αk and β j parameters. Here this takes 6 steps because we have 6 αk and β j parameters. A larger triangle would require more steps to finish the iteration. We will move down the column to calculate the αs and from right to left to calculate the βs. I recommend memorizing the pattern of the calculations, rather than the formulas, but generally speaking, we will calculate losses to date αk = 1 − sum of the β j sum of incremental losses at time j βj = sum of corresponding values of αk Step 3: Calculate α1. Since we don’t have any values of β j yet, this is ultimate losses for accident year 1. 1 2 3 αk 1 120 35 30 185 2 130 40 3 125 βj © 2024 The Infinite Actuary, LLC July 27, 2024 Page 6 B. Advanced Reserving Methods Taylor Step 4: Calculate β 3 (or β j for the last development period). 1 2 3 αk 1 120 35 30 185 2 130 40 3 125 30 βj = 0.162 185 losses to date in year 2 Step 5: Calculate α2 = 1 − β3 1 2 3 αk 1 120 35 30 185 130 + 40 2 130 40 = 203 1 − 0.162 3 125 βj 0.162 Step 6: Calculate β 2. 1 2 3 αk 1 120 35 30 185 2 130 40 203 3 125 35 + 40 βj = 0.193 0.162 185 + 203 losses to date in year 3 Step 7: Calculate α3 = 1 − β3 − β2 1 2 3 αk 1 120 35 30 185 2 130 40 203 125 3 125 = 194 1 − 0.162 − 0.193 βj 0.193 0.162 © 2024 The Infinite Actuary, LLC July 27, 2024 Page 7 B. Advanced Reserving Methods Taylor Step 8: Calculate β 1. 1 2 3 αk 1 120 35 30 185 2 130 40 203 3 125 194 120 + 130 + 125 βj = 0.644 0.193 0.162 185 + 203 + 194 Step 9: Check that the β j parameters sum to 1. Due to rounding, it appears that our β j parameters don’t quite equal 1, but if we look at the unrounded values, they do equal 1, so no adjustment is needed. If we did need to make an adjustment, we would divide each β j by the sum of all the β j ’s and then multiply each αk by the same number. Step 10: Show that the incremental estimates made by the cross-classified model equals the incremental losses estimated using the chain ladder method. We already calculated the cumulative loss estimates using the chain ladder method, so we just need to subtract those. Then multiply the appropriate αk and β j values to get the cross-classified estimate. We will use q(k, j) to represent the incremental loss at accident period k, development period j. Conventional Chain Ladder Cross-Classified Model q(2, 3) 203 − 170 = 33 203 × 0.162 = 33 q(3, 2) 163 − 125 = 38 194 × 0.193 = 38 q(3, 3) 194 − 163 = 31 194 × 0.162 = 31 Additional Calculation: You can also use the values of β j to calculate the values of the LDFs. j +1 ∑ βj j =1 fj = j ∑ βj j =1 From our example: 0.644 + 0.193 = 1.300 0.644 0.644 + 0.193 + 0.162 = 1.194 0.644 + 0.193 © 2024 The Infinite Actuary, LLC July 27, 2024 Page 8 B. Advanced Reserving Methods Taylor Generalized Linear Models These stochastic models for which the chain ladder results are maximum likelihood can be represented as GLMs. The advantage of this is that the parameters can be estimated by statistical software, which also returns a lot of additional, useful information about the model and especially about the dispersion of the parameter estimates. We can then use this information as the basis for the prediction error associated with the model. We are going to use matrix notation to describe the set-up of the GLM. You may recognize this from the GLM Framework file as part of the Shapland paper. We will assume a 3x3 triangle for demonstration purposes. We will use X to represent our design matrix. Each cell is this matrix contains what we’ll call a covariate. Remember that each row is a cell from our loss development triangle and shows which α and β parameters apply. For example the first row in the matrix represents the top left cell in the triangle. Its formula is 1α1 + 0α2 + 0α3 + 0β 2 + 0β 3. The last row in the matrix represents the top right cell in the triangle. Its formula is 1α1 + 0α2 + 0α3 + 1β 2 + 1β 3. 1 0 0 0 0   0 1 0 0 0   0 0 1 0 0 X= 1   0 0 1 0 0 1 0 1 0 1 0 0 1 1 Our parameter vector is A.  α1  α2    A=  α3    β2  β3 We will use Y to indicate our actual observed losses and µ to indicate our fitted or expected losses based on our parameters. We will use h(·) to represent our link function. Shapland exclusively uses the log-link function, but it could really be any one-to-one function with a range (−∞, ∞). Thus we have h(µ) = XA. If we set h(·) to the identity function, and select a normal distribution for our error terms, then we have Yi = XA + ϵi , with ϵi ∼ N (0, ϕi ) which is a traditional weighted linear regression model. The GLM is a generalized version of this where the following are true: The relation between observations and covariates may be non-linear Error terms may be non-normal The calculation of the parameters in the A matrix is usually done by maximum likelihood estimation (we select the parameters that maximize the likelihood between the actual losses and the fitted losses). ϕ The dispersion parameter, ϕ is usually unknown and we usually assume that ϕi = , where wi is a wi © 2024 The Infinite Actuary, LLC July 27, 2024 Page 9 B. Advanced Reserving Methods Taylor known weight given to each observation. Remember that the dispersion parameter is used to calculate the variance (Var (Y ) = a(ϕ)V (µ)). Running a GLM requires making the following selections: Cumulant function, b(θ ), which controls the model’s assumed error distribution p, which controls the relationship between the model mean and variance Covariates, which influence the mean, µ Link function, h(·), which specifies the functional relationship between µ and the covariates Covariates When producing a GLM of a loss development triangle, we have used covariates with the values of 1 or 0 to represent our use (or non-use) of the α and β parameters. These are categorical covariates. We would say that the values associated with β are one set of categorical variates, where the number of levels equals the number of development periods. Similarly the values associated with α would be a second set of categorical variates and the number of levels would equal the number of accident years. We could also have a continuous variate, such as age, where instead of selecting discrete intervals, we use a continuous range of numerical values. Instead of having multiple levels for each possible interval, we would just have one variate that could use it’s actual value (rather than 0 or 1). The value of a continuous variate could also be represented as a transformation of it’s actual value by using a linear spline (a piecewise function), such as the one shown below. This linear spline uses the basis function LmM ( x ) = min[ M − m, max (0, x − m)] where m and M are the lower and upper bounds of the range that we desire to be non-constant. Additional information on how to use these continuous variates is not on the syllabus. GLM Representations of the Chain Ladder Method The following section will show how we can set up some of our various representations of the chain ladder method in a GLM. Although, on an exam question, if you are given characteristics of a GLM model that fit the requirements to match the chain ladder output, then you should solve the problem using the traditional chain ladder LDFs, and not try to set up a GLM. © 2024 The Infinite Actuary, LLC July 27, 2024 Page 10 B. Advanced Reserving Methods Taylor Parametric Mack Model Let’s consider the Parametric Mack Model, specifically the ODP Mack Model, which was our requirement to get the conventional chain ladder estimates to be MVUEs. That model can be written as: Yk,j+1 | Xkj ∼ ODP(µkj , ϕj ) In English, this is saying that the incremental loss in the next period, given the cumulative losses to date, has an ODP distribution with parameters, µ and ϕ. Notice that ϕ only varies by development period, as that was also one of the requirements to get the MVUE result. The parameters µ and ϕ tell us that E[Yk,j+1 | Xkj ] = µ Var [Yk,j+1 | Xkj ] = ϕµ Knowing that the expected value of the next incremental loss is the same for the ODP Mack Model as it is for the conventional chain ladder method, tells us that Yk,j+1 = ( f j − 1) Xkj So, we can rewrite the model as: Yk,j+1 | Xkj ∼ ODP(( f j − 1) Xkj , ϕj ) Which leads us to the following results: Yk,j+1 ( f j − 1) = Xkj E[Yk,j+1 | Xkj ] E[ fˆkj − 1| Xkj ] = = fj − 1 Xkj Var [Yk,j+1 | Xkj ] ϕj µkj ϕj ( f j − 1) Xkj ϕ j ( f j − 1) Var [ fˆkj − 1| Xkj ] = 2 = 2 = 2 = Xkj Xkj Xkj Xkj Note that we can pull Xkj out of the expected value and variance formulas because it is the losses to date and is a known value. The ODP family is closed under scaling, which means that an ODP divided by a constant produces another ODP, so we have: ϕj fˆkj − 1| Xkj ∼ ODP( f j − 1, ) Xkj We’re also told that ϕ is unnecessary for the calculation of f , so it can be set to 1. © 2024 The Infinite Actuary, LLC July 27, 2024 Page 11 B. Advanced Reserving Methods Taylor We can set this up as a GLM where the output will be a distribution around our LDFs rather than our losses. We’ll set h(·) = identity. For a 3x3 triangle, we would also have the following:   1 0 X = 0 1 1 0 f1 − 1 A= f2 − 1 Note that this is a trivial GLM. ODP Cross-Classified Model The ODP Cross-Classified Model can be written as: Ykj ∼ ODP(µkj , ϕ) = ODP(αk β j , ϕ) Remember that the requirement for the cross-classified model is that ϕ doesn’t vary, so there is no subscript here. Also note that Ykj is not dependent on losses to date–it is solely a function of αk and β j. To enter this into a GLM, we will use h(·) = the log-link function and set µkj = exp[ln(αk ) + ln( β j )] We will also use a design matrix like the following (this is an example for a 3x3 triangle): 1 0 0 1 0 0   1 0 0 0 1 0   1 0 0 0 0 1 X=  0   1 0 1 0 0 0 1 0 0 1 0 0 0 1 1 0 0 lnα1    lnα2     lnα3  A=   lnβ 1   lnβ 2  lnβ 3 We want ∑ β j = 1 in order to eliminate redundancy. Most GLM software will automatically remove that redundancy by setting one of the parameters equal to 0 (i.e ln( β 1 ) = 0, so β 1 = 1). This may cause the parameter estimates to be different than what we calculated in our cross-classified example, but the results of the model will be the same. We could also manually make the same adjustment here that we discussed before, namely divide each β j by the sum of all β j s and multiply each αk by the same sum of all β j s. © 2024 The Infinite Actuary, LLC July 27, 2024 Page 12 B. Advanced Reserving Methods Taylor Deviance The most common way to measure goodness-of-fit for a GLM is the scaled deviance, which is calculated as: n D (Y, Ŷ ) = 2 ∑ [ln[π (Yi ; θ̂ (s) , ϕ)] − ln[π (Yi ; θ̂, ϕ)]] i =1 scaled deviance = 2 × sum of(loglikelihood of the saturated model - loglikelihood of a nested model) where Ŷ is the MLE of µ, θ̂ is the MLE of θ for a nested model (i.e. a model with fewer parameters than the saturated model), and θ̂ (s) is the estimate of θ in the saturated model (i.e. the model with a parameter for every observation). The minimum possible deviance is 0, which happens in the saturated model when there is no difference between observations and fitted values. We have the following table of loglikelihoods for various distributions. My guess is that you would be given these formulas if you were asked to calculate the deviance on the exam. Distribution µ ln[π (y; µ, ϕ)] yµ − 12 µ2 Normal θ ϕ ylnµ − µ ODP exp(θ ) ϕ −y −1 µ − lnµ Gamma θ ϕ −( 2y µ2 ) + 1 µ Inverse Gaussian (−2θ )−1/2 ϕ It turns out that ϕ is irrelevant to the MLE of parameters (since the loglikelihood is always proportional to 1/ϕ, it can be factored out), so we can also calculate the unscaled deviance as: n D ∗ (Y, Ŷ ) = 2 ∑ [ln[π (Yi ; θ̂ (s) , 1)] − ln[π (Yi ; θ̂, 1)]] i =1 © 2024 The Infinite Actuary, LLC July 27, 2024 Page 13 B. Advanced Reserving Methods Taylor Since we assume that ϕ is unknown, the unscaled deviance is more useful. We can also estimate ϕ from the unscaled deviance: D ∗ (Y, Ŷ ) ϕ̂ = n−p where n − p is the degrees of freedom (n is the number of data points, which would also be the number of parameters in the saturated model, and p is the number of parameters in the nested model). Example: You are given the following actual losses, as well as the losses fitted using the saturated model (with 3 accident year parameters and 3 development year parameters) and fitted losses using a model with 1 accident year parameter and 2 development period parameters. You fit the model using an ODP distribution. Find the unscaled deviance. Actual Cumulative Losses 1 2 3 1 120 155 185 2 130 170 3 125 Fitted Losses (Saturated) 1 2 3 1 119.23 155.00 185.00 2 130.77 170.00 3 125.00 Fitted Losses (Unsaturated) 1 2 3 1 114.17 162.17 184.67 2 114.17 162.17 3 114.17 Step 1: Calculate the loglikelihood for the saturated model, ln[π (Yi ; θ̂ (s) , 1)]. ylnµ − µ Because this is an ODP, we are using the formula ln[π (Yi ; θ̂ (s) , 1)] = , with ϕ = 1 (because we ϕ were told to find the unscaled deviance). We are given the fitted cumulative values, so we need to subtract to get the incremental values, µ. If we were given θ instead, we would need to convert it using µ = exp(θ ). Yi µ ln[π (Yi ; θ̂ (s) , 1)] 120 119.23 454.50 130 130.77 502.78 125 125.00 478.54 35 35.77 89.43 40 39.23 107.55 30 30.00 72.04 For example: 454.50 = 120 × ln(119.23) − 119.23 © 2024 The Infinite Actuary, LLC July 27, 2024 Page 14 B. Advanced Reserving Methods Taylor Step 2: Calculate the loglikelihood for the nested model, ln[π (Yi ; θ̂, 1)]. Yi µ ln[π (Yi ; θ̂, 1)] 120 114.17 454.35 130 114.17 501.73 125 114.17 478.04 35 48.00 87.49 40 48.00 106.85 30 22.50 70.91 Step 3: Calculate the deviance. We will use di to signify the unscaled deviance for each component, and then D ∗ (Y, Ŷ ) = ∑ di. di = 2× (loglikelihood for the saturated model − loglikelihood for the nested model): ln[π (Yi ; θ̂ (s) , 1)] ln[π (Yi ; θ̂, 1)] di 454.50 454.35 0.29 502.78 501.73 2.10 478.54 478.04 1.00 89.43 87.49 3.87 107.55 106.85 1.40 72.04 70.91 2.26 D ∗ (Y, Ŷ ) = 10.91 We could then compare the deviance for various nested models. We don’t want to automatically select the model with the smallest deviance (that will always be the saturated model). We would want to also incorporate a penalty for overparameterization, but Taylor doesn’t get into that. Residuals The standardized Pearson residuals are commonly used in GLMs. (Yi − Ŷi ) RiP = σ̂i actual - expected Pearson Residuals = standard deviation Note: You may recall other papers where the standardized residual requires adjustment with a hat matrix. That’s because in those papers our residual didn’t include ϕ (the hat matrix is an estimate of ϕ). Here we are applying ϕ directly, assuming we know what it is, by using the standard deviation in the denominator. © 2024 The Infinite Actuary, LLC July 27, 2024 Page 15 B. Advanced Reserving Methods Taylor Example: Given the following actual and fitted incremental losses for a model using the ODP distribution with ϕ = 4, calculate the standardized Pearson residuals. Actual Cumulative Losses 1 2 3 1 120 35 30 2 130 40 3 125 Fitted Losses 1 2 3 1 114.17 48.00 22.50 2 114.17 48.00 3 114.17 Solution: Pearson Residuals 1 2 3 1 0.273 -0.938 0.791 2 0.741 -0.577 3 0.507 120 − 114.17 For example: 0.273 = √ 4 × 114.17 Note: The residuals are all negative at development period 2 and all positive at the other development periods, so they aren’t random around 0, which means this model doesn’t fit very well. Nevertheless, it will suffice for purposes of this example. Standardized deviance residuals are another valid option for calculating residuals. One of the benefits of deviance residuals is that they will be normally distributed (whereas Pearson residuals don’t have to be). The standardized deviance residual is calculated as: s D di Ri = sign(Yi − Ŷi ) ϕ̂ s unscaled deviance for that component Deviance Residual = the sign of (actual - expected) × estimated scale parameter © 2024 The Infinite Actuary, LLC July 27, 2024 Page 16 B. Advanced Reserving Methods Taylor Example: Given the following actual and fitted incremental losses for a model using the ODP distribution with ϕ = 4, and the unscaled deviance, calculate the standardized deviance residuals. Actual Cumulative Losses 1 2 3 1 120 35 30 2 130 40 3 125 Fitted Losses 1 2 3 1 114.17 48.00 22.50 2 114.17 48.00 3 114.17 Unscaled Deviance 1 2 3 1 0.29 3.87 2.26 2 2.10 1.40 3 1.00 Solution: Deviance Residuals 1 2 3 1 0.268 -0.984 0.752 2 0.724 -0.591 3 0.499 r 0.29 For example: 0.268 = and then we make this positive because 120 − 114.17 > 0 4 Adjustments to the Model Heteroscedasticity is when the variance of the residuals varies from one period to the next. This can be seen in a plot of the residuals when the residuals aren’t evenly spaced. Heteroscedasticity can be solved by using non-constant values for the scale parameter ϕ. This is discussed in more depth in the Shapland paper; however, Taylor also mentions weighting the scale parameter differently to account for heteroscedasticity, which would be an identical adjustment to the non-constant scale parameter adjustment. Outliers can also be addressed by use of weights. We just assign a weight of 0 to the outlier. Of course, it’s important to take care when removing outliers so that we are not removing events that reflect potential future variability. We could also choose to only rely on recent experience years by setting weights for observations outside of the last n diagonals to 0 (where n is the number of years we want to use in our analysis). © 2024 The Infinite Actuary, LLC July 27, 2024 Page 17 B. Advanced Reserving Methods Taylor Problem Knowledge Checklist 1. Know the definition of the exponential dispersion family 2. Know the parts of an EDF, their names and purposes b(θ ) a(ϕ) c(y, ϕ) V (µ) 3. Know b(θ ), a(ϕ), c(y, ϕ), and µ for the most common EDF members (at least Poisson) 4. Know how to get E(Y ) and Var (Y ) for an EDF 5. Know the definition of the tweedie sub-family (including the restriction on p) 6. Know the purpose of p 7. Know p for common tweedie distributions 8. Be able to list the assumptions required for the following stochastic models to replicate chain ladder and know the results Non-parametric Mack EDF/ODP Mack Cross-classified 9. Be able to calculate αk and β j values for the cross-classified model 10. Know how to set up a GLM Design matrix, X Parameter matrix, A h(·) Required selections 11. Be able to describe the difference between categorical and continuous covariates 12. Be able to describe the GLM set up for either a parametric Mack model or a cross-classified model 13. Be able to calculate deviance (including the loglikelihood for common distributions) 14. Be able to calculate standardized Pearson residuals 15. Be able to calculate standardized deviance residuals 16. Be able to state the advantage of deviance residuals over Pearson residuals 17. Be able to state how to make adjustments to the model for the following issues: Heteroscedasticity Outliers Only using recent experience © 2024 The Infinite Actuary, LLC July 27, 2024 Page 18

B. Advanced Reserving Methods PDF

Document Details

Tags

Related

Summary

Full Transcript