Lecture 3 (Week 3) - Comparing and Combining Forecasts PDF
Document Details
Uploaded by RaptTechnetium
University of Sydney
David Ubilava
Tags
Summary
This lecture discusses comparing and combining forecasts in economics and business. It covers various methods, including in-sample measures like R-squared and adjusted R-squared, and out-of-sample measures like root mean squared forecast error. The lecture also explores information criteria and the Diebold-Mariano test for evaluating different models.
Full Transcript
Forecasting for Economics and Business Lecture 3: Comparing and Combining Forecasts David Ubilava University of Sydney Comparing forecasts using in-sample measures We often have several forecasts, each generated from a specific model or using a specific me...
Forecasting for Economics and Business Lecture 3: Comparing and Combining Forecasts David Ubilava University of Sydney Comparing forecasts using in-sample measures We often have several forecasts, each generated from a specific model or using a specific method. The obvious aim, at that point, is to select the model or method that generates the most accurate forecasts. One way of doing it is by selecting the model that best fits the data; that is, using in-sample goodness of fit measures. 2 / 34 R-squared and adjusted R-squared Recall the most frequently used (and often abused) R-squared: T 2 ∑ ^ e 2 t=1 t R = 1 − T 2 ∑ (yt − ȳ ) t=1 Adjusted R-squared accounts for the loss in degrees of freedom: T 2 2 ∑ ^ e T − 1 t=1 t R̄ = 1 − ( ), T ∑ (yt − ȳ ) 2 T − k t=1 where k denotes the number of estimated parameters. 3 / 34 Information criteria The adjustment made to the R-squared might not be 'enough' so select a 'good' forecasting model, however. Information criteria penalize for the loss in degrees of freedom more 'harshly' than the adjusted R-squared: T 2 k ^ ) + 2 AI C = ln (∑ e t T t=1 T 2 k ^ ) + ln T SI C = ln (∑ e t T t=1 4 / 34 Information criteria are relative measures Things to remember about the information criteria: Less is better. Relative (not absolute) values of the criteria matter. SIC selects a more parsimonious model than AIC. The measures are used to compare fit, so long as the dependent variable is the same across the models. 5 / 34 Comparing forecasts using out-of-sample measures Another way of doing it, which may be viewed as being more sensible, at least from a forecaster's perspective, is by evaluating forecasts in an out- of-sample environment. Recall that models with the best in-sample fit are not guaranteed to necessarily produce the most accurate forecasts. 6 / 34 A snapshot of multi-step-ahead El Nino forecasts 7 / 34 Historical data of one-step-ahead El Nino forecasts 8 / 34 Selecting based on a forecast accuracy measure Thus far we have implied the following "algorithm" for selecting the most accurate among available forecasts: Decide on a loss function (e.g., quadratic loss). Obtain forecasts, the forecast errors, and the corresponding sample expected loss (e.g., root mean squared forecast error) for each model or method in consideration. Rank the models according to their sample expected loss values. Select the model with the lowest sample expected loss. 9 / 34 Ranking of the models of El Nino forecasts Model MAFE RMSFE ECMWF 0.193 0.249 JMA 0.271 0.326 CPC MRKOV 0.297 0.362 KMA SNU 0.314 0.376 CPC CA 0.300 0.377 CSU CLIPR 0.311 0.392 LDEO 0.310 0.395 AUS/POAMA 0.321 0.417 10 / 34 Are they statistically significantly different? But the loss function is a function of a random variable, and in practice we deal with sample information, so sampling variation needs to be taken into the account. Statistical methods of evaluation are, therefore, desirable. 11 / 34 Forecast errors from two competing models Consider a time series of length T. Suppose h-step-ahead forecasts for periods R + h through T have been generated from two competing models denoted by i and j and yielding ^ y i,t+h|t and y^j,t+h|t , for all t = R, … , T − h, with corresponding forecast errors: e ^i,t+h|t and e ^j,t+h|t. 12 / 34 Loss differential The null hypothesis of equal predictive ability can be given in terms of the unconditional expectation of the loss differential: ^t+h|t )] = 0, H0 : E [d(e where, assuming quadratic loss, 2 2 ^t+h|t ) = e d(e ^ ^ − e. i,t+h|t j,t+h|t 13 / 34 Diebold-Mariano test The Diebold-Mariano (DM) test relaxes the aforementioned requirements on the forecast errors. The DM test statistic is: ¯ d DM = ∼ N (0, 1), 2 √σ /P d P where d¯ = P −1 ∑ t=1 ^t+h|t ) d(e , and where P = T − h − R + 1 is the total number of forecasts. 14 / 34 Modified Diebold-Mariano test A modified version of the DM statistic, due to Harvey, Leybourne, and Newbold (1998), addresses the finite sample properties of the test, so that: −1 P + 1 − 2h + P h(h − 1) √ DM ∼ tP −1 , P where tP −1 is a Student t distribution with P − 1 degrees of freedom. 15 / 34 A regression-based Diebold-Mariano test In practice, the test of equal predictive ability can be applied within the framework of a regression model: ^t+h|t ) = δ + υt+h d(e t = R, … , T − h. The null of equal predictive ability is equivalent of testing H0 : δ = 0. Because d(e ^t+h|t ) may be serially correlated, an autocorrelation consistent standard errors should be used for inference. 16 / 34 Predictive accuracy of El Nino forecasts Are the forecasts from ECMWF statistically significantly more accurate than those from JMA (the next best model)? As it turns out, the DM statistic is −2.930. Which means, we reject the null hypothesis of equal predictive accuracy. So, yes, ECMWF forecasts are more accurate than those of JMA. 17 / 34 Combining forecasts By choosing the most accurate of the forecasts, we discard all others. But other forecasts may not be completely useless. They could potentially contain information that is absent in the most accurate forecast. Thus, merely selecting the best model may be a sub-optimal strategy. An optimal strategy may be using some information from all forecasts, i.e., forecast combination. 18 / 34 Why might we combine? Several factors support the idea of forecast combination: The concept is intuitively appealing; The method is computationally simple; The outcome is surprisingly good. 19 / 34 How can we combine? Consider two forecasting methods (or models), i and j, each respectively yielding h-step-ahead forecasts y^i,t+h|t and y^j,t+h|t , and the associated forecast errors e ^i,t+h|t ^ = yt+h − y i,t+h|t and e ^j,t+h|t ^ = yt+h − y j,t+h|t. A combined forecast, y^c,t+h|t , is expressed as: ^ y ^ = (1 − w)y ^ + wy , c,t+h|t i,t+h|t j,t+h|t where 0 ≤ w ≤ 1 is a weight. Thus, a combined forecast is the weighted average of the two individual forecasts. 20 / 34 Mean of a combined forecast error Likewise, a combined forecast error is the weighted average of the two individual forecast errors: ^c,t+h|t = (1 − w)e e ^i,t+h|t + we ^j,t+h|t The mean of a combined forecast error (under the assumption of forecast error unbiasedness) is zero: ^i,t+h|t + we E (ec,t+h ) = E [(1 − w)e ^j,t+h|t ] = 0 21 / 34 Variance of a combined forecast error The variance of a combined forecast error is: 2 2 2 2 ^c,t+h|t ) = (1 − w) σ V (e + w σ + 2w(1 − w)ρσi σj , i j where σi and σj are the standard deviations of the forecast errors from models i and j, and ρ is a correlation between these two forecast errors. 22 / 34 The optimal weight We can obtin an optimal weight, i.e., a weight that minimizes the variance of the combined forecast error, by taking the derivative of the variance of a combined forecast error, and equating it to zero. Solving this first order condition for w yields the optimal weight: 2 σ − ρσi σj ∗ i w = 2 2 σ + σ − 2ρσi σj i j 23 / 34 The optimal weight for different ρ and σj /σi 24 / 35 At least as efficient as the most efficient forecast Substitute w∗ in place of w in the formula for variance to obtain: 2 2 2 σ σ (1 − ρ ) ∗ 2 ∗ i j ^c,t+1|t (w )] = σc (w ) = V [e 2 2 σ + σ − 2ρσi σj i j As it turns out: σc2 (w∗ ) i 2 2 ≤ min{σ , σ } j. That is, by combining forecasts we are not making things worse (so long as we use optimal weights). 24 / 34 The optimal weight for different ρ and σj /σi When the two forecasts are equally accurate (or inaccurate), the optimal weight is 0.5, regardless of the correlation between the forecasts. The less accurate is a forecast, the smaller is the weight attached to it. When the forecasts are highly correlated, the negative optimal weight may be attached to the less accurate forecast. This happens when ρ > σi /σj. When the forecasts are uncorrelated, the weights attached to the forecasts are inversely proportional to the variances of these forecasts: −2 2 σ σ ∗ i j w = = 2 2 −2 −2 σ + σ σ + σ i j i j 25 / 35 Variance of the optimally combined forecast error Substitute w∗ in place of w in the formula for variance to obtain: 2 2 2 σ σ (1 − ρ ) ∗ 2 ∗ i j ^c,t+1|t (w )] = σc (w ) = V [e 2 2 σ + σ − 2ρσi σj i j As it turns out: σc2 (w∗ ) i 2 2 ≤ min{σ , σ } j. That is, by combining forecasts we are not making things worse (so long as we use optimal weights). 26 / 35 Variance for different ρ and σj /σi 27 / 35 Combining equally accurate forecasts Assumption: σi = σj = σ. Suppose the individual forecasts are equally accurate, then the combined forecast error variance reduces to: 2 σ (1 + ρ) 2 ∗ 2 σc (w ) = ≤ σ 2 The equation shows there are diversification gains even when the forecasts are equally accurate (unless the forecasts are perfectly correlated, in which case there are no gains from combination). 25 / 34 Combining uncorrelated forecasts Assumption: ρ = 0. Suppose the forecast errors are uncorrelated, then the sample estimator of w∗ is given by: −2 2 σ σ ∗ i j w = = 2 2 −2 −2 σ + σ σ + σ i j i j Thus, the weights attached to forecasts are inversely proportional to the variance of these forecasts. 26 / 34 Combining equally accurate uncorrelated forecasts Assumption: σi = σj = σ and ρ = 0. Suppose the individual forecasts are equally accurate and the forecast errors are uncorrelated, then the sample estimator of w∗ reduces to 0.5, resulting in the equal-weighted forecast combination: ^ y ^ = 0.5y ^ + 0.5y c,t+h|t i,t+h|t j,t+h|t 27 / 34 Predictive accuracy of combined El Nino forecasts Are the combined forecasts statistically significantly more accurate than those from ECMWF (the best model)? When applying "equal weights", the DM statistic is 1.543. Which means, we fail to reject the null hypothesis of equal predictive accuracy. When applying "inversely proportional weights", the DM statistic is 2.062. Which means, we reject the null hypothesis of equal predictive accuracy in favor of the combined forecast. 28 / 34 The optimal weight in a regression setting The optimal weight has a direct interpretation in a regression setting. Consider the combined forecast equation as: ^ yt+h = (1 − w)y ^ + wy + εt+h , i,t+h|t j,t+h|t where εt+h ^c,t+h|t ≡ e. 29 / 34 The optimal weight in a regression setting We can re-arrange the equation so that: ^i,t+h|t = w(y e ^ ^ − y ) + εt+h , j,t+h|t i,t+h|t or that: ^i,t+h|t = w (e e ^i,t+h|t − e ^j,t+h|t ) + εt+h , t = R, … , T − h where w is obtained by estimating a linear regression with an intercept restricted to zero. 30 / 34 The optimal weight in a regression setting We can also estimate a variant of the combined forecast equation: ^ yt+h = α + βi y ^ + βj y + εt+h , i,t+h|t j,t+h|t which relaxes the assumption of forecast unbiasedness, as well as of weights adding up to one or, indeed, of non-negative weights. 31 / 34 Forecast encompassing A special case of forecast combination is when w = 0. Such an outcome (of the optimal weights) is known as forecast encompassing. It is said that y^i,t+h|t encompasses y^j,t+h|t , when given that the former is available, the latter provides no additional useful information. 32 / 34 Forecast encompassing in a regression setting We can test for the forecast encompassing by regressing the realized value on individual forecasts: ^ yt+h = α + β1 y ^ + β2 y + εt+h , i,t+h|t j,t+h|t and testing the null hypothesis that β2 = 0 , given that β1 = 1. 33 / 34 Key takeaways All forecasts are wrong, but some are less wrong than others. We can compare and select the more accurate of the available forecasts using the Diebold-Mariano test. Less accurate forecasts may still be useful. We can combine several forecasts to achieve improved accuracy. Not all forecasts can be useful. Forecast encompassing test allows us to "weed out" the "useless" forecasts. 34 / 34