Applied Econometrics Lecture Handout 9 - Cointegration and Error-Correction PDF
Document Details
Uploaded by ComprehensiveChrysocolla
University of Oulu, Oulu Business School
2023
Elias Oikarinen
Tags
Related
- Applied Econometrics Lecture 1 PDF
- Applied Econometrics Regression Diagnostics & Complications PDF
- Applied Econometrics Endogeneity & Instrumental Variables Lecture 3 PDF
- Applied Econometrics Probit and Logit Regression Lecture Handout 11 Autumn 2022 PDF
- Applied Econometrics Lecture Handouts PDF
- Applied Econometrics ARIMA Models Lecture Handout 5 PDF
Summary
This handout provides an overview of cointegration and error-correction models in applied econometrics. It discusses important concepts and provides examples, such as the Engle-Granger method and how to test for cointegration. The lecture notes also cover various applications related to economic variables and market forces.
Full Transcript
Applied Econometrics Cointegration and Error-Correction Lecture handout 9 Autumn 2023 D.Sc. (Econ.) Elias Oikarinen Professor (Associate) of Economics University of Oulu, Oulu Business School 1 This Handout • The concept of cointegration • How to test for cointegration • Implications of cointegra...
Applied Econometrics Cointegration and Error-Correction Lecture handout 9 Autumn 2023 D.Sc. (Econ.) Elias Oikarinen Professor (Associate) of Economics University of Oulu, Oulu Business School 1 This Handout • The concept of cointegration • How to test for cointegration • Implications of cointegration • Error-correction model • POLL: Do you readily know what cointegration means? • Brooks, pp. 373-411 • Enders, pp. 343-401 • In Moodle: articles ”Explaining cointegration” parts I and II 2 Unit Root in Regression Model Let’s assume a simple model: yt = a0 + a1zt + et where et is typically NOT white noise There are four different options for the model: 1) Both yt and zt are stationary OLS is a suitable estimator 2) yt and zt are integrated of different orders [for instance one being I(0) and the other I(1)] - Residual is not I(0) it is not meaningful to estimate such model 3) yt and zt are both non-stationary, I(1)*, and the variables are not cointegrated - The model cannot be estimated with OLS, as et is non-stationary - Threat of spurious regression 4) yt ja zt are both non-stationary, I(1)*, and cointegrated It is meaningful to estimate the model (even with OLS) * More generally: both are integrated of the same order [in some, rare, cases both could be I(2)] 3 Cointegration Engle and Granger (1987) show that it some cases residual of model yt = a0 + a1zt + et can be stationary even if yt ja zt are I(1) Robert Engle and Clive Granger received the economics Nobel prize in 2003 In other words: Variables that have a unit root may have a linear combination that is stationary, i.e., the variables are cointegrated; they have a cointegrating relation In that case, there is a tight long-run relationship between yt and zt that we may call a long-term equilibrium relation The series cannot wander away from the long-term relationship over the long run, and their development paths depend on each other At least one variable reacts to deviations from the cointegrating relation by adjusting towards the long-term equilibrium relationship In such a case, significant information on long-term relationship (and hence long-term dynamics) is lost if the series are differenced 4 Cointegration Granger (1986): “At the least sophisticated level of the economic theory lies the belief that certain pairs of economic variables should not diverge from each other by too great an extent, at least in the long run. Thus, such variables may drift apart in the short run or according to seasonal factors, but if they continue to be too far apart in the long run, then economic forces, such as the market mechanism or government intervention, will begin to bring them together again.” 5 Cointegration: Implications Intuitive and widely used concept in time series modelling regarding numerous different research questions Cointegration has several important implications concerning e.g.: Predictability Long-term dynamics Optimal portfolio allocation For instance: Integratedness across various asset market or goods markets Long-term risk diversification: Cointegration between return indices for two different assets indicates that over the long horizon their return correlations tends towards one Long-horizon diversification benefits are considerably weaker than contemporaneous correlation would suggest Cointegration analysis can be used to test theories such as PPP 6 Examples of Potential Cointegration Financial theory should suggest where two or more variables would be expected to hold some long-run relationship with one another, there are many examples, e.g.: Spot and futures prices for a given commodity or asset Ratio of relative prices and an exchange rate Equity prices and dividends In all these three cases, market forces arising from no-arbitrage conditions suggest that there should be an equilibrium relationship between the series concerned (for equity prices and dividends there are some ‘buts’ though) 7 Examples of Potential Cointegration Other examples of variables that may be pairwise cointegrated: Two different interest rates (e.g. Euribor 12 month & Euribor 3 month) Price level between two different countries (corrected for exchange rate) Stock price (or return) indices between two different stock exchanges Housing prices between two closeby areas or between two different dwelling types 8 Cointegrating Vector & Equilibrium Error Let’s assume a simple pairwise case: two I(1) variables, yt and zt , that have a stationary linear combination Long-term equilibrium relationship: 0 + 1yt + 2zt = 0 Cointegrating vector: = (0, 1, 2) Equilibrium error: et = 0 + 1yt + 2zt (= deviation from the long-term relationship) Equilibrium relation can also be presented: yt = -(0 + 2zt)/1 = 0 + 2zt et = yt – 0 – 2zt For instance: zt = income level, yt = consumption, or yt = Euribor 12 month, zt = Euribor 1 month 9 Cointegrating Vector & Equilibrium Error More generally: there are n I(1) variables that have a stationary linear combination Long-term equilibrium relationship: 0 + 1 x1t + 2 x2t + … + n xnt = 0 (0,1, 2, … n ) Cointegrating vector: Equilibrium error: et = 0 + 1 x1t + 2 x2t + … + n xnt Cointegration (stationary combination among the variables) means that the error term , i.e. equilibrium error, is stationary The system is in equilibrium when et = 0 Cointegration can also take place with I(2) variables, but I(1) case is the typical one and the one considered in this course 10 Cointegrating Vector Typically cointegration vector is normalized w.r.t. one of the variables That is, the coefficient of a given variable is set to 1 For instance: in pairwise case yt = -(0 + 2zt)/1 = 0 + 2zt , we normalize w.r.t. yt so that Cointegrating relation: yt - 0 - 2zt = 0 Cointegrating vector: (0,1,2) Makes sense to normalize w.r.t. a variable that adjusts towards the relationship (i.e. that reacts to deviations from the relation) 11 Testing for Cointegration: EngleGranger Method Engle & Granger (1987) Suits a case where there can be only one equilibrium relationship among the variables (or the aim is to study / estimate the equilibrium relationship for one particular variable – that then would be the LHS variable) 1. Testing for the order of integration of each variable Cointegration requires that at least two variables are I(1); otherwise not sensible to test for cointegration 12 Testing for Cointegration: EngleGranger Method 2. If at least two I(1) variables, a regression is estimated; in a bivariate system: yt = 0 + 1zt + et If the model is stationary, i.e. yt and zt are cointegrated, an OLS regression yields a “super-consistent” estimator of the cointegrating parameters 𝛽0 and 𝛽1; Stock (1987) proves that the OLS estimates converge faster than they do in OLS models using stationary variables 13 Testing for Cointegration: EngleGranger Method 3. Investigate the stationarity of et If the residual series et is stationary, yt and zt are cointegrated and the estimated relationship can be regarded as a stable long-term relation Residual stationarity is tested with the DF-test, or ADF-test if et is autocorrelated (which it usually is): Tested equation should not include deterministic variables, since et should have zero mean If the null hypothesis of a1= 0 cannot be rejected, we conclude that et is nonstationary, in which case yt and zt are not cointegrated Rejection of the null hypothesis implies that et is stationary, and yt and zt are cointegrated 14 Testing for Cointegration: EngleGranger Method Since we are testing residuals from an estimated regression model, the critical values presented in the Dickey-Fuller tables that are applied in basic unit root tests are not suitable Engle and Yoo (1987) provide critical value table for the EngleGranger test If the system includes n variables and the sample includes 100 observations , the critical values in the case of 2-5 variables at the 1%, 5% and 10% levels of significance: MacKinnon (1996) present response-surface coefficients to compute the critical values 15 Engle-Granger Method: Empirical Example Finnish GDP and housing price index Both are I(1) hp 5.4 5.2 5.0 4.8 4.6 4.4 4.2 1985 1990 1995 2000 2005 2010 2015 2020 2010 2015 2020 GDP 11.0 10.8 10.6 10.4 10.2 10.0 9.8 1985 1990 1995 2000 2005 Engle-Granger Method: Empirical Example Theoretically, the estimated long-run relation makes sense: housing prices increase by 0.8% over the long run when GDP increases by 1% EQE_HP_GDP .6 .4 .2 .0 -.2 -.4 1985 1990 1995 2000 2005 2010 2015 From Engle-Yoo tables: Significant at the 6% level (5% sig. Level = -3.37) (POLL) 17 2020 Fully-Modified OLS Estimation EViews UG II, pp. 269-275 It is well-known that if the series are cointegrated, OLS estimation (‘static OLS’) of the cointegrating vector is consistent, converging at a faster rate than is standard However, the static OLS (SOLS) has some shortcomings that can cause bias in the estimates (especially due to the long-run correlation between the cointegrating equation and stochastic regressors innovations) Hence, it is recommended to use estimation techniques that are specifically aimed at cointegrating regression, these include (among others) Fully-Modified OLS (FMOLS; Phillips and Hansen, 1990) Dynamic OLS (DOLS; Saikkonen, 1992; Stock and Watson, 1993) Phillips and Hansen (1990) show that the FMOLS estimator removes the bias and performs well even in small samples when doing inferences on a cointegrated system Also allows for standard (Wald) F-test NOTE: the cointegration test should be based on the (S)OLS estimation residual 18 FMOLS Estimation: Empirical Example The standard F-test can be used to test coefficient restrictions Slightly greater now Exactly the same test value, but with the correct p-value provided On Cointeg. tests with EViews, see UG II, pp. 282(NOTE: the reported test statistic is automatically based on the static OLS estimate) 19 Error-Correction Model If variables are cointegrated, at least one of them adjusts towards the cointegrating relationship, i.e. reacts to deviations from the long-run relation Error correction means adjustment towards the long-run equilibrium of the system In other words: correction of the equilibrium error due to the dynamics of at least one variable Cointegration analysis enables the testing for cointegration among variables 20 Error-Correction Model The dynamics of cointegrated variables can be presented with errorcorrection models; in the simpliest, i.e. two variable, case: yt-1 – 1zt-1 is the equilibrium error in period t-1 (note: here, the constant terms 1 and 2 cater for the constant of the cointegrating relationship as well) • • In practise, one can also add the constant term in yt-1 – 1zt-1 – 0 , in which case the equilibrium-error in […] is simply the residual from the estimated regression Together, the two equations form a vector error-correction model (VECM) Error-correction model (ECM), or VECM, enables us to take account of the influence of long-term cointegrating relationship on the dynamics of the variable(s), including forecasts (vs. basic ARIMAX or VAR in differences) Enables us to keep information cointained in the levels of the variables 21 Error-Correction Model In the Engle-Granger method, yt-1 – 0 – 1zt-1 is estimated 𝜀𝑦𝑡 and 𝜀𝑧𝑡 are white noise error terms y and z are the error-correction coefficients that indicate the adjustment speeds of y and z towards the equilbrium relation i shows how large a share of the equilibrium error disappears during a period due to the adjustment (i.e. ”error correction”) of variable i If y = z = 0, y and z are not cointegrated conventional VAR model in differences In a cointegrated system, at least one variable need to adjust towards the equilibrium, i.e. either y < 0 or z > 0 or both 22 Error-Correction Model • Why y < 0? Because yt-1 – 0 – 1zt-1 (= 𝑒Ƹ t-1) > 0 means that yt-1 is greater than its equilibrium value; hence yt needs to decrease for the error to correct • Greater absolute value of y means faster adjustment towards the cointerating • relations and thereby more rapid equilibrium correction Adjustment parameters should be smaller than one in absolute value • Similarly z > 0 if Zt adjusts towards the long-run relation • Error-correction mechanism in the equation for yt : y 𝑒Ƹ t-1 = y(yt-1 – 0 – 1zt-1) • If y = 0, y is weakly exogenous, i.e., y cannot be predicted with the equilibrium error and does not adjust towards the CI relation • If z = 0, z is weakly exogenous Error-Correction Model ECM is a one equation model, and it can be estimated in a ”flexible” form RHS variables can also include (stationary) variables other than those in the CI vector Does not need to include lags of the cointegrated variables The number of lags can be different for different variables 24 Error-Correction Model: An Example Often ECM is estimated for one variable only E.g. for forecasting purposes or study the dynamics of that variable ECM for housing price change based on the CI relation between HP and GDP AIC and/or SBC can be used to select RHS variables to be included POLL: Do the models make sense in terms of error-correction? Which one is preferred? Dependent Variable: DP Method: Least Squares Date: 10/01/21 Time: 13:22 Sample (adjusted): 1985Q3 2020Q1 Included observations: 139 after adjustments Huber-White-Hinkley (HC1) heteroskedasticity consistent standard errors and covariance Variable Coefficient Std. Error t-Statistic Prob. Variable Coefficient Std. Error t-Statistic Prob. C EQE(-1) D(P(-1)) D(LOANS(-1)) -0.003343 -0.037899 0.673796 0.311760 0.002033 0.014020 0.119343 0.127355 -1.644575 -2.703257 5.645852 2.447952 0.1024 0.0077 0.0000 0.0157 C EQE(-1) D(P(-1)) D(Y(-1)) D(LOANS(-1)) -0.003246 -0.038260 0.678396 -0.027329 0.312280 0.002139 0.014238 0.120937 0.103647 0.127564 -1.517859 -2.687146 5.609520 -0.263670 2.448026 0.1314 0.0081 0.0000 0.7924 0.0157 R-squared Adjusted R-squared S.E. of regression Sum squared resid 0.625556 0.617235 0.016758 0.037910 R-squared Adjusted R-squared S.E. of regression Sum squared resid 0.625698 0.614525 0.016817 0.037895 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion Includes a variable that is not present in the CI vector or VECM. 0.004175 0.027086 -5.311586 -5.227141 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion 0.004175 0.027086 -5.297578 -5.192021 25 Error-Correction Model: An Example Usual diagnostic checks Some remaining autocorrelation in residuals: maybe should add e.g. AR(2) component Clearly heteroscedastic: estimation with ”Huber-White”; could possibly add GARCH model Non-normal residual: t-test (p-values) should be taken with caution Sample: 1985Q1 2020Q4 Included observations: 139 Autocorrelation Heteroskedasticity Test: ARCH Partial Correlation 1 2 3 4 5 6 7 8 30 25 AC PAC -0.103 0.199 -0.037 0.150 -0.019 -0.041 0.025 -0.126 -0.103 0.191 -0.000 0.114 0.010 -0.095 0.020 -0.123 20 15 10 5 0 -0.06 -0.04 -0.02 0.00 0.02 0.04 0.06 0.08 Q-Stat Prob F-statistic Obs*R-squared 6.493114 22.48013 Prob. F(4,130) Prob. Chi-Square(4) 0.0001 0.0002 1.5152 0.218 7.2027 0.027 Test Equation: 7.3977 0.060 Dependent Variable: RESID^2 10.657 0.031 Method: Least Squares 10.712 0.057 Date: 10/01/21 Time: 13:33 Sample (adjusted): 1986Q3 2020Q1 10.960 0.090 Included observations: 135 after adjustments Series: Residuals 11.051 0.136 Huber-White-Hinkley (HC1) heteroskedasticity consistent standard errors Sample 1985Q3 2020Q1 13.408 0.099 and covariance Observations 139 Variable Coefficient Std. Error t-Statistic Prob. C RESID^2(-1) RESID^2(-2) RESID^2(-3) RESID^2(-4) 0.000143 0.339496 -0.090067 -0.003312 0.236567 6.84E-05 0.131577 0.087430 0.060575 0.132352 2.098312 2.580205 -1.030170 -0.054670 1.787408 0.0378 0.0110 0.3048 0.9565 0.0762 Mean Median Maximum Minimum Std. Dev. Skewness Kurtosis -2.23e-18 0.000264 0.081266 -0.059064 0.016574 0.611692 8.989916 Jarque-Bera Probability 216.4679 R-squared 0.000000 0.166520 Mean dependent var 0.000278 26 Error-Correction Model: An In the same manner as Example with ARMA models, one can compute forecasts and compare forecasting accuracy Seasonal dummies included Dependent Variable: D(HP) Method: Least Squares Date: 10/04/21 Time: 18:30 Sample (adjusted): 1985Q3 2020Q1 Included observations: 139 after adjustments Huber-White-Hinkley (HC1) heteroskedasticity consistent standard errors and covariance Based on SBC Variable Coefficient Std. Error t-Statistic Prob. C DHP(-1) DGDP(-1) DLOAN(-1) EQE_HP_GDP(-1) @QUARTER=1 @QUARTER=2 @QUARTER=3 -0.006847 0.629208 0.193030 0.317865 -0.030049 0.002796 0.021168 -0.010136 0.003527 0.108084 0.091608 0.132003 0.011580 0.007886 0.006746 0.007541 -1.941048 5.821472 2.107130 2.408023 -2.594857 0.354598 3.137685 -1.344181 0.0544 0.0000 0.0370 0.0174 0.0105 0.7235 0.0021 0.1812 R-squared Adjusted R-squared S.E. of regression Sum squared resid 0.599181 0.577763 0.018438 0.044534 Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion 0.004204 0.028375 -5.092993 -4.924102 .15 .10 .05 .00 -.05 .08 -.10 .04 .00 -.04 -.08 1990 1995 2000 Residual 2005 Actual 2010 2015 2020 Fitted Slightly slower adjustment now, about 12% per year 27 Error-Correction Model: An Example We might want to include 2nd lag of dhp to remove remaining residual autocorrelation Heteroscedastic & non-normal residual Sample: 1985Q1 2020Q1 Included observations: 139 Q-statistic probabilities adjusted for 7 dynamic regressors Autocorrelation Partial Correlation 1 2 3 4 5 6 7 8 9 10 11 12 AC PAC -0.034 0.200 -0.035 0.145 -0.010 -0.095 0.033 -0.105 0.114 -0.009 0.135 -0.022 -0.034 0.199 -0.023 0.108 0.007 -0.150 0.039 -0.081 0.100 0.069 0.088 -0.009 Q-Stat Prob* 0.1640 5.9011 6.0733 9.1144 9.1301 10.447 10.612 12.261 14.230 14.243 17.021 17.094 0.686 0.052 0.108 0.058 0.104 0.107 0.156 0.140 0.114 0.162 0.107 0.146 *Probabilities may not be valid for this equation specification. Breusch-Godfrey Serial Correlation LM Test: Null hypothesis: No serial correlation at up to 4 lags F-statistic Obs*R-squared 3.866025 15.08808 Prob. F(4,127) Prob. Chi-Square(4) 0.0054 0.0045 28 Engle-Granger Method: Complications Engle-Granger method is intuitive and easy to implement, but has some lacks Conclusions may depend on the selection of the LHS variable in the regression equation Can estimate only up to one cointegrating relationship between the variables Relies on a two-step estimator: the tested coefficient is obtained by estimating a regression using the residuals from another regression any error from the first stage regression is carried into the ADF test (that is conducted based on the 1st stage residual) 29 Other Possible CI Tests Phillips-Ouliaris (1990): residual based test, same basic approach as in Engle-Granger method Hansen instability test (1992): residual based test, same basic approach as in Engle-Granger method Johansen Maximum Likelihood (ML) method All readily available e.g. in Eviews (UG II, pp. 282-291) Regardless of the testing method, the selection of start and end dates of the sample period may affect the conclusions if the sample period is relatively short 30 Johansen ML Method* • • • • • • More ”sophisticated” approach A very commonly used approach especially if there are more than two non-stationary variables A system of variables can include more than one CI vector If there are n stochastic variables, there can be at max n-1 cointegrating relations The number of CI vectors is called cointegration rank Johansen (1988) Maximum Likelihood (ML) method: • • • • Can estimate and test for the presence of multiple cointegrating vectors Circumvents the use of two-step estimators Can be used to test restrictions on the CI vector(s) [testing whether some variable(s) can be removed from a CI vector] and on the speed of adjustment parameters (testing for weak exogeneity of a variable) Has greater power in CI test than the E-G method Johansen ML Method* • • • Detailed information on tests and estimation: Brooks, pp. 386-; Enders, pp. 373- (pp. 389- for illustration of the method); EViews, UG II, pp. 1023Testing procedure is based on a Vector Error-Correction (VEC) form Testing for the number of cointegrating vectors • • • Trace Test: is a joint test where the null is that the number of cointegrating vectors is less than or equal to r ; if rejected, indicates that there are more than r CI vectors Max Text: has as its null hypothesis that the number of CI vectors is r against an alternative of r + 1 Issues with the tests: • • Lag length selection: Info criteria; Non-autocorrelated residuals (LM test; Portmanteau test) Which deterministic variables to include: Typical case is that with constant included in the CI vector and tested VECM (this is the default option in EViews)