Applied Econometrics VAR Models Lecture Handout 10 Autumn 2022 PDF

Applied Econometrics VAR Models Lecture handout 10 Autumn 2022 D.Sc. (Econ.) Elias Oikarinen Professor (Associate) of Economics University of Oulu, Oulu Business School This Handout • Basics of vector autoregressive (VAR) models • Multivariate & multiequation time series analysis • Additional readings marked with * • Brooks, pp. 326-352 • Enders, pp. 285-335 (for this course, relevant pp. 285-313; other parts also recommended for an interested student) • Eviews UG II, Chapter 40 2 Overview of VAR Models ⚫ Vector autoregressive model; Sims (1980); The ”original” article is included the course webpages ⚫ At least two stochastic variables, that are assumed to interact or have prediction power w.r.t. each other ⚫ System of two or more equations ⚫ Same explanatory (i.e. RHS) variables in each estimated equation ⚫ Christopher A. Sims: Nobel Prize for Economics in 2011 ⚫ ⚫ Illustrates the great importance of VAR modelling in economics! A full course could be lectured on VAR models alone! ⚫ An interested student can find muuuch more information e.g. in Enders, also Brooks, and many other textbooks & articles 3 Structural Form VAR Model ⚫ Structural form of the simpliest VAR model, i.e. a model with two stochastic variables with lag length being 1 (1st order VAR): ⚫ Assumptions: ⚫ ⚫ ⚫ yt and zt are stationary Error terms / disturbance terms 𝜀𝑦𝑡 and 𝜀𝑧𝑡 are white noise processes with variances y2 ja z2 and E(𝜀𝑖𝑡 ) = 0 𝜀𝑦𝑡 and 𝜀𝑧𝑡 are uncorrelated ⚫ System includes feedback, since yt and zt affect each other ⚫ Error terms 𝜀𝑦𝑡 and 𝜀𝑧𝑡 are ”innovations” or ”shocks” to yt and zt ⚫ 𝜀𝑦𝑡 has a contemporaneous effect on zt, if b21 ≠ 0 ⚫ 𝜀𝑧𝑡 has a contemporaneous impact on yt, if b12 ≠ 0 4 Reduced Form VAR Model In structural form VAR, yt and zt affect each other simultaneously → OLS estimates would suffer from simultaneous equation bias since the regressors and the error terms would be correlated ⚫ ⚫ Fortunately, it is possible to transform the system of equations into a more usable form: ⚫ No simultaneity problem, as RHS includes only lagged (and thereby pre-determined) variables ⚫ This is the standard form, or reduced form, VAR model ⚫ Reduced form model is the one that we estimate ⚫ For formal derivation of the reduced form equation, see e.g. Enders, pp. 285-286 5 Reduced Form VAR Model ⚫ Reduced form residuals e1t and e2t constitute of two shocks (𝜀𝑦𝑡 and 𝜀𝑧𝑡 ): ⚫ Since 𝜀yt and 𝜀zt are white-noise processes, it follows that both e1t and e2t have zero means and constant variances and are individually serially uncorrelated ⚫ In general, the e1t and e2t will be correlated ⚫ In the special case where b12 = b21 = 0 (i.e., if there are no contemporaneous effects of yt on zt and zt on yt), e1t and e2t are uncorrelated ⚫ Also these issues are formally demonstrated in Enders ⚫ VAR stationarity conditions are presented in Enders, pp. 287-288 6 Advantages of VAR Models ⚫ The researcher does not need to specify which variables are endogenous or exogenous – all are endogenous ⚫ VARs allow the value of a variable to depend on more than just its own lags, so VARs are more flexible and include more features than univariate AR(MA) models ⚫ It is possible to simply use OLS separately on each equation (since all RHS variables are pre-determined in the reduced form model) ⚫ VARs often work well as forecast models ⚫ Allow for detailed and versatile analysis of economic dynamics ⚫ Some more details on the advantages: Brooks pp. 328-329 7 Problems of VAR Models ⚫ VARs are atheoretical (as are ARMA models), since they use little theoretical information about the relationships between the variables to guide the specification of the model ⚫ However, one should base the selection of variables and some other key issues in VAR analysis, to be discussed later on, on theoretical considerations (unless the aim is just to have a forecasting model) ⚫ The questions / problems discussed in the next two slides ⚫ Some more details on the complications: Brooks pp. 329-330 8 VAR Model Estimation ⚫ In the estimated (i.e. reduced form) model, all RHS variables are lagged → equations can be estimated separately with OLS ⚫ Key questions in the estimation: ⚫ What variables to include? ⚫ What is the lag length? ⚫ Any necessary transformation in the variables (e.g. differencing)? 9 VAR Model Estimation: Variables ⚫ Number of variables is typically relatively small → number of estimated parameters does not become very large → dof do not get overly small ⚫ One should be able to reason logically the variables to be included ⚫ Variables to be included depends on the aim of the model ⚫ ⚫ For instance, when examining questions w.r.t. monetary policy, the model should typically include at least an interest rate variable, inflation rate and GDP Sims (1980) and some other econometricians suggest that the variables are not differenced even if they are nonstationary, since: ⚫ VAR analysis aims to investigate relationships among variables and not to provide accurate estimates for individual parameters ⚫ Information is lost, when data are differenced ⚫ Majority of econometricians think that the variables should be stationary ⚫ Solutions for keeping the information on long-run relationships in a stationary model: vector error-correction model (VECM) 10 Lag Length Selection ⚫ ⚫ In VAR models, long lag lengths quickly consume degrees of freedom For instance, a 3 variable 1st order VAR model (in reduced form): yt = a10 + a11yt-1 + a12zt-1 + a13kt-1 + e1t zt = a20 + a21yt-1 + a22zt-1 + a23kt-1 + e2t kt = a30 + a31yt-1 + a32zt-1 + a33kt-1 + e3t Corresponding 2nd order VAR: yt = a10 + a11yt-1 + a12zt-1 + a13kt-1 + a14yt-2 + a15zt-2 + a16kt-2 + e1t zt = a20 + a21yt-1 + a22zt-1 + a23kt-1 + a24yt-2 + a25zt-2 + a26kt-2 + e2t kt = a30 + a31yt-1 + a32zt-1 + a33kt-1 + a34yt-2 + a35zt-2 + a36kt-2 + e3t ⚫ With lag length p and n equations, each equation includes n*p lagged RHS variables + an intercept (unless it is excluded) ⚫ ⚫ Each equation contains np +1 coefficients to be estimated VAR model contains overall n*(np+1) estimated coefficients 11 Lag Length Selection ⚫ Appropriate lag length selection can be critical: ⚫ ⚫ ⚫ Typical procedure: begin with the longest plausible lag length or the longest feasible lag length given degrees-of-freedom considerations, and remove insignificant lags (starting from the longest one) concurrently from all equations ⚫ ⚫ ⚫ If p is too small, the model is misspecified if p is too large, degrees of freedom are wasted In monthly data a suitable starting point is often e.g. 12 lags – in that case we would assume that one-year lag is enough to cater for the dynamics of the system In quarterly data, the starting point often is 4 lags, could also be e.g. 8 lags If the nobs if relatively small (and especially if there are many equations in the model), one needs to consider the degrees of freedom when selecting the lag length 12 Lag Length Selection: LR Test ⚫ Likelihood Ratio (LR) test can be used to select the lag length ⚫ More detailed discussion (including e.g. formal computations of the test statistic) in Brooks, pp. 330-331 & Enders, pp. 303-305 ⚫ Null hypothesis: the last w lags in the model are together insignificant, and can thereby be removed from the model (i.e. from all equations) ⚫ For instance: the longest plausible lag length is assumed to be 12, and we wish to test whether lags 9-12 could be removed ⚫ Estimate separately both 12 lag VAR and 8 lag VAR, and compare these models ⚫ By restricting to 8 lags, the number of estimated coefficients drop by 4*n in each equation: e.g. in 2-variate model altogether by 2*(4*2) = 16, in 3variate model by 3*(4*3) = 36 etc. ⚫ As the aim is to test whether dropping lags 9-12 is suitable for every equation in the system, F-test on separate equations does not work ⚫ If the null hypothesis is accepted, lags 9-12 can be removed from the model 13 Lag Length Selection: LR Test ⚫ The test statistic has asymptotically a 2 distribution, the degrees of freedom in the test being the same as the number of tested restrictions, (i.e. n*n*4 in our example) ⚫ If the null hypothesis is rejected, the tested/restricted lags can not be removed from the model; if test does not reject, the lags can be removed ⚫ Note: the same sample period and effective sample size has to be used when comparing the models with different lag lengths ⚫ With lag length of 12, the first 12 observations of the overall sample are needed for the prediction of the 13th observation - i.e. they are lost from the effective sample ⚫ This needs to be considered, when estimating the models to be compared in LR test ⚫ EViews does this in the correct manner automatically 14 Lag Length Selection: LR Test ⚫ ⚫ Complication with LR test: May give a different conclusion depending on testing approach ⚫ For instance: accepts restrictions p = 12 → p = 8 and then p = 8 → p = 4, but rejects restricting directly p = 12 → p = 4 ⚫ LR test is based on asymptotic theory, which may not be very useful in the small samples available to time series ⚫ Also assumes that the errors from each equation are normally distributed ⚫ Moreover, LR test is only applicable when one model is a restricted version of the other 15 Lag Length Selection: Info Criteria ⚫ Alternative test criteria are the multivariate versions of the AIC and SBC:* N = determinant of the variance/covariance matrix of the residuals = Total number of parameters estimated in the whole model = n*(n*p) + n, assuming that there is only one deterministic variable in the equations (i.e. the constant, intercept) ⚫ Smaller info criteria values are preferred ⚫ Adding additional regressors will reduce ln|Σ| at the expense of increasing N ⚫ Sample period and effective sample size need to be the same in all models that are compared ⚫ Do not require residual normality ⚫ If residuals are autocorrelated, it may be necessary to include more lags than suggested by e.g. SBC * In software packages, these are sometimes reported a bit differently, see Enders, p. 305 16 Diagnostic Checks ⚫ Residuals should not exhibit autocorrelation; can be tested by: ⚫ Lagrange multiplier (LM) test: preferred when testing relatively short lag autocorrelations; thereby good especially for relatively low frequency data ⚫ Portmanteau test (comparable to the Ljung-Box test for univariate models) is better for testing for autocorrelation at long lag lengths; thereby suits high frequency (e.g. daily) data particularly well ⚫ Tests can be conducted separately for each equation, or for all equations at the same time ⚫ Residual normality can be tested with JB test: each equation separately, or all equations simultaneously (with the mutivariate JB test) ⚫ White heteroscedasticity test (for instance) ⚫ Heteroscedasticity or non-normality of residuals are not serious problems in VAR models, if the main aim is to investigate the linear interactions / interdependences among the included variables (as it typically is) ⚫ GARCH dimension can be included in VAR models 17 Empirical Example ⚫ Let’s consider a simple model of interaction between housing prices, housing loans, and the overall economy ⚫ Data for the Finnish market: 1985Q1 to 2020Q1, not seasonally adjusted ⚫ Variables to be included: ⚫ Real GDP – I(1) ⚫ Real housing price index – I(1) ⚫ Housing loan-to-GDP ratio – I(1) ⚫ Real after-tax mortgage interest rate – I(1) ⚫ In natural logs except for the interest rate 18 Empirical Example: Graphs ⚫ In the VAR window: View – Endogenous graph (apparently clear seasonality in dGDP and dMR) 19 Empirical Example: Lag Length ⚫ ⚫ VAR with differenced variables, as they are stationary EViews: Quick – Estimate VAR In the VAR equation box: View – Lag length criteria In the VAR equation box: View – Lag exclusion test (LR test for lag exclusion; note, you need to estimate VAR with 6 lags to get this) 20 Empirical Example: Lag Length ⚫ Long lag length suggested by all criteria: maybe seasonal variation? ⚫ Indeed, when seasonal dummies are added, less lags are needed, and all the info criteria prefer a model with the dummies → we continue with VAR that includes the three seasonal dummies ⚫ We also could test for the significance of the dummies with the LR test (not readily available in Eviews) ⚫ As often, different criteria suggest different lag lengths 21 Empirical Example: Diagnostic Checks ⚫ ⚫ • This is how the model with one lag looks like… …but the LM test rejects the hypothesis of no autocorrelation → Need to include more lags (POLL) Note: there are a large number of insignificant parameters, but that is not a problem in VAR models 22 Granger causality ⚫ Enders: pp. 305-307 ⚫ Granger causality does not imply actual causality! ⚫ Rather, Granger ”causality” is about a predictive relationship ⚫ If zt Granger causes yt → zt has predictive power w.r.t. yt ⚫ If zt does not improve the (in-sample) prediction of yt → zt does not Granger cause yt ⚫ If zt Granger causes yt, but yt does not Granger cause zt, we can call this a lead-lag relationship: zt ”leads” yt ⚫ NOTE: If zt does not Granger cause yt, it does NOT automatically mean that yt is exogenous w.r.t zt: zt can have a simultaneous impact on yt ! How? 23 Granger causality ⚫ Let’s consider the simpliest possible example: ⚫ ⚫ ⚫ If a12 ≠ 0, zt Granger causes yt if a21 ≠ 0, yt Granger causes zt More generally, let the reduced form equation for zt be: zt = a20 + a21yt-1 + a22yt-2 +…+ a2pyt-p + a2p+1zt-1 + a2p+2zt-2 +…+ a2p+pzt-p + ezt ⚫ When we test the null hypothesis of no Granger causality from yt to zt, we test the hypothesis: a21y = a22yt-2 = a23yt-3 =…= a2pyt-p = 0 ; i.e. that all the coefficient on yt are zero in the equation for zt ⚫ Granger (non-)causality is tested with the standard (Wald) F-test ⚫ ”Block-exogeneity” test is used to investigate, whether a variable Granger causes any of the other variables; LR test is typically used for this 24 Empirical Example: Granger causality ⚫ We need to include five lags to get rid of significant autocorrelation ⚫ Depending on the aim of the model, we might include less lags so not to have to estimate that many parameters and loose df:s (e.g. a model for forecasting purposes) ⚫ Granger (non-)causality test in Eviews: View – Lag structure – Granger causality/Block exogeneity tests ⚫ For instance: housing price movements Granger cause (i.e. have predictive power w.r.t.) GDP growth, but the other variables do not ⚫ Note: EViews does not provide the actual block exogeneity test 25 VAR Model Identification ⚫ More formally: Enders, pp. 292-294 ⚫ Identification is not necessary, if the model is just for forecasting purposes & investigating Ganger causalities ⚫ Due to the feedback inherent in a VAR process, the structural equations (p. 4) cannot be estimated directly ⚫ Hence, we estimate the reduced form equations ⚫ A key question: is it possible to recover all of the information present in the structural (also called “primitive”) system? ; In other words, is the structural system identifiable given the OLS estimates of the VAR model? 26 VAR Model Identification ⚫ In the two variable 1st order (structural) VAR, there are 10 parameters: 2 constants (b10 and b20) 4 autoregressive coeffcients (11, 12, 21, and 22) 2 feedback coefficients (b12 and b21) 2 variances (y2 and z2) ⚫ Reduced form estimation only provides 9 estimates (p. 5): 6 coefficient estimates (a10, a20, a11, a12, a21, ja a22 ) 2 variances and covariance [var(e1t ), var(e2t) ja cov(e1t,e2t)] → To identify the structural model, we need to impose at least one restriction 27 VAR Model Identification ⚫ ⚫ In the two equation case: ⚫ If exactly one parameter of the structural form is restricted, the system is just identified ⚫ If more than one parameter is restricted, the system is over identified Typically the identification is achieved by restricting either b12= 0 or b21= 0, i.e. that only one of the variables has a contemporaneous impact on the other variable 28 VAR Model Identification ⚫ Let’s assume that we impose the restriction b21= 0 ⚫ A theoretical model, for instance, suggests that yt does not contemporaneously affect zt ⚫ Hence, we restrict the structural from to be: ⚫ We estimate the reduced form: 29 *VAR Model Identification ⚫ Based on the estimated reduced form model, the structural parameters can be computed 30 VAR Model Identification ⚫ ⚫ We also get estimates for the shocks yt and zt ⚫ Residual series e2t directly provide estimates for zt ⚫ From equation e1t = yt – b12zt we get the estimate: yt = e1t + b12zt Both shocks, yt and zt contemporaneously influence yt , but only zt affects zt ⚫ Other option to identify the model would be to set b12 = 0 ⚫ In a model with more endogenous variables, n(n-1)/2 restrictions are needed: in a 3-variate model 3 restrictions, in 4-variate model 6 restrictions etc. 31 Innovation Accounting ⚫ Innovation accounting is often a key part of VAR analysis ⚫ Dynamic interrelationships among the endogenous variables in VAR are investigated ⚫ Brooks, pp. 336-338 ⚫ Impulse response functions: ⚫ ⚫ Variance decomposition: ⚫ ⚫ What is the effect of each shock on the variable values / development paths Gives the proportion of the movements in the variables that are due to their “own” shocks, versus shocks to each of the other variables More formal derivations are presented in Enders, pp. 294-301 (impulse responses) and pp. 301-303 (variance decomposition) 32 Impulse Response Functions ⚫ The reactions of each of variable to a given shock (in the structural form, i.e. to yt and zt) are called impulse response functions (IRFs) ⚫ Computed in practice by expressing the VAR model as a vector moving average (VMA) process – i.e. the VAR model is written as a VMA ⚫ That is, similar to AR model, VAR model can be expressed in a moving average form ⚫ In the VMA representation, the variables are expressed in terms of current and past values of the structural shocks (and the mean of the variable) 33 Impulse Response Function ⚫ Provided that the system is stable, effect of a shock should gradually die away (assuming that the variables are stationary) ⚫ Impulse response functions can be (and typically are) investigated graphically ⚫ VAR needs to be indentified – i.e. some restriction(s) need to be imposed – in order to derive impulse response functions ⚫ Regardless of the identification approach, IRFs are computed from coefficient estimates that do not represent the DGP precisely → Confidence bands can be computed for impulse responses 34 Choleski Decomposition ⚫ One possible way to identify the model, is the Choleski decomposition: ⚫ Restrict the contemporaneous effects among the variables (as discussed earlier on) ⚫ Even if a variable has not contemporaneous impact on another variable, it can have a lagged effect ⚫ Recursive ordering is imposed among the variables: 1st variable has contemporaneous effect on all other variables, 2nd variable has contemporaneous effect on all other variables except for the 1st one …, nth (last) variable has not contemporaneous impact on any other variable 35 Choleski Decomposition ⚫ How to select the ordering of variables? ⚫ Sometimes economic theory gives the ”predicted” ordering ⚫ Previous empirical examinations can provide information on the ordering ⚫ Sometimes neither theoretical nor empirical literature gives sufficient guidance on the ordering ⚫ The importance of ordering depends on the magnitude of the correlation coefficient between the error terms (e1t and e2t in the 2-variate case) ⚫ If error terms are not correlated, the ordering does not matter ⚫ Statistical significance of the correlation can be tested ⚫ Rule of thumb: if corr > 0.2 with approximately 100 usable observations, it is significant 36 Choleski Decomposition ⚫ If there are significant correlations (and no prior knowledge on the correct ordering), the usual procedure is: 1. Obtain the impulse response function using a particular ordering 2. Compare the results to the impulse response function obtained by reversing the ordering 3. If the implications are quite different, additional investigation into the relationships between the variables is necessary 37 Other Decomposition Approaches ⚫ There also are other approaches to identify the model (and thereby to compute the impulse response functions) ⚫ In a structural VAR (SVAR) model the restrictions are based on the economic theory, and there is generally no particular ordering between the variables ⚫ Enders, pp. 313-335 ⚫ Pesaran & Shin (1998) generalized impulse responses (GIRFs) ⚫ Does not depend on the VAR ordering ⚫ GIRFs from a shock to the jth variable are derived by applying a variable specific Cholesky factor computed with the jth variable palced 1st in the ordering ⚫ Criticized for the lack of theoretical assumptions 38 Example of (Theoretical) IRFs (Enders, p. 297) There is clear asymmetry due to the decomposition: • One-unit shock in 𝜀yt causes yt to increase by one unit; however, there is no contemporaneous effect on zt • Both variables immediately react to the shock in zt, 𝜀zt • That is, coefficient b21 is restricted to equal zero (p. 28) • Since the system is stationary, the impulse responses ultimately decay and converge to some value 39 Empirical Example: IRFs ⚫ ⚫ ⚫ Choleski decomposition with ordering: GDP, HP, Loan, MR Confidence bands: Monte Carlo simulated with 1000 repetitions EViews: Select ”Impulse” in the VAR window Response to Cholesky One S.D. (d.f. adjusted) Innovations ± 2 S.E. Response of D(GDP) to D(GDP) Response of D(GDP) to D(HP) Response of D(GDP) to D(LOAN) Response of D(GDP) to D(MR) .02 .02 .02 .02 .01 .01 .01 .01 .00 .00 .00 .00 2 4 6 8 10 12 14 16 2 Response of D(HP) to D(GDP) 4 6 8 10 12 14 16 2 Response of D(HP) to D(HP) 4 6 8 10 12 14 16 2 Response of D(HP) to D(LOAN) .016 .016 .016 .012 .012 .012 .012 .008 .008 .008 .008 .004 .004 .004 .004 .000 .000 .000 .000 -.004 -.004 -.004 -.004 4 6 8 10 12 14 16 2 Response of D(LOAN) to D(GDP) 4 6 8 10 12 14 16 2 Response of D(LOAN) to D(HP) 4 6 8 10 12 14 16 2 Response of D(LOAN) to D(LOAN) .006 .006 .006 .004 .004 .004 .004 .002 .002 .002 .002 .000 .000 .000 .000 -.002 -.002 -.002 -.002 4 6 8 10 12 14 16 2 Response of D(MR) to D(GDP) 4 6 8 10 12 14 16 2 Response of D(MR) to D(HP) 4 6 8 10 12 14 16 2 Response of D(MR) to D(LOAN) 0.8 0.8 0.8 0.4 0.4 0.4 0.4 0.0 0.0 0.0 0.0 -0.4 -0.4 -0.4 -0.4 -0.8 -0.8 -0.8 -0.8 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 2 4 6 8 10 12 14 10 12 14 16 4 6 8 10 12 14 16 4 6 8 10 12 14 16 Response of D(MR) to D(MR) 0.8 2 8 Response of D(LOAN) to D(MR) .006 2 6 Response of D(HP) to D(MR) .016 2 4 16 40 2 4 6 8 10 12 14 16 Empirical Example: IRFs ⚫ ⚫ ⚫ E.g.: 1st column shows the influence of GDP shocks on the 4 variables E.g.: 3rd row shows the reactions of ”d(loan)” to the 4 shocks Note: 3 + 2 + 1 = 6 restrictions on the contemporanous parameters Response to Cholesky One S.D. (d.f. adjusted) Innovations ± 2 S.E. Response of D(GDP) to D(GDP) Response of D(GDP) to D(LOAN) Response of D(GDP) to D(MR) .02 .02 .02 .02 .01 .01 .01 .01 .00 .00 .00 .00 2 With stationary variables, each IRF should converge to zero over the long run! Response of D(GDP) to D(HP) 4 6 8 10 12 14 16 2 Response of D(HP) to D(GDP) 4 6 8 10 12 14 16 2 Response of D(HP) to D(HP) 4 6 8 10 12 14 16 2 Response of D(HP) to D(LOAN) .016 .016 .016 .012 .012 .012 .012 .008 .008 .008 .008 .004 .004 .004 .004 .000 .000 .000 .000 -.004 -.004 -.004 -.004 4 6 8 10 12 14 16 2 Response of D(LOAN) to D(GDP) 4 6 8 10 12 14 16 2 Response of D(LOAN) to D(HP) 4 6 8 10 12 14 16 2 Response of D(LOAN) to D(LOAN) .006 .006 .006 .004 .004 .004 .004 .002 .002 .002 .002 .000 .000 .000 .000 -.002 -.002 -.002 -.002 4 6 8 10 12 14 16 2 Response of D(MR) to D(GDP) 4 6 8 10 12 14 16 2 Response of D(MR) to D(HP) 4 6 8 10 12 14 16 2 Response of D(MR) to D(LOAN) 0.8 0.8 0.8 0.4 0.4 0.4 0.4 0.0 0.0 0.0 0.0 -0.4 -0.4 -0.4 -0.4 -0.8 -0.8 -0.8 -0.8 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 2 4 6 8 10 12 14 10 12 14 16 4 6 8 10 12 14 16 4 6 8 10 12 14 16 14 16 Response of D(MR) to D(MR) 0.8 2 8 Response of D(LOAN) to D(MR) .006 2 6 Response of D(HP) to D(MR) .016 2 4 16 41 2 4 6 8 10 12 Empirical Example: Accumulated IRFs ⚫ ⚫ Accumulated IRFs show the accumulated effect of the quarterly (in this case) reactions – i.e. the sum of responses of a given variable to a given shock over time Hence, essentially show the impact on the levels of the variables here Accumulated Response to Cholesky One S.D. (d.f. adjusted) Innovations ± 2 S.E. Accumulated Response of D(GDP) to D(GDP) ⚫ The IRFs should make sense in order to make interpretations regarding the nature of the shocks For accumulated IRFs, each should converge to some value (does not need to be zero) over the long run Accumulated Response of D(GDP) to D(LOAN) Accumulated Response of D(GDP) to D(MR) .03 .03 .03 .03 .02 .02 .02 .02 .01 .01 .01 .01 .00 .00 .00 .00 -.01 -.01 -.01 -.01 2 4 6 8 10 12 14 16 2 Accumulated Response of D(HP) to D(GDP) 4 6 8 10 12 14 16 2 Accumulated Response of D(HP) to D(HP) 4 6 8 10 12 14 16 2 Accumulated Response of D(HP) to D(LOAN) .08 .08 .08 .04 .04 .04 .04 .00 .00 .00 .00 -.04 2 4 6 8 10 12 14 16 -.04 2 Accumulated Response of D(LOAN) to D(GDP) 4 6 8 10 12 14 16 4 6 8 10 12 14 16 2 Accumulated Response of D(LOAN) to D(LOAN) .075 .075 .075 .050 .050 .050 .050 .025 .025 .025 .025 .000 .000 .000 .000 -.025 -.025 -.025 -.025 4 6 8 10 12 14 16 2 Accumulated Response of D(MR) to D(GDP) 4 6 8 10 12 14 16 2 Accumulated Response of D(MR) to D(HP) 4 6 8 10 12 14 16 2 Accumulated Response of D(MR) to D(LOAN) 1.0 1.0 1.0 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.0 -0.5 -0.5 -0.5 -0.5 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 10 12 14 16 2 4 6 8 10 12 14 16 4 6 8 10 12 14 16 4 6 8 10 12 14 16 Accumulated Response of D(MR) to D(MR) 1.0 2 8 Accumulated Response of D(LOAN) to D(MR) .075 2 6 -.04 2 Accumulated Response of D(LOAN) to D(HP) 4 Accumulated Response of D(HP) to D(MR) .08 -.04 ⚫ Accumulated Response of D(GDP) to D(HP) 42 2 4 6 8 10 12 14 16 Empirical Example: Accumulated IRFs ⚫ ⚫ Residuals are highly correlated across equations → let’s compare IRFs with the reverse ordering Mostly similar IRFs, but some notable differences (especially w.r.t. last shock) Accumulated Response to Cholesky One S.D. (d.f. adjusted) Innovations ± 2 S.E. Accumulated Response of D(GDP) to D(GDP) Accumulated Response of D(GDP) to D(HP) Accumulated Response of D(GDP) to D(MR) .03 .03 .03 .02 .02 .02 .02 .01 .01 .01 .01 .00 .00 .00 .00 -.01 -.01 -.01 -.01 2 4 6 8 10 12 14 16 2 Accumulated Response of D(HP) to D(GDP) 4 6 8 10 12 14 16 2 Accumulated Response of D(HP) to D(HP) 4 6 8 10 12 14 16 2 Accumulated Response of D(HP) to D(LOAN) .08 .08 .08 .04 .04 .04 .04 .00 .00 .00 .00 4 6 8 10 12 14 16 2 Accumulated Response of D(LOAN) to D(GDP) 4 6 8 10 12 14 16 2 Accumulated Response of D(LOAN) to D(HP) 4 6 8 10 12 14 16 2 Accumulated Response of D(LOAN) to D(LOAN) .08 .08 .08 .04 .04 .04 .04 .00 .00 .00 .00 4 6 8 10 12 14 16 2 Accumulated Response of D(MR) to D(GDP) 4 6 8 10 12 14 16 2 Accumulated Response of D(MR) to D(HP) 4 6 8 10 12 14 16 2 Accumulated Response of D(MR) to D(LOAN) 1.5 1.5 1.5 1.0 1.0 1.0 1.0 0.5 0.5 0.5 0.5 0.0 0.0 0.0 0.0 -0.5 -0.5 -0.5 -0.5 -1.0 2 4 6 8 10 12 14 16 -1.0 2 4 6 8 10 12 14 16 8 4 6 8 4 6 8 4 6 8 10 12 14 16 12 14 16 10 12 14 16 10 12 14 16 43 -1.0 2 10 Accumulated Response of D(MR) to D(MR) 1.5 -1.0 6 Accumulated Response of D(LOAN) to D(MR) .08 2 4 Accumulated Response of D(HP) to D(MR) .08 2 (POLL) Accumulated Response of D(GDP) to D(LOAN) .03 2 4 6 8 10 12 14 16 Variance Decomposition ⚫ As with IRFs, variance decomposition is based on the VMA form and the VAR needs to be identified ⚫ Shows how much of the forecast error variance (for 1, 2, . . . , j periods ahead) for each variable can be explained by innovations in the variable itself and in each of the other variables ⚫ Therefore also called forecast error variance decomposition ⚫ In other words: shows the proportion of the movements in the dependent variables that are due to their own shocks, versus shocks to the other variables ⚫ Helps in investigating interrelations between the variables: how important are the shocks in one variable for the time path of another variable… 44 Variance Decomposition ⚫ In practice, it is useful to examine the decompositions at various forecast horizons ⚫ As the horizon increases, the variance decompositions should converge ⚫ ⚫ ⚫ If innovations in zt , zt , cannot explain any of the error variance of yt at any forecast horizon, yt is exogenous w.r.t. zt (= yt is independent of zt) Another extreme would be if zt explains 100% of yt forecast error variance at all forecast horizons: in this case yt is entirely endogenous to zt In applied research, it is typical for a variable to explain almost all of its forecast error variance at short horizons and smaller proportions at longer horizons ⚫ In the 2-variate case, we would expect this pattern if 𝜀zt shocks had little contemporaneous effect on yt but affected the yt sequence with a lag 45 Variance Decomposition ⚫ Similar complications are present as with IRF computations ⚫ If the (reduced form) error terms exhibit notable correlation, it is good to experiment with different orderings (at least in the case where the theory does not give clear guidance on the identifying restrictions) ⚫ A practical tip for the ordering (if no clear theoretical, or prior empirical, guidance is available): If a variable has only little predictive power, especially at short horizons, w.r.t. the other variables regardless of the variable ordering, it should be placed last in the ordering (since the variance decompompositions analysis then indicates the importance of its shocks for the other variables is tiny or inexistent) ⚫ In any analysis, the reported IRFs and variance decompositions should be based on the same variable ordering in the Choleski decomposition (or, more generally, should be based on the same identifying restrictions what ever decompositions approach is used) 46 Empirical Example: Variance Decompositions ⚫ Variance decomposition for housing price growth By construction (i.e. ordering in Choleski decomposition) the one-period ”share” of the two last variables is zero. Shocks in the loan-toGDP ratio growth explain app. 10% of the one year forecast error variance. Own shocks explain over 70% of the longrun forecast error variance. 47 Empirical Example: Variance Decompositions Variance Decomposition using Cholesky (d.f. adjusted) Factors ± 2 S.E. Percent D(GDP) variance due to D(GDP) Percent D(GDP) variance due to D(HP) Percent D(GDP) variance due to D(LOAN) Percent D(GDP) variance due to D(MR) 100 100 100 100 80 80 80 80 60 60 60 60 40 40 40 40 20 20 20 20 0 0 0 2 4 6 8 10 12 14 16 2 Percent D(HP) variance due to D(GDP) 4 6 8 10 12 14 16 0 2 Percent D(HP) variance due to D(HP) 4 6 8 10 12 14 16 2 Percent D(HP) variance due to D(LOAN) 100 100 100 80 80 80 80 60 60 60 60 40 40 40 40 20 20 20 20 0 0 0 4 6 8 10 12 14 16 2 Percent D(LOAN) variance due to D(GDP) 4 6 8 10 12 14 16 4 6 8 10 12 14 16 2 Percent D(LOAN) variance due to D(LOAN) 100 100 100 80 80 80 80 60 60 60 60 40 40 40 40 20 20 20 20 0 0 0 4 6 8 10 12 14 16 2 Percent D(MR) variance due to D(GDP) 4 6 8 10 12 14 16 4 6 8 10 12 14 16 2 Percent D(MR) variance due to D(LOAN) 100 100 100 80 80 80 80 60 60 60 60 40 40 40 40 20 20 20 20 0 0 0 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 14 16 4 6 8 10 12 14 16 4 6 8 10 12 14 16 Percent D(MR) variance due to D(MR) 100 2 12 0 2 Percent D(MR) variance due to D(HP) 10 Percent D(LOAN) variance due to D(MR) 100 2 8 0 2 Percent D(LOAN) variance due to D(HP) 6 Percent D(HP) variance due to D(MR) 100 2 4 0 2 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 48 Empirical Example: Variance Decompositions (POLL) Reversed Choleski ordering ⚫ Variance Decomposition using Cholesky (d.f. adjusted) Factors ± 2 S.E. Percent D(GDP) variance due to D(GDP) Percent D(GDP) variance due to D(HP) Percent D(GDP) variance due to D(LOAN) Percent D(GDP) variance due to D(MR) 100 100 100 100 80 80 80 80 60 60 60 60 40 40 40 40 20 20 20 20 0 0 0 2 4 6 8 10 12 14 16 2 Percent D(HP) variance due to D(GDP) 4 6 8 10 12 14 16 0 2 Percent D(HP) variance due to D(HP) 4 6 8 10 12 14 16 2 Percent D(HP) variance due to D(LOAN) 100 100 100 80 80 80 80 60 60 60 60 40 40 40 40 20 20 20 20 0 0 0 4 6 8 10 12 14 16 2 Percent D(LOAN) variance due to D(GDP) 4 6 8 10 12 14 16 4 6 8 10 12 14 16 2 Percent D(LOAN) variance due to D(LOAN) 100 100 100 80 80 80 80 60 60 60 60 40 40 40 40 20 20 20 20 0 0 0 4 6 8 10 12 14 16 2 Percent D(MR) variance due to D(GDP) 4 6 8 10 12 14 16 4 6 8 10 12 14 16 2 Percent D(MR) variance due to D(LOAN) 100 100 100 80 80 80 80 60 60 60 60 40 40 40 40 20 20 20 20 0 0 0 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 14 16 4 6 8 10 12 14 16 4 6 8 10 12 14 16 Percent D(MR) variance due to D(MR) 100 2 12 0 2 Percent D(MR) variance due to D(HP) 10 Percent D(LOAN) variance due to D(MR) 100 2 8 0 2 Percent D(LOAN) variance due to D(HP) 6 Percent D(HP) variance due to D(MR) 100 2 4 0 2 4 6 8 10 12 14 16 2 4 6 8 10 12 14 16 49 SUR Estimation* ⚫ In a VAR model, all equations inlude exactly the same RHS variables with the same lag lenths for each variable and equation ⚫ In this case, OLS estimation separately for each equation provides the same results as joint estimation of all the equations ⚫ If there are differences across equations regarding the RHS variables or their lags, the model is called a near-VAR ⚫ Near-VAR should be estimated with the SUR method (seemingly unrelated regression) ⚫ Efficiency gain compared with OLS, if model residuals are correlated with each other ⚫ IRFs and variance decomposition can be computed for near-VAR models as well ⚫ Some econometricians recommend that all variables and lags should be included in all equations (and they typically are) 50 SUR Estimation* ⚫ Sometimes restricting the models so that the regressors vary between equations can have benefits: ⚫ If one wishes to restrict the number of estimated parameters to save degrees of freedom ⚫ If is is evident based on theory that some variable in the model does not (directly) affect all other variables in the model ⚫ For instance: yt = a10 + a11yt-1 + a12zt-1 + a13kt-1 + a14yt-2 + a15zt-2 + a16kt-2 + e1t zt = a20 + a21yt-1 + a22zt-1 + a23kt-1 + a24yt-2 + a25zt-2 + a26kt-2 + e2t kt = a20 + a32zt-1 + a33kt-1 + a35zt-2 + a36kt-2 + e3t 51 Forecasting with VAR Models ⚫ Brooks, p. 334 52 VAR Model ”Extensions” ⚫ VAR-GARCH ⚫ VARX: VAR model with exogenous RHS variables ⚫ VARMA: Includes lagged error terms as well ⚫ Bayesian VAR ⚫ GVAR: Global VAR ⚫ FAVAR: Factor augmented VAR ⚫ Cointegrated VAR: In other words VECM (vector error-correction model) 53 Vector Error-Correction Model ⚫ Recall from the previous lecture handout: ⚫ The dynamics of cointegrated variables can be presented with errorcorrection models; in the simpliest, i.e. two variable, case: ⚫ yt-1 – 1zt-1 is the equilibrium error in period t-1 (note: the constant terms 1 and 2 cater for the constant of the cointegrating relationship as well) ⚫ • • Together, the two equations form a vector error-correction model (VECM) Error-correction model (ECM), or VECM, enables us to take account of the influence of long-term cointegrating relationship on the dynamics of the variables, including forecasts (vs. basic ARIMAX or VAR in differences) Enables us to keep information cointained in the levels of the variables 54 Vector Error-Correction Model ⚫ ê = estimated equilibrium error ⚫ 𝜀𝑦𝑡 and 𝜀𝑧𝑡 are white noise error terms ⚫ y and z are the error-correction coefficients that indicate the adjustment speeds of y and z towards the equilbrium relation ⚫ i shows how large a share of the equilibrium error disappears during a period due to the adjustment (i.e. ”error correction”) of variable i ⚫ If y = z = 0, y and z are not cointegrated → conventional VAR model in differences ⚫ In a cointegrated system, at least one variable need to adjust towards the equilibrium, i.e. either y < 0 or z > 0 or both 55 Vector Error-Correction Model • Why y < 0? Because yt-1 – 1zt-1 (= 𝑒Ƹ t-1) > 0 means that yt-1 is greater than its equilibrium value; hence yt needs to decrease for the error to correct • Greater absolute value of y means faster adjustment towards the • cointerating relations and thereby more rapid equilibrium correction Adjustment parameters should be smaller than one in absolute value • Similarly z > 0 if Zt adjusts towards the long-run relation • Error-correction mechanism in the equation for yt : y 𝑒Ƹ t-1 = y(yt-1 – 1zt-1) • If y = 0, y is weakly exogenous, i.e. cannot be predicted with the equilibrium error and does not adjust towards the CI relation • If z = 0, z is weakly exogenous • IRFs and variance decompositions can be computed for a VECM Vector Error-Correction Model ⚫ It is necessary to reinterpret Granger causality in a cointegrated System: In a cointegrated system, yt does not Granger cause zt if: 1) lagged values Δyt−i do not enter the Δzt equation (significantly) AND 2) zt does not respond to the deviation from long-run equilibrium (i.e. zt is weakly exogenous) 57 Vector Error-Correction Model: Empirical Example D(HP) D(GDP) 0.651891 (0.08685) [ 7.50582] 0.009769 (0.09296) [ 0.10509] D(HP(-2)) 0.183362 (0.10294) [ 1.78127] 0.002233 (0.11018) [ 0.02027] D(HP(-3)) -0.126493 (0.10291) [-1.22921] 0.075700 (0.11014) [ 0.68731] D(HP(-4)) 0.126135 (0.08790) [ 1.43499] 0.239852 (0.09408) [ 2.54946] D(GDP(-1)) 0.147188 (0.08161) [ 1.80345] -0.061990 (0.08735) [-0.70966] D(GDP(-2)) -0.193447 (0.08066) [-2.39842] -0.134261 (0.08633) [-1.55527] D(GDP(-3)) -0.080086 (0.08002) [-1.00089] -0.060694 (0.08564) [-0.70870] D(GDP(-4)) -0.161666 (0.08010) [-2.01822] 0.039740 (0.08573) [ 0.46352] C 0.013114 (0.01096) [ 1.19600] 0.062512 (0.01174) [ 5.32681] EQE_HP_GDP(-1) -0.026303 -0.006537 (0.01084) [-0.60298] D(HP(-1)) Based on AIC Based on SBC (0.01013) Significant[-2.59677] at the 6% level @QUARTER=1 -0.022936 (0.02013) (5% sig. Level [-1.13921] = -3.37) Speed of adjustment parameters: Do they make sense? (POLL) We could restrict GDP to be weakly exogenous… -0.143145 (0.02155) [-6.64276] @QUARTER=2 0.020110 (0.00976) [ 2.06029] -0.005979 (0.01045) [-0.57233] @QUARTER=3 -0.040190 (0.02023) [-1.98682] -0.078326 (0.02165) [-3.61775] 0.632684 580.916887 R-squared

Applied Econometrics VAR Models Lecture Handout 10 Autumn 2022 PDF

Document Details

Tags

Related

Summary

Full Transcript