EMF Part 2: Tilburg Past Paper PDF Oct-Dec 2023
Document Details
Uploaded by Deleted User
2023
PP
Tags
Summary
This EMF Part 2: Tilburg past paper, from October-December 2023, covers various financial topics including event studies, panel data analysis, and time-series models.
Full Transcript
EMF Part 2: Tilburg Oct-Dec 2023 PP Read me This file has been created by PP, its usage is intended to be free for all students. This LATEXdocument is made with notes, documents, and textbooks from previous years, along with everything I have personally understood and believe to be...
EMF Part 2: Tilburg Oct-Dec 2023 PP Read me This file has been created by PP, its usage is intended to be free for all students. This LATEXdocument is made with notes, documents, and textbooks from previous years, along with everything I have personally understood and believe to be useful for explaining the subject in question. Additionally, there may be errors, so always double-check. Good luck! If you have appreciated my work, feel free to donate to the following BTC address or Paypal account: bc1q29ylyh4u9kj9pqzn6fa2e5vlumta07j6e7dzrs @pietropist 1 Table of Contents J9j 1 Event Studies 1 1.1 Identifying the event...................................... 2 1.2 Models for the Abnormal Returns.............................. 2 1.2.1 Mean-adjusted returns................................. 3 1.2.2 Market Adjusted Returns............................... 3 1.2.3 Market Model Residuals and CAPM......................... 3 1.3 Analysing Abnormal Returns................................. 4 1.4 Testing Abnormal Performance................................ 5 1.4.1 The basis: t-tests.................................... 5 1.4.2 Testing the Significance of Cumulative Abnormal Returns............. 6 1.4.3 Cross-sectional Heteroskedasticity.......................... 7 1.4.4 Standardization..................................... 7 1.5 Some Problems with the tests................................. 8 1.6 Cross-sectional Dependence.................................. 8 1.7 Conducting Event Studies With Small Cross-sections.................... 9 1.7.1 Sign Test........................................ 9 1.7.2 Rank Test........................................ 9 1.8 Long Horizon Event Studies.................................. 10 1.8.1 Defining Abnormal Returns.............................. 10 1.8.2 Calendar Time Abnormal Returns.......................... 11 2 Panel Data 12 2.1 Estimation Methods...................................... 12 2.1.1 Pooled OLS....................................... 12 2.1.2 Fixed Effects Model.................................. 13 2.1.3 Briefly: The Random Effects Model......................... 14 2.2 Standard Errors and Clustering................................ 15 2.3 Panel models for asset returns: the cross-sectional approach................ 15 2.4 Panel models for asset returns: the Fama-MacBeth procedure............... 16 2.5 The Fama-French (1993) Approach.............................. 17 3 Time-series Data 18 3.1 Revisit the Classical Linear Regression Model with Time-series Data........... 18 3.1.1 Tricky Assumptions.................................. 18 3.2 The Normality Assumption.................................. 19 3.2.1 What if your sample is small?............................. 19 3.3 How to adjust standard errors for serial correlation?.................... 20 3.3.1 Newey-West: HAC Standard Errors......................... 22 3.3.2 Including Lagged Effects................................ 22 3.4 Is measurement error in the variables an issue?....................... 22 3.5 How to test if parameters are stable over time?....................... 24 4 Time-Series Models and Forecasting 25 4.1 The Autocorrelation Function................................. 25 4.1.1 A White Noise Process................................. 26 4.2 Autoregressive (AR) Model.................................. 26 4.2.1 The Stationary Condition for AR(p) Model..................... 28 4.2.2 How many lags for AR(p) Model........................... 28 4.2.3 ARMA processes.................................... 28 I 4.3 AR models for Scenario Analysis and Forecasting...................... 29 4.3.1 Forecasting....................................... 29 4.4 VAR Models.......................................... 31 4.4.1 Contemporaneous Terms................................ 31 4.4.2 Impulse Responses................................... 31 5 Non-stationarity and Time-varying Volatility 33 5.1 Types of Non-stationarity................................... 33 5.2 Process with Unit Root.................................... 35 5.2.1 Deal With Close to Unit Root Processes....................... 35 5.3 Time-varying Volatility.................................... 36 5.3.1 Exponentially Weighted Moving Average Models (EWMA)............ 37 5.3.2 Autoregressive Conditionally Heteroscedastic Models (ARCH)........... 38 5.3.3 Generalised ARCH Models (GARCH)........................ 38 J9j II 1 Event Studies What is an event study? A test of the change in stock (or bond) prices around specific ‘events’. Such events can be: corporate earnings announcements / merger announcements macro-economic news natural disasters auctions of treasury bonds Event studies have been used for two major reasons: 1. To examine the impact of some event on the wealth of the security holders 2. To test the hypothesis that the market efficiently incorporates new information (EMH testing) Pay attention to the following Figure 1 We want to understand how to construct this plot starting from scratch. In the following pages we are going to describe how to construct it. The plot shows cumulative average abnormal return with respect to days relative to an announcement. Note that the announcement is often more important than the actual event. In case of semi-strong market efficiency, there should be stock reaction on the day the announcement is made. A market in which prices always fully reflect available and relevant information is called an efficient market. Prices will change only when (new) information arrives. But information, by definition, cannot be predicted ahead of time. Therefore, prices changes cannot be predicted ahead of time. Efficiency refers to the two aspects of a price adjustment to information: the speed and quality (direction and magnitude) of the adjustment. Note also that the primary role of the capital market is the allocation of ownership of the economy’s capital stock. Thus, asset prices should provide accurate signals for resource allocation We can list the three forms of the Efficient Market Hypothesis (EMF) 1 1. Weak Form Prices reflect all information contained in the record of past 2. Semi-strong Form Prices reflect not only past information but all other published information 3. Strong Form Prices reflect not only just public information but any information that might be relevant In order to conducting an event study we need to highlight the three main steps 1. Identify the event of interest and the timing of the event 2. Specify a “benchmark” model for normal stock return behavior 3. Calculate and analyze abnormal returns around the event date 1.1 Identifying the event The type of event studied is usually motivated by economic theory. For example, consider the ex- dividend day behaviour of stock prices. The simplest efficient markets model predicts that the decline in price is equal to the dividend paid. However, there are numerous studies that find price changes significantly smaller than the dividend. In this example, the timing of the event is trivial, as the ex-dividend day is known. However, in many event studies the timing of the event is more problematic. Consider, for example, the stock price behaviour of a target firm in a corporate takeover. Identifying the actual date of the takeover as the event date would not yield meaningful results, as the takeover is usually announced a long time before and potential changes in the value of the target and bidder firms should already be reflected in the stock price. It is much more interesting to see what happens on the day that the takeover plans become public knowledge (the announcement day) or before that, to form an idea of the profits from insider trading. A common procedure is to pick the date of the first announcement in the Wall Street Journal, or another financial news source, of the takeover bid as the event date. Some uncertainty regarding the event date is often unavoidable, and one has to take some care in interpreting the results of an event study in such cases. 1.2 Models for the Abnormal Returns An important step in conducting an event study is the choice of a benchmark model for stock return behaviour. There is a wide variety of models available in the literature, which we try to summarize here. The main differences among the models are the chosen benchmark return model and its estimation interval. Abnormal returns (AR) are defined as the return (R) minus a benchmark or normal return (N R) ARit = Rit − N Rit (1.1) For several methods, the determination of the normal returns requires estimation of some parameters. This estimation is typically performed over an estimation period, [T1 , T2 ], which precedes the event period, [t1 , t2 ]. The event period is typically indicated by t = 0. Notice that the time index t counts ”event time”, i.e. the number of periods (days, months) from the event and not the usual calendar time. Graphically, the time line around the event is shown in Figure 1.2 below 2 1.2.1 Mean-adjusted returns In the mean-adjusted model, the benchmark is the average return over some period, say between T1 and T2. The normal returns are then defined as T2 1 X N Ri = Rit (1.2) T t=T1 where T = T2 − T1 + 1 equals the number of time periods used to calculate the average return (the length of the estimation period). The choice of the benchmark period is rather arbitrary. 1.2.2 Market Adjusted Returns An obvious disadvantage of the mean-adjusted method is the omission of marketwide stock price movements from the benchmark return. Especially if the events for different firms occur at the same point in time (e.g. option expiration), the results might be biased if the whole market goes up or down in the event period. In such a case, significant abnormal returns will be detected, which may not be due to the event but rather to marketwide price movements. To correct for this omission, the return on a market index, Rmt , can be chosen as the benchmark: N Rit = Rmt (1.3) The resulting abnormal returns are referred to as market adjusted returns. A point of interest is which market index to choose. The standard choice for US research are the CRSP equally weighted and value weighted indexes. 1.2.3 Market Model Residuals and CAPM The market adjusted returns method implicitly assumes that the ”beta” of each stock is equal to one. This is obviously not always the case, and it is better to account for differences in ”beta” in defining abnormal returns. Therefore, a good way to define abnormal returns is as residuals of the market model: Rit = αi + βi Rmt + εit (1.4) | {z } |{z} Normal Return Abnormal Return The abnormal returns are then defined as the residuals or prediction errors of this model. Instead, normal returns are defined as N Rit = α̂i + β̂i Rmt (1.5) 3 Where Cov(Ri , Rm ) β̂i = and α̂i = R̄i − β̂i R̄m V [Rm ] So that equation (1.1) holds. Notice that α̂ and β̂ are the OLS estimate of model (1.5) and are estimated in the estimation window (from T1 to T2 ). We can further use another version of the CAPM where the beta is estimated with excess returns Rit − Rf t = βi (Rmt − Rf t ) + εit (1.6) So that the normal returns are N Rit = Rf t + β̂i (Rmt − Rf t ) (1.7) 1.3 Analysing Abnormal Returns In analysing abnormal returns, it is conventional to label the event date as time t = 0. Hence, from now on ARi,0 denotes the abnormal return on the event date and, for example, ARi,t denotes the abnormal return t periods after the event. If there is more than one event relating to one firm or stock price series, they are treated as if they concern separate firms. We typically consider an event period, running from t1 to t2 , where the time index t is counted from the event date. Assuming there are N firms in the sample, we can construct a matrix to store abnormal returns: AR1,t1 · · · AR1,−1 AR1,0 AR1,1 · · · AR1,t2 AR2,t1 · · · AR2,−1 AR2,0 AR2,1 · · · AR2,t2 (1.8) .............. ....... ARN,t1 · · · ARN,−1 ARN,0 ARN,1 · · · ARN,t2 Each row of matrix (1.8) is a time series of abnormal returns for firm i, and each column is a cross section of abnormal returns for time period t. In order to study stock price changes around events, each firm’s return data could be analysed separately. However, this is not very informative because a lot of stock price movements are caused by information unrelated to the event under study. The informativeness of the analysis is greatly improved by averaging the information over a number of firms. Typically, the unweighted cross-sectional average of abnormal returns in period t is considered: N 1 X AARt = ARit (1.9) N i=1 Thus, we have an AARt for each t that is an average of each column of matrix (1.8). Large deviations of the average abnormal returns from zero indicate abnormal performance. Because these abnormal returns are all centered around one particular event, the average should reflect the effect of that particular event. All other information, unrelated to the event, should cancel out on average. Often, we are interested not only in performance at the event date, but also over longer periods surrounding the event. The usual way to study performance over longer intervals is by means of cumulative abnormal returns, where the abnormal returns for company i are aggregated from the start of the event period, t1 , up to time t2 , as follows: t2 X CARi = ARi,t1 + · · · + ARi,t2 = ARit (1.10) t=t1 That is the sum of each row of matrix (1.8). Again, in event studies the CARs are aggregated over the cross-section of events to obtain cumulative average abnormal returns (CAAR): N t2 1 X X CAAR = CARi = AARt (1.11) N i=1 t=t 1 4 Notice that the previous formals of CARi and CAAR were computed with respect to the full event period. We can also cumulate from the start of the event period to any day t ≤ t2 t X CARit = ARis (1.12) s=t1 So that we have the sum of abnormal returns until an arbitrary t for every firm i. Then, the Cumulative Average Abnormal Return for day t is N 1 X CAARt = CARit (1.13) N i=1 and this is the actual solid black line of Figure 1. Notice that in order to plot the graph we need to calculate CAARt for each t in the interval [t1 , t2 ] and [T1 , T2 ]. 1.4 Testing Abnormal Performance Although graphical reporting of cumulative abnormal returns is instructive and often suggestive, in almost all event studies the graphical analysis is supported by statistical tests. These tests are designed to answer the question whether the calculated abnormal returns are significantly different from zero at a certain, a priori specified, significance level. The null hypothesis to be tested is of the form H0 : E [ARit ] = 0 (1.14) Which statistical test of this hypothesis is appropriate depends on the way in which the abnormal returns are constructed and on the statistical properties of stock returns. In this section we discuss several test statistics. We pay particular attention to some pitfalls that a researcher might encounter when conducting an event study. 1.4.1 The basis: t-tests The most common test of the null hypothesis of no abnormal return, equation (1.14), is a simple t-test. In order to introduce this test, we make some restrictive assumptions, but these will be relaxed later on. Specifically, assume that i.i.d. ARit ∼ N (0, σ 2 ) =⇒ E [ARit · ARjt ] = 0 ∀ i ̸= j (1.15) So that the distribution of the ARit that together make up the average abnormal return AARt is Normal with mean zero and all abnormal returns are cross-sectionally uncorrelated (by independence covariance is zero). Hence, the variance of the average, AARt , is equal to 1/N times the variance of a single abnormal return, so that σ2 AARt ∼ N 0, N If σ 2 were known, a test statistic for 1.14 is given by √ AARt T S1 = N ∼ N (0, 1) σ Under the stated assumptions, T S 1 follows a Standard Normal distribution. Of course, in practice σ is unknown. An estimator of σ (st ) can be constructed from the cross-sectional variance of the abnormal returns in period t: v u u 1 X N st = t (ARit − AARt )2 (1.16) N − 1 i=1 This yields the following test statistic for the average abnormal return √ AARt T S1 = N ∼ tN −1 (1.17) st 5 Under the stated assumptions, this test statistic follows a t-Student distribution with N − 1 degrees of freedom. However, there is strong evidence that stock returns do not satisfy the Normality assumption imposed to derive the distribution of the T S 1 and T S1 test statistics. If stock returns do not follow a normal distribution, the exact small sample distribution result that T S1 follows a Student-t distribution does no longer hold. If we maintain the assumptions that the abnormal returns are independent and have the same mean and variance, it can be shown that in large samples, T S1 approximately follows a Standard Normal distribution. √ This is a result of the Central Limit Theorem, which states that under these assumptions, N times the average, divided by the standard deviation converges to a Standard Normal random variable √ AARt T S1 = N ∼ N (0, 1) (1.18) st Hence, if N is large enough, the quantiles of the Normal distribution can be used as critical values for the t-test. In event studies, N > 30 is typically sufficient for this. Significance Level Two Sided Critical Value 10% 1.64 5% 1.96 1% 2.58 1.4.2 Testing the Significance of Cumulative Abnormal Returns Often, one is interested in the abnormal performance of event firms over a longer event period. One common situation is when the date at which the event took place cannot be determined exactly. For example, the news of a possible takeover-bid might spread gradually to the public, and may be reported with some lag in the press. If there is such event date uncertainty, the abnormal returns may be spread out around the chosen event date (t = 0). In such circumstances, it is often necessary to test the significance of abnormal returns in a window around t = 0. Also, it might be interesting to test the significance of cumulative abnormal returns over the complete pre-event or post-event period. In this section, we describe how to test the significance of the abnormal returns over an arbitrary event interval [t1 , t2 ]. The null hypothesis is that the expected cumulative price change over this period is zero. First we define the cumulative abnormal return (over the event interval) as before t2 X CARi = ARi,t1 + · · · + ARi,t2 = ARit t=t1 Then, the null hypothesis to be tested is H0 : E [CARi ] = 0 (1.19) This hypothesis can be tested in a similar way as testing a one-period abnormal return. The most common procedure is the following. First, calculate the CARi for every event i. Then, calculate the cross-sectional average N 1 X CAAR = CARi N i=1 and standard deviation v u N u 1 X s=t (CARi − CAAR)2 (1.20) N − 1 i=1 The t-test then simply is √ CAAR T S2 = N ∼ N (0, 1) (1.21) s 6 1.4.3 Cross-sectional Heteroskedasticity The returns of some firms in the event study may be more volatile than the return of other firms. For example, small firm returns are typically more volatile than large firm returns. This can make the AAR or CAAR sensitive to “outliers” from one or a few very volatile firms or events. There are two principal ways to deal with this: 1. Adjust the standard errors of the AAR and CAAR 2. Standardize the AR by firm, and then calculate average standardized abnormal returns Now, we are going to explain the first method. After, in sub section (1.4.4) we explain the second. First, we need to calculate the firm-specific variances over the estimation window: T2 1 X ARi = ARit (1.22) T t=T1 Then, v u T2 u 1 X si = t (ARit − ARi )2 (1.23) T2 − T1 t=T1 If the returns are independent across firms, the variance of the AARt is given by N N 1 X 2 1 X s2AAR = si =⇒ sAAR = si (1.24) N 2 i=1 N i=1 The t-test statistic for the AARt then is approximately AARt TS = ∼ N (0, 1) (1.25) sAAR √ Notice the absence of the N in this formula. For the CAAR, we get a similar test statistic CAAR TS = √ ∼ N (0, 1) (1.26) t sAAR with t = t2 − t1 + 1 the number of days in the event period. 1.4.4 Standardization The assumption that all abnormal returns are identically distributed is usually too strong. Especially the assumption that the variance of the abnormal returns is equal for all series i (cross-sectional homoskedasticity, σi2 = σ 2 ) is not likely to be true, simply because some stocks are more volatile than others. Including one or two very volatile stocks in the analysis might cause a large variation in the AARt and hence a low power of the test. Therefore, it seems natural to use a weighted average of abnormal returns that puts a lower weight on abnormal returns with a high variance. A frequently used weight is the time-series estimate of the standard deviation of the abnormal returns. Such an estimate can be obtained as we did in equation (1.23). Now we want to divide each stock’s CAR by its firm-specific volatility to get scaled (or standardized) CAR i.e. SCAR Pt2 ARit CARi SCARi = t=t1 = (1.27) si si So that, N 1 X ASCAR = SCARi (1.28) N i=1 7 Since we have scaled the abnormal returns already, our test statistic becomes somewhat easier: r N T S4 = ASCAR ∼ N (0, 1) (1.29) t with t = t2 − t1 + 1 the number of days in the event period. 1.5 Some Problems with the tests All tests discussed so far had the important maintained assumption that the abnormal returns were uncorrelated, both serially and cross-sectionally (although they were allowed to be cross-sectionally heteroskedastic). Moreover, it was assumed that the variance of the returns in event periods is the same as the variance in non-event periods. Both assumptions are sometimes violated, especially when dealing with daily observations. Generally we can have some complications in running events analysis. Event induced variance – Higher variance at or around event date ☞ Cross-section estimators of standard deviation are robust against this. Thus, we have a remedy with this problem Event clustering – Multiple events in same calendar time period – This induces cross-sectional correlation, and makes t-test invalid – Good benchmark often solves this problem, if not, there are two solutions ☞ Average all returns of the same calendar day into a portfolio return, and treat that as one observation in the t-test ☞ Alternatively, a crude dependence adjustment of the standard error can be applied (see 1.6) Non-normality of return distribution – Skewness and outliers – Mainly a problem in small samples (N < 30) ☞ Rank or sign tests perform better than t-test (see 1.7.1 and 1.7.2) 1.6 Cross-sectional Dependence So far, we assumed that the abnormal returns are uncorrelated between two different events, i.e. Cov(ARit , ARjt ) = 0 ∀i ̸= j. However, if there is event clustering (several events occur in the same calendar period) there may be cross-sectional correlation between abnormal returns. In that case, the variance of the average of N abnormal returns is no longer equal to 1/N times the variance of a single returns, but larger (if the correlation is positive, as it typically is). As a consequence, the usual variance estimator underestimates the variance of the average abnormal returns, the usual t-statistics are biased upward, and the null hypothesis is rejected too often. To deal with cross-sectional dependence, Brown and Warner (1980) propose the so-called crude dependence adjustment method. This method estimates the variance of the average abnormal returns directly from the time series of observations of average abnormal returns in the estimation window: v u u 1 X T2 s̄ = t (AARt − AR∗ )2 (1.30) T −1 t=T1 8 Where N T2 1 X 1 X AARt = ARit AR∗ = AARt N i=1 T t=T1 With T = T2 − T1 + 1. The associated t-test statistic is AARt T S5 = ∼ N (0, 1) (1.31) s̄ √ Notice the absence of N in the formula. 1.7 Conducting Event Studies With Small Cross-sections All tests discussed so far invoke the central limit theorem to prove that their distribution under the null is Standard Normal. However, sometimes very small cross-sections of events are used. Especially when using daily data, the underlying abnormal returns have very fat tails. The approximation of the test statistics distributions by a Normal distribution may be very poor in such small cross- sections. Typically, because stock returns are fat-tailed, the critical values of the Normal distribution will be too small. Hence, one will reject the null hypothesis (1.14) too often. Some authors argue that even in large samples the approximation by the normal distribution is poor. FFRJ claim that stock returns have distributions which resemble so-called sum-stable distributions. For such distributions, the variance does not exist and therefore the Central Limit Theorem does not apply. As a result, the t-tests are invalid, even asymptotically. To resolve these problems, non-parametric tests can be used. These tests are valid under very general distributional assumptions regarding the abnormal returns. Non- parametric tests may also be more robust to outliers and other data imperfections. In this section we discuss two non-parametric tests, the sign test and the rank test. 1.7.1 Sign Test The sign test tests whether there are as many positive as negative abnormal returns on event dates (so if the distribution is symmetric around zero). The sign test statistic is based on the fraction of positive abnormal returns in the event period (denoted by p). Under the null, and if the return distribution is symmetric, the expectation of p is 0.5. The test statistic √ T S9 = 2 N (p − 0.5) ∼ N (0, 1) (1.32) has a standard normal distribution under the null that the median of the abnormal returns is zero. Therefore, the sign test in this form will only test hypothesis (1.14), that the mean of the abnormal returns is zero, if the distribution of the abnormal returns is symmetric. 1.7.2 Rank Test The sign tests suffer from a common weakness: they do not take the magnitude of the abnormal return into account. In contrast, the t-tests of the previous subsection are very sensitive to the magnitude of an abnormal return. The rank test proposed by Corrado (1989) is a non-parametric way to account for the magnitude of an abnormal return, but without the distributional assumptions which are needed to make the t-tests valid. The test works as follows. Denote the rank of the abnormal return ARit in the whole series (including the event period) of abnormal returns by Kit. This rank is transformed into the statistic Kit Uit = (1.33) T which should be uniformly distributed under the null that event periods are not different from non- event periods. To test this hypothesis, define the following test statistic " N # √ 1 X Uit − 0.5 T S11 = N ∼ N (0, 1) (1.34) N i=1 sut 9 With v u N u 1 X sut =t (Uit − 0.5)2 (1.35) N − 1 i=1 where 0.5 is the expected value of Uit. The central limit theorem can be invoked to show that the test statistic (1.34) follows approximately a Normal distribution in large samples. Compared with the usual t-tests, the convergence to the Normal distribution of the averages of the rank may be faster than the averages of the returns, especially when these have fat tails. It is therefore expected that the rank tests give better results in small cross-sections. The non-parametric tests were devised to mitigate one potential problem with the t-tests, namely, non-normality. All the other potential problems mentioned before, such as event clustering and event-induced variance, still remain. The adjustments to the tests used before in section (1.5) such as the crude dependence adjustment method are in most cases also applicable to the rank test. 1.8 Long Horizon Event Studies The discussion in the preceding sections focused on events that lasted for a fairly short time period. Most corporate events, for example mergers and takeovers, take only a few months with most of the abnormal returns occurring in a few days around the announcement. There is a fairly large literature, however, that considers the long-term impact of corporate events. The prime example here is the study of equity offerings, both initial public offerings (IPO) and seasoned equity offerings (SEO). The literature in this area typically studies the returns of firms that have an IPO or SEO over a horizon of 36 to 60 months. In this section, we discuss some methodological issues in conducting such long horizon event studies. We do not review the empirical literature on long-horizon event studies. A good reference paper for the empirical IPO and SEO literature is Ritter and Welch (2002). 1.8.1 Defining Abnormal Returns The correction for market returns is typically sufficient for short horizon event studies, but in long-horizon event studies the market model or the CAPM have several disad- vantages. There are several well-known deviations from the CAPM, such as the size effect, the book-to-market effect and the momentum effect, see for example Bodie, Kane and Marcus (2005, Chapter 12). In long horizon event studies, the Fama and French (1996) three factor model is therefore often used as the benchmark model to generate normal returns. This model extends the market model with the returns on a ”size” portfolio (SMB) and a ”value” portfolio (HML) Rit − Rf t = αi + βi (Rmt − Rf t ) + γi SM Bt + ξi HM Lt + εit (1.36) where SMB (”small minus big”) is the difference in return between a portfolio of small firms and a portfolio of large firms, and HML (”high minus low”) is is the difference in return between a portfolio of firms with a high book-to-market ratio (”value” firms) and a portfolio of firms with a low book-to- market ratio (”growth” firms). Then the normal return for firm i in period t then is defined by N Rit = Rf t + β̂i (Rmt − Rf t ) + γ̂i SM Bt + ξˆi HM Lt (1.37) The abnormal returns constructed from the three factor model are not only more accurate than the market model based abnormal returns, but they also show less cross-sectional correlation. IPO’s are typically small growth firms, and will exhibit similar exposures to the size an value factors. Omitting these factors from the normal return benchmark model will lead to abnormal returns that are correlated across firms that have an IPO in the same month. Cross-sectional correlation is an important issue for the inference. As an alternative to the Fama-French three factor model, Barber and Lyon (1997) advocate a non-parametric approach, where the benchmark return equals the return on a firm (or a portfolio return on a small group of firms) with similar size and book-to-market ratios. This approach is 10 more flexible than the linear regression (1.36), but it needs more data to obtain the same power. In practice, therefore, the Fama-French three factor model remains a popular benchmark in long-horizon event studies. The formal definition of the CAR in long-horizon event studies is H X H X CARi = ARit = (Rit − N Rit ) (1.38) t=1 t=1 where the event period runs from time t = 0 to t = H, where H is the holding period; Rit is the return on firm i in month t after the event; and N Rit the corresponding normal (or benchmark) return. The CAR methodology implicitly assumes a monthly rebalancing of the portfolio to an equal weighting of the return on event firms. In practice, however, many investors have longer holding periods than one month and do not rebalance their portfolio on a regular basis. An alternative approach to performance measurement therefore is to construct returns until the end of the event period (typically 36 or 60 months) without rebalancing. These returns are so-called buy-and-hold abnormal returns, or BHAR. Barber and Lyon (1997) give the formal definition for the buy-and-hold abnormal return H Y H Y BHARi = [1 + Rit ] − [1 + N Rit ] (1.39) t=1 t=1 The distribution of these BHAR’s is much more skewed than the distribution of the CAR, because over such a long period typically a few firms have extremely high returns, whereas the majority of firms has moderate even negative returns. However, in large samples this skewness is not a problem: by the Central Limit Theorem, t-statistics based on the BHAR are also approximately Normally distributed. In smaller samples, however, this skewness may create problems. 1.8.2 Calendar Time Abnormal Returns A third way to test the significance of long-horizon event returns is the ”calendar time returns” approach of Jaffe (1974) and Mandelker (1974) and advocated by Fama (1998). This approach first constructs a time series of portfolio returns, where for each month the portfolio consists of all firms that had an IPO in the last H months (typically, H = 36 or H = 60). If there is a month where no single firm had an IPO in the previous H months, the portfolio return is put equal to the risk free return. This procedure gives a monthly time-series of event portfolio returns, denoted by Rpt. This return is then regressed on the three Fama-French factors Rpt − Rf t = αp + βp (Rmt − Rf t ) + γp SM Bt + ξp HM Lt + εpt (1.40) The intercept of this regression, αp , measures the abnormal performance with respect to the three factor benchmark. The significance of the abnormal performance can be tested by the t-test for the significance of αp in this regression. The advantage of the calendar time returns approach over the usual event study method, which uses event-time CAR’s or BHAR’s, is that cross-sectional and serial correlation are not a problem. 11 2 Panel Data Panel data have both a time series and cross-sectional dimension. It measure same collection of people, firms, assets, or objects over several periods. Econometrically, the simplest setup is yit = α + β1 x1,it + · · · + βk xk,it + uit with t ∈ [1,..., T ] and i ∈ [1,..., N ] So that we have T dates and N cross-sectional objects. For example: yit could be the stock return of firm i in year t. When dealing with panel data we need variation both over time and in cross-section. If little time variation in variables, just do cross-sectional regression. By structuring the model in an appropriate way, we can remove the impact of certain forms of omitted variables bias in regression results. Using panel data can help to study how the relationships between objects change over time. 2.1 Estimation Methods Mainly there are five type of estimation methods. 1. Pooled OLS 2. Seemingly unrelated regressions (SUR) 3. Fixed effects estimator 4. Random effects estimator 5. Fama-MacBeth estimator These approaches differ primarily in how the constant term and slope coefficients vary in the cross-section (i) and over time (t). We are going to focus (mainly) on three of them 1. Pooled OLS No heterogeneity in α and β, they remain constant 2. Fixed effects estimator Constant term can vary across i (αi ). Very popular method in corporate finance/banking 3. Fama-MacBeth estimator Constant and slope vary over time (αt and βt ). Very popular for models of asset returns 2.1.1 Pooled OLS Simplest way to deal with panel data: estimate a single, pooled regression on all the observations together. Simply use all observations across i and t in a single regression. Pooling the data in this way assumes that the constant term and slope coefficients do not vary across i and t. yit = α + β1 x1,it + · · · + βk xk,it + uit This approach is advisable in case of a small sample (both N and T small). In this case simply hard to reliably estimate a more general model. The model pools observations together from different time periods, ignoring they belong to specific firms. As a result, all observations are treated as if they come from a single group, effectively combining the panel data into a cross-sectional data set. The model coefficients can then be estimated using OLS. 12 2.1.2 Fixed Effects Model Often used in finance. The slope of the coefficients is the same across i (firms), but the constant term allowed to differ across i. yit = αi + β1 x1,it + · · · + βk xk,it + uit Can think of αi as capturing all (omitted) variables that affect yit cross-sectionally but do not vary over time. For example, the sector that a firm operates in, a person’s gender, or the country where a bank has its headquarters. A general worry with regressions is omitted variables (i.e. anything you do not observe). By including fixed effects you control for unobservable differences across the units (firms, individuals) that you analyze (which are roughly constant over time). Downside of including fixed effects: you cannot identify effects of variables that are constant over time. Fixed effects example We will use the results of Murphy (1985): Corporate performance and man- agerial remuneration: An Empirical Analysis, Journal of Accounting and Economics. Does executive compensation increase after good firm performance? And if so, by how much? The challenge is that many (omitted) variables may affect compensation. In order to get an estimate with this model we need to incorporate N dummy variables yit = α1 D1i + α2 D2i + · · · + αN DNi + β1 x1,it + · · · + βk xk,it + uit For example D1i is a dummy variable equal to 1 for observations on the first entity (e.g., the first firm) in the sample and zero otherwise. This is just a standard regression model and can be estimated using OLS. This implies N + k parameters to estimate and is not practical if N is large, so we can use ”within” transformation. First, take the time-series mean of each entity (firm) T 1X ȳi = yit T t=1 13 The subtract this value from the values of all variables yi1 ,..., yit. Note that such a regression does not require an intercept term since now the dependent variable will have zero mean by construction. The model containing the demeaned variables is yit − ȳi = β(xit − x̄i ) + uit − ūi (2.1) This model can be estimated using OLS. Estimates identical to approach with N dummy variables. It is also possible to have a time-fixed effects model rather than an entity-fixed effects model. If we think that the average value of yit changes over time but not cross-sectionally we can write time-fixed effects model as yit = α + λt + β1 x1,it + · · · + βk xk,it + uit where λt is a time-varying intercept. For example, regulatory environment or tax rate changes halfway through a sample period this change may influence y, but in the same way for all firms. Time fixed effects can be allowed by adding T time dummies, where D1t , for example, denotes a dummy variable that takes the value 1 for the first time period and zero elsewhere, and so on. Similarly, in order to avoid estimating a model containing all T dummies, a within transformation can be conducted. Now subtract the cross-sectional average at time t from each observation N 1 X ȳt = yit N i=1 and subtract the cross-sectional average at time t from each observation yit − ȳt = β(xit − x̄t ) + uit − ūt (2.2) Also is possible to allow for both entity fixed effects and time fixed effects within the same model. Such a model would contain both cross-sectional and time dummies. By including time fixed effects, you control for any common time-series variation in the (de- pendent) variables. But there is a drawback: you can no longer identify the effect of a variable that only varies over time (constant in the cross-section). For example: market-wide financial or macro variables. 2.1.3 Briefly: The Random Effects Model As with fixed effects, the random effects approach proposes different intercept terms for each entity. Again these intercepts are constant over time. However, under the random effects model, the intercepts for each cross-sectional unit are assumed to arise from A common intercept α (the same for all cross-sectional units and over time) Plus a random variable εi that varies cross-sectionally but is constant over time We can write the random effects panel model as yit = α + βxit + ωit with ωit = εi + νit (2.3) Heterogeneity (variation) in the cross-sectional dimension occurs via the εi terms (and not via dummies). This framework requires the assumptions that the new cross-sectional error term, εi , has Zero mean and constant variance Is independent of the individual observation error term νit Is independent of the explanatory variables The parameters are estimated consistently but inefficiently by OLS. A generalized least squares (GLS) procedure is more efficient. The random effects model has mainly two advantages: 14 1. Less parameters to be estimated, compared to fixed effects (especially important if T is small) 2. One can still estimate the effect of variables that are constant over time But at the cost of additional assumptions on the error term. Hence, fixed effects model is more robust as it uses less assumptions, but may work less well in small samples (creates noise). 2.2 Standard Errors and Clustering The simplest case is when error term uit is independent across entities/firms i and over time t. You can simply apply the standard OLS standard errors. But this is not typically the case for financial and economic data. Often the error terms across individuals (firms, households, etc), are positively correlated (even if you include fixed effects) and the error terms over time are positively correlated (persistence). If you neglect these positive correlations, usually standard errors are too low. Intuition: if you have positively correlated observations, you have less information in the data compared to case with no correlations. Thus, we need to correct standard errors for such patterns by using clustered standard errors. Error terms are then allowed to be correlated within each cluster, but assumed not to be correlated across clusters. For example: a panel with observations for N firms and T years. If you cluster standard errors by firm: each firm is a cluster you allow for time-series correla- tion. Error terms within each firm are allowed to be correlated across time: Cov(uit , uis ) ̸= 0 But there is no correlation across firms: Cov(uit , ujt ) = 0 If you cluster standard errors by year: each year is a cluster you allow for cross-sectional correlation. Error terms within each year are allowed to be correlated across firms: Cov(uit , ujt ) ̸= 0 But there is no correlation across time: Cov(uis , uit ) = 0 Can also cluster by firm and year (double or two-way clustering) Cov(uit , uis ) ̸= 0: correlation across years in a firm Cov(uit , ujt ) ̸= 0: correlation across firms in a year Cov(uis , ujt ) = 0: no correlation across year and firm Usually, two-way clustering is used. Unless there are good reasons for why there are no correlations for some type of clusters. If N or T is small, two-way clustering might lead to noisy standard errors: it is not clear what is best here. Importantly, fixed effects (or time-fixed effects) do not take away the need for clustering. Fixed effects take out ”average” effects, not correlations across time or firms. 2.3 Panel models for asset returns: the cross-sectional approach In many investment analyses, one has panel data of returns. For example, monthly returns from 1960 to 2023 for all firms in the U.S. stock market or return data for a set of portfolios over time. Such data are used to: estimate/test asset pricing model, like the CAPM, document asset pricing anomalies (size, momentum, value) and, study whether some variables predict returns. The CAPM establish that E [Ri ] = Rf + βi (E [Rm ] − Rf ) Expected return on stock i is equal to the risk-free rate plus a risk premium. The risk premium is equal to the market risk premium, E [Rm ] − Rf , multiplied by risk exposure of the stock, known as ’beta’, βi. Beta is not observable and must be calculated, and hence tests of the CAPM are usually done in two steps 15 Step 1: Estimating the stock betas Step 2: Analyze if average returns indeed increase with stock beta (in the cross-section) In order to calculate βi (an estimator for it) we have two equivalent ways Cov(Rie , Rm e ) e e β̂i = e or Ri,t = αi + βi Rm,t + ui,t V [Rm ] The for the second step we need to run the following cross-sectional regression (at any random time) R̄ie = λ0 + λ1 β̂i + νi where R̄ie is the excess return for portfolio i averaged over the whole period. According to the CAPM λ0 = 0 and λ1 = E [Rm ] − Rf With the empirical analysis we can test whether the above values for the λ’s are representative or not. We found out that for long sample of U.S. stock returns: empirical relation between CAPM-beta and average return is negative (contrary to the economic intuition given by the CAPM). But there are problems with this cross-sectional approach. It does not allow for time variation in the explanatory variables – CAPM beta of a firm can vary over time – Size, book-to-market, and other variables do vary over time It assumes that investors know the CAPM beta (and other variables) when pricing the stocks at each point in time Usual OLS standard errors are wrong, need to be corrected for correlation across n portfolio returns The solution to the above problems is the Fama-MacBeth approach. 2.4 Panel models for asset returns: the Fama-MacBeth procedure Instead of 1 single cross-sectional regression, run a cross-sectional regression each month. e R̄i,t+1 = λ1,t + λ2,t β̂i,t + λ3,t BT Mi,t + λ4,t M Vi,t + νi,t+1 Here, β̂i,t is an estimate of the CAPM beta using information up to month t. Often, researchers use a rolling window, e.g. the last 60 months of returns, to estimate this beta. Then we can get the time-series of coefficient estimates λ̂1,t , λ̂2,t ,..., with t = 1,..., TF M B. Next step in FMB procedure: simply take the average of the coefficients over time. Do this for each coefficient separately and do t-test on this average using the standard error of the average. Perhaps we want to use White standard errors and/or correcting for serial correlation using Newey-West (discussed later). The average value of each lambda over t can be calculated using TX F MB 1 λ̂j = λ̂j,t j = 1,..., k (2.4) TF M B t=1 In our case k = 4 since we have 4 parameters to estimate. TF M B is the number of cross-sectional regressions used in the second stage of the test, the λ̂j are the four different parameters. The standard deviation is v u TXF MB 2 u 1 σ̂j = t λ̂j,t − λ̂j (2.5) TF M B − 1 t=1 16 The t-test statistic is then simply p λ̂j TF M B ∼ tTF M B −1 (2.6) σ̂j which follows a t-distribution with TF M B − 1 degrees of freedom. FMB procedure automatically corrects for correlation of the error term across portfolios. To summarize, the main advantages of Fama-Macbeth procedure over the simple cross-sectional approach are: Betas are not assumed to be constant over time Does not use forward looking info. Every investor use information up to every date t Automatically corrects for correlation of the error term across portfolios Allows you to study time variation in the lambda estimates 2.5 The Fama-French (1993) Approach In multi-factor models, stock returns depend on multiple risk factors, each factor has its own risk premium. Fama-French (1993) propose to use return factors, leading to the famous FF three-factor model. Fama and French (1993) first-step time-series regression which is run separately on each portfolio i e Ri,t = αi + βi,M KT M KTt + βi,SM B SM Bt + βi,HM L HM Lt + εi,t (2.7) e where Ri,t is the (excess) return on portfolio i at time t, and where: MKT is the market index return minus risk-free rate SMB is the difference in returns of portfolio of small stocks and a portfolio of large stocks (‘Small Minus Big’) HML is the difference in returns of portfolio of value stocks and portfolio of growth stocks (‘High Minus Low’) We now have three betas: three forms of risk exposure (factor loadings). The second stage in this approach is to use the factor loadings from the first stage as explana- tory variables in cross-sectional regressions, Fama-MacBeth style: e Ri,t+1 = λ1,t + λ2,t β̂i,M KT,t + λ3,t β̂i,SM L,t + λ4,t β̂i,HM L,t + νi,t+1 (2.8) We can interpret the second stage regression parameters (the λ’s) as factor risk premia that show amount of extra return generated from taking on an additional unit of that source of risk. This is an example of an APT model (Arbitrage Pricing Theory). 17 3 Time-series Data What are time-series data? Observations of a variable or several variables over time. Time series ob- servations are often correlated over time (denoted “serial correlation” or “autocorrelation”). Time series may include trends and seasonality. What are the main differences compared to cross-sectional data? Notation – Time series: denote each observation by t and total number of observations by T – Cross-sectional: denote each observation by i and total number of observations by N Cross-sectional data are usually a sample drawn from a population Time-series data sample is determined by – Frequency (day, week, month,...) – Starting date and end date of the sample time period Choice of sample period: often determined by availability In which ways do we see time-series in finance used? There are a lot of applications, following there are the three main categories 1. Estimating, testing and using asset pricing models (such as the CAPM) 2. Establishing (causal) relations between financial and economic variables 3. Forecasting financial market variables 3.1 Revisit the Classical Linear Regression Model with Time-series Data We need to tighten a little bit the classic assumption of the Classical Linear Regression Model (CLRM). Let x be the set of all xt , so that x = {x1 ,..., xT } then we assume that [A1] Linear Model yt = α + βxt + ut [A2] Random Sample Cov (ut , ut+j | x) = 0 ∀j ∈ R [A3] Sample Variation V [x] > 0 [A4] No Endogeneity E [ut | x] = 0 =⇒ Cov(ut , xt ) = 0 [A5] Homoskedasticity V [ut | x] = σ 2 < ∞ [A6] Normality ut ∼ N (0, σ 2 ) So we assume no past or future values of xt (contained in x) are informative about ut. If these assumptions hold, one can simply apply the OLS estimation as before. 3.1.1 Tricky Assumptions Remember the no Endogeneity assumption E [ut | x] = 0 Often x is not exogenous, but jointly determined at the same time as y. In this case, hard to establish a causal relation from x to y. In such cases, OLS establishes correlation but not causation. Sometimes the goal is only to estimate correlations, for example: CAPM beta regression of stock return on stock market return. Then we have the no autocorrelation assumption, that is Cov (ut , ut+j | x) = 0 ∀j ∈ R 18 This assumption states that the serial correlation (or autocorrelation) of the error term is zero. The error term at time t is uncorrelated with error term at any other time t + j. Good news: we can test this assumption, and also correct for the presence of nonzero serial correlation. Do not underestimate the homoskedasticity (and finite variance) assumption V [ut | x] = σ 2 < ∞ Maybe the important note to do here is that the variance should be finite. For this, we need the time series variable to be stationary. We are going to go deeper on this later. Then we have the Normality assumption ut ∼ N (0, σ 2 ) This assumption matters for hypothesis testing. Often, financial variables such as asset returns exhibit fat tails: higher chance of extreme outcomes and extreme crashes: low probability of very negative return. We can test this assumption and it can be relaxed. Indeed, if the sample size becomes very large, one can effectively drop the normality assumption. But in small samples non-normality can be problematic. 3.2 The Normality Assumption The assumption of Normality is needed for hypothesis testing. Very important result: if the sample size gets larger and larger, we can drop the normality assumption even for hypothesis testing. Formally, if assumptions [A1] to [A5] hold (so without normality) and if the sample size T is large then β̂ − β a ∼ N (0, 1) se(β̂) and the approximation error goes to zero as T goes to infinity. Recall that if we assume normality, we have β̂ − β ∼ tT −k−1 se(β̂) Indeed, note that as T gets large, the t-distribution tT −k−1 converges to the Standard Normal distri- bution. In other words, if we have many observations, we can drop the assumption that the error term is normal, and still do hypothesis testing (t-tests, F -test, p-values) in the same way as before. 3.2.1 What if your sample is small? Basically we want to follow a two step procedure 1. First step: test for Normality 2. If Normality rejected, this can be resolved sometimes In order to test for Normality we can use the the Bera-Jarque Normality test. Skewness (b1 ) and kurtosis (b2 ) are the (standardised) third and fourth moments of a distribution. A Normal distribution is not skewed and has a kurtosis of 3. So that, the excess kurtosis (b2 − 3) is zero. We want to test whether the skewness and excess kurtosis are jointly zero. Skewness and kurtosis can be expressed respectively as E u3 E u4 b1 = and b2 = σ3 σ4 The Bera-Jarque test statistic is given by b21 (b2 − 3)2 W =T + ∼ χ2k 6 24 19 We estimate b1 and b2 using the residuals from the OLS regression: û. What do we do if we find evidence of Non-Normality in a small sample? It is not obvious what we should do. Sometimes one can transform the dependent variable (for example, transform size of firms to log of size). Maybe there are outliers in the residuals. So, another thing to pay attention is that often one or two very extreme residuals causes us to reject the normality assumption. Then we can Winsorize data: put lower and upper bound on a variable (say 1% and 99% percentile) so that outliers are cut off. Alternatively we can use dummy variables. We can estimate a monthly model of asset returns from 1980-1990, and we plot the residuals, and find a large outliers e.g. in Oct 1987. In general, even if Normality is not rejected, in small samples one has to be careful when drawing conclusions based on OLS estimates and standard errors. The modern approach is to show robustness of an OLS model towards several modeling and data choices (use subsets of observations and/or explanatory variables, use different transformations of variables, different approaches to deal with outliers, data from other country/market) 3.3 How to adjust standard errors for serial correlation? We need to see if Cov (ut , ut+j | x) = 0 ∀j ∈ R. Obviously we never have the “true” u’s, so we use their sample counterpart, the residuals ût. We can then calculate Corr(ût , ût+j ). Some stereotypical patterns we may find in the residuals are given by the next Figures 1 , 2 and 3 where we plot the relationship between ût+1 and ût Figure 1: Positive Autocorrelation is indicated by a cyclical residual plot over time 20 Figure 2: Negative autocorrelation is indicated by an alternating pattern where the residuals cross the time axis more frequently than if they were distributed randomly Figure 3: No pattern in residuals at all: this is what we would like to see We can test for autocorrelation by the the Breusch-Godfrey test. This test for rth order autocorrelation perform the following regression ut = ρ1 ut−1 + ρ2 ut−2 + · · · + ρr ut−r + νt with νt ∼ N (0, σν2 ) We want to have all the rho’s jointly equal to zero. Hence, the null hypothesis is H0 : ρ 1 = ρ 2 = · · · = ρ r = 0 Then we can follow a three steps approach: 1. Estimate the linear regression using OLS and obtain the residuals 2. Regress ût on all of the regressors from stage 1 (the x’s) plus ût−1 , ût−2 ,..., ût−r and obtain the R2 from this regression. 3. It can be shown that (T − r)R2 ∼ χ2r. If test statistic exceeds critical value from the statistical tables, reject the null. What if the residuals have autocorrelation? Under assumption of covariance stationarity and weak dependence, OLS is still unbiased but not efficient. What are covariance stationarity and weak dependence? Covariance stationarity: Cov(ut , ut+j ) depends only on j and tends to zero as j be- comes large Weak dependence: E [ut ] = 0 and V [ut ] = σ 2 < ∞ 21 To continue, if the residuals have autocorrelation the OLS standard error estimates are inappropriate. If autocorrelation is positive (often is the case) then the standard errors are likely to be too low relative to their “correct” value if positively correlated residuals. Intuition: if residuals have positive correlation, a given sample of T observations is less informative compared to case where residuals are not correlated. The most common approach to correct for autocorrelation of the residuals is the Newey-West correction on the standard errors. An alternative approach can be: if there is autocorrelation of the residuals, apparently the model misses something. We can try to extend the model, for example by including lagged effects. 3.3.1 Newey-West: HAC Standard Errors Denoting by seOLS the regular OLS standard error, Newey-West standard error then equals m T 2X i 1 X se2N W = se2OLS + 1− ût ût−i (3.1) T i=1 m+1 T t=i+1 | {z } | {z } Less weight on higher autocovariance Estimate of autocovariance Cov(u ,u )of lag i t t−i The main point of the test is to add covariances of the standard error to correct for the too low estimate. The Newey-West correction also corrects for heteroskedasticity. They are often referred to as “heteroskedasticity and autocorrelation consistent” (HAC) standard errors. Newey-West se’s can be poorly behaved if there is substantial serial correlation and the sample size T is small. The lag number m must be chosen by the researcher: often used rule of thumb is: m = 0.75 · T 1/3. Still, HAC standard errors are used very often. 3.3.2 Including Lagged Effects The model we have considered so far was static yt = β0 + β1 x1t + β2 x2t + · · · + βk xkt + ut We can easily extend this to case where the current value of yt depends on previous values of one of the x’s, e.g. yt = β0 + β1 x1t + β2 x2t + · · · + βk xkt + γ1 x1,t−1 + · · · + γk xk,t−1 + ut Why might we want/need to include lags in a regression? Inertia of the dependent variable / delayed response due to illiquidity For example households or central bank decisions usually are taken with some lag. Interest rate might respond slowly to inflation (central bank inertia / waiting if inflation persists) Overreaction / underreaction Reduce serial correlation (autocorrelation) of the error term One issue of this method is that with lagged effects, OLS may become biased (but still consis- tent, that is, it approaches the true parameters as the sample becomes very large i.e. converges in probability). 3.4 Is measurement error in the variables an issue? Measurement error in one or more of the explanatory variables can lead to bias in the OLS estimates. Sometimes also known as the errors-in-variables problem. Measurement errors can occur in a variety of circumstances Macroeconomic variables are almost always estimated quantities (GDP, inflation, and so on), as is most information contained in company accounts 22 Financial market prices may contain noise if markets are illiquid Sometimes we have to estimate explanatory variables Consider a model with one explanatory variable yt = β1 + β2 xt + ut Suppose that xt is measured with error: Instead of observing true value, we observe a noisy version, x̃t , which is equal to the true xt plus a nois, νt , that is independent of xt and ut x̃t = xt + νt Combining the two equations: yt = β1 + β2 (x̃t − νt ) + ut =⇒ yt = β1 + β2 x̃t + (ut − β2 νt ) this is the regression we actually run. We can observe that x̃t and the composite error term (ut − β2 νt ) are correlated since both depends on νt. This causes the parameters to be estimated inconsistently. Indeed, starting from the definition for βOLS (notice that the covariance of the constant β1 is always zero with the other terms so we don’t include it in the covariance) Cov(yt , x̃t ) Cov(β2 xt + ut , xt + νt ) = V [x̃t ] V [xt + νt ] Cov(β2 xt , xt ) + Cov(ut , xt ) + Cov(β2 xt , νt ) + Cov(ut , νt ) = V [xt ] + V [νt ] β2 V [xt ] = V [xt ] + V [νt ] V [xt ] = β2 · V [xt ] + V [νt ] | {z }