Lecture on Panel Data and Fixed Effects
Document Details

Uploaded by LongLastingGalaxy
Magnus Carlsson
Tags
Summary
This lecture, presented by Magnus Carlsson, explores panel data and fixed effects. It discusses the estimation of causal effects, methods for controlling omitted variables, and the application of models such as random effects and fixed effects approaches. It offers different estimation techniques and provides insights into interpreting the results.
Full Transcript
Panel data and fixed effects Magnus Carlsson Fixed effects, difference-in-differences, and panel data To estimate causal effects, experiments, IV or RD methods are preferred methods ln many applications, however, experiments are impossible and there are simply no good instruments or disco...
Panel data and fixed effects Magnus Carlsson Fixed effects, difference-in-differences, and panel data To estimate causal effects, experiments, IV or RD methods are preferred methods ln many applications, however, experiments are impossible and there are simply no good instruments or discontinuities to exploit An alternative is then methods that allow us to at least control for certain types of omitted variables These methods allow us to control for the unobserved factors that are fixed over time or space Fixed effects, difference-in-differences, and panel data: roadmap 1. Fixed effects and panel data 1. Panel data: random vs fixed effects 2. Fixed effects estimation with panel data 3. Pitfalls 4. Fixed effects estimation with other data structures 2. Difference-in-differences 1. Estimation 2. Pitfalls and sensitivity checks Panel data and fixed effects This lecture deals with panel data and fixed effects estimation Panel data follows outcomes and characteristics of individuals at multiple points in time Typically, the sample of individuals N is relatively large, while the number of time periods T over which these individuals are observed is generally short. Fixed effects refer to one way of analyzing panel data but the fixed effects approach can be applied to other data structures as well This includes family-level data, twin data, etc., without a time dimension The simplest case: running OLS on panel data One can always analyze panel data by just pooling observations over time and running an OLS (and treating all observations as independent): πππ‘ = πΌ + πππ‘ π½ + πππ‘ π = 1, β¦ , π π‘ = 1, β¦ , π, which contains N individuals, which are observed over T periods. The pooled model only provides consistent estimators for πΌ and π½ if the zero conditional mean assumption E πππ‘ ππ1 , β¦ , πππ = 0 is satisfied. Violation of this assumption causes the estimators to be biased and inconsistent. The panel data model Consider the basic linear panel data model, where we instead follow individuals over time: πππ‘ = πΌ + πππ‘ π½ + πππ‘ π = 1, β¦ , π π‘ = 1, β¦ , π,. ln panel data models, we assume that the error term πππ‘ can be divided into two parts that enter additively, so that: πππ‘ = ππ + πππ‘ (1) Note that the ππ this expression occurs without a time subscript! ln this specification, the error term is thus divided into one part that does not vary over time, ππ , and one part that does vary over time πππ‘ The panel data model What is then ππ ? It reflects factors that are unobserved by the econometrician and that do not vary over time for the individual This may include factors such as genes, early childhood environment, parental background, certain personality traits, βdeepβ preferences, etc. The assumptions we make about ππ determines what type of panel data model to use The panel data model: random effects Panel data models are analyzed as either random effects models or fixed effects models ln the random effect model, we assume that: E ππ ππ1 , β¦ , πππ = 0 (2) Here, we assume that the unobserved factors that are fixed over time are independent of the value of the X variables for all time periods Is the random effects assumption realistic? The panel data model: random effects Note that the random effects assumption is similar to the zero conditional mean assumption The assumption says that unobserved, time-invariant, factors such as ability, preferences, parental background are independent of all included X variables For instance, if Y denotes earnings and X schooling, it says that such unobserved factors are independent of the level of schooling But this is exactly what we (often) do not believe and what we want to address! The panel data model: fixed effects In the fixed effect model, we relax the assumption and allow for: E ππ ππ1 , β¦ , πππ β 0 (3) Here, we assume that the unobserved factors that are fixed over time are not independent of the value of the X variables for all time periods Without an experiment, this is the more realistic case As we will see, even if we allow for this particular type of break of the zero conditonal mean assumption, the fixed effects model may still give us consistent estimates of the causal effect! The fixed effects model To get more specific, consider the fixed-effect model: πππ‘ = πΌ + πππ‘ π½ + ππ + πππ‘ , where πππ‘ is a vector of exogenous regressors and πππ‘ is independent over time and across individuals. By the fixed effects assumption, we do not rule out correlation between ππ and πππ‘. As an example, ππ could represent unobserved ability (at least the part of it that does not vary over time) Estimating the fixed effects model ln the specification above, the ππ is a constant (or βfixedβ) for each π and therefore looks like a βdummyβ variable for each π! What if we could get rid of the ππ ? Thus, by including a dummy variable for each π in the regression, we could control for ππ But note that there are as many ππ parameters as individuals, which could mean thousands of ππ to estimate! Estimating the fixed effects model Even if we cannot estimate all the ππ parameters, we can get rid of them using within estimation. We use a trick, where it turns out that including a dummy variable for each π is algebraically the same as estimation in deviations from means To implement the trick, we first calculate the individual-specific averages over time, so that: ππ = ππ π½ + ππ + ππ where 1 π 1 π 1 π 1 π ππ = Οπ‘=1 πππ‘ ππ = Οπ‘=1 πππ‘ ππ = Οπ‘=1 πππ‘ ππ = Οπ‘=1 πππ‘ π π π π Estimating the fixed effects model βthe trickβ Next subtract ππ from πππ‘ : πππ‘ β ππ = πΰ·¨ππ‘ π½ + ππ + πππ‘ β ππ π½ β ππ β ππ = πππ‘ β ππ π½ + (πππ‘ β ππ ). Note now that ππ = ππ since ππ is always the same across time periods. This implies that we get the specification πΰ·¨ππ‘ = πΰ·¨ππ‘ π½ + πππ‘ Η π = 1, β¦ , π t = 1, β¦ , π with πΰ·¨ππ‘ = πππ‘ β ππ πΰ·¨ππ‘ = πππ‘ β ππ πππ‘ Η = πππ‘ β ππ The within estimator π½απ€ππ‘βππ is then obtained by applying OLS. ln this specification, we got rid of ππ , i.e. the fixed effects! Interpreting the fixed effects model Removing ππ , i.e. the fixed effects, means that we implicitly control for all individual-specific factorsβwhether observable or unobservableβthat are constant over time Thus we have removed a potentially large source of omitted variables bias We can do this even though we may not ever be able to observe or measure these unobserved and time-constant individual-specific factors. We now interpret the estimated effect, π½, as the effect of a within-unit change in treatment. For this reason, the FE estimator is also called the within estimator. Some remarks on the fixed effects estimator The parameters π½ are identified due to (within) variation in πππ‘ over time. Estimators for ππ and π½ are consistent if the asymptotics imply that π becomes large. If instead π is fixed and π goes to infinity, only π½απ€ππ‘βππ is consistent, but ππ is not (so called incidental parameters). If π is not too large, one could simply include dummy variables for each individual and estimate the original model by OLS. This provides the within estimators and πΖΈ π in a single step. An alternative way to βkillβ the fixed effects: the first-differences estimator Instead of the within estimation procedure, one could also use first- differences over time: πππ‘ β πππ‘β1 = πππ‘ π½ + ππ + πππ‘ β πππ‘β1 π½ β ππ β πππ‘β1 = πππ‘ β πππ‘β1 π½ + (πππ‘ β πππ‘β1 ) π‘ = 2β¦,π or Ξπππ‘ = Ξπππ‘ π½ + Ξπππ‘ , where taking first-differences eliminates ππ from the model. If we perform OLS we obtain the first-difference estimator π½απππππ. What additional assumptions are needed for the FE or first-difference model? So far we only addressed the assumptions about ππ. But what about πππ‘ , i.e. unobserved factors that are allowed to vary over time? For both the first-differences and within estimator to provide consistent estimates, we now need the regressors to be strictly exogenous: πΈ πππ‘ ππ1 , β¦ , πππ , ππ = 0 π = 1β¦,π π‘ = 1β¦,π The strict exogeneity assumption The strict exogeneity assumption is a version of the zero conditional mean assumption lt says that the part of the error term that is allowed to vary over time, πππ‘ , must be unrelated to the value of the treatment indicator or other control variables in any time period It would typically fail, if there is some time-specific unobserved shock that affect both the outcome and our X variable of interest What looks like an effect of X on Y may then simply reflect the influence of this shock The fixed effects model: Angrist and Pischke In the book (MHE), the authors use potential outcomes notation and use the example of the effect of unionship on wages Let πππ‘ equal the (log) earnings of worker π at time π‘ and let π·ππ‘ denote his union status. The observed πππ‘ is either π0ππ‘ or π1ππ‘ , depending on union status. Suppose further that: πΈ(π0π‘ |π΄π ; πππ‘ , π‘, π·ππ‘ ) = πΈ(π0π‘ |π΄π ; πππ‘ , π‘) (4) The fixed effects model: Angrist and Pischke The expression above says that the potential outcome as untreated is independent of actual treatment status, conditional on unobserved worker ability, π΄π , and other observed covariates πππ‘ , and time π‘ ln other words, union status is as good as randomly assigned conditional on these factors If this is true, we would be able to get a consistent estimate of π·ππ‘ if we could control for, or somehow account for π΄π. With a fixed effects model, we can accomplish this, as long as the unobserved factors is constant over time. Example: Freeman (1984), returns to union membership Freeman (1984) estimates the effect of union membership on wages Ideally, we would like to observe each individualβs potential outcome with and without union membership ln general, could we get at the potential outcomes by just observing the wages of members and non-members? If not, maybe the unobserved differences between members and non-members are constant over time? βWhatever makes us special is timelessβ (Angrist and Kreuger 1998) Example: Freeman (1984), returns to union membership Example: Freeman (1984), returns to union membership Why are Freemanβs fixed effects estimates smaller than his cross- sectional estimates? Two explanations: 1. The fixed effect estimates are closer to the βtrueβ causal effect of union membership. This would suggest that the effect was overestimated in the cross-sectional estimates 2. There are measurement errors in the union status variable. Unfortunately, the role of measurement errors normally becomes exaggerated in the fixed-effects model. The measurement error problem brings us to the potential pitfalls of the fixed effects method. Pitfalls of the fixed effects approach The measurement error problem The intuition for the measurement errors problem is that fixed effects models restricts the variation in the Xs to within individuals This also means that the fraction of the variation that reflects measurement errors may increase It can be shown that the downward bias that results from βclassicalβ measurement error is greater in fixed effects models than in OLS The downward bias gets stronger, the stronger the correlation is between the x-variables in different periods Pitfalls of the fixed effects approach Impossible to estimate the effect of time-invariant regressors. Why? Because the deviation from the individual-specific mean will always be zero for such a variable We can therefore not estimate the effect of time-invariant factors such as gender, ethnicity, education (at least as an adult), etc. With a random effects specification we can estimate the time-invariant factors, but the problem is that the underlying assumptions of the RE-model are unrealistic in most applications Pitfalls of the fixed effects approach The effect is only identified for those who actually change treatment status The ones who do not change treatment status do not contribute to the estimates, since we are relying on within variation. Problem is that the sample who actually changes status in the causal variable of interest may be selective This may make it difficult to compare the OLS estimates with the FE estimates, since OLS exploits all observations Pitfalls of the fixed effects approach Violation of the strict exogeneity assumption ln many applications, the strict exogeneity assumption may be criticized. Selection into treatment may be based on unobserved factors that do vary over time, such as shocks, which would violate the strict exogeneity assumption. Thus, we have: πΈ πππ‘ ππ1 , β¦ , πππ , ππ β 0 (7) Fixed effects without a time dimension: exploiting family data such as siblings and twins The fixed effects approach does not require a time dimension! As long as important unobserved variables are shared by some group of individuals, they can be cancelled out using a fixed effects approach: Examples: 1. Twins: identical twins (monozygotic) twins share genetics and family background 2. Siblings: shares some genetics and family background Example 1 of fixed effects with twins: The effects of birth weight over the life cycle (Bharadwaj, Lundborg, Rooth (2015). What is the effect of early childhood health over the life cycle? The role of birth weight over the life cycle is examined in order to answer questions about the persistence of health inequalities at birth. With detailed register data on income, welfare payments etc, the extent to which lower birth weight children ever catch up to their heavier counterparts is examined. Example 1 of fixed effects with twins: Note that identifying the effect of birth weight is normally very difficult If one compares the birth weight of babies born in different families, there is a concern that those with low birth weight differ in unobservable ways from those with high birth weight For instance, those with low birth weight are more likely to be born in poorer families or in less healthy families To the extent that such factors are unobserved, it is likely that the birth weight coefficient partly or fully simply pick up such unobserved background factors We thus have an omitted variables problem (or βendogeneityβ problem) Example 1 of fixed effects with twins: ln order to deal with omitted variables, data on the birth weight of almost all twins born in Sweden between 1926-1958 is exploited. These data are linked to register data on incomes and education from 1968 and onwards. Empirical specification: ππππ‘ = π½π‘ π΅πππ + ππ + ππππ‘ , where ππππ‘ is log income of twin π in twin pair π at time π‘, π½π‘ is the coefficient of the birth weight variable π΅πππ , ππ are twin-pair-specific unobserved factors, and ππππ‘ is unobserved factors of twin π Example 1 of fixed effects with twins: ln this specification, it is (partly) possible to deal with the omitted variables problem The reason is that the fixed effect ππ , will account for the influence of genetics, the background and behaviors of the parents, and any environmental problems shared by twins, etc. To see this clearly, we can take the difference between twins within twin pairs (where the other twin is π β² ): ππππ‘ β ππ β² ππ‘ = π½π‘ π΅πππ β π΅ππ β² π + ππ β ππ + (ππππ‘ βππ β²ππ‘ ) = π½π‘ π΅πππ β π΅ππ β² π + (ππππ‘ βππ β² ππ‘ ) Example 1 of fixed effects with twins: The key identifying assumption needed in order to give π½π‘ a causal interpretation is that π΅πππ β π΅ππ β² π is uncorrelated with ππππ‘ β ππ β² ππ‘ Is this a strict assumption? Maybe, if we for instance think that parental investments in their kids is a function of birth weight. Or if differences in birth weight are related to differences in cognition. If parents try to compensate for the lower birth weight of one twin, the estimated effect is the effect of birth weight that remains despite parentsβ best attempts to compensate Example 1 of fixed effects with twins: results Example 2: Critical periods in the development of cognitive skills and health (van den berg, Lundborg 2014) Very difficult to study the causal effect of poor circumstances during childhood on later life outcomes Periods during childhood in which poor conditions have particularly bad consequences for development are called βcritical periodsβ Some famous evidence comes from studies Romanian orphans, who were rescued from incredibly bad conditions by adoption parents from Western countries These results may reflect self-selection, however, if adoption parents picked out the least βdamagedβ children from the orphanages Example 2: Critical periods in the development of cognitive skills and health (van den berg et al (2014) Van den Berg et al. (2014) exploit data on immigrant brothers who migrated to Sweden from different countries The study exploits that the brothers migrated from more or less poor conditions to a richer country The brothers entered Sweden at the same point in time, calendar-wise, but at different development stages (ages) By comparing the outcomes of brothers who entered at the same time (fixed effects) but at different stages, we may be able to identify βcritical periodsβ in the development height Critical periods in the development of height at age 18: results Example 2: Critical periods in the development height By imposing brother fixed effects, it is possible to difference out everything unobserved at the family level This includes factors such as reasons for migrating, parental background, language, which all may be important omitted variables Brother fixed effects also address the self-selection of the families who migrate, since the unobserved selection factors are at the family level! The results show larger effects if migrating from poorer countries, as one would expect Summary fixed effects A fixed effects estimator allows us to control for certain types of omitted variables The method allows us to control for the unobserved factors that are fixed over time (time-invariant) or space The method cannot account for the influence of unobserved factors that vary over time (or space) A useful application of fixed effects estimators is on siblings or twin data Appendix: Panel data and fixed effects Comparing the first-differences and the within estimators If π = 2, the within estimator and first-difference estimator are the same. To see this note that the within model computes individual means: 1 ππ = (ππ1 + ππ2 ) (5) 2 And, in mean deviation form, we have: 1 1 1 ΰ·¨ ππ2 = ππ2 β ππ = ππ2 β ππ1 + ππ2 = ππ2 β ππ1 , (6) 2 2 2 which will always be half the first-difference. So effectively the same model. Comparing the first-differences and the within estimators If πππ‘ is uncorrelated over time, the within estimator is more efficient than the first-difference estimator. If the πππ‘ are serially correlated, the first-difference estimator is more efficient. If strict exogeneity is violated, the first-difference estimator and the within estimator become both inconsistent and have different probability limits.