Difference-in-Difference Analysis PDF

Unit 8: Difference-in-difference Martin / Zulehner: Introductory Econometrics 1 / 28 Outline 1 Interactions between independent variables Interactions between two binary variables Binary-continuous interactions 2 Difference-in-difference Classical example: Card and Krueger (1994) What is the DID estimator estimating? Estimation with two periods 3 Application: School program in Kenya (Lucas and Mbiti, 2012) Martin / Zulehner: Introductory Econometrics 2 / 28 Interactions between independent variables Test scores and student-to-teacher ratios: perhaps a class size reduction is more effective in some circumstances than in others... perhaps smaller classes help more if there are many English learners, who need individual attention ∆test score that is, ∆STR might depend on PctEL ∆y more generally, ∆x1 might depend on x2 how to model such “interactions” between x1 and x2 ? we first consider binary x s, then continuous x ’s Wages and education, age/experience (potential vs actual) perhaps age/potential experience affects wages differently males and females – why? education? Martin / Zulehner: Introductory Econometrics 3 / 28 Interactions between two binary variables yi = β0 + β1 D1i + β2 D2i + ui D1i , D2i are binary β1 is the effect of changing D1 = 0 to D1 = 1: in this specification, this effect doesn’t depend on the value of D2 To allow the effect of changing D1 to depend on D2 , include the "interaction term" D1i × D2i as a regressor: Yi = β0 + β1 D1i + β2 D2i + β3 (D1i × D2i ) + ui Martin / Zulehner: Introductory Econometrics 4 / 28 Interpreting the coefficients yi = β0 + β1 D1i + β2 D2i + β3 (D1i × D2i ) + ui General rule: compare the various cases E (yi | D1i = 0, D2i = d2 ) = β0 + β2 d2 E (yi | D1i = 1, D2i = d2 ) = β0 + β1 + β2 d2 + β3 d2 subtract (a) − (b) : E (yi | D1i = 1, D2i = d2 ) − E (yi | D1i = 0, D2i = d2 ) = β1 + β3 d2 the effect of D1 depends on d2 (what we wanted) β3 = increment to the effect of D1 , when D2 = 1 Martin / Zulehner: Introductory Econometrics 5 / 28 Example: wages. gen bachelor_female=bachelor*female. reg ahe bachelor female bachelor_female Source SS df MS Number of obs = 7,098 F(3, 7094) = 501.45 Model 182530.858 3 60843.6192 Prob > F = 0.0000 Residual 860753.793 7,094 121.335466 R-squared = 0.1750 Adj R-squared = 0.1746 Total 1043284.65 7,097 147.003614 Root MSE = 11.015 ahe Coef. Std. Err. t P>|t| [95% Conf. Interval] bachelor 10.5569.343367 30.75 0.000 9.883797 11.23 female -3.289496.400951 -8.20 0.000 -4.07548 -2.503513 bachelor_female -1.726883.5393244 -3.20 0.001 -2.78412 -.6696465 _cons 17.49846.2336802 74.88 0.000 17.04038 17.95654 bachelor is a 0/1 dummy variable as is female Martin / Zulehner: Introductory Econometrics 6 / 28 Binary-continuous interactions regression model yi = β0 + β1 Di + β2 xi + β3 (Di × xi ) + ui I observations with Di = 0 (the “D = 0” group): yi = β0 + β2 xi + ui the D = 0 regression line I observations with Di = 1 (the “D = 1” group): yi = β0 + β1 + β2 xi + β3 xi + ui = (β0 + β1 ) + (β2 + β3 ) xi + ui the D = 1 regression line Martin / Zulehner: Introductory Econometrics 7 / 28 Binary-continuous interactions yi = β0 + β1 Di + β2 xi + β3 (Di × xi ) + ui General rule: compare the various cases y = β0 + β1 D + β2 x + β3 (D × x) Now change X : y + ∆y = β0 + β1 D + β2 (x + ∆x) + β3 [D × (x + ∆x)] subtract (a) − (b) : ∆y ∆y = β2 ∆x + β3 D∆x or = β2 + β3 D ∆x The effect of X depends on D (what we wanted) β3 = increment to the effect of x, when D = 1 Martin / Zulehner: Introductory Econometrics 8 / 28 Binary-continuous interactions (a) Different intercepts, same slope / (c) Same intercept, different slopes Martin / Zulehner: Introductory Econometrics 9 / 28 Binary-continuous interactions (b) Different intercepts, different slopes Martin / Zulehner: Introductory Econometrics 10 / 28 Example: wages. reg ahe bachelor female bachelor_female age age_female Source SS df MS Number of obs = 7,098 F(5, 7092) = 335.09 Model 199372.791 5 39874.5581 Prob > F = 0.0000 Residual 843911.86 7,092 118.994904 R-squared = 0.1911 Adj R-squared = 0.1905 Total 1043284.65 7,097 147.003614 Root MSE = 10.908 ahe Coef. Std. Err. t P>|t| [95% Conf. Interval] bachelor 10.44409.340215 30.70 0.000 9.777163 11.11101 female 2.611809 2.733557 0.96 0.339 -2.746779 7.970397 bachelor_female -1.524202.5344241 -2.85 0.004 -2.571832 -.4765709 age.6111904.059267 10.31 0.000.4950094.7273715 age_female -.1996865.0912412 -2.19 0.029 -.3785465 -.0208265 _cons -.6064738 1.770818 -0.34 0.732 -4.077806 2.864858 bachelor is a 0/1 dummy variable as is female age ∈ [25; 34] Martin / Zulehner: Introductory Econometrics 11 / 28 Difference-in-difference lay out key identifying assumptions for the simplest difference-in-differences estimator “no anticipation” assumption and its economic content “parallel trends” assumption and its economic content generalize assumptions for popular extensions to the estimator when (unit 18) I treatment lasts several periods I treatment is introduced to different units at different times Martin / Zulehner: Introductory Econometrics 12 / 28 Classical example: Card and Krueger (1994) Measured employment before and after minimum wage increase for a sample of fast-food restaurants Motivated difference-in-differences (DID) estimator by the following I Moreover, since seasonal patterns of employment are similar in New Jersey and eastern Pennsylvania, as well as across high- and low-wage stores within New Jersey, our comparative methodology effectively "differences out" any seasonal employment effects. Martin / Zulehner: Introductory Econometrics 13 / 28 Table 3 of Card and Krueger (1994) Stores by state Variable Difference, PA NJ NJ − PA (i) (ii) (iii) 1. FTE employment before, 23.33 20.44 −2.89 all available observations (1.35) (0.51) (1.44) 2. FTE employment after, 21.17 21.03 −0.14 all available observations (0.94) (0.52) (1.07) 3. Change in mean FTE −2.16 0.59 2.76 employment (1.25) (0.54) (1.36) Binary treatment Di : for PA, Di = 0; for NJ,Di = 1 Two periods: t ∈ {−1, 0} and treatment is implemented at t = 0 Four sample averages of the outcome ȳt,D : before vs after and PA vs NJ Martin / Zulehner: Introductory Econometrics 14 / 28 A simple DID estimator Row 3 Column (iii) is their DID estimate Stores by state Variable Difference PA NJ NJ − PA (i) (ii) (iii) 1. FTE employment before, 23.33 20.44 −2.89 all available observations (1.35) (0.51) (1.44) 2. FTE employment after, 21.17 21.03 −0.14 all available observations (0.94) (0.52) (1.07) −2.16 0.59 2.76 3. Change in mean FTE (1.25) (0.54) (1.36) We can write the estimator as β̂ DID = (ȳt=0,D=1 − ȳt=−1,D=1 ) − (ȳt=0,D=0 − ȳt=−1,D=0 ) Martin / Zulehner: Introductory Econometrics 15 / 28 What is the DID estimator estimating? The DID estimator is β̂ DID = (ȳt=0,D=1 − ȳt=−1,D=1 ) − (ȳt=0,D=0 − ȳt=−1,D=0 ) Potential outcomes yi,t (d) for d ∈ {0, 1} The employment that would have been if minimum wage increased (d = 1) and did not increase (d = 0) For PA, observe yi,t (0); for NJ, observe yi,t (1) Martin / Zulehner: Introductory Econometrics 16 / 28 What is the DID estimator estimating? Interested in the average impact for NJ after the minimum wage increased, formally, the average treatment effect on the treated treatment effect z }| { ATT: E [ yi,0 (1) − yi,0 (0) | Di = 1] | {z } | {z } observed counterfactual Since counterfactual outcomes are never observed, we need to impose some assumptions to estimate the ATT Martin / Zulehner: Introductory Econometrics 17 / 28 Sufficient assumptions (1): No anticipation “no anticipation” assumption: the outcome is not affected by the treatment prior to its implementation: yi,−1 (0) = yi,−1 (1) for all i with Di = 1 Assuming "no anticipation," outcomes we observe yi,t can be written as PA Di = 0 NJDi = 1 before t = −1 yi,−1 (0) yi,−1 (0) after t = 0 yi,0 (0) yi,0 (1) Example violation: fast food restaurants laying off minimum wage workers in advance of increase in wage Other examples: consumption smoothing for anticipated job loss (Hendren 2017) Martin / Zulehner: Introductory Econometrics 18 / 28 Sufficient assumptions (2): Parallel trends “parallel trends” assumption: E [yi,0 (0) − yi,−1 (0) | Di = 1] (NJ counterfactual trend) = E [yi,0 (0) − yi,−1 (0) | Di = 0] if minimum wage never increased for NJ, average trends would coincide between NJ and PA Example violation: NJ labor market was improving compared to PA Other examples: downward trend in wage income leading to participation in job training programs (Ashenfelter’s dip) Martin / Zulehner: Introductory Econometrics 19 / 28 Sufficient assumptions (2): Parallel trends Parallel trends assumption allows for potentially non-zero selection bias: E [yi,−1 (0) | Di = 1] − E [yi,−1 (0) | Di = 0] | {z } selection bias at t=−1 = E [yi,0 (0) | Di = 1] − E [yi,0 (0) | Di = 0] | {z } selection bias at t=0 Sensitive to the scale: if parallel trends holds for level of employment, it might fail for log of employment, and vice versa (Roth and Sant’Anna 2023) Martin / Zulehner: Introductory Econometrics 20 / 28 DID is unbiased for ATT The DID estimator β̂ DID = (ȳt=0,D=1 − ȳt=−1,D=1 ) − (ȳt=0,D=0 − ȳt=−1,D=0 ) is therefore unbiased for E [yi,0 − yi,−1 | Di = 1] − E [yi,0 − yi,−1 | Di = 0] =E [yi,0 (1) − yi,−1 (0) | Di = 1] − E [yi,0 (0) − yi,−1 (0) | Di = 0] | {z } “no anticipation” = E [yi,0 (1) − yi,0 (0) | Di = 1] + | {z } ATT E [yi,0 (0) − yi,−1 (0) | Di = 1] − E [yi,0 (0) − yi,−1 (0) | Di = 0] | {z } =0 under “parallel trends” Martin / Zulehner: Introductory Econometrics 21 / 28 Regression representation with two periods Recall the DID estimator: β̂ DID = (ȳt=0,D=1 − ȳt=−1,D=1 ) − (ȳt=0,D=0 − ȳt=−1,D=0 ) Can implement via regression as follows Define zit as I 1 if i is treated (Di = 1) and t is after treatment (t = 0) I 0 otherwise Estimate yit = αd + γt + βzit + εit I Group fixed effect αd for d ∈ {0, 1} I Time fixed effect γt The OLS estimate β̂ is numerically equivalent to β̂ DID Martin / Zulehner: Introductory Econometrics 22 / 28 Grouped data and repeated cross sections This regression representation is also useful for non-panel datasets For repeated cross sections, β̂ DID still unbiased estimate of ATT and so is the regression representation We can also collapse to group-level and obtain group-level panel data WLS coincides exactly with β̂ DID Martin / Zulehner: Introductory Econometrics 23 / 28 Two-way fixed effects (TWFE) Common to implement DID via Two-way fixed effects (TWFE) regression Estimate yit = αi + γt + βzit + εit I Unit fixed effect αi I Time fixed effect γt Large subsequent literature on minimum wage (for example, Neumark and Wascher 2007) estimates this model allowing for continuous treatment, covariates, multiple time periods, etc. Will return to some of these topics in Unit 13 and Unit 14 (Panel data methods) Martin / Zulehner: Introductory Econometrics 24 / 28 Application: School program in Kenya (Lucas and Mbiti, 2012) Lucas, Adrienne M., and Isaac M. Mbiti. “Access, sorting, and achievement: The short-run effects of free primary education in Kenya." American Economic Journal: Applied Economics 4, no. 4 (2012): 226-253. Education as key driver of economic development (and key contribution to inequality within and across countries) Large literature on access, sorting, assignment,... School fees as deterrent to education ⇒ free education programs This paper: assess success of Free Primary Education (FPE) program in Kenya Martin / Zulehner: Introductory Econometrics 25 / 28 Descriptive Evidence ⇒ suggestive evidence that FPE increased attendance (test takers) at public schools, but decreased it at private schools Martin / Zulehner: Introductory Econometrics 26 / 28 Identification Main idea: Identify effect of FPE by exploiting its effective differential impact across Kenyan districts (proportional to dropout rate prior to policy change) Policy should have no effect with prior 0 dropout rate, but stronger effect where dropout rate was high ⇒ diff-in-diff strategy (compare outcomes before and after policy across regions): ysjt =β0 + β1 (intensityjt × publics ) + β2 (intensityjt × privates ) + δj + δj × publics + δj × trendt + δt + εsjt where s ∈ {public, private} is the school type, and intensityjt the effective intensity (0 for all districts initially, then depending on dropout rates) Martin / Zulehner: Introductory Econometrics 27 / 28 Results ⇒ FPE increased number of test takers Additional results in the paper: (i) Especially children from dis-advantaged background gain access and (ii) school quality does not decrease Martin / Zulehner: Introductory Econometrics 28 / 28

Difference-in-Difference Analysis PDF

Document Details

Tags

Related

Summary

Full Transcript