Research Methods: Applied Empirical Economics Lecture 3 PDF
Document Details
Uploaded by Deleted User
Leiden University
2024
Dr. M. van Lent
Tags
Summary
This document summarizes Instrumental Variables from a lecture on Applied Empirical Economics. The lecture covers learning goals, concepts, and examples.
Full Transcript
Research Methods: Applied Empirical Economics Lecture 3: Instrumental Variables Dr. M. van Lent | Leiden University September 9, 2024 1 Roadmap of the lectures 1. Introduction 2. RCTs/Regression 3. Instrumental Variables 4. Regression D...
Research Methods: Applied Empirical Economics Lecture 3: Instrumental Variables Dr. M. van Lent | Leiden University September 9, 2024 1 Roadmap of the lectures 1. Introduction 2. RCTs/Regression 3. Instrumental Variables 4. Regression Discontinuity 5. Difference-in-Difference 2 Recap from previous lectures Interested in learning the effect of a treatment Two potential outcomes, only one of them is observed Need to search for a proxy of the unobserved potential outcome (i.e., treatment or control) When doing so, be aware of selection bias: Difference treatment control = treatment effect + selection bias Randomized trials eliminate selection bias by creating a comparable control group 3 In the absence of a randomized trial with full compliance… One has to come up with a story about (non-)random selection into treatment One can either try to control for omitted variables or use variation in treatment that is not influenced by omitted variables. This lecture, we focus on the second route: Instrumental variables Chapter 3 of Mastering Metrics 4 Learning goals Understand how instrumental variables break endogeneity - Field experimental setting (RCT) - Naturally occurring data (discrete and continuous treatment) Be able to apply instrumental variables estimation in STATA 5 The field experiment of Bolhaar et al. (2016) RCT where some caseworkers were randomly assigned to: impose a period of job search before workers receive unemployment benefits. 6 The field experiment of Bolhaar et al. (2016) RCT where some caseworkers were randomly assigned to: impose a period of job search before workers receive unemployment benefits. Compliance among caseworkers was not 100% IV to use variation in treatment that is not influenced by omitted variable bias (OVB) Caseworkers were randomized to follow one default option - Always apply treatment (i.e., imposition of job search requirement) - Never apply treatment Treatment effect = mean( job finding rate | always ) – mean( job finding rate | never ) Thus, what we estimate is the “Intention-to-Treat” 7 Intention-to-treat (ITT) Term stems from medical trials What we know with ITT is the effect of PRESCRIBING the treatment, not the effect of actually TAKING the treatment Likewise, in Bolhaar et al. we know the effect of telling caseworkers to impose a job search requirement, not the effect of the job search requirement itself Two types of non-compliance - Treatment migration: control group gets treated - Treatment dilution: assigned to treatment but not treated 8 The caseworker experiment 9 Instrumental variable isolates exogenous variation Not all variation in treatment (i.e. job search requirement) is random because of potential selection bias Example of potential selection bias: - client selection -> clients who are expected to be more successful in finding a job are more likely to be required job search Therefore, we need to isolate part of the variation in the actual treatment delivered that is random.... by using the original random assignment as an instrumental variable! 10 How an instrumental variable works 11 How an instrumental variable works Outcome Treatment (partly non-random) The instrument should be sufficiently strong; i.e., deliver sufficient variation in treatment Exclusion (“Relevance”) Restriction: no direct Instrumental impact variable (randomly assigned: independence assumption) 12 Assumptions Exclusion restriction: The instrument only affects the outcome through treatment. In other words: the instrument cannot have a direct effect on the outcome. Being assigned to treatment doesn’t affect outcome itself, only through treatment. Independence restriction: Instrument should be randomly assigned. This means the instrument is not correlated with the omitted variables we want to control for. Relevance: Assignment of case workers to treatment or control affects whether the case worker is giving treatment to the client. 13 Instrument isolates the random part in total variation in treatment delivered You obtain purely random variation in treatment by going through your first stage estimation First stage: regress treatment assignment on treatment. Next, you use these first stage estimates in the regression to estimate the causal effect of receiving treatment (instead of using actual treatment!) This is referred to as Two-stage Least Squares (TSLS) or Instrumental-Variables (IV) estimation 14 Effect of treatment is a “LATE” LATE = “Local” Average Treatment Effect By using only variation in treatment due to randomization, you effectively estimate the treatment effect of compliers..not those who would never take the treatment (never taker)..not those who would always take the treatment (always taker) This is why the estimate is “local” Stated differently, we do not know how the treatment would work out for never takers and always takers. 15 ITT and LATE Suppose we have outcome Y, treatment D, and instrument Z We define ρ as the ITT-effect: 𝜌 = 𝐸 𝑌𝑖 𝑍𝑖 = 1) − 𝐸 𝑌𝑖 𝑍𝑖 = 0) The expected The expected value of those value of those with prescribed without treatment prescribed treatment ρ : The effect of being assigned to treatment. In BKK: the effect of the case manager being instructed to implement job search on workers’ employment. 16 ITT and LATE Suppose we have outcome Y, treatment D and instrument Z We define ρ as the ITT-effect: 𝜌 = 𝐸 𝑌𝑖 𝑍𝑖 = 1) − 𝐸 𝑌𝑖 𝑍𝑖 = 0) In the first stage of IV, we calculate the difference in the likelihood of the treatment for those with and without prescribed treatment, Φ : Φ = 𝐸 𝐷𝑖 𝑍𝑖 = 1) − 𝐸 𝐷𝑖 𝑍𝑖 = 0) The likelihood of The likelihood of treatment of treatment of those those without with prescribed prescribed treatment treatment 17 ITT and LATE Suppose we have outcome Y, treatment D and instrument Z We define ρ as the ITT-effect: 𝜌 = 𝐸 𝑌𝑖 𝑍𝑖 = 1) − 𝐸 𝑌𝑖 𝑍𝑖 = 0) In the first stage of IV, we calculate the difference in the likelihood of the treatment for those with and without the prescribed treatment, Φ : Φ = 𝐸 𝐷𝑖 𝑍𝑖 = 1) − 𝐸 𝐷𝑖 𝑍𝑖 = 0) 𝜌 The LATE effect (λ ) in the 2nd stage then is: λ= Ф With Ф < 1, we have incomplete compliance and LATE > ITT 18 Bolhaar et al. (2016) Job Search period in welfare (partly non-random) Randomization By experiment 19 Bolhaar et al. (2016) 20 ITT and LATE in Bolhaar et al. (2016) Welfare applicants dealing with “Always”-caseworker receive treatment in 0.55 of cases (on average) - Treatment dilution (of 45%) Welfare applicants dealing with “Never”- caseworker receive treatment in 0.09 of cases (on average) - Treatment migration (of 9%) Difference (1st stage estimate) = 0.55 – 0.09 = 0.46 ITT LATE = 0.46 21 Impact estimates: ITT and LATE 22 Recap so far Instrumental variables are used in field experiments with noncompliance It shows treatment effect for those with compliance to the experiment Conditions for IV to work: - The instrument impacts the treatment (relevance): condition usually met in experiments - The independence assumption: with field experiments, this condition is satisfied - The exclusion restriction condition: the instrument should affect outcome only via treatment 23 Let’s move to observational data (not field experiments!) Section 3.3 in MM: “The population bomb” – the quality/quantity tradeoff in number of children With observational data, we cannot exploit random assignment of experiments But other variables may satisfy IV conditions 24 Examples of instruments Angrist, Lavy and Schlosser (2010): - RQ: Do more children lead to less educated children? - Outcome measure: educational attainment of the first child in families with at least two children - “Treatment”: number of children in the family - Instrument: 25 Examples of instruments Angrist, Lavy and Schlosser (2010): - RQ: Do more children lead to less educated children? - Outcome measure: educational attainment of the first child in families with at least two children - “Treatment”: number of children in the family - Instrument: Second child a twin or singleton (dummy variable) 26 Examples of instruments Angrist, Lavy and Schlosser (2010): - RQ: Do more children lead to less educated children? - Outcome measure: educational attainment of the first child in families with at least two children - “Treatment”: number of children in the family - Instrument: Second child a twin or singleton (dummy variable) Relevance: Indeed more children when the second (and third) are twins. Independence: having a twin is not correlated with omitted variables that explain the relation between child education and number of children. Exclusion: in itself having twin brothers/sisters should not affect educational attainment of the first born, except through the number of children. 27 More examples of instruments Miguel, Satyanath, and Sergenti (2004): - RQ: Do worse economic conditions lead to more civil conflict? - Outcome measure: civil conflict - Treatment: economic growth - Instrument: rainfall Relevance: In Africa rainfall and economic conditions (quality of agricultural land) are strongly (positively) correlated. Independence: Rainfall should not be correlated with other variables we are interested in that affect the relation between economic conditions and conflict. Exclusion: in itself rainfall should not affect civil conflict except through economic conditions. 28 Two-stage least squares (TSLS) Treatments and instruments can be continuous variables Calls for a setup with D and Z as continuous variables, and with control variables (A). (For now, strong advice to stick to a model with only one instrument (see also MM, pp.145-146) In the example: instrument=rainfall, treatment=economic growth 29 Two-stage least squares (TSLS) To test whether the instrument has sufficient strength anyway, you can start by estimating a reduced form model (ITT) for outcome Y with A as controls: Yi = α0 + ρ Zi + γ0 Ai + e0i If so, you can try TSLS. You then first estimate the impact of the instrument (Z) on the treatment X, while including A as controls: Di = α1 + Ф Zi + γ1 Ai + e1i Next, the first-stage fitted values of D can be used in the second-stage equation i + γ2 Ai + e2i Yi = α2 + λ2SLS D 30 Two-stage least squares (TSLS) To test whether the instrument has sufficient strength anyway, you can start by estimating a reduced form model (ITT) for outcome Y with A as controls: Yi = α0 + ρ Zi + γ0 Ai + e0i If so, you can try TSLS. You then first estimate the impact of the instrument (Z) on the treatment X, while including A as controls: ρ measures the effect of the Di = αZ1on instrument +ФtheZoutcome i + γ1 Ai + e1i measure Y. If the exclusiveness condition holds, this effect runs Next, the first-stage fitted throughvalues of D variable treatment can be used D! in the second-stage equation i + γ2 Ai + e2i Yi = α2 + λ2SLS D 31 Two-stage least squares (TSLS) To test whether the instrument has sufficient strength anyway, you can start by estimating a reduced form model (ITT) for outcome Y with A as controls: Yi = α0 + ρ Zi + γ0 Ai + e0i If there is an effect, you can try TSLS. You then first estimate the impact of the instrument (Z) on the treatment D, while including A as controls: Di = α1 + Ф Zi + γ1 Ai + e1i Next, the first-stage fitted values of D can be used in the second-stage equation i + γ2 Ai + e2i Yi = α2 + λ2SLS D 32 Two-stage least squares (TSLS) To test whether the instrument has sufficient strength anyway, you can start by estimating a reduced form model (ITT) for outcome Y with A as controls: Yi = α0 + ρ Zi + γ0 Ai + e0i If there is an effect, you can try TSLS. You then first estimate the impact of the instrument (Z) on the treatment D, while including A as controls: Di = α1 + Ф Zi + γ1 Ai + e1i Next, the first-stage fitted values of D can be used in the second-stage equation Ф measures the effect of the instrument Z on the treatment D. Yieffect This = α2 is+ neededto λ2SLS D i +rescale γ2 Ai +the e2i ITT-effect (ρ) 33 Two-stage least squares (TSLS) To test whether the instrument has sufficient strength anyway, you can start by estimating a reduced form model (ITT) for outcome Y with A as controls: Yi = α0 + ρ Zi + γ0 Ai + e0i If there is an effect, you can try TSLS. You then first estimate the impact of the instrument (Z) on the treatment X, while including A as controls: Di = α1 + Ф Zi + γ1 Ai + e1i Next the first-stage fitted values of D can be used in the second-stage equation i + γ2 Ai + e2i Yi = α2 + λ2SLS D 34 Two-stage least squares (TSLS) To test whether the instrument has sufficient strength anyway, you can start by estimating a reduced form model (ITT) for outcome Y with A as controls: Yi = α0 + ρ Zi + γ0 Ai + e0i λ2SLS is the causal impact of D If there is an effect, you can try TSLS. You then first estimate the impact of the on Y. instrument (Z) on the treatment X, while including A as controls: First-stage fitted values of D are used in order to isolate random Di = α1 + Ф Zi + γ1 Ai + evariation 1i induced by the instrument! Next the first-stage fitted values of D can be used in the second-stage equation i + γ2 Ai + e2i Yi = α2 + λ2SLS D 35 Miguel, Satyanath, and Sergenti (2004) Reduced form: less rainfall associated with more conflict 36 Miguel, Satyanath, and Sergenti (2004) First stage: more rainfall associated with more economic growth 37 Miguel, Satyanath, and Sergenti (2004) Second stage: Economic growth decreases civil conflict 38 Test your assumptions The instrumental variable (IV) method requires assumptions. How to test for these assumptions? 39 Test your assumptions The instrumental variable (IV) method requires assumptions. How to test for these assumptions? Relevance assumption: test by studying the F-statistic of the first stage regression. Rule of thumb: F>10 is fine. Independence assumption: treatment is not correlated with omitted variables that we want to control for. Compare the values of the control variables of the group assigned to treatment and to control. Exclusion restriction: the untestable one. You need to have a ‘story’. 40 Summarizing In field experiments with non-compliance, instrumental variable estimation infers true effects from Intention-to-Treat (ITT) effects IV estimation can also be used with observational data. This also requires three assumptions to be met: - Relevance: the instrument should impact the treatment - The independence assumption: as-good-as random assignment of instrument - The exclusion restriction: the instrument should affect outcome only via treatment To implement all this, one can use TSLS (or 2SLS) framework. 41 How does this all work in STATA? General warning: STATA can do a whole lot for you, but you have to understand what’s going on ! Prior to using TSLS, you can use the function “regress” or “reg” to obtain estimates of the reduced-form (i.e. regression of ‘y’ on the instrument) or ITT-effect of instruments - So perform this regression without using the treatment variable! “IVregress” is the function in STATA that allows you to perform TSLS - (or: IVreg) - Use the option “first”, so as to make sure you can see and interpret the first-stage estimates! - Read the instruction manual of STATA first, concentrate on 2SLS 42 https://www.youtube.com/watch?v=lbnswRJ1qV0 43 Data in STATA – example The effect of education on wages 44 Ivregress 45