Principles of Empirical Analysis: Quasi-Experiments and Instrumental Variables

Summary

This document introduces principles of empirical analysis, focusing on quasi-experiments, instrumental variables, and regression discontinuity design. It covers key concepts such as treatment effects, causal inference, and how to address endogeneity issues. The material is valuable for students and researchers seeking to understand and apply these methods.

Full Transcript

**Principles of Empirical Analysis** Section 2 **Lecture 6 -- Introduction to Quasi-Experiments** T = the independent variable, variable of interest; the treatment. Y = the dependent variable, the outcome variable. Limits of RCTs: - Costly - Unethical - Unhelpful when studying historic...

**Principles of Empirical Analysis** Section 2 **Lecture 6 -- Introduction to Quasi-Experiments** T = the independent variable, variable of interest; the treatment. Y = the dependent variable, the outcome variable. Limits of RCTs: - Costly - Unethical - Unhelpful when studying historical questions. - Unhelpful for understanding market level - Using observational data and clever designs can allow researchers to study causal questions without needing to perform a specific experiment. Observation data: data that is collected as a part of the normal functioning of societies institutions, etc. Observational study: draws inferences from a sample of a populations where the independent variable is not under control of the researcher. 1. Selection on observables: T and C groups different from each other only w.r.t. observable characteristics 2. Selection on unobservable: T and C groups differ from each other in unobservable characteristics - Something unexpected happened affecting some people and not others in an almost random manner. - Exogeneous variable induces variation in treatment -- IV. - Selection mechanism is known -- RDD. - Treatment and controls are observed before and after treatment -- DiD. Natural/quasi-experiment: when something unexpected happens, like a government policy or nature affecting some households in a way that resembles an experiment. - Provides treatment and control group. Example: Quasi-experiment Two areas with fixed housing supply; city, suburbs, and two fixed types of households; rich and poor. - Segregation by income: rich in city, poor in suburbs. - Due to income inequality, quality in neighbourhoods, optimizing behaviour. - Neighbourhood effect: where you live can have direct or indirect effects on socio-economic outcomes. RQ: would the children in a family benefit if the family moved next to high-income families? How to isolate effect of treatment (living in rich neighbourhood) from our outcome (benefits to children)? 1. Control for observable differences: compare people with similar observable and measurable characteristics. - But if the families are assumed to be similar, why would the families make different residential location choices. - Those who make choice to move may use more resources in parenting than observably similar parents who remain in poor nbhs. -\> unobservable differences. Public housing demolition: - Provide low-income household with resources to move to different residential area - Forced to relocate due to demolition, received housing vouchers forcing to relocate. - Treatment and control occur naturally -\> planning to divide residents was not planned for research purposes -\> Quasi-experiment. - Compare outcome on young adults who were displaced and non-displaced children from the same public housing project. - T = displaced - C = non-displaced Key assumption I: 1. The decision about which buildings to demolish were unrelated to the characteristics of the tenants. 2. Households and children were similar in T and C. - If they are similar in characteristics they can observe, then it is plausible they are similar in characteristics the researcher cannot. - Balance tests. Key assumption II: 1. The demolition has no effects on the children who were not displaced; treatment had no effect on control. Balance test: a test to assess whether T and C groups are comparable across various observable characteristics. - Crucial as unlike true random assignment where random assignment ensures they are statistically equivalent, quasi-experiments rely on non-random assignment methods. Main effects Heterogenous effects: how does the outcome variable differ by subgroup. **Lecture 7 -- Instrumental Variable (IV) pt1.** An exogenous variable induces variation in treatment -\> instrumental variable. Imperfect compliance: some randomized into treatment do not get treated, and some randomized into control groups still get treated. A group of boxes with text AI-generated content may be incorrect. Randomization ensures that the share of each group is equally large in the treatment and control group. - Comparing always-takers\*+compliers with never-takers+compliers, and always-takers+compliers and always+never+compliers is forbidden. - Comparing everyone randomized into ![A diagram of a group of people AI-generated content may be incorrect.](media/image2.png) Intention to treat effect: measures the impact of being assigned to the treatment group versus being assigned to control group, regardless of compliance or not. A diagram of a treatment group AI-generated content may be incorrect. Local average treatment effect: estimates the effect of the treatment on compliers in T and C groups. - But issue is we cannot directly observe who the compliers are. - However, the share of compliers can be estimated. -\> Wald Estimator. ![A math equation with black text AI-generated content may be incorrect.](media/image4.png) For LATE with IV - ITT: expected value for the outcome variable for T and C -\> calculates the ITT. - Share compliance: proportion of participants in each group who actually receive the treatment; compliance rate; how many actually received the treatment because they were assigned relative to those who were not assigned. A white background with black text AI-generated content may be incorrect. - The impact of receiving the treatment may differ for compliers from the never- and always-takers. Instrument variables - Answers causal question: does T affect Y? - Instrument: find something exogenous that we can measure that: 1. Only affects T, whose effects on Y we want estimate 2. Cannot affect Y directly. Example: Does the amount of schooling a person gets affect their future wages? - Instrument: an exogenous factor as good as random that affects only amount of schooling but not wages directly. IV conditions: 1. Relevance condition: the instrument should be correlated with the variable of interest. - Instrument has causal effects on the variable whose effects we want to measure. 2. Exogeneity: the instrument is randomly assigned, or as good as randomly assigned -\> unrelated to omitted variables we want to control for if we could. 3. Exclusion restriction: the chosen instrument should only affect outcomes through treatment variable. ![A diagram of a performance AI-generated content may be incorrect.](media/image6.png) First stage: the relationship between IV and explanatory variable. - IV is winning lottery. - Explanatory variable is the likelihood to attend. Reduced form - Relates IV to Y. Second stage: - Outcome Y and treatment T is attending. - The causal effect of attending school is isolated, controlling for cofounding factors correlated with attendance and grades. 1. Relevance condition: winning the lottery is correlated with likelihood to attend. 2. Exogeneity: winning the lottery is not correlated with omitted variables, like motivation or grades, because the lottery is randomly assigned. 3. Exclusion restriction: winning the lottery has no direct impact on student grades except through treatment (attending the school). **Lecture 9 -- Instrumental Variable (IV) pt2.** Starting point: we want to estimate the effect of T on the outcome, but we suspect there are variables that affect both treatment status and outcome: cofounding factors. Possible solutions: we find some variable with is random or almost random that affects treatment, introduces exogeneous random variation in treatment: IV. Threats: IV should not affect outcome directly, and cannot be systematically correlated with cofounding factors affecting T and Y. Testing IV-conditions. 1. Relevance - First stage regression to check for correlation between the instrument and variable of interest (T). 2. Exogeneity - It is not 100% testable as we cannot check for correlation of IV with unobservables -\> but we can check for correlation between IV and observable cofounding factors. 3. Exclusion restriction - Cannot be tested but must provide arguments in favour of exclusion restriction. - Suggest threats to exclusion restriction and show that they are not problematic. Often face endogeneity issues: the treatment is often correlated with unobservables that also affect outcome. - IV is an exogeneous variable not affected by unobserved factors influencing Y, that only affects the outcome through treatment. - Creates exogeneous variation in treatment allowing to isolate causal effect from treatment. - Estimate the treatment affect for a specific group whose treatment status changes because of the IV. Example: T = economic shock, Y = conflict, Z = rainfall shocks 1. Relevance condition: rainfall shocks are correlated with economic shocks. 2. Exogeneity: rainfall shocks and conflict do not have correlated unobserved factors; rainfall should be random and not influenced by pre-existing conflict or economic shocks. 3. Exclusion restriction: rainfall and conflict do not directly affect each other, only through the effect of rainfall on economic shocks and then conflict. IV 1. Situation where potential OVB affecting both T and Y. 2. Find IV that is correlated with T which is affects Y. 3. IV must not affect Y directly (exclusion restriction -- must be justified). 4. IV is randomly assigned (Exogeneity -- balance tests). 5. Find IV is strongly correlated with T (Relevance -- Tested via First Stage) Reduced form: effect of Z/IV on Y. - We want T on Y, since Z is exogenous/as good as random/uncorrelated with cofounding factors, the only reason Z affects Y is through T. First stage: how IV affects those actually getting treated, the share of compliers. RF/FS = rescale effect of IV on Y by effect of IV on T -\> causal effect in treatment units. **Lecture 9 -- Regression Discontinuity Design (RDD)** Selection mechanism is known Start with a causal relationship in mind: want to estimate the effect of some treatment on some outcome. - Suspect some selection into the treatment: T is correlated with unobservables that may affect Y. RDD: isolate causal effect of T in situations where individuals become treated after crossing some arbitrary cutoff. Sharp RDD: treatment received with probability one above cut-off and zero below. Fuzzy RDD: the probability of receiving treatment increases discontinuously at threshold. -\> imperfect compliance. Assumption: the potential outcomes evolve smoothy across the cutoff. - There is no precise manipulation of the running variable, observations just below the threshold are very similar to those just above the threshold and therefore constitute a valid control group. The problem of causal inference in RDD: you cannot observe both potential outcomes for the same unit. - Cannot see outcome if unit is treated or not simultaneously, only those above or below cutoff. - Absence of common support; you cannot see units with the same outcome on both sides of the cutoff. A graph of a diagram AI-generated content may be incorrect. Local causal effect in RDD: - Uses discontinuity around cutoff to estimate treatment effect; those above and below are similar in all respects except the treatment assignment. - Treatment effect is localized around cutoff -\> local causal effect. - Units just below cutoff serve as the control group to those just above it. Key assumption and points of RDD: Units close the cut-off are comparable to each other in all relevant aspects, except for their treatment status. 1. In a small nbh around cut-off we obtain conditions that mimic randomized experiment; units on each side are as good as randomly assigned to either receive the treatment or not. 2. There is a continuity of average potential outcomes near cutoff; if we could observe the potential outcomes for all units near the cut-off it would change gradually rather than abruptly. Example: US legal drinking age T = legal access of alcohol, Y = likelihood of dying and cause Running variable: age Cutoff: 21 Treatment: legal access of alcohol ![A graph of a treatment AI-generated content may be incorrect.](media/image8.png) Left side is the control. Right side is the treatment. Issue: how do we know jump is death rates is due to alcohol consumption; additional data is required: 1. Data on alcohol consumption by age. 2. Data on causes of death by age. Testing RDD: - Underlying assumption: units cannot precisely manipulate their own value of the running variable. 1. Sorting of running variable - If units could manipulate their running variable, they would sort to the right side if the treatment was beneficial. - With no manipulation the number of observations just above cut-off should be approximately the same as the just below the cut-off. - Create histogram of the running variable and inspect if number around cutoff are similar. A red and blue graph AI-generated content may be incorrect. - The graph of the right side shows signs of sorting: noticeable jump in number of treatment observations around cutoff. - Left graph has a smooth transition of number of observations from control to treatment. 2. Falsification tests: examining whether near cutoff treatment units are similar to control units in terms of observable characteristics. - If units don't have the ability to manipulate the running variable, there should be no systematic differences between the units with similar running variable values. ![A graph of health care AI-generated content may be incorrect.](media/image10.png) - All predetermined values should be analysed in the same way as the outcome of interest. - We can see that it is smooth, meaning they have the same characteristics. 3. Placebo test 1: replacing the true cutoff value with a fake cutoff value in the running variable. - A significant treatment effect should only occur at true cutoff value, and not other values, where treatment should be constant. 4. Placebo test 2: run placebos at true cutoff for different outcomes (Y). - Other outcomes should not be affected by the treatment. Limitations of RDD: 1. Local randomization interpretation. - It is a randomized experiment within a window around the cutoff -\> only locally random. - Results can be randomized to a narrow segment of the running variable. - A lot of data around the cutoff is needed. 2. Technical issues - RDD compares the means of those just above and just below, so how large the bandwidth (how much data away from the cutoff should we use)? - Bias-variance trade-off: the closer we are the more unbiased the causal effect estimate is, but the standard error is larger -\> more noise. Addressing RDD limitations: 1. Authors must show their results are robust to different modelling and data choices. - Different bandwidths. - Different specifications of relationship between Y and running variable. - Linear vs non-linear, adding interaction term. Fuzzy RDD: A graph of a diagram AI-generated content may be incorrect. - When passing the cutoff changes treatment probability, instead of switching treatment on and off completely. - Imperfect compliance: the cutoff is an instrument for being treated -\> increases probability but isn't 1. RDD: Idea: - If a rule determines treatment due to a hard cut-off, we can use the rule to estimate the causal effect, without an RCT. Criteria: - Running variable - Cutoff - Treatment - Probability of treatment is a function of the running variable that changes discontinuously at the cut-off. Assumption: - Units just below and just above are similar and comparable -\> cannot manipulate their running variable. Testing: - Placebo tests, falsification tests, sorting (density tests) Challenges: - Lot of data around cutoff. - Cannot extrapolate for from cut off (local causal effects). **Lecture 11-12 -- Difference-in-differences (DiD)** Treatment and controls are observed before and after treatment - Weaker assumption, in the absence of treatment the difference between treatment and control groups in constant over time (parallel trends). - With this assumption we can relax the requirement that the treatment and control groups are almost identical/as good as randomly assigned. 1. Pre-treatment difference between groups is normal difference. 2. Post-treatment difference between groups is normal + causal effect of treatment. 3. Difference-in-differences is the causal effect. - We use the outcome of the control group as a counterfactual -\> what would have happened without treatment. DID: two groups, two time periods - Two time periods and two groups, where the timing of the treatment is the same for all. ![A diagram of a treatment group AI-generated content may be incorrect.](media/image12.png) The control group captures common changes in both groups. - We can use the same trend line but with the mean difference before the treatment as the counterfactual to what would have happened to the treatment group if they hadn't been treated. - The key assumption. 2x2 regression model \ [*y*~it~= *α*+ *β*~treatedi~+ *γ*~aftert~+ *δ*~treatedi~ \* *aftert*+ *u*~it~]{.math.display}\ Treated = 1 if observation is in the treatment group, otherwise 0. After = 1 if observation is from the after period, otherwise 0. Treated\*after = 1 if observation is from the treatment group and observed after treatment. - The treated and after are dummy variables and the product is the interaction term. Alpha is the intercept/constant term. Control before: alpha Control after: alpha + y Treatment before: alpha + B Treatment after: alpha + B + y + sigma Control after -- control after = y Treatment after -- treatment before = y + sigma DID = difference control -- difference treatment = sigma - This gives us the causal effect Key assumptions of DID: 1. Parallel trends - The outcome of the treatment and control group would follow the same trend in the absence of treatment. Testing this assumption: Since we cannot directly observe this counterfactual situation, we need show the following things: 1. Parallel pre-trends: show trends in pre-period develop in similar manner. A graph of a patient AI-generated content may be incorrect. - Here we see similar trends in previous periods. 2. Common shocks: show if other policies or changes coincided the treatment periods, they affected control and treatment group in the same way, or that there were no other common shocks during the same time that would affect the groups differently. - Research reform and institutional details. Staggered timing: groups of units receive treatment at different times. ![A graph of different colored lines AI-generated content may be incorrect.](media/image14.png) - At the first dotted line, comparison of early treated group with control and late treated group. - At second dotted line, comparison of late treated group with control and early treated group. Example: how does a reduction in air pollution affect the health of infants? T = more vs less exposed to pollution Y= premature birth and low birth weight. Air pollution is not randomly assigned, studies on health outcomes for different pollution exposure may ignore cofounding factors. - Higher incomes sort to locations with better air quality -\> overestimating effect of pollution on health. - Higher pollution in urban areas with more educated people with better access to healthcare -\> underestimates effects of pollution on health. Problems with staggered treatment: - Analysis of multiple 2x2 comparisons around time windows when treated. - If treatment affects are heterogenous (differ over time), then analysing using regular DID methods can lead to biased results -\> type I and type II errors. DID: Idea: - Even if treated and control differ in baseline characteristics, we can observe treatment and control before and after treatment to estimate the causal effect. Assumptions: - The potential outcomes (control and counterfactual) would have developed in a parallel manner in the absence of treatment. - Common shocks assumption: no differential changes over time for the treated and control groups. Testing DID: - Visualization and testing: are the trends in outcomes parallel before treatment? - Parallel trends in pre-treatment period. - Common shocks: same reaction or no shocks.