Regression Discontinuity Design
30 Questions
0 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

In a Sharp Regression Discontinuity (RD) design, why is the lack of common support a critical feature?

  • It necessitates extrapolation towards the cutoff point to compare control and treatment units, fundamentally shaping the RD analysis. (correct)
  • It ensures that the potential outcomes are identical for all units, simplifying the analysis.
  • It allows for direct comparison of control and treatment units across the entire range of the running variable.
  • It guarantees that units in the control and treatment groups have the same value of the running variable, ensuring a balanced comparison.

In Regression Discontinuity designs, if individuals cannot perfectly sort themselves around the cutoff, what does the discontinuous change in the probability of treatment allow us to estimate?

  • The local causal effect of the treatment around the cutoff. (correct)
  • The global causal effect of the treatment.
  • The average treatment effect across the entire population.
  • The administrative burden of treatment assignment.

What is the fundamental challenge in estimating the average treatment effect at a specific value of the score, E[Yi(1)|Xi =x] − E[Yi(0)|Xi =x], in a Regression Discontinuity design?

  • The regression curves are always parallel, making it impossible to determine any difference.
  • The location of the cutoff is unknown.
  • The value of x is always zero.
  • Both potential outcomes (treated and untreated) are never observed for the same individual at the specific value of x. (correct)

In Regression Discontinuity, how are units with scores just below the cutoff utilized in estimating the local causal effect?

<p>They serve as a control group for units with scores just above the cutoff. (D)</p> Signup and view all the answers

In the context of Regression Discontinuity, the potential outcomes for individuals with high and low scores relative to the cutoff are likely to differ based on what pre-treatment characteristics?

<p>Ability and motivation. (B)</p> Signup and view all the answers

In regression discontinuity design (RDD), what is the primary trade-off when choosing a bandwidth around the cutoff point?

<p>Narrower bandwidths reduce bias but increase variance; wider bandwidths reduce variance but may increase bias. (A)</p> Signup and view all the answers

What is the key difference between sharp and fuzzy regression discontinuity designs (RDD)?

<p>In sharp RDD, the probability of treatment changes discontinuously from 0 to 1 at the cutoff, while in fuzzy RDD, the probability of treatment changes discontinuously by an amount less than 1. (A)</p> Signup and view all the answers

In the context of fuzzy Regression Discontinuity Design (RDD), what role does being above the cutoff serve?

<p>It serves as an instrument for receiving the treatment because it increases the probability of treatment, though not necessarily to 1. (D)</p> Signup and view all the answers

Why is it important to check the robustness of results to different modeling and data choices in a Regression Discontinuity Design (RDD)?

<p>To address the limitations of RDD and ensure that results are not driven by specific, arbitrary choices. (A)</p> Signup and view all the answers

Consider a scenario where admission to a specialized school is determined by a score on an entrance exam. School A admits students scoring above 90, School B admits students above 85. What RDD challenge does this situation exemplify?

<p>There are multiple cutoffs, leading to potential complications in identifying treatment effects. (A)</p> Signup and view all the answers

In the context of regression discontinuity design (RDD), what does the term 'bandwidth' refer to?

<p>The range of values around the cutoff point of the running variable that are included in the local regression. (A)</p> Signup and view all the answers

In a fuzzy RDD, which of the following statistical techniques is commonly employed to estimate the treatment effect, acknowledging the imperfect compliance?

<p>Instrumental Variables (IV) regression, where the assignment based on the cutoff is used as an instrument for actual treatment. (C)</p> Signup and view all the answers

Suppose a researcher is using RDD to study the effect of a scholarship on college enrollment. The cutoff is a minimum GPA of 3.5. The researcher finds that students with a GPA of 3.49 have a 60% enrollment rate, while those with a GPA of 3.51 have an 80% enrollment rate. What concern is MOST relevant?

<p>Students may be manipulating their GPA to be above the cutoff, which could bias the results. (C)</p> Signup and view all the answers

A researcher wants to study the effect of a new job training program on employment rates. Eligibility for the program is determined by an index score; individuals scoring above 50 are eligible. However, not everyone eligible enrolls in the program. Assuming the researcher uses a Regression Discontinuity Design (RDD), which of the following is the most critical assumption for valid causal inference?

<p>Individuals near the cutoff are similar in all respects except for their eligibility status and subsequent program enrollment. (C)</p> Signup and view all the answers

In Regression Discontinuity Design (RDD), what key assumption must hold true to ensure the validity of causal inferences?

<p>Units do not have the ability to precisely manipulate their own value of the running variable. (A)</p> Signup and view all the answers

When testing the assumptions of Regression Discontinuity Design (RDD), what does plotting the histogram of the running variable help to identify?

<p>Whether observations are evenly distributed around the threshold. (A)</p> Signup and view all the answers

In the context of Regression Discontinuity Design (RDD), what is the purpose of checking if observations just above and just below the threshold are similar with respect to other observables?

<p>To validate the assumption that assignment to treatment is essentially random at the cutoff. (A)</p> Signup and view all the answers

In Regression Discontinuity Design, the treatment effect at the cutoff is estimated as $E[Y_i|X_i \geq c] - E[Y_i|X_i < c]$, where $c$ is the cutoff. What does this expression represent?

<p>The difference in expected outcomes for units just above and just below the cutoff. (A)</p> Signup and view all the answers

What additional data would be most useful to strengthen the conclusion that a jump in death rates after age 21 in the US is due to increased alcohol access and consumption?

<p>Data on alcohol consumption by age and causes of death by age. (D)</p> Signup and view all the answers

When using binned scatter plots in RDD, what is the primary reason for fitting regression lines separately on each side of the cutoff?

<p>To allow for different relationships between the running variable and the outcome on either side of the threshold. (C)</p> Signup and view all the answers

What is the potential consequence of units being able to precisely manipulate their value of the running variable in Regression Discontinuity Design (RDD)?

<p>It biases the estimation of the treatment effect at the cutoff. (B)</p> Signup and view all the answers

In a Regression Discontinuity Design (RDD) studying the effect of a policy change at age 21 on health outcomes, what is the most plausible threat to the validity of the design?

<p>Individuals accurately sorting themselves around the age 21 threshold to gain the policy benefit. (C)</p> Signup and view all the answers

How does Regression Discontinuity Design (RDD) address the challenge of confounding variables when estimating treatment effects?

<p>By assuming that treatment assignment is essentially random in a narrow window around the cutoff. (C)</p> Signup and view all the answers

In Regression Discontinuity Design (RDD), what critical assumption ensures the validity of causal inference around the cutoff?

<p>Units on either side of the cutoff are comparable in all relevant aspects, except for their treatment status. (C)</p> Signup and view all the answers

Within the local randomization framework of Regression Discontinuity Design (RDD), how are units near the cutoff treated?

<p>They are considered to be randomly assigned to treatment or control groups. (D)</p> Signup and view all the answers

What does the continuity-based framework in Regression Discontinuity Design (RDD) assume regarding potential outcomes near the cutoff?

<p>Potential outcomes are continuous, meaning average potential outcomes change smoothly near the cutoff. (A)</p> Signup and view all the answers

Considering the example of the minimum legal drinking age in the US, what serves as the running variable in a Regression Discontinuity Design (RDD) examining the effect of alcohol access on mortality?

<p>Age. (A)</p> Signup and view all the answers

In the context of Regression Discontinuity Design (RDD), what characterizes the relationship between the running variable ('a') and the treatment status ('Da') in the legal drinking age example?

<p>Da is a deterministic and discontinuous function of 'a', changing abruptly at the cutoff. (C)</p> Signup and view all the answers

How does Regression Discontinuity Design (RDD) address the issue of treatment assignment not being random?

<p>By focusing on the discontinuity in treatment assignment at the cutoff point. (A)</p> Signup and view all the answers

In Regression Discontinuity Design (RDD), what is the significance of observing a discontinuity in the outcome variable at the cutoff point?

<p>It provides evidence of a causal effect of the treatment on the outcome. (B)</p> Signup and view all the answers

Flashcards

E[Y(1)|X]

The average potential outcome when treated, observed for those with high scores.

E[Y(0)|X]

The average potential outcome when NOT treated, observed for those with high scores.

Local Causal Effect

The effect of a treatment at the cutoff point in a Regression Discontinuity design. Control units with scores just below the cutoff are compared to treated units with scores just above it.

Lack of Common Support

In RD designs, units in control and treatment groups cannot have the same value of the running variable.

Signup and view all the flashcards

Extrapolation in RD

Estimating treatment effects at the cutoff by extending the observed relationships. We can't observe both treated and untreated states for the same value of X.

Signup and view all the flashcards

RDD Key Assumption

Units near the cutoff are similar in all aspects except treatment status.

Signup and view all the flashcards

RDD as Local Random Experiment

Units near cutoff are as if randomly assigned to treatment or control groups. (Local Randomization)

Signup and view all the flashcards

Continuity of Potential Outcomes Framework

Average potential outcomes change smoothly near the cutoff. (Continuity-Based Framework)

Signup and view all the flashcards

Minimum Legal Drinking Age Example

Legal drinking age at 21 in the US

Signup and view all the flashcards

Treatment (T) in Drinking Age Example

Legal access to alcohol.

Signup and view all the flashcards

Outcome (Y) in Drinking Age Example

Likelihood of death (or specific cause of death).

Signup and view all the flashcards

Discontinuity of Treatment Status

Treatment status abruptly changes at a specific point based on a running variable.

Signup and view all the flashcards

Sharp Cutoff

The value of the outcome variable changes sharply at a specific cutoff point of the running variable.

Signup and view all the flashcards

Treatment Effect at Cutoff

The difference in the outcome variable at the cutoff point, revealing the immediate impact of the treatment.

Signup and view all the flashcards

Treatment Effect Formula

Calculated as the difference between the regression line of the treated group (α1) and the control group (α0) at the cutoff point.

Signup and view all the flashcards

Binned Scatter Plots

Papers often aggregate individual level micro data into smaller bins to show relationships more clearly.

Signup and view all the flashcards

RDD Mechanisms

Data is often needed regarding alcohol consumption and causes of death by age to verify that the increase of death rates is actually caused by increased alcohol consumption.

Signup and view all the flashcards

RDD Assumption

Individuals should not be able to precisely control or manipulate their value of the running variable around the cutoff.

Signup and view all the flashcards

Testing for Sorting

Checking whether the observations are distributed evenly around the cutoff.

Signup and view all the flashcards

Checking Observables

Verifying if observable characteristics of individuals just above and below the threshold are similar.

Signup and view all the flashcards

Histogram of Running Variable

A graph displaying how observations are grouped across the range of the running variable.

Signup and view all the flashcards

Impact of Limited Data in RDD

Using fewer data points results in a larger variance or standard error in the estimate, indicating more noise.

Signup and view all the flashcards

Bandwidth in RDD

The segment of observations around the cutoff used for estimating the local linear regression.

Signup and view all the flashcards

Local Linear Regression (Below Cutoff)

Estimates the expected outcome (e.g., death rate) given age, specifically for those slightly younger than the cutoff (e.g., between 19 and 21). It models the relationship as a linear function.

Signup and view all the flashcards

Local Linear Regression (Above Cutoff)

Estimates the expected outcome (e.g., death rate) given age, specifically for those slightly older than the cutoff (e.g., between 21 and 23). It models the relationship as a linear function.

Signup and view all the flashcards

Robustness in RDD

Verifying that the RDD results are consistent and reliable under different analysis choices.

Signup and view all the flashcards

Bandwidth Variation

Varying the range of data points considered around the cutoff to ensure the results are not overly sensitive to the specific bandwidth chosen.

Signup and view all the flashcards

Specification Variation

Testing different equations or models that describe the relationship between the running variable and the outcome to ensure the findings are not model-dependent.

Signup and view all the flashcards

Fuzzy Regression Discontinuity (Fuzzy RD)

An RDD where the cutoff leads to a change in the probability or intensity of treatment, but not a complete switch.

Signup and view all the flashcards

Discontinuous Treatment Probability

The probability of receiving treatment changes discontinuously at the cutoff.

Signup and view all the flashcards

Study Notes

  • Regression discontinuity design (RDD) is an approach to estimate the causal effect of treatment

Observational Alternatives to Experiments

  • Selection on observables involves differing treatment and control groups based on observable characteristics
  • Selection on unobservables involves differences based on unobservable characteristics
  • Exogenous variables induce variation in treatment through instrumental variables
  • Regression discontinuity designs have a known selection mechanism
  • Treatment and controls are observed before and after treatment in difference-in-differences

RDD Basics

  • RDD was introduced by Thistlethwaithe and Cambell in 1960
  • Used to study merit awards impact where awards are given if test score exceeds a cutoff
  • RDD reappeared and formalized in economics in late 90s
  • Has proven to be a powerful causal tool in empirical economics and other disciplines
  • Disciplines include: political science, education, epidemiology, criminology
  • RDD has strong internal validity but is very data intensive
  • There needs to be a lot of observations near the cutoff

Causality and RDD

  • RDD isolates the causal effect of treatment where individuals become treated after crossing an arbitrary cutoff
  • RDD has three fundamental components: running variable, cutoff, and treatment
  • Individuals become treated after crossing a cutoff in the running variable
  • Sharp RDD involves treatment received with probability zero below the cutoff and one above it
  • Fuzzy RDD involves the probability of receiving treatment increasing discontinuously at the threshold
  • RDD assumes potential outcomes evolve smoothly across the cutoff
  • If there is no manipulation of the running variable, observations just below the threshold are similar to those just above, forming a control group

Running Variables and Cutoffs

  • Examples of running variables and cutoffs: entry to high school/university depending on test scores/GPA
  • Eligibility to vote or buy alcohol after a certain age
  • Access to services based on residential location and catchment areas
  • Candidate vote share determining election status (treatment)
  • Speeding fines in Finland based on exceeding the speed limit by more than 20 km

Sharp RDD

  • Test scores determine admission to a course or school as an example
  • The running variable is the test score
  • A threshold score is required for passing the exam, like 50/100, which is the cutoff
  • Treatment involves attending the course or school

Causal Inference Problem in RDD

  • The fundamental problem of causal inference shows because it is only possible to observe the outcome under control for units below the cutoff
  • Additionally, can only observe the outcome under treatment for units above the cutoff
  • As an example: test scores determine entry to education
  • Two people with the same score and a score above the cutoff, the other rejected so there is no common support in scores between the accepted and rejected group. Potential outcomes, like ability and motivation, are different between those who score high and low

Local Causal Effect

  • If units cannot perfectly "sort" around the cutoff, the discontinuous change in treatment probability can still be used to understand the local causal effect
  • Units with scores barely below the cutoff can be used as a control group
  • Units with scores barely above the cutoff can be the treatment

Key Point and Assumption of RDD

  • Units with similar score values on opposite sides of the cutoff are comparable in all aspects except treatment status
  • In a small neighborhood around the cutoff conditions mimic a randomized experiment with local randomization
  • There is a continuity of average potential outcomes near the cutoff with continuity-based framework
  • At the age of 21 is the legal drinking age in the U.S
  • T is legal access to alcohol and Y is the likelihood of dying (and specific cause)

Alcohol and Deaths

  • The treatment status is a deterministic function of age so age determines status
  • Treatment status is a discontinuous function of age where after the cutoff is reached, remains unchanged.

Testing RDD Assumptions

  • RDD assumes units cannot precisely manipulate their running variable value
  • This can be tested checking for sorting, looking at if observations just above and below the threshold are similar regarding observables
  • Placebo tests also confirm assumption, using noneffective cutoffs or other ages in drinking example.

Sorting or Manipulation of the Running Variables

  • RDD assumes units do not have ability to manipulate the value
  • If treatment is beneficial, units would want to receive the treatment and sort on the right side of the cutoff
  • With no manipulation, treated observations above the cutoff the control observations below it
  • Test: Use histogram of the running variable to see number of observations near cutoff similar
  • A formal statistical density test can also be used (McCrary test)

Test of Observable Variables

  • One of the most important RDD falsification tests examines if treated units are similar to control units near the cutoff in terms of observable characteristics
  • If units lack the ability to alter the running variable, there should be no systematic differences
  • All predetermined variables should be analyzed using RDD as the outcome of interest

Placebo Tests

  • Placebo test 1 replaces the true cutoff value with a fake cutoff in the running variable
  • A value at which the treatment status does not really change, performing estimation and inference using this “fake” cutoff. A significant treatment effect should occur only at the true value and not elsewhere.
  • Placebo test 2 runs placebos at the true cutoff but replaces the outcome Y with other outcomes that should not be affected by the treatment

Technical Issues

  • RDD is to comparing means for those just above to those just below the cut-off
  • Often, there isn't enough data to estimate the treatment effect simply by comparing means at the cutoff
  • This requires the choice of bandwith, which is a balance between accurate estimation and having enough data points

How to Address Limitations - Robustness

  • In an a RDD paper, must show robustness to different modelling, data choices, and demonstrate results are similar with different bandwidths around cutoff

Fuzzy RD (Regression Discontinuity)

  • When passing the cutoff creates a jump in treatment probabilities or treatment intensity
  • Rather than switching the treatment on/off completely, here the resulting RD design is fuzzy
  • Different schools have different cut-offs. Suppose there are 3 schools, and school 1 has the highest cut-off for intake. Examples include but are not inclusive to exam school or charter school.
  • Scoring > cut-off for school 1: this increases your probability of attending an exam school, but not to p=1

Sarvimäki, Uusitalo & Jäntti (2021)

  • After WW II approx 11% of the Finnish population was moved from ceded Soviet union.
  • Re-homed to remaining parts of Finland.
  • Displaced farmers approximately make up 50% of those displaced. They were given land and assistance to establish new farms in areas with soil and climate as origin regions.
  • Former neighbours got resettled close to each other to preserve social networks.
  • Since 1948 the displaced would cease to be issued subsidies, and instead were at liberty to sell land and move from the area.

Meyersson (2014)

  • Uses a regression discontinuity design to compare municipalities where the Islamic party barely won or [barely] lost elections
  • The despite negative raw correlations, the RD results reveal that, over six years, Islamic rule increased female secular high school education

Speeding Tickets

  • In Finland, speeding tickets become income-dependent if the driver's speed exceeds the speeding limit by more than 20 km/h
  • This Leads to a substantial jump in the size of the fine.
  • No bunching below the threshold, but smooth speed distributions are around the speed threshold.
  • Discontinuity may assist determining, estimating the the effect of punishment with fine size relative likelihood to re-offend.
  • To compare similar individuals who drove 19 and 21 km too fast.

RDD Recap

  • If a rule determines treatment due to a hard to predict cut-off, one can use the rule to estimate causal effect without RCT.
  • The necessary criteria using RDD: the running variable, treatment, and cutoff and treatment assignments.
  • Discontinuously change as function running the variable and the cutoff.
  • Units just below and above the cut-off are very similar and comparable.
  • Tests for Validity and Design such as Density tests, Test for balance covariate or and Test of placebo.
  • The challenges requires a lot of observations that are in range of cut off. Or and cannot extrapolate results for to units that are far from the cut- off using Local causal effects!

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Description

This lesson covers the features of Regression Discontinuity (RD) designs, including common support, estimation with imperfect sorting, and challenges in estimating treatment effects. It also discusses how units near the cutoff are used and the role of pre-treatment characteristics.

More Like This

Statistics Regression Analysis Quiz
10 questions
Key Concepts in Regression Discontinuity
21 questions
Lesson 14: Regression Discontinuity and IV
11 questions
Use Quizgecko on...
Browser
Browser