Causal Inference & Jakarta HOV Restrictions
25 Questions
3 Views

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson

Questions and Answers

In the context of causal inference, what does the potential outcome $y_{1i}$ represent?

  • The outcome for individual _i_ if they receive the treatment. (correct)
  • The outcome for individual _i_ if they do not receive the treatment.
  • The outcome for individual _i_ regardless of whether they receive the treatment.
  • The average outcome for all individuals in the study.

In the Hanna et al. (2017) study, what was identified as the 'treatment' when examining Jakarta's high-occupancy vehicle (HOV) restrictions?

  • The lifting of the HOV restriction. (correct)
  • Increased enforcement of HOV restrictions.
  • The average commute time of drivers in Jakarta.
  • The number of vehicles on Jakarta's roads during peak hours.

In the study of Jakarta’s HOV restrictions, which of the following best exemplifies the 'population' of interest?

  • All residents of Jakarta, including those who do not drive.
  • Drivers on trips who might use the affected routes. (correct)
  • Companies operating vehicle fleets in Jakarta.
  • All drivers globally.

What serves as a counterfactual in the Hanna et al. (2017) study of Jakarta's HOV restrictions?

<p>The state of the world just before the treatment (lifting of the HOV restriction). (B)</p> Signup and view all the answers

How is the outcome of interest defined in the study examining the impact of Jakarta's high-occupancy vehicle (HOV) restriction?

<p>The delay per kilometer traveled. (C)</p> Signup and view all the answers

In the context of examining the relationship between age and income, why might simply relying on summary statistics like correlation ($\rho$) and OLS regression estimates ($\beta_0$, $\beta_1$) be insufficient?

<p>Because summary statistics fail to capture non-linear relationships and variations across different age groups. (A)</p> Signup and view all the answers

When analyzing the relationship between two variables, such as age and income, what is the primary advantage of using nonparametric estimates for $E[Y|X=x]$ compared to relying solely on linear regression?

<p>Nonparametric estimates allow for the detection of non-linear relationships between the variables. (D)</p> Signup and view all the answers

Suppose you are analyzing the relationship between education level (X) and annual salary (Y). An OLS regression yields the equation $\hat{Y} = 20000 + 5000X$. What does the coefficient 5000 represent?

<p>For each additional year of education, the predicted annual salary increases by 5000 units. (D)</p> Signup and view all the answers

What does a correlation coefficient ($\rho$) of 0.28 between two variables X and Y suggest?

<p>A weak positive linear relationship between X and Y. (A)</p> Signup and view all the answers

Why might adding a small amount of random noise to a scatter plot of data points be useful?

<p>It can make patterns and clusters more visually discernible when data points overlap. (C)</p> Signup and view all the answers

Why is informal reasoning generally discouraged in serious economic research when constructing counterfactuals?

<p>It is inherently biased and lacks empirical support, often resembling an unsubstantiated guess. (A)</p> Signup and view all the answers

In the context of causal research, what is the primary purpose of constructing a counterfactual?

<p>To establish a benchmark for what would have occurred in the absence of a specific intervention. (B)</p> Signup and view all the answers

A researcher is studying the impact of a new job training program on employment rates. Which approach would involve comparing the employment outcomes of individuals who participated in the program with those of a similar group who did not?

<p>Control groups (D)</p> Signup and view all the answers

What is the fundamental assumption when using a control group to establish causality?

<p>All factors other than the treatment affect both groups similarly. (A)</p> Signup and view all the answers

Which of the following methods for constructing counterfactuals involves using a quantitative model to simulate alternative scenarios?

<p>Structural models (C)</p> Signup and view all the answers

A researcher aims to evaluate the impact of a new agricultural technique on crop yield. To do this, they compare fields where the technique was applied with similar fields where it was not. What is a critical assumption the researcher must make to ensure the validity of their causal inference?

<p>Factors such as soil quality, weather patterns, and pest exposure affect both sets of fields similarly. (D)</p> Signup and view all the answers

A city implements a new policy aimed at reducing traffic congestion during peak hours. To assess the policy's effectiveness, transportation officials compare traffic flow during peak hours after the policy was implemented with traffic flow during the same hours before the policy. What unaddressed factor could undermine the validity of the findings?

<p>Changes in road maintenance schedules unrelated to the new policy. (C)</p> Signup and view all the answers

What is the primary challenge in causal inference when trying to determine the treatment effect for an individual?

<p>The inability to observe both potential outcomes (with and without treatment) for the same individual. (A)</p> Signup and view all the answers

What does the Average Treatment Effect (ATE) represent?

<p>The average difference in potential outcomes (with and without treatment) across the entire population. (A)</p> Signup and view all the answers

The Average Treatment Effect on the Treated (ATT) is defined as:

<p>E[y1i - y0i | Di = 1] (D)</p> Signup and view all the answers

Why might the Average Treatment Effect (ATE) and the Average Treatment Effect on the Treated (ATT) differ?

<p>Because the treatment may have different effects on those who receive it compared to what the effect would be on those who do not. (A)</p> Signup and view all the answers

What is the primary purpose of constructing counterfactuals in causal research?

<p>To estimate what would have happened to the treated group in the absence of the treatment. (C)</p> Signup and view all the answers

A researcher is studying the effect of a new teaching method on student test scores. They find that students who were taught with the new method (the treated group) scored significantly higher than students taught with the traditional method. However, students in the treated group were also more motivated and had access to better resources. What validity issue does this study likely face?

<p>Internal validity (D)</p> Signup and view all the answers

A study finds a significant Average Treatment Effect (ATE) of a job training program in a specific city. However, when policymakers attempt to implement the same program in a rural area with a different demographic, they observe minimal impact. What type of validity is most likely compromised in this scenario?

<p>External Validity (C)</p> Signup and view all the answers

In a study examining the impact of a new drug on blood pressure, researchers use a randomized controlled trial. After analyzing the data, they find a statistically significant reduction in blood pressure for the treatment group compared to the control group. However, some participants in the control group also started exercising regularly, which could also lower blood pressure. What is the most appropriate next step for the researchers?

<p>Analyze the data to account for the effect of exercise as a potential confounding variable. (A)</p> Signup and view all the answers

Flashcards

Marginal Distribution

A distribution that examines the probabilities of a single variable, disregarding others.

Conditional Distribution

A distribution showing the probability of a variable given the value of another variable.

Conditional Expectation Function

Shows the expected value of a variable, given the value of another variable.

Covariance and Correlation

A measure of the linear association between two variables.

Signup and view all the flashcards

Ordinary Least Squares (OLS) Regression

A line of best fit that shows a linear relationship between two variables.

Signup and view all the flashcards

Treatment (Hanna et al., 2017)

The lifting of the High-Occupancy Vehicle (HOV) restriction in Jakarta.

Signup and view all the flashcards

Counterfactuals (Hanna et al., 2017)

  1. State of the world just before the treatment.
  2. Google’s prediction under ”typical traffic conditions”.
Signup and view all the flashcards

Population (Hanna et al., 2017)

Drivers on trips that might use the restricted routes.

Signup and view all the flashcards

Outcome (Hanna et al., 2017)

The delay per kilometer traveled.

Signup and view all the flashcards

Treatment Status (Di)

A variable indicating whether individual i receives treatment (1) or not (0).

Signup and view all the flashcards

Counterfactual

A benchmark of what would have happened to the treated group without the treatment.

Signup and view all the flashcards

Informal Reasoning

A non-rigorous approach, often an unsubstantiated guess.

Signup and view all the flashcards

Structural Models

Using a quantitative model to simulate alternative scenarios.

Signup and view all the flashcards

Control Groups

Comparing a treatment group to a similar control group.

Signup and view all the flashcards

Causal Questions

Used for answering causal questions.

Signup and view all the flashcards

Constructing Counterfactuals

Constructing a suitable benchmark to analyze causal effects.

Signup and view all the flashcards

Treatment Effect

Differences between treatment and control groups are attributed to treatment.

Signup and view all the flashcards

Treatment Effect (Individual)

The difference in potential outcomes with and without a treatment for an individual.

Signup and view all the flashcards

Average Treatment Effect (ATE)

The average difference in potential outcomes (with and without treatment) across a population.

Signup and view all the flashcards

ATE for the Treated (ATT)

The average treatment effect specifically for those who received the treatment.

Signup and view all the flashcards

E[a|b]

The expectation of a value 'a', given that 'b' is true.

Signup and view all the flashcards

Internal vs. External Validity (Treatment Effect)

Internal validity assesses if the study truly measures the treatment effect for the treated group, while external validity assesses if the treatment effect can be generalized to other population.

Signup and view all the flashcards

ATE vs. ATT: Why the difference?

ATE might differ from ATT because the treatment effect may vary between those who receive the treatment and those who do not.

Signup and view all the flashcards

Study Notes

  • The lecture focuses on causality, potential outcomes, and research design
  • Today's learning objectives include understanding causality, counterfactuals, potential outcomes, treatment effects, and selection bias
  • A key objective is understanding how randomization eliminates selection bias

Logistics

  • Homework 1 deadline has been extended to Thursday, January 16 at 23:59
  • Homework 2 is posted on MyCourses, due next Wednesday, January 22
  • Stata code for previous examples is available on MyCourses under "More Materials"

Quick Recap

  • Joint distributions and associations between variables covered: marginal and conditional distribution, conditional expectation function, scatter plots, covariance/correlation, regression, and OLS

Association Between Age and Income

  • How income varies with age can be visualized using a scatter plot
  • Let's use measures of dependence
  • The correlation between income and age is 0.28
  • Estimating a regression of income (Y) on age (X) yields estimates for the intercept (Bo = 10,654) and age coefficient (B1 = 297)
  • The estimates are in euros, while the y-axis is in thousands of euros
  • These summary statistics are not very helpful

Flexibility of Regression

  • Multivariate regression model can provide a more flexible fit: Y = Bo + B1X + B2X^2 + error
  • Estimates that fit the data best are: Bo = -37,549, B1 = 2.857, B2 = -31
  • In general, looking at the data in several ways is good
  • Correlation measures linear dependence of two variables
  • "Goodness of fit" is measured with multiple variables

Measuring How Well a Model Fits the Data

  • Coefficient of determination (R^2) is typically used
  • For a regression model Y = f(X) + ε where X is a vector of independent variables: R^2 = Σ(f(Xi) - Ȳ)^2 / Σ(Yi - Ȳ)^2
  • R^2 measures the variability of the dependent variable
  • An R^2 of 1 means a perfect prediction

Causal Questions

  • Prior lectures focused on descriptive questions such as "What is the joint distribution of X and Y?" to measure the actual state of the world
  • Often there is a need to evaluate X on Y impacts, like: education on earnings, marketing on sales, carbon tax on emissions, R&D on innovation, or fiscal stimulus on unemployment
  • Causal questions are about comparing counterfactual states, like "how would Y change if we changed X?"
  • Y is the outcome, X is treatment

Counterfactual States

  • Counterfactual states are almost impossible to observe for any single individual/entity
  • Everything else remains the same except the treatment (ceteris paribus)
  • Possible with lab experiments in natural sciences
  • More challenging when studying people
  • Counterfactuals can be found for the average person in a sample

Identifying Causal Relationships via Experiments

  • The lecture focuses on answering causal questions using experimental designs
  • It is helpful to design comparisons to test for causality
  • It can be helpful to consider the ideal experiment
  • There's helpful benchmark for naturally-occurring/quasi experiments
  • Natural experiments involving randomization will be discussed next week

Elements of Causal Questions

  • (1) Treatment: Impact of
  • (2) Counterfactual: Impact in comparison to
  • (3) OUtcome: Impact on
  • (4) Population: Impact for
  • Worksheet (WS) 3.1: Think of a causal question and write it down
  • Impact question from Jakarta on high-occupancy vehicle restriction vs unrestrivted road travel on travels travel times?

The Causal Question

  • What is the impact of Jakarta's high-occupancy vehicle restrictions on drivers travel times with unrestructived road travel?
  • Treatment: Lifting of the HOV rerstriction
  • Counterfactuals:
    • State of world prior to the treatment
    • Google's prediction under "typical traffic conditions"
  • Population: drivers taking those routes
  • Outcome: the delay per km travelled

Potential Outcomes

  • Focus is on binary (0/1) treatments, denoting the treatment status of individual i
  • Dᵢ = 1 if she receives the treatment, 0 if she does not
  • Outcomes are denoted by y
  • Potential outcome = y1i if Di = 1, y0i if Di = 0
    • y1i, is the outcome of individual i who has been treated
    • y0i is her outcome who has not been treated

Treatment Effect

  • The treatment effect for individual i is the different between y1i and y0i
  • Causal inference prohibits observing both yᵢ1 and y0i for a unit

Average Treatment Effect

  • The treatment effect for an individual cannot be identified
  • But average treatment effects can be estimated
    • ATE = E[y1i - yOi]
    • ATT = E[y1i - y0i | Di = 1]
  • Why ATE and ATT matter: The treatment effect may be different depending on those who get the treatment
    • Internal validity: Do we learn the true effect for the population that's being treated?
    • External validity: Can extrapolate to other populations?

Approaches for Constructing Counterfactuals

  • Causal questions need what would've happened for the treatment group
  • Researchers construct a counterfactual to determine causation
  • Approaches for constructing the counterfactual include:
    • Informal reasoning with guesses not allowed for economics!
    • Structural models that use quantative models to construct alternative states of the world
    • Control groups that compare treatment group with similar control group

Research Designs and Control Groups

  • To approximate what would have happened to the treated group without the treatment we use the comparative control group tool
  • In economics, this design/experimental approach estimates the counterfactual E[y0i|Dᵢ=1]
  • Invalid control groups leads to selection bias
  • Whether control groups provides a factual counterfactual is key

How to Find Control Group

  • Hanna et al., include a few different routes to consider

Regression Estimation

  • Dependant/outcome variable travel/delay on segment, on date d and departure hour h
  • Independent/explanatory: indicator for whether variable d is after policy learning, can be shown as
    • Postd = 0: group "control" lifts before policy
    • Postd = 1: group "treatment" lifts after policy

How Good is the Counterfactual?

  • What if the event is intended to coincide with changes in outcomes, as opposed to the changes being caused by the treatment?
  • What would outcomes have been in abscence of Policy
  • Can you average delay at a?
  • Key assumption: the observations treated would resemble control observations without treatment
  • WS 3.3: to answer causal questions using data, what is a reasonable "control group" for treatment?

Selection Bias

  • As the data amounts increase, the average samples approximate the population average
  • Avg[yi|D = 1] - Avg [yi | D = 0] => E[y;|D = 1] - E[y;|D = 0] * treatment group, = Control vs ATE and ATT

Randomized Selection

  • Randomly assign people, creating controlled and unbiased groups
  • Potential outcomes are: same expectation
  • control tells us without treatment
  • WS# 3.4. Identify issues when there is selection for given controls

Summary

  • Causality requires comparison of counterfactual states
  • Only one is observed
  • Control groups can only ifere the treatment group with absence of treatment.
  • Selection bias occurs when treatments not comparable
  • With expectation only differences being groups recieving treatments is a part of randomization to eliminate selection bias

Upcoming

  • Pre-class assignment 4 which includes summarization and reading of an article
  • Homework 2 is due Jan 22 at time 14:00
  • Tips: Dont wait to the last minute, skill build to time work with data set
  • Session 2 is tomorrow
  • Help for course can be found using "Zulip"
  • participation is incentive

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

Related Documents

Description

Questions cover potential outcomes in causal inference. It specifically refers to Hanna et al. (2017) study on Jakarta's high-occupancy vehicle (HOV) restrictions, counterfactuals, and limitations of linear regression.

More Like This

Use Quizgecko on...
Browser
Browser