Mediation Models: Effects of X on Y

Choose a study mode

Play Quiz
Study Flashcards
Spaced Repetition
Chat to Lesson

Podcast

Play an AI-generated podcast conversation about this lesson
Download our mobile app to listen on the go
Get App

Questions and Answers

Explain the difference between the Structural Equation Modeling (SEM) approach and the Causal Inference approach to mediation analysis, highlighting their focus and the types of variables they typically consider.

The SEM approach focuses on estimating parameters for an entire model with at least two equations. The Causal Inference approach focuses on the causal effect of X on Y using counterfactuals considering X,Y, and M as binary variables, often with an interaction effect.

Describe three potential reasons why researchers often fail to adequately consider the causal assumptions underlying mediation analysis.

  1. Psychologists historically avoided causal language.
  2. Researchers mistakenly believed mediation analyses are robust to assumption violations much like multiple regression.
  3. The influential Baron and Kenny (1986) paper did not emphasize confounding as an assumption.

Define what a “post-treatment confounder” is. How should it be handled in mediation analysis and what is the consequence to estimates if handled inappropriately?

A post-treatment confounder is a variable C caused by X, which then causes both M and Y. It can be treated as a second mediator, allowing for all indirect effects to be identified. If instead C is inappropriately not treated as a mediator it results in an inability to use potential outcomes to define the natural indirect effect of X on Y.

Explain why the proportion of the effect that is mediated ($ab/c$) can be an unstable measure of mediation, and under what conditions is it advisable to compute this measure.

<p>The proportion mediated ($ab/c$) is unstable because it is very sensitive to small values of the total effect ($c$). It is only advisable to compute this measure if the standardized total effect is at least ±.2, but can be informative when the direct effect is not statistically significant.</p> Signup and view all the answers

Describe the joint significance test for the indirect effect, its strengths and weaknesses, and new findings regarding its properties.

<p>The joint significance test evaluates the significance of both path a and path b. If both are significant, the indirect effect is considered non-zero. Strengths: simple and used in power computations. Weaknesses: Does not provide a confidence interval, and is affected by heteroscedasticity. Is not affected by non-normality.</p> Signup and view all the answers

Flashcards

Direct Effect

The effect of X on Y not due to the mediator.

Indirect Effect

The amount of mediation, typically measured as a times b.

Confounding Variable

A variable that affects both the mediator (M) and the outcome (Y), potentially distorting the observed relationships.

Causal Inference Approach

Effects are defined using potential outcomes rather than in terms of structural equations, accommodating non-interval variables and allowing for X and M interaction

Signup and view all the flashcards

Collider Variable

A variable caused by both the mediator and the outcome. Conditioning on a collider can create spurious associations between the mediator and outcome.

Signup and view all the flashcards

Study Notes

Introduction

  • A variable X is presumed to cause another variable Y.
  • X refers to the causal variable.
  • Y refers to the outcome variable.
  • Path c in the unmediated model represents the total effect of X on Y.
  • The effect of X on Y can be mediated by a mediating variable M.
  • Path a signifies the effect of X on M.
  • Path b signifies the effect of M on Y.
  • Path c' signifies the direct, unmediated effect of X on Y.
  • The focus is on estimating and testing linear mediation models where M and Y (and sometimes X) are measured at the interval level.
  • This approach is termed the Structural Equation Modeling or SEM approach, although it does not always require an SEM program.
  • The discussion includes the interaction of X and M, a nonlinear effect.

The Three Effects of X on Y

  • The total effect of X on Y is denoted as c.
  • The direct effect is denoted as c'.
  • The indirect effect is denoted as ab.
  • Total effect equals direct effect plus indirect effect: c = c' + ab.
  • Indirect effect equals the reduction of the effect of the causal variable on the outcome: ab = c - c'.
  • In contemporary mediational analyses, the indirect effect or ab quantifies the mediation amount.
  • The equation c = c' + ab holds when multiple regression or SEM without latent variables is used.
  • The equation also requires the same cases in all analyses and the same covariates in all equations.
  • For logistic analysis and SEM with latent variables, c = c' + ab is only approximately equal.
  • For multilevel models, the indirect effect has an additional covariance term.
  • Inconsistent mediation occurs when the indirect and direct effects have different signs.
  • The amount of reduction in the effect of X on Y due to M should be quantified by the indirect effect (ab).
  • The proportion of the effect that is mediated is ab/c or 1 - c'/c, is theoretically informative but unstable if c is small.
  • For linear models, the indirect effect is computed as the product of a and b (product method).
  • The difference method uses c - c' as the measure of the indirect effect.
  • The Causal Inference approach defines more formal definitions of the total, direct, and indirect effects

Causal Assumptions

  • M and Y have normally distributed residuals with homogenous variances and independence.
  • Four key causal assumptions of mediational analyses:
    • No X-M Interaction: The effect of M on Y (b) does not vary across levels of X.
    • Causal Direction: M causes Y, but Y does not cause M.
    • Perfect Reliability in M: The reliability of M is perfect.
    • No Confounding: There is no variable that causes both M and Y.
  • The assumption of X is randomly assigned.

The Indirect Effect

  • The indirect effect is the standard measure of mediation in linear models, estimated by the product method (ab).
  • Four causal assumptions must hold for the indirect effect to be valid, especially no X-M interaction.
  • Four different tests of ab are available

Sobel Test

  • An initial test to test the indirect effect (also called delta method).
  • The test divides ab by the square root of the variance and treats the ratio as a Z test.
  • This test is very conservative since it falsely assumes that the indirect effect has a normal distribution.
  • Should no longer be used.

Joint Significance of Paths a and b

  • A nonzero indirect effect is inferred if tests of paths a and b are both significant.
  • The joint significance test appears to work well, but is rarely used as the definitive test.
  • Joint significance presumes that a and b are uncorrelated.
  • Simulation results have shown that this test performs about as well as a bootstrap test.
  • The test is affected by heteroscedasticity but not affected by non-normality.
  • Provides a straightforward way to determine the power of the test of the indirect effect.
  • Does not provide a confidence interval for the indirect effect

Bootstrapping

  • The currently recommended test for the indirect effect.
  • Bootstrapping is a non-parametric method based on resampling with replacement.
  • The indirect effect is computed from each resample to empirically generate a sampling distribution.
  • A correction for bias can be made.
  • A confidence interval, a p value, or a standard error can be determined from the distribution.
  • Determine if zero is in the confidence interval.
  • Several SEM programs can be used to bootstrap.
  • Hayes and Preacher have written SPSS and SAS macros that can be downloaded for tests of indirect effects.
  • The current recommendation is to use the percentile bootstrap (bootstrap with no bias correction) and not to use the bias correction.

Monte Carlo Bootstrap

  • A computer simulation test of the indirect effect is proposed.
  • Starts with the estimates a and b, and their standard errors.
  • Random normal variables for a and b are generated to create a distribution of ab values.
  • Confidence intervals and a p value can be created with these values.
  • The test is useful in situations in which there is not easy to bootstrap (e.g., the raw data are unavailable).
  • If estimates of a and b are correlated, then that correlation can be included in the Monte Carlo simulation.

Failure to Consider Causal Assumptions

  • In 1981, the four causal assumptions (direction, interaction, reliability, and confounding) were described.
  • Despite emphasis on these assumptions, most mediation analysis papers fail to discuss them.
  • Gelfand et al. (2009) surveyed mediation studies published in 2002
  • Rijnhart et al. (2022) examined 175 mediation papers published from 2005 to 2009.
  • 10% of these papers reported testing for X-M interaction.
  • Possible reasons for the failure to consider assumptions;
    • Psychologists avoided causal language and believed that the only way to establish causality was through experimentation.
    • Researchers presumed that mediation analyses were robust over violation of assumptions and could be safely ignored.
    • The most prominent paper discussing the SEM approach to mediation does not mention confounding as an assumption.

Causal Inference Approach

  • The approach mainly discussed is the SEM approach.
  • An alternative and very different approach is the Causal Inference approach.
  • The Causal Inference approach uses the same basic causal structure as the linear approach.
  • The relationships between variables need not be linear, and the variables need not be interval.
  • X is commonly called Exposure and is often symbolized by an A.
  • The Causal Inference approach focuses on the causal effect of X on Y, and not on the entire model.
  • The Causal Inference approach attempts to develop a formal basis for causal inference in general and mediation in particular, and it typically uses counterfactuals or potential outcomes.

Assumptions

  • Necessary assumptions for mediation are discussed using SEM terms.
  • The Causal Inference approach makes the following assumptions about confounding:
    • No unmeasured confounding of the X-Y relationship
    • No unmeasured confounding of the M-Y relationship
    • No unmeasured confounding of the X-M relationship
    • Variable X must not cause any known confounder of the M-Y relationship.
  • There are ways to avoid Condition 4.
  • These conditions are sufficient but not necessary.
  • The Causal Inference approach emphasizes sensitivity analysis. These are analyses that ask questions such as, “What would happen to the results if there was a M-Y confounder that had both a moderate effect on M and Y?”

Definitions of the Direct, Indirect, and Total Effects

  • Effects are defined using counterfactuals, not structural equations.
  • For person i, it can be asked: What would person i's score on Y be if person i had scored 0 on X? That value, called the potential outcome, is denoted Yi(0).
  • The population average of these potential outcomes across persons is denoted as E[Y(0)].
  • The definition of the effect of X on Y or total effect as E[Y(1)] - E[Y(0)].
  • In the Causal Inference approach, a Controlled Direct Effect or CDE for the mediator equal to a particular value
  • Then the Natural Direct Effect or NDE is determined
  • The parallel Natural Indirect Effect or NIE is defined
  • The Total Effect becomes the sum of the two: TE = NIE + NDE = E[Y(1,M1)] - E[Y(1,M0)] = E[Y(1)] - E[Y(0)]
  • Note that both the CDE and the NDE would equal the regression slope or path c’ if the model is linear
  • The NIE would equal ab, and the TE would equal ab + c’ if assumptions are met.

Meeting the Causal Assumptions of Mediation

  • Mediation is a hypothesis about a causal network.
  • Focuses on the assumption that X is a randomized variable.
  • Remedies must be undertaken to remove bias in the effect from X to M and from X to Y should X not be a randomized variable.

Direction

  • Measure X before M and Y to ensure that X is not caused by M or Y.
  • Similarly, measure M before Y to make sure that Y does not cause M.
  • The mediator may be caused by the outcome variable (Y causing M) , what is commonly called a feedback model.
  • When the causal variable is a manipulated variable, it cannot be caused by either the mediator or the outcome.
  • If it can be assumed that c' is zero, then a model with reciprocal causal effects can be estimated.
  • Smith (1982) has developed another method for the estimation of reciprocal causal effects.
  • Treat the mediator and the outcome variables as outcome variables and they each may mediate the effect of the other.
  • Both the mediator and the outcome must have a variable that causes each of them but not the other.

Interaction

  • Possibility that M might interact with X to cause Y.
  • The X with M interaction should always be estimated and tested and added to the model if present.
  • The Causal Inference approach begins with the assumption that X and M interact and treats the interaction as part of the mediation.
  • We define the interaction as product of X and M or XM, and denote its effect on Y as d:
  • The indirect and direct effects greatly complicates the interpretation.

Reliability

  • Unreliability is very often present in measurement.
  • If the mediator is measured with less than perfect reliability of 1.00, then the effects (b and c') are likely biased.
  • The effect of the mediator on the outcome (path b) is attenuated.
  • The effect of the causal variable on the outcome (path c'') is likely over-estimated if ab is positive and under-estimated when ab is negative.
  • Measurement error in Y does not bias unstandardized estimates, but it does bias standardized estimates, attenuating them.
  • If two or more of the variables have measurement error, the biases are more complicated
  • Biases due to measurement are hardly discussed.
  • The researcher can a priori fix the reliability of M in an SEM analysis or adjust estimates or there can be multiple indicators of M within an SEM analysis.
  • Measurement error in Y does not bias the unstandardized estimates of b and c′.
  • An power can be dramatically reduced if the mediator is a latent variable.
  • To allow for measurement error in M and to allow true M to interact with X, new analytic methods are needed.

Confounding

  • Confounding has many different names: omitted variable, spurious variable, third variable, and selection.
  • There is a variable that causes both variables in the equation.
  • Two different strategies for dealing with confounding: design and analysis.
  • The two design strategies discussed are randomization and holding the confounder constant.
  • Statistical strategies and five different strategies are discussed: instrumental variable estimation, front-door adjustment, cofounding as the null hypothesis, matching on propensity scores, and inverse propensity weighting.

Design Strategies

  • Randomization: By randomizing X, it is known that both M and Y do not cause X.
  • Randomization of the mediator is discussed; it can be difficult to manipulate mediators since they are often internal variables.
  • Hold the Confounder Constant: To remove the effects of a confounding variable, the researcher holds a variable constant.
  • If a variable is held constant, it no longer varies and so is no longer a variable and cannot be a confounder.

Statistical Strategies

  • Instrumental Variable: Used to remove the effects of an unmeasured confounding variable if c' is zero.
  • Front-Door Adjustment: Variable B that completely mediates the M to Y relationship but is not affected by C, an unmeasured variable
  • The path from M to B is identified by regressing B on X and M.
  • The path from B to Y is estimated by regressing Y on X and B.
  • Confounding as the Null Hypothesis: It might be that single unmeasured variable can explain the covariation between all the variables.
  • Multiple Regression or Analysis of Covariance Adjustment: The most often used strategy to remove the effects of confounding variables is to include those variables in the analysis as covariates, what Pearl and Mackenzie (2018, pp. 158-159) call backdoor path.
  • p hacking: Selecting a particular combination of covariates to show a desired result.
  • Inverse Propensity Weighting: Inverse probability weighting or marginal structural models.
  • Inverse propensity weighting is the idea using propensity scores to re-weight the analysis to reduce the effect of confounding.

Strong Assumptions

  • It is assumed that ALL the confounding variables are assumed to be included in the set of covariates and all the covariates are measured without error.
  • A reasonable expectation is that they these methods would typically remove only some but not all of the effects of confounding, so sensitivity analyses are strongly recommended.

Some Confounders May Not Be Confounders

  • Researchers need to be careful about what variables are treated as confounders.
  • Bowtie Confounders: C is correlated with M and Y but does not cause either and so would not be treated as confounder.

Offsetting Confounders

  • There are two confounders.
  • Controlling for a confounder, biases the estimate.

Collider and Mediating Variables

  • A collider variable is caused by M and Y.
  • Note if the researcher holds a collider constant that too leads to problems.
  • A mediator too should not be treated as a confounder.
  • Post-treatment Confounders: G Estimation

Sensitivity Analyses

  • One can determine what would happen to the mediational paths if one or more of these assumptions is violated by conducting sensitivity analyses or what value of reliability that would make c’ equal zero.
  • A thorough mediation analysis should be accompanied by a sensitivity analysis.
  • One way to conduct a sensitivity analysis is to estimate the mediational model using SEM
  • In the case in which X is manipulated and M has measurement error and there is a confounder for M and Y, it can happen that the two biases can to some degree offset each other.

Current State of Mediational Analysis

  • Multiple Regression: Minimally two equations, one for M and one for Y, need to be estimated.
  • The major difficulty is in the testing of the indirect effect for statistical significance.
  • PROCSS can be used
  • Another alternative is to use PROCESS, which has macros (add-ons) for SPSS, SAS, and R that perform bootstrap analyses
  • Structural Equation Modeling: All the coefficients are estimated in a single run and most SEM programs provide estimates of indirect effects and bootstrapping.
  • Also SEM with FIML estimation can allow for a more complex model of missing data and conduct sensitivity analyses.
  • A fit statistic an "Information" measure like the AIC or BIC computed
  • Causal Inference Software: Mplus and the CAUSALMED procedure within SAS is highly recommended.
  • Stata 18 is another option to estimate effects using the Causal Inference approach.

Reporting Results

  • Mediation papers need to report the indirect effect and its confidence interval.
  • Paths a, b, c, and c', as well their statistical significance (or confidence interval) are reported.
  • One must discuss the likelihood of meeting the assumptions of mediational analysis reporting on the results of sensitivity analyses.

Power

  • The indirect effect is the product of two effects.
  • Possibilities for small, medium, and large effect sizes.

Distal and Proximal Mediators

  • Paths a and b are presumed to be positive. Maximum size of the product ab equals a value near c, and so as path a increases, path b must decrease and vice versa.
  • Hoyle and Kenny (1999) define a proximal mediator as path a being greater than path b and a distal mediator as b being greater than a.
  • An mediator can be too close in time or in the process to the causal variable and so path a would be relatively large and path b relatively small or too close to the outcome and with a distal mediator path b is large and path a is small.
  • Standardized a and b should be comparable in size for power.

Multicollinearity

  • If M is a successful mediator, it is necessarily correlated with X due to path a.
  • Given that path a is nonzero, the power of the tests of the coefficients b and c’ is lowered.
  • Effective sample size for the tests of coefficients b and c’s approximately N(1 - r2) where N is the total sample size and r is the correlation between the causal variable and the mediator, which is equal to standardized a.

Low Power for Tests of c and c'

  • The tests of c and c’ have relatively low power, especially in comparison to the indirect effect. It can easily happen, that ab can be statistically significant but c is not. 
  • Power advantage in testing ab over c'
  • One needs to be very careful about any claim of complete mediation based on the non-significance of c’.

Power Program and Apps

  • Can be used to forecast the power of the test that the indirect effect is zero.
  • Can also be used to forecast the minimum sample size needed to achieve a desired level of power or the sample size needed to achieve the desired level of power.
  • Alternatively, and more generally, one could use a structural equation modeling to run a simulation to conduct a power analysis.
  • MedPower, one computes the power of test of paths a and b and then multiplies their power to obtain the power of the test of the indirect effect.
  • mc_power_med, uses a Monte Carlo bootstrap.
  • pwr2ppl, This is an R package to compute power
  • Allows for up to four X variables and four mediators
  • Computes power for only the indirect effect.

Criticism of Mediation Analyses and the Response

  • Mediational analyses of cross-sectional studies very often yields invalid estimates of mediational effects.
  • Mediational analyses make strong assumptions that were rarely fully met and so conclusions from mediational analyses should be very cautious.
  • Authors should list the actual specific assumptions under which their central estimate of the causal effects can be interpreted, along with statements about the degree to which conclusions are sensitive to violations of these assumptions

Briefly Discussed Topics

  • X Within-Participants
  • Multiple Causal or X Variables
  • Multiple Mediators
  • Total effect or c is near zero, because there are two indirect effects that work in the opposite direction: offsetting mediators.
  • Test hypotheses about the linear combinations of indirect effects
  • Multiple Outcomes: Can be tested simultaneously or separately. 
  • Covariates
  • Mediated Moderation and Moderated Mediation
  • Clustered Data
  • Longitudinal Data
  • Baseline measure of M and Y might be a way of control for confounders.  
  • Intensive Longitudinal Data

Conclusion

  • Mediational analyses continue to be a topic of intense interest because they provide answers to important research questions.
  • Mediational analyses have advanced our understanding in many areas, e.g., prevention science and medicine (MacKinnon, 2024).
  • Researchers learned what are the successful mediators but what are the unsuccessful ones.
  • Mediational analyses are very popular because they help researchers answer a fundamental research question: How? Methodologists and statisticians need to continue their effort to make methods of mediation analysis comprehensible to researchers. Just because there are poor mediation analyses, does not mean all mediation analyses should never be undertaken!
  • Mediational analyses require strong causal assumptions that must be stated and taken into account. The two major assumptions concern confounding and unreliability. Mediation researchers need to state what those assumptions are and attempt to satisfy them, as well as conduct sensitivity analyses.
  • Structural Equation Modeling can be used to deal with many of the assumptions such as measurement error in M, multiple mediator and outcome models, and models which fix the c' path to zero.
  • To become fully proficient, one must learn the Causal Inference approach with potential outcomes analysis. This approach is beneficial not only for understanding nonlinear models, but also linear models. However, the Causal Inference approach needs to become more cognizant of the assumption of perfect reliability of causal variables.

Studying That Suits You

Use AI to generate personalized quizzes and flashcards to suit your learning preferences.

Quiz Team

More Like This

Mastering the Mechanism
20 questions
Zastosowanie Statystyki w Psychologii
8 questions
Pharmacy Automation and Funding Models
42 questions

Pharmacy Automation and Funding Models

BestPerformingAntigorite2474 avatar
BestPerformingAntigorite2474
Use Quizgecko on...
Browser
Browser