A national experiment reveals where a growth mindset improves achievement PDF
Document Details

Uploaded by EliteAmbiguity1280
University of California, Irvine
David S. Yeager
Tags
Related
Summary
A research article details a national experiment on secondary education in the United States. The study reveals that a short online growth mindset intervention can improve the academic outcomes of adolescents, improving grades and course enrollment. The article identifies school contexts that sustained the effects of the growth mindset intervention. Keywords: student achievement, growth mindset, secondary education, education.
Full Transcript
Article https://doi.org/10.1038/s41586-019-1466-y...
Article https://doi.org/10.1038/s41586-019-1466-y OPEN A national experiment reveals where a growth mindset improves achievement David S. Yeager1*, Paul Hanselman2*, Gregory M. Walton3, Jared S. Murray1, Robert Crosnoe1, Chandra Muller1, Elizabeth Tipton4, Barbara Schneider5, Chris S. Hulleman6, Cintia P. Hinojosa7, David Paunesku8, Carissa Romero9, Kate Flint10, Alice Roberts10, Jill Trott10, Ronaldo Iachan10, Jenny Buontempo1, Sophia Man Yang1, Carlos M. Carvalho1, P. Richard Hahn11, Maithreyi Gopalan12, Pratik Mhatre1, Ronald Ferguson13, Angela L. Duckworth14 & Carol S. Dweck3 A global priority for the behavioural sciences is to develop cost-effective, scalable interventions that could improve the academic outcomes of adolescents at a population level, but no such interventions have so far been evaluated in a population-generalizable sample. Here we show that a short (less than one hour), online growth mindset intervention— which teaches that intellectual abilities can be developed—improved grades among lower-achieving students and increased overall enrolment to advanced mathematics courses in a nationally representative sample of students in secondary education in the United States. Notably, the study identified school contexts that sustained the effects of the growth mindset intervention: the intervention changed grades when peer norms aligned with the messages of the intervention. Confidence in the conclusions of this study comes from independent data collection and processing, pre- registration of analyses, and corroboration of results by a blinded Bayesian analysis. About 20% of students in the United States will not finish high school reflect on ways to strengthen their brains through schoolwork, and on time1. These students are at a high risk of poverty, poor health and they internalize the message by teaching it to a future first-year ninth early mortality in the current global economy2–4. Indeed, a Lancet grade student who is struggling at the start of the year. The intervention commission concluded that improving secondary education outcomes can lead to sustained academic improvement through self-reinforcing for adolescents “presents the single best investment for health and cycles of motivation and learning-oriented behaviour. For example, a wellbeing”5. growth mindset can motivate students to take on more rigorous learn- The transition to secondary school represents an important period of ing experiences and to persist when encountering difficulties. Their flexibility in the educational trajectories of adolescents6. In the United behaviour may then be reinforced by the school context, such as more States, the grades of students tend to decrease during the transition positive and learning-oriented responses from peers or instructors10,17. to the ninth grade (age 14–15 years, UK year 10), and often do not Initial intervention studies with adolescents taught a growth mindset recover7. When such students underperform in or opt out of rigorous in multi-session (for example, eight classroom sessions15), interactive coursework, they are far less likely to leave secondary school prepared workshops delivered by highly trained adults; however, these were for college or university or for advanced courses in college or univer- not readily scalable. Subsequent growth mindset interventions sity8,9. In this way, early problems in the transition to secondary school were briefer and self-administered online, although lower effect sizes can compound over time into large differences in human capital in were, of course, expected. Nonetheless, previous randomized eval- adulthood. uations, including a pre-registered replication, found that online One way to improve academic success across the transition to sec- growth mindset interventions improved grades for the targeted group ondary school is through social–psychological interventions, which of students in secondary education who previously showed lower change how adolescents think or feel about themselves and their achievement13,16,18. These findings are important because previously schoolwork and thereby encourage students to take advantage of low-achieving students are the group that shows the steepest decline in learning opportunities in school10,11. The specific intervention evalu- grades during the transition to secondary school19, and these findings ated here—a growth mindset of intelligence intervention—addresses are consistent with theory because a growth mindset should be most the beliefs of adolescents about the nature of intelligence, leading beneficial for students confronting challenges20. students to see intellectual abilities not as fixed but as capable of growth Here we report the results of the National Study of Learning in response to dedicated effort, trying new strategies and seeking help Mindsets, which examined the effects of a short, online growth mindset when appropriate12–16. This can be especially important in a society intervention in a nationally representative sample of high schools in the that conveys a fixed mindset (a view that intelligence is fixed), which United States (Fig. 1). With this unique dataset we tested the hypotheses can imply that feeling challenged and having to put in effort means that that the intervention would improve grades among lower-achieving one is not naturally talented and is unlikely to succeed12. students and overall uptake of advanced courses in this national sample. The growth mindset intervention communicates a memorable metaphor: that the brain is like a muscle that grows stronger and A focus on heterogeneity smarter when it undergoes rigorous learning experiences14. Adolescents The study was also designed with the purpose of understanding for hear the metaphor in the context of the neuroscience of learning, they whom and under what conditions the growth mindset intervention 1 University of Texas at Austin, Austin, TX, USA. 2University of California, Irvine, Irvine, CA, USA. 3Stanford University, Stanford, CA, USA. 4Northwestern University, Evanston, IL, USA. 5 Michigan State University, East Lansing, MI, USA. 6University of Virginia, Charlottesville, VA, USA. 7University of Chicago, Chicago, IL, USA. 8Project for Education Research that Scales, San Francisco, CA, USA. 9Paradigm Strategy Inc., San Francisco, CA, USA. 10ICF, Fairfax, VA, USA. 11Arizona State University, Tempe, AZ, USA. 12The Pennsylvania State University, University Park, PA, USA. 13Harvard University, Cambridge, MA, USA. 14University of Pennsylvania, Philadelphia, PA, USA. *e-mail: [email protected]; [email protected] 3 6 4 | N A T U RE | V O L 5 7 3 | 1 9 S E P TE M B ER 2 0 1 9 Article RESEARCH Recruit Student Student Professional data Pre-registered Blinded representative session 1 session 2 processing analyses Bayesian sample analyses a2 + b2 Intervention and Intervention and Make-a-math Grades and survey survey worksheet task course taking Before data After Ninth grade 1–4 weeks later* Analysis phase collection ninth grade Fig. 1 | Design of the National Study of Learning Mindsets. Between The tick symbol indicates that a comprehensive analysis plan was pre- August and November 2015, 82% of schools delivered the intervention; registered at https://osf.io/tn6g4. The blind-eye symbol indicates that, the remaining 18% delivered the intervention in January or February of first, teachers and researchers were kept blinded to students’ random 2016. Asterisk indicates that the median number of days between sessions assignment to condition, and, second, the Bayesian, machine-learning 1 and 2 among schools implementing the intervention in the autumn was robustness tests were conducted by analysts who at the time were blinded 21 days; for spring-implementing schools it was 27 days. The coin-tossing to study hypotheses and to the identities of the variables. symbol indicates that random assignment was made during session 1. improves grades. That is, it examined potential sources of cross- section 5). We addressed site selection bias by contracting a profes- site treatment effect heterogeneity. One reason why understanding sional research company, which recruited a sample of schools that heterogeneity of effects is important is because most interventions generalized to the entire population of ninth-grade students attending that are effective in initial efficacy trials go on to show weaker or regular US public schools27 (that is, schools that run on government no effects when they are scaled up in effectiveness trials that deliver funds; see Supplementary Information section 3). Next, the study used treatments under everyday conditions to more heterogeneous sam- analysis methods that avoided false conclusions about subgroup effects, ples21–23. Without clear evidence about why average effect sizes differ in by generating a limited number of moderation hypotheses (two), later-conducted studies—evidence that could be acquired from a pre-registering a limited number of statistical tests and conducting a systematic investigation of effect heterogeneity—researchers may blinded Bayesian analysis that can provide rigorous confirmation of prematurely discard interventions that yield low average effects but the results (Fig. 1). could provide meaningful and replicable benefits at scale for targeted groups21,23. Expected effect sizes Further, analyses of treatment effect heterogeneity can reveal critical In this kind of study, it is important to ask what size of effect would evidence about contextual mechanisms that sustain intervention be meaningful. As a leading educational economist concluded, “in effects. If school contexts differ in the availability of the resources or real-world settings, a fifth of a standard deviation [0.20 s.d.] is a large experiences needed to sustain the offered belief change and enhanced effect”28. This statement is justified by the ‘best evidence synthesis’ motivation following an intervention, then the effects of the interven- movement29, which recommends the use of empirical benchmarks, tion should differ across these school contexts as well10,11. not from laboratory studies, but from the highest-quality field research Sociological theory highlights two broad dimensions of school contexts on factors affecting objective educational outcomes30,31. A standardized that might sustain or impede belief change and enhanced motivation mean difference effect size of 0.20 s.d. is considered ‘large’ because it is: among students treated by a growth mindset intervention6. First, (1) roughly how much improvement results from a year of classroom schools with the least ‘formal’ resources, such as high-quality curricula learning for ninth-grade students, as shown by standardized tests30; and instruction, may not offer the learning opportunities for students (2) at the high end of estimates for the effect of having a very high-quality to be able to capitalize on the intervention, while those with the most teacher (versus an average teacher) for one year32; and (3) at the upper- resources may not need the intervention. Second, some schools may not most end of empirical distributions of real-world effect sizes from have the ‘informal’ resources needed to sustain the intervention effect, diverse randomized trials that target adolescents31. Notably, the highly- such as peer norms that support students when they take on challenges cited ‘nudges’ studied by behavioural economists and others, when and persist in the face of intellectual difficulty. We hypothesized that aimed at influencing real-world outcomes that unfold over time (such both of these dimensions would significantly moderate growth mindset as college enrolment or energy conservation33) rather than one-time intervention effects. choices, rarely, if ever, exceed 0.20 s.d. and typically have much smaller Historically, the scientific methods used to answer questions about effect sizes. the heterogeneity of intervention effects have been underdevel- Returning to educational benchmarks, 0.20 s.d. and 0.23 s.d. were oped and underused21,24,25. Common problems in the literature are: the two largest effects observed in a recent cohort analysis of the results (1) imprecise site-level impact estimates (because of cluster-level random of all of the pre-registered, randomized trials that evaluated promising assignment); (2) inconsistent fidelity to intervention protocols across interventions for secondary schools funded as part of the US federal sites (which can obscure the workings of the cross-site moderators of government’s i3 initiative34 (the median effect for these promising interest); (3) non-representative sampling of sites (which causes site interventions was 0.03 s.d.; see Supplementary Information section 11). selection bias22,26); and (4) multiple post hoc tests for the sources of The interventions in the i3 initiative typically targeted lower-achieving treatment effect size heterogeneity (which increases the probability of students or schools, involved training teachers or changing curricula, false discoveries24). consumed considerable classroom time, and cost several thousand We overcame all of these problems in a single study. We randomized US dollars per student. Moreover, they were all conducted in non- students to condition within schools and consistently had high fidelity representative samples of convenience that can overestimate effects. of implementation across sites (see Supplementary Information Therefore, it would be noteworthy if a short, low-cost, scalable growth 1 9 S E P TE M B ER 2 0 1 9 | V O L 5 7 3 | N A T U RE | 3 6 5 RESEARCH Article mindset intervention, conducted in a nationally representative sample, Average effects on mathematics and science GPAs could achieve a meaningful proportion of the largest effects seen for A secondary analysis focused on the outcome of GPAs in only math- past traditional interventions, within the targeted, pre-registered group ematics and science (as described in the analysis plan). Mathematics of lower-achieving students. and science are relevant because a popular belief in the United States links mathematics and science learning to ‘raw’ or ‘innate’ abilities37—a Defining the primary outcome and student subgroup view that the growth mindset intervention seeks to correct. In addition, The primary outcome was the post-intervention grade point average success in mathematics and science strongly predicts long-term eco- (GPA) in core ninth-grade classes (mathematics, science, English or nomic welfare and well-being38. Analyses of outcomes for mathematics language arts, and social studies), obtained from administrative data and science supported the same conclusions (B = 0.10 for mathematics sources of the schools (as described in the pre-analysis plan found in and science GPAs compared to B = 0.10 for core GPAs; Extended Data the Supplementary Information section 13 and at https://osf.io35). Tables 1–3). Following the pre-registered analysis plan, we report results for the targeted group of n = 6,320 students who were lower-achieving Quantifying heterogeneity relative to peers in the same school. This group is typically targeted by The intervention was expected to homogeneously change the mindsets comprehensive programmes evaluated in randomized trials in educa- of students across schools—as this would indicate high fidelity of tion, as there is an urgent need to improve their educational trajectories. implementation—however, it was expected to heterogeneously change The justification for predicting effects in the lower-achieving group lower-achieving students’ GPAs, as this would indicate potential school is that (1) this group benefitted in previous growth mindset trials; differences in the contextual mechanisms that sustain an initial treat- (2) lower-achieving students may be undergoing more academic ment effect. As predicted, a mixed-effects model found no significant difficulties and therefore may benefit more from a growth mindset variability in the treatment effect on self-reported mindsets across that alters the interpretation of these difficulties; and (3) students schools (unstandardized τˆ = 0.08, Q64 = 57.2, P = 0.714), whereas sig- who already have a high GPA may have less room to improve their nificant variability was found in the effect on GPAs among lower- GPAs. We defined students as relatively lower-achieving if they were achieving students across schools (unstandardized τˆ = 0.09, Q64 = 85.5, earning GPAs at or below the school-specific median in the term before P = 0.038)39 (Extended Data Fig. 2). random assignment or, if they were missing prior GPA data, if they were below the school-specific median on academic variables used to Moderation by school achievement level impute prior GPA (as described in the analysis plan). Supplementary First, we tested competing hypotheses about whether the formal analyses for the sample overall can be found in Extended Data Table 1, resources of the school explained the heterogeneity of effects. Before and robustness analyses for the definition of lower-achieving students analysing the data, we expected that in schools that are unable to are included in Extended Data Fig. 1 (Supplementary Information provide high-quality learning opportunities (the lowest-achieving section 7). schools), treated students might not sustain a desire to learn. But we also expected that other schools (the highest-achieving schools) might Average effects on mindset have such ample resources to prevent failure such that a growth mindset Among lower-achieving adolescents, the growth mindset intervention intervention would not add much. reduced the prevalence of fixed mindset beliefs relative to the con- The heterogeneity analyses found support for the latter expec- trol condition, reported at the end of the second treatment session, tation, but not the former. Treatment effects on ninth-grade GPAs unstandardized B = −0.38 (95% confidence interval = −0.31, −0.46), among lower-achieving students were smaller in schools with standard error of the regression coefficient (s.e.) = 0.04, n = 5,650 higher achievement levels, intervention × school achievement level students, k = 65 schools, t = −10.14, P < 0.001, standardized mean (continuous) interaction, unstandardized B = −0.07 (95% con- difference effect size of 0.33. fidence interval = 0.02, 0.13), s.e. = 0.03, z = −2.76, n = 6,320, k = 65, P = 0.006, standardized β = −0.25. In follow-up analy- Average effects on core course GPAs ses with categorical indicators for school achievement, medium- In line with our first major prediction, lower-achieving adolescents achieving schools (middle 50%) showed larger effects than high- earned higher GPAs in core classes at the end of the ninth grade when er-achieving schools (top 25%). Low-achieving schools (bottom assigned to the growth mindset intervention, B = 0.10 grade points 25%) did not significantly differ from medium-achieving schools (95% confidence interval = 0.04, 0.16), s.e. = 0.03, n = 6,320, k = 65, (Extended Data Table 2); however, this non-significant difference t = 3.51, P = 0.001, standardized mean difference effect size of 0.11, should be interpreted cautiously, owing to wide confidence intervals relative to comparable students in the control condition. This conclu- for the subgroup of lowest-achieving schools. sion is robust to alternative model specifications that deviate from the pre-registered model (Extended Data Fig. 1). Moderation by peer norms To map the growth mindset intervention effect onto a policy-relevant Second, we examined whether students might be discouraged from indicator of high school success, we analysed poor performance rates, acting on their enhanced growth mindset when they attend schools defined as the percentage of adolescents who earned a GPA below 2.0 in which peer norms were unsupportive of challenge-seeking, whereas peer norms that support challenge-seeking might function on a four-point scale (that is, a ‘D’ or an ‘F’; as described in the pre- to sustain the effects of the intervention over time. We measured peer analysis plan). Poor performance rates are relevant because recent norms by administering a behavioural challenge-seeking task (the changes in US federal laws (the Every Student Succeeds Act36), have ‘make-a-math-worksheet’ task) at the end of the second intervention led many states to adopt reductions in the poor performance rates in session (Fig. 1) and aggregating the values of the control group to the ninth grade as a key metric for school accountability. More than the school level. three million ninth-grade students attend regular US public schools The pre-registered mixed-effects model yielded a positive and each year, and half are lower-achieving according to our definition. The significant intervention × behavioural challenge-seeking norms interac- model estimates that 5.3% (95% confidence interval = −1.7, −9.0), tion for GPA among the targeted group of lower-achieving adolescents, s.e. = 1.8, t = 2.95, P = 0.005 of 1.5 million students in the United such that the intervention produced a greater difference in end-of-year States per year would be prevented from being ‘off track’ for graduation GPAs relative to the control group when the behavioural norm that by the brief and low-cost growth mindset intervention, representing a surrounded students was supportive of the growth mindset belief reduction from 46% to 41%, which is a relative risk reduction of 11% system, B = 0.11 (95% confidence interval = 0.01, 0.21), s.e. = 0.05, (that is, 0.05/0.46). z = 2.18, n = 6,320, k = 65, P = 0.029, β = 0.23. The same conclusion 3 6 6 | N A T U RE | V O L 5 7 3 | 1 9 S E P TE M B ER 2 0 1 9 Article RESEARCH a School achievement Low and medium High b School achievement Low and medium High 0.30 0.30 0.25 0.25 0.25 0.18 Conditional average Conditional average 0.20 0.20 0.16 treatment effect treatment effect 0.14 0.15 0.12 0.15 0.12 0.11 0.10 0.08 0.10 0.07 0.08 0.06 0.05 0.03 0.05 0.03 –0.02 0 0 0 –0.04 –0.05 –0.05 –0.10 –0.10 –0.15 –0.15 First Second Third Fourth First Second Third Fourth Quartile of norms Quartile of norms c Norms Unsupportive Supportive d Norms Unsupportive Supportive 0.30 Treatment effects on mathematics/science 0.30 0.25 GPAs (linear mixed-effects model) 0.25 Treatment effects on core GPAs (linear mixed-effects model) 0.20 0.20 0.15 0.15 0.10 0.10 0.05 0.05 0 0 –0.05 –0.05 –0.10 –0.10 –0.15 –0.15 Low Medium High Low Medium High School achievement level School achievement level Fig. 2 | The growth mindset intervention effects on grade point worksheet task after session 2. c, d, Box plots represent unconditional averages were larger in schools with peer norms that were supportive of treatment effects (one for each school) estimated in the pre-registered the treatment message. a, c, Treatment effects on core course grade point linear mixed-effects regression model with no school-level moderators, averages (GPAs). b, d, Treatment effects on GPAs of only mathematics as specified for research question 3 in the pre-analysis plan and described and science. a, b, The CATEs represent the estimated subgroup treatment in the Supplementary Information section 7.4. The distribution of the effects from the pre-registered linear mixed-effects model, with survey school-level treatment effects was re-scaled to the cross-site standard weights, when fixing the racial/ethnic composition of the schools to the deviation, in accordance with standard practice. Dark lines correspond to population median to remove any potential confounding effect of that the median school in a subgroup and the boxes correspond to the middle variable on moderation hypothesis tests. Achievement levels: low, 25th 75% of the distribution (the interquartile range). Supportive schools are percentile or lower; middle, 25th–75th percentile; high, 75th percentile defined as above the population median (third and fourth quartiles); or higher, which follows the categories set in the sampling plan and in the unsupportive schools are defined as those below the population median pre-registration. Norms indicate the behavioural challenge-seeking norms, (first and second quartiles). n = 6,320 students in k = 65 schools. as measured by the responses of the control group to the make-a-math- was supported in a secondary analysis of only mathematics and science called Bayesian causal forest (BCF). BCF has been shown by both its GPAs (Extended Data Table 2). creators and other leading statisticians in open head-to-head com- petitions to be the most effective of the state-of-the-art methods for Subgroup effect sizes identifying systematic sources of treatment effect heterogeneity, while Putting together the two pre-registered moderators (school achievement avoiding false positives40,41. level and school norms), the conditional average treatment effect The BCF analysis assigned a near-certain posterior probability that (CATEs) on core GPAs within low- and medium-achieving schools the population-average treatment effect (PATE) among lower-achieving (combined) was 0.14 grade points when the school was in the third students was positive and greater than zero, PPATE > 0 ≥ 0.999, providing quartile of behavioural norms and 0.18 grade points when the school strong evidence of positive average treatment effects. BCF also found was in the fourth and highest quartile of behavioural norms, as shown stronger CATEs in schools with positive challenge-seeking norms, and in Fig. 2. For mathematics and science grades, the CATEs ranged from weaker effects in the highest-achieving schools (Extended Data Fig. 3 0.16 to 0.25 grade points in the same subgroups of low- and medium- and Supplementary Information section 8), providing strong corre- achieving schools with more supportive behavioural norms (for results spondence with the primary analyses. separating low- and medium-achieving schools, see Fig. 2c, d and Extended Data Table 3). We also found that even the high-achieving Advanced mathematics course enrolment in tenth grade schools showed meaningful treatment effects among their lower achievers The intervention showed weaker benefits on ninth-grade GPAs in on mathematics and science GPAs when they had norms that sup- high-achieving schools. However, students in these schools may benefit ported challenge seeking—0.08 and 0.11 grade points for the third and in other ways. An analysis of enrolment in rigorous mathematics fourth quartiles of school norms, respectively, in the high-achieving courses in the year after the intervention examined this possibility. The schools (P = 0.002; Extended Data Table 3). enrolment data were gathered with these analyses in mind but since the analyses were not pre-registered, they are exploratory. Bayesian robustness analysis Course enrolment decisions are potentially relevant to all students, A team of statisticians, at the time blind to study hypotheses, re-analysed both lower- and higher-achieving, so we explored them in the full the dataset using a conservative Bayesian machine-learning algorithm, cohort. We found that the growth mindset intervention increased the 1 9 S E P TE M B ER 2 0 1 9 | V O L 5 7 3 | N A T U RE | 3 6 7 RESEARCH Article likelihood of students taking advanced mathematics (algebra II or transition to secondary school6. Indeed, new interventions in the future higher) in tenth grade by 3 percentage points (95% confidence inter- should address the interpretation of other challenges that adolescents val = 0.01, 0.04), s.e. = 0.01, n = 6,690, k = 41, t = 3.18, P = 0.001, from experience, including social and interpersonal difficulties, to affect a rate of 33% in the control condition to a rate of 36% in the intervention outcomes (such as depression) that thus far have proven difficult to condition, corresponding to a 9% relative increase. Notably, we discov- address43. And the combined importance of belief change and school ered a positive intervention × school achievement level (continuous) environments in our study underscores the need for interdisciplinary interaction, (B = 0.04 (95% confidence interval = 0.00, 0.08), research to understand the numerous influences on adolescents’ devel- s.e. = 0.02, z = 2.26, P = 0.024, the opposite of what we found for opmental trajectories. core course GPAs. Within the highest-achieving 25% of schools, the intervention increased the rate at which students took advanced math- Online content ematics in tenth grade by 4 percentage points (t = 2.37, P = 0.018). In Any methods, additional references, Nature Research reporting summaries, the lower 75% of schools—where we found stronger effects on GPA— source data, extended data, supplementary information, acknowledgements, the increase in the rate at which students took advanced mathematics peer review information; details of author contributions and competing inter- ests; and statements of data and code availability are available at https://doi.org/ courses was smaller: 2 percentage points (t = 2.00, P = 0.045). Thus 10.1038/s41586-019-1466-y. an exclusive focus on GPA would have obscured intervention benefits among students attending higher-achieving schools. Received: 11 May 2018; Accepted: 7 July 2019; Published online 7 August 2019. Discussion The National Study of Learning Mindsets showed that a low-cost treat- 1. McFarland, J., Stark, P. & Cui, J. Trends in High School Dropout and ment, delivered in less than an hour, attained a substantial proportion Completion Rates in the United States: 2013 (US Department of Education, 2016). of the effects on grades of the most effective rigorously evaluated ado- 2. Autor, D. H. Skills, education, and the rise of earnings inequality among the lescent interventions of any cost or duration in the literature within “other 99 percent”. Science 344, 843–851 (2014). the pre-registered group of lower-achieving students. Moreover, the 3. Fischer, C. S. & Hout, M. Century of Difference (Russell Sage Foundation, 2006). 4. Rose, H. & Betts, J. R. The effect of high school courses on earnings. Rev. Econ. intervention produced gains in the consequential outcome of advanced Stat. 86, 497–513 (2004). mathematics course-taking for students overall, which is meaningful 5. Patton, G. C. et al. Our future: a Lancet commission on adolescent health and because the rigor of mathematics courses taken in high school strongly wellbeing. Lancet 387, 2423–2478 (2016). 6. Crosnoe, R. Fitting In, Standing Out: Navigating the Social Challenges of high predicts later educational attainment8,9, and educational attainment is School to Get an Education (Cambridge Univ. Press, 2011). one of the leading predictors of longevity and health38,42. The finding 7. Sutton, A., Langenkamp, A. G., Muller, C. & Schiller, K. S. Who gets ahead that the growth mindset intervention could redirect critical academic and who falls behind during the transition to high school? Academic performance at the intersection of race/ethnicity and gender. Soc. Probl. 65, outcomes to such an extent—with no training of teachers; in an effec- 154–173 (2018). tiveness trial conducted in a population-generalizable sample; with data 8. Adelman, C. The Toolbox Revisited: Paths to Degree Completion from High School collected by an independent research company using repeatable proce- through College (US Department of Education, 2006). dures; with data processed by a second independent research company; 9. Schiller, K. S., Schmidt, W. H., Muller, C. & Houang, R. Hidden disparities: how courses and curricula shape opportunities in mathematics during high school. and while adhering to a comprehensive pre-registered analysis plan—is Equity Excell. Educ. 43, 414–433 (2010). a major advance. 10. Walton, G. M. & Wilson, T. D. Wise interventions: psychological remedies for Furthermore, the evidence about the kinds of schools where the social and personal problems. Psychol. Rev. 125, 617–655 (2018). 11. Yeager, D. S. & Walton, G. M. Social–psychological interventions in education: growth mindset treatment effect on grades was sustained, and where it they’re not magic. Rev. Educ. Res. 81, 267–301 (2011). was not, has important implications for future interventions. We might 12. Dweck, C. S. & Yeager, D. S. Mindsets: a view from two eras. Perspect. Psychol. have expected that the intervention would compensate for unsupport- Sci. 14, 481–496 (2019). 13. Yeager, D. S. et al. Using design thinking to improve psychological interventions: ive school norms, and that students who already had supportive peer the case of the growth mindset during the transition to high school. J. Educ. norms would not need the intervention as much. Instead, it was when Psychol. 108, 374–391 (2016). the peer norm supported the adoption of intellectual challenges that the 14. Aronson, J. M., Fried, C. B. & Good, C. Reducing the effects of stereotype threat on African American college students by shaping theories of intelligence. intervention promoted sustained benefits in the form of higher grades. J. Exp. Soc. Psychol. 38, 113–125 (2002). Perhaps students in unsupportive peer climates risked paying a social 15. Blackwell, L. S., Trzesniewski, K. H. & Dweck, C. S. Implicit theories of intelligence price for taking on intellectual challenges in front of peers who thought predict achievement across an adolescent transition: a longitudinal study and an intervention. Child Dev. 78, 246–263 (2007). it undesirable to do so. Sustained change may therefore require both 16. Paunesku, D. et al. Mind-set interventions are a scalable treatment for academic a high-quality seed (an adaptive belief system conveyed by a compel- underachievement. Psychol. Sci. 26, 784–793 (2015). ling intervention) and conductive soil in which that seed can grow 17. Cohen, G. L., Garcia, J., Purdie-Vaughns, V., Apfel, N. & Brzustoski, P. Recursive (a context congruent with the proffered belief system). A limitation processes in self-affirmation: intervening to close the minority achievement gap. Science 324, 400–403 (2009). of our moderation results, of course, is that we cannot draw causal 18. Good, C., Aronson, J. & Inzlicht, M. Improving adolescents’ standardized test conclusions about the effects of the school norm, as the norms were performance: an intervention to reduce the effects of stereotype threat. measured, not manipulated. It is encouraging that a Bayesian analysis, J. Appl. Dev. Psychol. 24, 645–662 (2003). 19. Benner, A. D. The transition to high school: current knowledge, future directions. reported in the Supplementary Information section 8, yielded evidence Educ. Psychol. Rev. 23, 299–328 (2011). consistent with a causal interpretation of the school norms variable. The 20. Burnette, J. L., O’Boyle, E. H., VanEpps, E. M., Pollack, J. M. & Finkel, present research therefore sets the stage for a new era of experimental E. J. Mind-sets matter: a meta-analytic review of implicit theories and self-regulation. Psychol. Bull. 139, 655–701 (2013). research that seeks to enhance both students’ mindsets and the school 21. Greenberg, M. T. & Abenavoli, R. Universal interventions: fully exploring their environments that support student learning. impacts and potential to produce population-level impacts. J. Res. Educ. Eff. 10, We emphasize that not all forms of growth mindset interventions 40–67 (2017). 22. Allcott, H. Site selection bias in program evaluation. Q. J. Econ. 130, 1117–1165 can be expected to increase grades or advanced course-taking, even (2015). in the targeted subgroups11,12. New growth mindset interventions that 23. Singal, A. G., Higgins, P. D. R. & Waljee, A. K. A primer on effectiveness and go beyond the module and population tested here will need to be sub- efficacy trials. Clin. Transl. Gastroenterol. 5, e45 (2014). 24. Bloom, H. S. & Michalopoulos, C. When is the story in the subgroups? Strategies jected to rigorous development and validation processes, as the current for interpreting and reporting intervention effects for subgroups. Prev. Sci. 14, programme was13. 179–188 (2013). Finally, this study offers lessons for the science of adolescent behav- 25. Reardon, S. F. & Stuart, E. A. Editors’ introduction: theme issue on variation in iour change. Beliefs—and particularly beliefs that affect how students treatment effects. J. Res. Educ. Eff. 10, 671–674 (2017). 26. Stuart, E. A., Bell, S. H., Ebnesajjad, C., Olsen, R. B. & Orr, L. L. Characteristics of make sense of ongoing challenges—are important during high-stakes school districts that participate in rigorous national educational evaluations. developmental turning points such as pubertal maturation43,44 or the J. Res. Educ. Eff. 10, 168–206 (2017). 3 6 8 | N A T U RE | V O L 5 7 3 | 1 9 S E P TE M B ER 2 0 1 9 Article RESEARCH 27. Gopalan, M. & Tipton, E. Is the National Study of Learning Mindsets 40. Hahn, P. R., Murray, J. S. & Carvalho, C. Bayesian regression tree models for nationally-representative? https://psyarxiv.com/dvmr7/ (2018). causal inference: regularization, confounding, and heterogeneous effects. 28. Dynarski, S. M. For Better Learning in College Lectures, Lay Down The Laptop Preprint at https://arxiv.org/abs/1706.09523 (2017). and Pick Up a Pen (The Brookings Institution, 2017). 41. Dorie, V., Hill, J., Shalit, U., Scott, M. & Cervone, D. Automated versus 29. Slavin, R. E. Best-evidence synthesis: an alternative to meta-analytic and do-it-yourself methods for causal inference: lessons learned from a traditional reviews. Educ. Res. 15, 5–11 (1986). data analysis competition. Statist. Sci. 34, 43–68 (2019). 30. Hill, C. J., Bloom, H. S., Black, A. R. & Lipsey, M. W. Empirical benchmarks for 42. Kaplan, R. M. More Than Medicine: The Broken Promise of American Health interpreting effect sizes in research. Child Dev. Perspect. 2, 172–177 (2008). (Harvard Univ. Press, 2019). 31. Kraft, M. Interpreting Effect Sizes of Education Interventions https://scholar. 43. Yeager, D. S., Dahl, R. E. & Dweck, C. S. Why interventions to influence adolescent harvard.edu/files/mkraft/files/kraft_2018_interpreting_effect_sizes.pdf (Brown behavior often fail but could succeed. Perspect. Psychol. Sci. 13, 101–122 (2018). University, 2018). 44. Dahl, R. E., Allen, N. B., Wilbrecht, L. & Suleiman, A. B. Importance of investing 32. Hanushek, E. Valuing teachers: how much is a good teacher worth? Educ. Next in adolescence from a developmental science perspective. Nature 554, 11, 40–45 (2011). 441–450 (2018). 33. Benartzi, S. et al. Should governments invest more in nudging? Psychol. Sci. 28, 1041–1055 (2017). Publisher’s note: Springer Nature remains neutral with regard to jurisdictional 34. Boulay, B. et al. The Investing in Innovation Fund: Summary of 67 Evaluations claims in published maps and institutional affiliations. (US Department of Education, 2018). 35. Yeager, D. S. National Study of Learning Mindsets - One Year Impact Analysis. Open Access This article is licensed under a Creative Commons https://osf.io/tn6g4 (2017). Attribution 4.0 International License, which permits use, sharing, 36. Alexander, L. Every Student Succeeds Act. 114th Congress Public Law No. adaptation, distribution and reproduction in any medium or 114-95 https://www.congress.gov/bill/114th-congress/senate-bill/1177/text format, as long as you give appropriate credit to the original author(s) and the (US Congress, 2015). source, provide a link to the Creative Commons license, and indicate if changes 37. Leslie, S.-J., Cimpian, A., Meyer, M. & Freeland, E. Expectations of brilliance were made. The images or other third party material in this article are included underlie gender distributions across academic disciplines. Science 347, in the article’s Creative Commons license, unless indicated otherwise in a credit 262–265 (2015). line to the material. If material is not included in the article’s Creative Commons 38. Carroll, J. M., Muller, C., Grodsky, E. & Warren, J. R. Tracking health inequalities license and your intended use is not permitted by statutory regulation or from high school to midlife. Soc. Forces 96, 591–628 (2017). exceeds the permitted use, you will need to obtain permission directly from the 39. Bloom, H. S., Raudenbush, S. W., Weiss, M. J. & Porter, K. Using multisite copyright holder. To view a copy of this license, visit http://creativecommons. experiments to study cross-site variation in treatment effects: a hybrid org/licenses/by/4.0/. approach with fixed intercepts and a random treatment coefficient. J. Res. Educ. Eff. 10, 817–842 (2017). © The Author(s) 2019 1 9 S E P TE M B ER 2 0 1 9 | V O L 5 7 3 | N A T U RE | 3 6 9 RESEARCH Article Methods Growth mindset intervention content. In preparing the intervention to be Ethics approval. Approval for this study was obtained from the Institutional scalable, we revised past growth mindset interventions to focus on the perspectives, Review Board at Stanford University (30387), ICF (FWA00000845), and the concerns and reading levels of ninth-grade students in the United States, through University of Texas at Austin (#2016-03-0042). In most schools this experiment an intensive research and development process that involved interviews, focus was conducted as a programme evaluation carried out at the request of the partici- groups and randomized pilot experiments with thousands of adolescents13. pating school district45. When required by school districts, parents were informed The control condition, focusing on brain functions, was similar to the growth of the programme evaluation in advance and given the opportunity to withdraw mindset intervention, but did not address beliefs about intelligence. Screenshots their children from the study. Informed student assent was obtained from all from both interventions can be found in Supplementary Information section 4, participants. and a detailed description of the general intervention content has previously been Participants. Data came from the National Study of Learning Mindsets45, which published13. The intervention consisted of two self-administered online sessions is a stratified random sample of 65 regular public schools in the United States that that lasted approximately 25 min each and occurred roughly 20 days apart during included 12,490 ninth-grade adolescents who were individually randomized to regular school hours (Fig. 1). condition. The number of schools invited to participate was determined by a power The growth mindset intervention aimed to reduce the negative effort beliefs of analysis to detect reasonable estimates of cross-site heterogeneity; as many of the students (the belief that having to try hard or ask for help means you lack ability), invited schools as possible were recruited into the study. Grades were obtained fixed-trait attributions (the attribution that failure stems from low ability) and from the schools of the students, and analyses focused on the lower-achieving performance avoidance goals (the goal of never looking stupid). These are the subgroup of students (those below the within-school median). The sample reflected documented mediators of the negative effect of a fixed mindset on grades12,15,48 the diversity of young people in the United States: 11% self-reported being black/ and the growth mindset intervention aims to reduce them. The intervention did African-American, 4% Asian-American, 24% Latino/Latina, 43% white and 18% not only contradict these beliefs but also used a series of interesting and guided another race or ethnicity; 29% reported that their mother had a bachelor’s degree or exercises to reduce their credibility. higher. To prevent deductive disclosure for potentially-small subgroups of students, The first session of the intervention covered the basic idea of a growth mindset— and consistent with best practices for other public-use datasets, the policies for the that an individual’s intellectual abilities can be developed in response to effort, National Study of Learning Mindsets require analysts to round all sample sizes to taking on challenging work, improving one’s learning strategies, and asking the nearest 10, so this was done here. for appropriate help. The second session invited students to deepen their Data collection. To ensure that the study procedures were repeatable by third understanding of this idea and its application in their lives. Notably, students were parties and therefore scalable, and to increase the independence of the results, not told outright that they should work hard or employ particular study or learning two different professional research companies, who were not involved in devel- strategies. Rather, effort and strategy revision were described as general behav- oping the materials or study hypotheses, were contracted. One company (ICF) iours through which students could develop their abilities and thereby achieve drew the sample, recruited schools, arranged for treatment delivery, supervised their goals. and implemented the data collection protocol, obtained administrative data, and The materials presented here sought to make the ideas compelling and help cleaned and merged data. They did this work blind to the treatment conditions of adolescents to put them into practice. It therefore featured stories from both older the students. This company worked in concert with a technology vendor (PERTS), students and admired adults about a growth mindset, and interactive sections in which delivered the intervention, executed random assignment, tracked student which students reflected on their own learning in school and how a growth mindset response rates, scheduled make-up sessions and kept all parties blind to condition could help a struggling ninth-grade student next year. The intervention style is assignment. A second professional research company (MDRC) processed the data described in greater detail in a paper reporting the pilot study for the present merged by ICF and produced an analytic grades file, blind to the consequences of research13 and in a recent review article12. their decisions for the estimated treatment effects, as described in Supplementary Among these features, our intervention mentioned effort as one means to Information section 12. Those data were shared with the authors of this paper, develop intellectual ability. Although we cannot isolate the effect of the growth who analysed the data following a pre-registered analysis plan (see Supplementary mindset message from a message about effort alone, it is unlikely that the mere Information section 13; MDRC will later produce its own independent report using mention of effort to high school students would be sufficient to increase grades its processed data, and retained the right to deviate from our pre-analysis plan). and challenge seeking. In part this is because adolescents often already receive a Selection of schools was stratified by school achievement and minority com- great deal of pressure from adults to try hard in school. position. A simple random sample would not have yielded sufficient numbers of Intervention delivery and fidelity. The intervention and control sessions were rare types of schools, such as high-minority schools with medium or high levels delivered as early in the school year as possible, to increase the opportunity to of achievement. This was because school achievement level—one of the two can- set in motion a positive self-reinforcing cycle. In total 82% of students received didate moderators—was strongly associated with school racial/ethnic composi- the intervention in the autumn semester before the Thanksgiving holiday in the tion46 (percentage of Black/African-American or Hispanic/Latino/Latina students, United States (that is, before late November) and the rest received the intervention r = −0.66). in January or February; see Supplementary Information section 5 for more detail. A total of 139 schools were selected without replacement from a sampling frame The computer software of the technology vendor randomly assigned adolescents to of roughly 12,000 regular US public high schools, which serve the vast majority intervention or control materials. Students also answered various survey questions. of students in the United States. Regular US public schools exclude charter or pri- All parties were blind to condition assignment, and students and teachers were not vate schools, schools serving speciality populations such as students with physical told the purpose of the study to prevent expectancy effects. disabilities, alternative schools, schools that have fewer than 25 ninth-grade The data collection procedures yielded high implementation fidelity across the students enrolled and schools in which ninth grade is not the lowest grade in participating schools, according to metrics listed in the pre-registered analysis the school. plan. In the median school, treated students viewed 97% of screens and wrote a Of the 139 schools, 65 schools agreed, participated and provided student response for 96% of open-ended questions. In addition, in the median school 91% records. Another 11 schools agreed and participated but did not provide student students reported that most or all of their peers worked carefully and quietly on grades or course-taking records; therefore, the data of their students are not ana- the materials. Fidelity statistics are reported in full in Supplementary Information lysed here. School nonresponse did not appear to compromise representativeness. section 5.6; Extended Data Table 2 shows that the treatment effect heterogeneity We calculated the Tipton generalizability index47, a measure of similarity between conclusions were unchanged when controlling for the interaction of treatment and an analytic sample and the overall sampling frame, along eight student demo- school-level fidelity as intended. graphic and school achievement benchmarks obtained from official government Measures. Self-reported fixed mindset. Students indicated how much they agreed sources27. The index ranges from 0 to 1, with a value of 0.90 corresponding to with three statements such as “You have a certain amount of intelligence, and you essentially a random sample. The National Study of Learning Mindsets showed really can’t do much to change it” (1, strongly disagree; 6, strongly agree). Higher a Tipton generalizability index of 0.98, which is very high (see Supplementary values corresponded to a more fixed mindset; the pre-analysis plan predicted that Information section 3). the intervention would reduce these self-reports. Within schools, the average student response rate for eligible students GPAs. Schools provided the grades of each student in each course for the eight was 92% and the median school had a response rate of 98% (see definitions and ninth grade. Decisions about which courses counted for which content area in Supplementary Information section 5). This response rate was obtained by were made independently by a research company (MDRC; see Supplementary extensive efforts to recruit students into make-up sessions if students were absent Information section 12). The GPAs are a theoretically relevant outcome because and it was aided by a software system, developed by the technology vendor grades are commonly understood to reflect sustained motivation, rather than only (PERTS), that kept track of student participation. A high within-school response prior knowledge. It is also a practically relevant outcome because, as noted, GPA rate was important because lower-achieving students, our target group, are typically is a strong predictor of adult educational attainment, health and well-being, even more likely to be absent. when controlling for high school test scores38. Article RESEARCH School achievement level. The school achievement level moderator was a latent moderators of treatments: BCF40. The BCF algorithm uses machine learning tools variable that was derived from publicly available indicators of the performance of to discover (or rule out) higher-order interactions and nonlinear relations among the school on state and national tests and related factors45,46, standardized to have covariates and moderators. It is conservative because it uses regularization and mean = 0 and s.d. = 1 in the population of the more than 12,000 US public schools. strong prior distributions to prevent false discoveries. Evidence for the robustness Behavioural challenge-seeking norms of the schools. The challenge-seeking norm of the moderation analysis in our pre-registered model comes from correspond- of each school was assessed through a behavioural measure called the make- ence with the estimated moderator effects of BCF in the part of the distribution a-math-worksheet task13. Students completed the task towards the end of the where there are the most schools (that is, in the middle of the distribution), because second session, after having completed the intervention or control content. They this is where the BCF algorithm is designed to have confidence in its estimates chose from mathematical problems that were described either as challenging and (Extended Data Fig. 3). offering the chance to learn a lot or as easy and not leading to much learning. Reporting summary. Further information on research design is available in Students were told that they could complete the problems at the end of the session the Nature Research Reporting Summary linked to this paper. if there was time. The school norm was estimated by taking the average num- ber of challenging mathematical problems that adolescents in the control condi- Data availability tion attending a given school chose to work on. Evidence for the validity of the Technical documentation for the National Study of Learning Mindsets is available challenge-seeking norm is presented in the Supplementary Information section 10. from ICPSR at the University of Michigan (https://doi.org/10.3886/ICPSR37353. Norms of self-reported mindset of the schools. A parallel analysis focused on norms v1). Aggregate data are available at https://osf.io/r82dw/. Student-level data are for self-reported mindsets in each school, defined as the average fixed mindset protected by data sharing agreements with the participating districts; de-identified self-reports (described above) of students before random assignment. The private data can be accessed by researchers who agree to terms of data use, including beliefs of peers were thought to be less likely to be visible and therefore less likely required training and approvals from the University of Texas Institutional Review to induce conformity and moderate treatment effects, relative to peer behaviours49; Board and analysis on a secure server. To request access to data, researchers should hence self-reported beliefs were not expected to be significant moderators. Self- contact [email protected]. The pre-registered analysis plan can be found at reported mindset norms did not yield significant moderation (see Extended Data https://osf.io/tn6g4. The intervention module will not be commercialized and will Table 2). be available at no cost to all secondary schools in the United States or Canada that Course enrolment to advanced mathematics. We analysed data from 41 schools wish to use it via https://www.perts.net/. Selections from the intervention materials who provided data that allowed us to calculate rates at which students took an are included in the Supplementary Information. Researchers wishing to access full advanced mathematics course (that is, algebra II or higher) in tenth grade, the intervention materials should contact [email protected] and must agree to school year after the intervention. Six additional schools provided tenth grade terms of use, including non-commercialization of the intervention. course-taking data but did not differentiate among mathematics courses. We expected average effects of the treatment on challenging course taking in tenth Code availability grade to be small because not all students were eligible for advanced mathematics Syntax can be found at https://osf.io/r82dw/ or by contacting [email protected]. and not all schools allow students to change course pathways. However, some edu. students might have made their way into more advanced mathematics classes or remained in an advanced pathway rather than dropping to an easier pathway. 45. Yeager, D. S. The National Study of Learning Mindsets, [United States], These challenge-seeking decisions are potentially relevant to both lower- and 2015–2016 (ICPSR 37353). 10.3886/ICPSR37353.v1 (2019). higher-achieving students, so we explored them in the full sample of students in 46. Tipton, E., Yeager, D. S., Iachan, R. & Schneider, B. in Experimental Methods in the 41 included schools. Survey Research: Techniques that Combine Random Sampling with Random Analysis methods. Overview. We used intention-to-treat analyses; this means Assignment (ed. Lavrakas, P. J.) (Wiley, 2019). 47. Tipton, E. How generalizable is your experiment? An index for comparing that data were analysed for all students who were randomized to an experimental experimental samples and populations. J. Educ. Behav. Stat. 39, 478–501 condition and whose outcome data could be linked. A complier average causal (2014). effects analysis yielded the same conclusions but had slightly larger effect sizes 48. Robins, R. W. & Pals, J. L. Implicit self-theories in the academic domain: (see Supplementary Information section 9). Here we report only the more con- implications for goal orientation, attributions, affect, and self-esteem change. servative intention-to-treat effect sizes. Standardized effect sizes reported here Self. Identity 1, 313–336 (2002). 49. Paluck, E. L. Reducing intergroup prejudice and conflict using the media: a field were standardized mean difference effect sizes and were calculated by dividing the experiment in Rwanda. J. Pers. Soc. Psychol. 96, 574–587 (2009). treatment effect coefficients by the raw standard deviation of the control group for the outcome, which is the typical effect size estimate in education evaluation Acknowledgements This manuscript uses data from the National Study experiments. Frequentist P values reported throughout are always from two-tailed of Learning Mindsets (principal investigator, D.Y.; co-investigators: R.C., hypothesis tests. C.S.D., C.M., B.S. and G.M.W.; https://doi.org/10.3886/ICPSR37353.v1). The programme and surveys were administered using systems and processes Model for average treatment effects. Analyses to estimate average treatment effects developed by the Project for Education Research That Scales (PERTS (https:// for an individual person used a cluster-robust fixed-effects linear regression model www.perts.net/); principal investigator, D.P.). Data collection was carried out with school as fixed effect that incorporated weights provided by statisticians from by an independent contractor, ICF (project directors, K.F. and A.R.). Planning ICF, with cluster defined as the primary sampling unit. Coefficients were therefore meetings were hosted by the Mindset Scholars Network at the Center for generalizable to the population of inference, which is students attending regular Advanced Study in the Behavioral Sciences (CASBS) with support from a public schools in the United States. For the t distribution, the degrees of freedom grant from Raikes Foundation to CASBS (principal investigator, M. Levi), and the study received assistance or advice from M. Shankar, T. Brock, C. Bryan, is 46, which is equal to the number of clusters (or primary sampling units, which C. Macrander, T. Wilson, E. Konar, E. Horng, J. Axt, T. Rogers, A. Gelman, was 51) minus the number of sampling strata (which was 5)45. H. Bloom and M. Weiss. We are grateful for feedback on a preprint from L. Quay, Model for the heterogeneity of effects. To examine cross-school heterogeneity in the D. Bailey, J. Harackiewicz, R. Dahl, A. Suleiman and M. Greenberg. Funding treatment effect among lower-achieving students, we estimated multilevel mixed was provided by the Raikes Foundation, the William T. Grant Foundation, the effects models (level 1, students; level 2, schools) with fixed intercepts for schools Spencer Foundation, the Bezos Family Foundation, the Character Laboratory, and a random slope that varied across schools, following current recommended the Houston Endowment, the Yidan Prize for Education Research, the National Science Foundation under grant number HRD 1761179, a personal gift from practices39. The model included school-centred student-level covariates (prior A. Duckworth and the President and Dean of Humanities and Social Sciences performance and demographics; see the Supplementary Information section 7) at Stanford University. Preparation of the manuscript and the development to make site-level estimates as precise as possible. This analysis controlled for of the analytical approach were supported by National Institute of Child school-level average student racial/ethnic composition and its interaction with Health and Human Development (10.13039/100000071 R01HD084772), the treatment status variable to account for confounding of student body racial/ P2C-HD042849 (to the Population Research Center (PRC) at The University of ethnic composition with school achievement levels. Student body racial/ethnic Texas at Austin). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health composition interactions were never significant at P < 0.05 and so we do not dis- and the National Science Foundation. cuss them further (but they were always included in the models, as pre-registered). Bayesian robustness analysis. A final pre-registered robustness analysis was con- Author contributions D.S.Y. conceived the study and led the design, analysis ducted to reduce the influence of two possible sources of bias: awareness of study and writing; C.S.D. was involved in every phase of the study, particularly the hypotheses when conducting analyses and misspecification of the regression conception of the study, the study design, the preparation of intervention materials, the interpretation of analyses and the writing of the manuscript; model (see the Supplemental Information, section 13, p. 12). Statisticians who were G.M.W. co-conceived the study, contributed to intervention material design not involved in the study design and unaware of the moderation hypotheses re- and assisted with the interpretation of analyses and writing of the manuscript. analysed a blinded dataset that masked the identities of the variables. They did P.H. contributed to the design of the study and the conceptualization of the so using an algorithm that has emerged as a leading approach for understanding moderators, co-developed the analysis plan, carried out statistical analyses, RESEARCH Article developed the Supplementary Information and assisted with the writing of the identified. Specifically, D.P. is the co-founder and executive director at PERTS, manuscript; R.C. and C.M. contributed to study design, analysis and writing of an institute at Stanford University that offers free growth mindset interventions the paper, especially with respect to sociological theory about context effects; and measures to schools, and authors D.S.Y, C.S.D., G.W., A.L.D., D.P., and C.H. E.T., R.I., D.S.Y. and B.S. developed the school sampling plan; C.R. co-developed have disseminated findings from research to K-12 schools, universities, non- the intervention content; D.P. and PERTS developed the intervention delivery profit entities, or private entities via paid or unpaid speaking appearances or and survey data collection software; K.F., A.R., J.T., and R.I. led data collection consulting. None of the authors has a financial relationship with any entity that and independently cleaned and merged raw data prior to access by the sells growth mindset products or services. analysts; J.S.M., C.M.C. and P.R.H. executed a blinded analysis of the data and contributed to the writing of the paper; C.S.H., A.L.D. and D.S.Y. co-developed Additional information the behavioural norms measure; M.G., P.M., J.B. and S.Y.H. contributed to data Supplementary information is available for this paper at https://doi.org/ analysis; R.F. contributed to study design. 10.1038/s41586-019-1466-y. Correspondence and requests for materials should be addressed to D.S.Y. or P.H. Competing interests The authors declare no competing interests for this Peer review information Nature thanks Eric Grodsky, Luke Miratrix and the other, study. Several authors have disseminated growth mindset research to anonymous, reviewer(s) for their contribution to the peer review of this work. public audiences and have complied with their institutional financial interest Reprints and permissions information is available at http://www.nature.com/ disclosure requirements; currently no financial conflicts of interest have been reprints. Article RESEARCH Extended Data Fig. 1 | The finding that the growth mindset effect on of changing just one or two model specifications at a time while leaving GPA is positive among lower-achieving students is robust to deviations the rest of the pre-registered model specifications the same. Open circles from the pre-registered statistical model. a, b, Each estimate represents represent the pre-registered definition of lower-achieving students (below an unstandardized treatment effect on GPA (on a 0 to 4.3 scale) estimated the school-specific median), and filled dots represent the alternative in separate fixed-effects regression models with school as a fixed effect. definition of lower-achieving students (below the school-specific Most of the alternative specifications were known to produce less-valid median and below a 3.0 GPA out of 4.3). b, A histogram of all possible tests of the hypothesis, but some of them required fewer subjective combinations of the alternative model specifications that shows that effects judgments and so it was informative to show that the main conclusion of are uniformly positive. Note that the treatment effect estimates on the far a positive treatment effect was supported even with a suboptimal model left of b are from clearly less-valid models; for example, they insufficiently specification. Examples include revising the core GPA outcome to include control for prior achievement, they drop participants with missing data, non-core classes such as speech, debate or electives (because this does they do not use survey weights (so results are not representative and not involve coding of core classes; see ‘Includes Non-Core Courses’), or therefore do not answer our research questions). Panels a and b both revising the post-treatment marking period to include pre-treatment show that even exercising all of these degrees of freedom in a way that data in cases in which schools implemented the intervention in the could obscure true treatment effects still yields positive point estimates. Spring (because this does not involve coding pre- and post-treatment Further explanations of why the alternatives were not selected for the pre- making periods; see ‘Includes Some Pre-Treatment GPA’). a, The effects registration are included in Supplementary Information section 7.3. RESEARCH Article Extended Data Fig. 2 | The growth mindset intervention effect in a sizes for each school on its own. Therefore, as with any multi-site trial, given school is almost always positive, although there is significant effects of individual schools are not expected to be significantly different heterogeneity across schools. a, b, Mindset treatment effects on for from zero even though the average treatment effect is significantly core course GPAs (a) and mathematics/science GPAs (b). Estimates different from zero. The plotted treatment effects were estimated in an were generated using the pre-registered linear mixed-effects model unconditional model with no cross-level interactions (that is, without (see Supplementary Information section 7, RQ3). Note that the treatment consideration of the potential moderators) and so the points are shrunken effect at any individual school is likely to have a very wide confidence towards the sample mean. Thus, these plotted estimates do not correspond interval even when there is a true positive effect, owing to small sample to the estimated CATEs reported in the paper or in Extended Data Table 3. Article RESEARCH Extended Data Fig. 3 | A BCF analysis reproduces the same pattern of as black, African-American, Hispanic, Latino or indigenous American, moderation by norms as the pre-registered linear mixed-effects model split at the school-level population median (26% of the student body of The BCF analysis uses a nonparametric Bayesian model designed to the school). The dashed lines represent the estimated intercept and slope shrink effect sizes to see if any effect can update a relatively strong prior for the linear trend of the estimated treatment effects in norms. b, The centered on null effects and biased toward low degrees of treatment effect coloured lines represent LOESS smoothing curves for the trend in norms moderation. a, b, Data points correspond to school-level treatment effects of the estimated treatment effects, fitted to the estimated school-level estimated by the pre-registered linear mixed-effects model (a) or the BCF treatment effects within achievement groups and weighted by school model (b). Treatment effects refers to the difference between the treatment sample size. The area between thevertical lines is the interquartile range and control groups in terms of mathematics/science GPAs at the end of (IQR) of norms, where neither model is extrapolating. The two models ninth grade in a school, adjusting for pre-random-assignment covariates agree broadly about average effects, particularly within the IQR of norms, and including survey weights. The models included three school-level while BCF estimated somewhat lower degrees of heterogeneity and moderators of the student-level randomized treatment: the achievement extrapolates in a fundamentally different fashion at the extremes of norms level (categorical, dummies for low and high, medium group as the (since it is a nonlinear model).Recall BCF is designed to shrink toward reference category in the linear model), the behavioural growth mindset an overall effect size of zero, and to shrink CATEs of similar schools norms (continuous) and the percentage of racial or ethnic minority towards one another, in order to avoid over-fitting the data. Unlike the students (continuous) of the schools. School-level treatment effects preregistered linear models, BCF was specified with no prior hypotheses include the fitted values plus the model-estimated, school-specific random about the functional form of moderation (nonlinearities and/or effect. Challenge-seeking behavioural norm refers to the average number interactions between multiple moderators and treatment) so this shrinkage of challenging mathematics problems (out of 8) chosen by students in the is necessary to obtain stable estimates of treatment effects. However, it control group in a given school. N, the number of lower-achieving students does lead to smaller estimates of effect sizes and a lower estimated degree in a school. Percent minority, the percentage of students who identify of moderation relative to the preregistered linear mixed effects model. RESEARCH Article Extended Data Table 1 | Growth mindset effects were of a similar magnitude across subject areas P values are from two-sided hypothesis tests. Confirming the predictions in the pre-analysis plan, higher-achieving students demonstrated no significant treatment effects on core course GPAs, B = 0.01 grade points (95% confidence interval = −0.03–0.06), s.e. = 0.02, n = 6,170, k = 65, t = 0.480, P = 0.634, standardized mean difference effect size = 0.01, resulting in a significant intervention × lower-achiever interaction B = 0.09 grade points (95% confidence interval = 0.01–0.17), s.e. = 0.04, n = 12,490, k = 65, t = 2.179, P = 0.034). This result replicates previous research and supports the pre-registered decision to examine average GPA effects only among lower-achieving students, as higher-achieving students may have already had habits (for example, turning in work on time) and environments (for example, supportive family, teachers or peer groups) that fostered high GPAs even in the control condition. Article RESEARCH Extended Data Table 2 | Moderating effects of school achievement level and school norms Values are from the pre-registered model (see Supplementary Information section 7, RQ4). Moderator estimates with the same model number and letter combination were obtained from the same regression model. P values are from two-tailed hypothesis tests. RESEARCH Article Extended Data Table 3 | CATEs are largest for medium-achieving schools with supportive norms CATEs are the average differences between the randomly assigned intervention and control groups in terms of GPA or D/F average rates, for a given set of schools. 95% CI, 95% confidence interval. P values are from two-tailed hypothesis tests. Norms refers to behavioural challenge-seeking norms, as measured by the responses of the control group to the make-a-math-worksheet task. Standardized effect sizes for GPA are essentially identical to the unstandardized effect sizes because the standard deviation of GPA is approximately 1. The estimates were generated from the pre-registered linear mixed-effects regressions (equations provided in Supplementary Information section 7) that used survey weights provided by the research company to make estimates generalizable. The models included three school-level moderators of the student-level randomized treatment: the achievement level (categorical, dummies for low and high, medium group omitted), the behavioural growth mindset norms (continuous) and the percentage of racial or ethnic minority students (continuous) of the school. To define the school achievement levels for presentation of school subgroup effects, we followed the analysis plan. The pre-analysis plan did not include a method for post-estimation summarization of the effects of the continuous norms, so the table uses a prominent default: a split at the population median. The full, continuous norms variable was used to estimate the model, so the choice of the median split cut-off point did not affect the estimation of the regression coefficients. Grey shaded columns indicate the subgroup that was expected to have the largest effects in the pre-registered analysis plan. a, School achievement level subgroups for core course GPAs. b, School achievement level subgroups for reduction in rates of D/F averages in core course GPAs. c, School achievement level subgroups for GPAs of only mathematics and science. nature research | reporting summary David S. Yeager, [email protected], Paul Corresponding author(s): Hanselman, [email protected] Reporting Summary Nature Research wishes to improve the reproducibility of the work that we publish. This form provides structure for consistency and transparency in reporting. For further information on Nature Research policies, see Authors & Referees and the Editorial Policy Checklist. Statistical parameters When statistical analyses are reported, confirm that the following items are present in the relevant location (e.g. figure legend, table legend, main text, or Methods section). n/a Confirmed The exact sample size (n) for each experimental group/condition, given as a discrete number and unit of measurement An indication of whether measurements were taken from distinct samples or whether the same sample was measured repeatedly The statistical test(s) used AND whether they are one- or two-sided Only common tests should be described solely by name; describe more complex techniques in the Methods section. A description of all covariates tested A description of any assumptions or corrections, such as tests of normality and adjustment for multiple comparisons A full description of the statistics including central tendency (e.g. means) or other basic estimates (e.g. regression coefficient) AND variation (e.g. standard deviation) or associated estimates of uncertainty (e.g. confidence intervals) For null hypothesis testing, the test statistic (e.g. F, t, r) with confidence intervals, effect sizes, degrees of freedom and P value noted Give P values as exact values whenever suitable. For Bayesian analysis, information on the choice of priors and Markov chain Monte Carlo settings For hierarchical and complex designs, identification of the appropriate level for tests and full reporting of outcomes Estimates of effect sizes (e.g. Cohen's d, Pearson's r), indicating how they were calculated Clearly defined error bars State explicitly what error bars represent (e.g. SD, SE, CI) Our web collection on statistics for biologists may be useful. Software and code Policy information about availability of computer code Data collection Data were collected via the Qualtrics survey platform. Data analysis Data were analyzed in R and Stata. All syntax files are stored on a secure server with the raw data, housed at the University of Texas at Austin Population Research Center. For manuscripts utilizing custom algorithms or software that are central to the research but not yet described in published literature, software must be made available to editors/reviewers upon request. We strongly encourage code deposition in a community repository (e.g. GitHub). See the Nature Research guidelines for submitting code & software for further information. Data April 2018 Policy information about availability of data All manuscripts must include a data availability statement. This statement should provide the following information, where applicable: - Accession codes, unique identifiers, or web links for publicly available datasets - A list of figures that have associated raw data - A description of any restrictions on data availability Data, syntax, and documentation are available to researchers who agree to terms of data use, including analysis on a secure server and prohibitions against any analysis that risks exposing the identity of participating students (i.e., deductive disclosure) 1 nature research | reporting summary Field-specific reporting Please select the best fit for your research. If you are not sure, read the appropriate sections before making your selection. Life sciences Behavioural & social sciences Ecological, evolutionary & environmental sciences For a reference copy of the document with all sections, see nature.com/authors/policies/ReportingSummary-flat.pdf Behavioural & social sciences study design All studies must disclose on these points even when the disclosure is negative. Study description This study involves secondary data analysis of an intervention evaluation conducted on behalf of schools in the U.S. The program