Strengthening Teacher Support for Math Learning Outcomes (El Salvador 2019)
Document Details
Uploaded by KnowledgeableBandura1751
Universidad Francisco Gavidia
2022
Takao Maruyama
Tags
Summary
This study examines a structured pedagogy program in El Salvador aimed at improving mathematics learning outcomes in lower secondary education. The two-year experiment evaluated the impact of additional interventions on teacher support and student learning. The program included material distribution and teacher training, and the study used item response theory to measure learning gains. Additional interventions resulted in a statistically significant gain in the first year, but this effect did not accumulate in the second year.
Full Transcript
International Journal of Educational Research 115 (2022) 101977 Contents lists available at ScienceDirect International Journal of Educational Research journal homep...
International Journal of Educational Research 115 (2022) 101977 Contents lists available at ScienceDirect International Journal of Educational Research journal homepage: www.elsevier.com/locate/ijedures Strengthening Support of Teachers for Students to Improve Learning Outcomes in Mathematics: Empirical Evidence on a Structured Pedagogy Program in El Salvador Takao Maruyama * Graduate School for Humanities and Social Sciences, Hiroshima University, 1-5-1, Kagamiyama, Higashi-Hiroshima, Hiroshima 739-8529 Japan A R T I C L E I N F O A B S T R A C T Keywords: The learning crisis in lower secondary education is profound. Evidence suggests that a structured Educational development pedagogy program that combines the distribution of teaching and learning materials with Mathematics learning different interventions is an effective approach. Given a variety of possible combinations of Lower secondary education, Latin America interventions in a program, this study conducted a two-year-long experiment on additional Impact evaluation interventions in a program for mathematics in El Salvador. The distribution of mathematics tests and workbooks were included to a program to strengthen support of teachers for students. The average one-year impact on mathematics learning is estimated at 0.17 standard deviations. The impact remained positive but became not statistically significant in the second year of research when the difference of interventions between the treatment and control groups disappeared. 1. Introduction Approximately 617 million primary and lower secondary school-age children worldwide are not reaching the minimum proficiency levels in reading and mathematics (UNESCO, 2017). Recent debates on educational development have focused on the learning crisis in primary education (Pritchett and Beatty, 2015; World Bank, 2019); however, the crisis at the lower secondary education level is equally profound. Worldwide, more than half of lower secondary school-age children are not acquiring minimum proficiency, and most are in low- and lower–middle income countries (UNESCO, 2017). The current status is far from achieving the Sustainable Development Goals (SDGs) target of at least minimum proficiency levels in reading and mathematics. In Latin America, although enrollment in lower secondary education has expanded since the 2000s, the quality of education has stagnated over the years (Angrist et al., 2021; UNESCO, 2021). Through cross-country study, Hanushek and Woessman (2012; 2015; 2016) demonstrated that a low level of educational achievement was a source of poor growth performance in Latin America. Math ematics is the foundation for science, engineering, and technology (OECD, 2021), which are key drivers of a country’s socioeconomic development; however, approximately 60 percent of lower secondary school-age children are not mastering the minimum proficiency level of mathematics in Latin America (UNESCO, 2017). Through lesson observations in several Latin American countries, Bruns and Luque (2015) argued that poor student learning results could be directly linked to the failure of teachers to keep students engaged in learning. In El Salvador, the learning crisis is alarming. The Trends in International Mathematics and Science Study (TIMSS) 2007 revealed * Corresponding author at: Hiroshima University: Hiroshima Daigaku, Japan. E-mail address: [email protected]. https://doi.org/10.1016/j.ijer.2022.101977 Received 7 January 2022; Received in revised form 23 February 2022; Accepted 7 April 2022 Available online 6 June 2022 0883-0355/© 2022 The Author. Published by Elsevier Ltd. This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/). T. Maruyama International Journal of Educational Research 115 (2022) 101977 that only 20 percent of 8th-grade students (the second grade of lower secondary school) reached the minimum proficiency level of mathematics at the grade level (Mullis et al., 2009). As reported in Section 2.2, this study’s baseline survey of 7th-grade students indicated that the situation remained same. For example, approximately 60 percent of the 7th-grade students from the sampled schools could not correctly answer a two-digit subtraction problem with borrowing at the beginning of the school year (Koei Research & Consulting Inc., 2018). A systematic review of evidence in education suggests that a structured pedagogy program that provides teaching and learning materials and other types of interventions such as teacher training is an effective approach to improving student learning (Snilsveit et al., 2016). To address several challenges regarding the improvement of learning (e.g., inadequately trained teachers and a lack of appropriate materials, curricula, and instructional approaches), a structured pedagogy program includes different types of compo nents, such as the distribution of teaching and learning materials, teacher training, and instructional support from school principals (Snilsveit et al., 2016). However, given an extensive variety of possible combinations of interventions, systematic reviews have not clearly demonstrated the effective combination of different interventions for a structured pedagogy program, which would be an interest of policymakers (Piper et al., 2018). It is also necessary to consider possible complementarities among different inputs in a package of interventions (Kerwin and Thornton, 2020). The experiment of this study in El Salvador thus intends to bridge the gap in understanding regarding the combination of different components in a structured pedagogy program at the lower secondary education level. To improve student mathematics achievement, the Ministry of Education in El Salvador implemented the “Project for the Improvement of Mathematics Teaching in Primary and Secondary Education” (hereinafter “ESMATE project”), with technical cooperation from the Japan International Cooperation Agency (JICA). The ESMATE project first developed a set of teaching and learning materials, that is, teachers’ guides, textbooks (ESMATE textbooks), and workbooks.1 The Ministry then combined the provision of these materials with other types of interventions, including introductory training for school principals, teachers, and representatives of parent associations and periodical mutual review meetings of teachers, as a structured pedagogy program (hereinafter referred “ESMATE program”).2 Although the Ministry of Education scaled up the ESMATE program nationwide for grades 7 to 9 (the lower secondary education level),3 it could not cover several interventions in the program due to budget constraints. This study focused on the interventions that were not covered in the nationwide scaling-up and investigates how the additional interventions could enhance the impact of the program on mathematics learning of grade 7 students using a randomized controlled trial.4 Additional interventions in the ESMATE program aimed to enhance the support of teachers for students to improve their mathe matics learning in two ways. First, the Ministry provided teachers with mathematics tests for students. In mutual review meetings during inter-semester periods, teachers would review and discuss student test results with colleagues so that they would be more conscious of how much students learned. The Ministry also provided initial on-site advice to school principals regarding mathematics lesson observation to increase the frequency of their lesson observations and suggestions (feedbacks) to teachers. The interventions aimed to improve teachers’ teaching practices in their mathematics lessons. Second, in addition to the textbook, mathematics workbooks were distributed to students in the treatment group to strengthen their study at home. As the content of the workbook corresponded to the ESMATE textbook, teachers could use it for homework assignments. This study makes three main contributions to the literature on educational development. First, it evaluated the accumulated impact of first-year additional interventions on student mathematics learning in the following year. Most previous studies conducted in the context of developing countries have focused on the impact of interventions on learning outcomes right after the intervention was completed (Evans and Popova, 2016). However, since students continue learning for years, it is critical to see whether they can advance their learning after receiving the intervention. This study thus tracked the same students for two years. In 2019 (year 2 of the study), both groups received the same interventions from the Ministry. The Ministry continuously provided a set of mathematics textbooks and student workbooks. To capture the student learning progress in the two years, this study used item response theory (IRT) to link the mathematics test scores at the end of the 7th grade and the 8th grade. In year 1 of the study, the additional components in the ESMATE program improved student mathematics learning by 0.17 standard deviations of the IRT scores, which was statistically significant at the 1 percent level. Conversely, the accumulated impact in the following year was positive but not statistically significant. The gains in the first year were thus not sufficient to yield an accumulated impact on mathematics learning in the following year. Second, this study examined how the additional interventions strengthened the effectiveness of a structured pedagogy program for lower secondary education by reviewing the impacts on teaching practices and student study at home. Piper et al. (2018) investigated 1 The textbooks, workbooks, and teachers’ guidebooks that the ESMATE project developed are listed on the website of the Ministry of Education, Science and Technology in El Salvador: https://www.mined.gob.sv/esmate/. Appendix 1 of Maruyama and Kurosaki (2021) outlines the textbook’s page structure and the relation among the different teaching and learning materials. 2 The ESMATE program was developed for grades 1 to 11. While this study focused on grade 7 (grade 1 in lower secondary education), another study investigated the impact of the ESMATE program on grade 2 students in primary education (Maruyama and Kurosaki, 2021). In this exper iment, the entire package of the ESMATE program, including similar additional interventions examined in this paper, such as the distribution of math tests, was provided for the treatment group. The control group in the experiment for grade 2 students did not receive any intervention. The result of the experiment demonstrated that the whole package of the ESMATE program for grade 2 students in primary education improved math learning outcomes by approximately 0.49 standard deviations (Maruyama and Kurosaki, 2021). 3 In El Salvador, primary education (grades 1 through 6) and lower secondary education (grades 7 through 9) are compulsory. 4 The randomized controlled trial was conducted with agreement between the Ministry of Education in El Salvador and JICA in June and October 2017. The survey and database construction were performed by Koei Research & Consulting Inc. under a contract with JICA, in collaboration with the Ministry of Education in El Salvador. Hitotsubashi University Research Ethics Examination Committee reviewed the research plan. 2 T. Maruyama International Journal of Educational Research 115 (2022) 101977 the essential components of a structured pedagogy program in literacy and numeracy by comparing different combinations of in terventions. They revealed that it was effective to add distributions of student textbooks and structured lesson plans for teachers to the package of interventions composed of teacher training and coaching. Piper et al. (2015) also demonstrated how strengthening coaching in a structured pedagogy program enhanced student learning in literacy and numeracy. Although evidence for the structured pedagogy program is increasing at the primary education level, studies focusing on effective combinations of interventions in the structured pedagogy program are scarce at the lower secondary-school level. Third, this study analyzed the heterogeneity of the first-year impact with respect to student household economic status. A disparity in learning outcomes in terms of student household economic status is widely observed in developing countries (Akmal and Pritchett, 2021). In El Salvador, the economic inequality level has historically been high. The Gini index was 54.5 in 1998, and it gradually decreased with the country’s modest economic growth, reaching 38.6 in 2018 (World Bank, 2021). Based on the baseline data for different types of student household assets, a composite index that represents student household economic status was constructed by principal component analysis. This study presents the economic disparity in learning outcomes of mathematics in El Salvador and investigates the heterogeneous impact of the additional interventions with respect to household economic status. Even after con trolling for the heterogeneous impact by the baseline score level, the impact of additional components in the ESMATE program was larger for students with lower household economic status. The remainder of this paper is organized into the following five sections. Section 2 describes the experimentation design and content of the ESMATE program. Section 3 presents the impacts of the additional components of the ESMATE program on student mathematics learning. In Section 4, the impacts on teaching practices and student study at home are examined. Section 5 discusses the findings, and Section 6 concludes. 2. Experimental design 2.1. Content of the ESMATE program In 2018, the Ministry of Education in El Salvador provided the schools in the control group with a package of interventions to improve students’ mathematics learning. The primary component of the package was the distribution of the ESMATE textbooks and teachers’ guidebooks. The ESMATE textbooks were designed to help students learn mathematics by solving mathematics problems. Each page included four lesson steps: (1) show the lesson topic, (2) pose example problems, (3) explain the general principle, and (4) provide exercises, so that teachers could organize daily mathematics lessons according to these four steps. The package also included other components, namely introductory training for teachers and school principals regarding how to use the textbook, and mutual review meetings among teachers to improve teaching with the ESMATE textbook. Introductory training for representatives of parent associations was also provided to strengthen family support for students’ study at home. For the treatment group, the Ministry provided supplementary interventions to the abovementioned package to strengthen the impact on students’ mathematics learning in two ways. First, the Ministry distributed mathematics tests to teachers to help them check students’ understanding. At the inter-semester mutual review meetings of teachers, the students’ test results were reviewed and discussed with colleagues.5 The Ministry also provided initial on-site advice for school principals regarding mathematics lesson observation to increase the frequency of their lesson observations and suggestions for teachers. In the traditional approach to teaching mathematics in El Salvador, teachers explained math problems through examples; however, they did not pay attention to how much the students learned (JICA, 2019). The interventions aimed to improve teaching practices by enhancing teachers’ awareness of how much students learned in mathematics lessons. Second, in addition to the textbook, mathematics workbooks that corresponded to the ESMATE textbook were also distributed to students in the treatment group. Since teachers could use the workbooks to assign homework regularly, the distribution of the workbooks would increase the frequency of homework assignments. At the beginning of the school year in January 2018, the ESMATE textbooks and teachers’ guidebooks were distributed to the treatment group. The distribution for the control group was delayed by approximately one month due to a delay in the Ministry’s public procurement process. In the control group, the schedule for introductory training for teachers, school principals, and repre sentatives of parent associations was also postponed for approximately one month, corresponding with the delayed textbook distri bution. In 2019 (year 2 of this study), the Ministry provided a set of teaching and learning materials, including workbooks for both groups, and organized a mutual review meeting of teachers.6 5 The grade 7 mathematics curriculum included eight units, and the Ministry developed two types of mathematics tests: one for checking student understanding by unit, and the other for checking student understanding by semester. In year 1 of this study, for the control group, the Ministry distributed both tests to schools; however, the number of copies distributed to each school was limited: only one copy of each test by unit in the mathematics curriculum in total, and 10 copies of mathematics tests by semester in total. On the other hand, in the treatment group, the Ministry distributed the necessary number of copies of tests for all students in grade 7 in the selected schools. 6 The ESMATE project was completed at the end of June 2019. While the Ministry distributed math textbooks and workbooks for schools in 2019, it did not distribute teachers’ guides and math tests. Teachers used teacher guides for grade 8 that were stored at schools, or they referred to the guide on the Ministry’s website. As math tests were included in the teachers’ guides, they could copy it. 3 T. Maruyama International Journal of Educational Research 115 (2022) 101977 Fig. 1. Boxplots of the baseline test scores by student household economic status. Note. The data source is baseline survey of this study. The number of students in the low, medium, and high household economic status is 534, 1317, and 472, respectively in the treatment group, and 484, 1273, and 433, respectively in the control group. 2.2. Assessment of learning outcomes in mathematics The baseline survey was conducted from January to March 2018, the end-line survey from September to October 2018, and a follow-up survey from September to October 2019. To assess learning outcomes in mathematics, students took written tests in all three rounds of surveys. Survey teams administered the tests without the presence of teachers in the classroom. The test items at the baseline survey were derived from the math content taught in primary education. The baseline test results demonstrated that grade 7 students did not master the foundations of mathematics, including the four basic operations. For example, only 42 percent of the students could correctly answer the item “43 – 17.” Regarding the item “2 × 2 × 2”, 34 percent of students confused multiplication with addition, answering “6.” For the division item “612 ÷ 102,” half of the students did not write any response. To account for progress that could be gained by following the curriculum, the test items differed across surveys. The test items at the end- line survey were derived from the mathematics content taught in grade 7. There were 20 test items at the baseline and end-line surveys respectively, and they included problems that were posed in the texts. The test items in the follow-up survey were derived mainly from mathematics content taught in grade 8. The test for the follow-up survey comprised 25 different items, five of which were the same as in the end-line survey. The written test for each survey round included items to measure learning outcomes in mathematics by cognitive skills (i.e., knowing, applying, and reasoning) and by cognitive domains (i.e., number and operation, function, and geometry). The composition of test items is presented in Tables A.1 to A.3 in Appendix A. Based on the five common mathematics items in the end-line and follow-up surveys, the test scores in the two rounds were linked using IRT. The process of linking test scores is presented in Appendix B. The baseline survey collected information regarding different types of student household assets through student interviews. A composite index of the different types of household assets was computed using principal component analysis. The first principal component accounted for approximately 25 percent of the overall variance of the variables of different types of household assets, and all their loading coefficients were positive. Therefore, the first principal component was taken as the index of student household economic status. According to the first principal component, the samples of students were divided into one of three levels: low, medium, and high household economic status.7 Details of the principal component analysis are presented in Appendix C. As shown in Fig. 1, student baseline test scores modestly correlated with household economic status. The follow-up survey additionally conducted student interviews to collect information regarding parents’ educational back grounds. Overall, students with lower household economic status received less support from their families for studying at home. The mothers’ educational attainment positively correlated with household economic status (see Fig. G.1 in Appendix G), and mothers with higher educational attainment were shown to support student study at home more frequently in a week (see Fig. G.2) 2.3. Sampling This study selected the 7th grade from the lower secondary education level and evaluated the impact of the additional interventions of the ESMATE program by a randomized controlled trial. Among the 14 departments, Cabañas, La Union, San Miguel, and San Vicente, situated in the central and eastern parts of El Salvador, were selected. The educational outcomes such as enrollment and dropout rates in those four departments were close to or below the national averages (see Table 1). 7 The average percentage of students from low economic households per school was approximately 25 percent, and the standard deviation was 21 percentage points. The average percentage of students from medium economic households per school was approximately 58 percent, and the standard deviation was 18 percentage points. 4 T. Maruyama International Journal of Educational Research 115 (2022) 101977 Table 1 Educational statistics in four departments targeted in this study. National Average Cabañas La Union San Miguel San Vicente Primary net enrollment rate (2015) 86.2% 89.0% 81.2% 85.7% 85.7% Primary repetition rate (2014) 5.8% 6.7% 5.5% 5.4% 7.7% Primary dropout rate (2014) 6.4% 9.8% 8.5% 6.7% 7.7% Secondary net enrollment rate (2015) 37.9% 25.4% 25.9% 35.5% 38.5% Secondary repetition rate (2014) 4.9% 3.7% 4.9% 4.2% 4.3% Secondary dropout rate (2014) 8.5% 12.4% 11.5% 7.1% 8.0% Source: The Ministry of Education in El Salvador. Table 2 Sampling frame of schools in the four departments. Cabañas La Union San Miguel San Vicente Total (A) N. of public schools (lower secondary) 105 146 248 113 612 (B) Schools with both cycle 1 and cycle 3 in (A) 104 144 247 111 606 (C-1) Schools without difficulty in access or security in (B) 64 68 164 105 401 (C-2) Schools not targeted by the MCC program (Sampling frame) 64 49 151 105 369 (D-1) Sampled schools (Treatment) 22 16 51 36 125 (D-2) Sampled schools (Control) 21 17 51 36 125 Public schools in El Salvador are called “basic education public schools,” and according to local educational needs, they can include preschool, primary, and lower and upper secondary school levels. Within the four selected departments in this study, 612 basic education public schools offered lower secondary education, of which 606 had cycle 1 (grades 1 through 3) and cycle 3 (grades 7 through 9) of basic education.8 The country suffers from security problems due to the presence of gang members inherited from past civil conflicts. Intentional homicides per 100,000 were 61.8 in 2017 (World Bank, 2021), the highest rate in the world. Consequently, schools in El Salvador are also affected by the activities of gang members (USAID, 2017). Schools located in an area severely affected by such activities, and those that were physically difficult to access, were excluded from the sampling frame. Outside this study’s experimental design, the Millennium Challenge Corporation (MCC) planned to distribute the ESMATE textbooks in 2018. Schools targeted by the MCC were also excluded from this study’s evaluation framework. As a result, the sampling frame comprised 369 basic education public schools. Of these, 250 were randomly sampled, with half randomly assigned to the treatment group and the other half to the control group (see Table 2). The stratification variables in the randomization included department and urban status. If several classes of targeted grades were found in the sampled schools, one class was randomly selected. In the baseline survey, seven schools in the treatment group and four schools in the control group were excluded due to security reasons (Koei Research & Consulting Inc. 2018).9 In addition to these 11 schools, there were no students enrolled in grade 7 at the two schools in the control group. Based on the educational census data collected by the Ministry, the characteristics of the 612 basic education public schools in the four departments were compared to the sampling frame (see Table D.1 in Appendix D). As data from some schools were not included in the educational census survey data, the number of schools in Column (A) to (C) in Table D.1 does not exactly match the number of schools in Table 2. The sampling frame adequately represents the original 612 basic education public schools in the four departments. The characteristics of the original sample of 250 schools and the remaining 237 schools after attrition of 13 schools were also equivalent between the two groups. 2.4. Comparison of student, teacher, and school characteristics The student, teacher, and school characteristics of the 237 schools are presented in Tables 3 and 4. The average percentage of teachers who finished teacher training courses in the control group was higher than in the treatment group by 9.5 percentage points (see Table 3). A higher percentage of teachers who taught subjects other than mathematics was found in the treatment group than in the control group (by nine percentage points). In terms of school characteristics, the average number of 7th grade students in the treatment group is larger than the control group by 9 students.10 The differences are statistically significant at the 10 percent level. 8 This research originally targeted both the primary and lower secondary education levels, focusing on the 2nd grade in primary and 7th grade in lower secondary education. Given the differences in content and educational levels, this paper focuses only on the lower secondary educational level. 9 At the end-line survey, three schools in the control group were additionally excluded due to security reasons (Koei Research & Consulting Inc., 2019). In the other two schools, the number of 7th grade students enrolled at the baseline survey was small, and all students were absent at the end-line survey. 10 This difference in the number of 7th grade students is due to the number of 7th grade classes in the sampled schools. The class size of the sampled class was equivalent between the treatment group (average: 24 students) and the control group (average: 23 students). 5 T. Maruyama International Journal of Educational Research 115 (2022) 101977 Table 3 Comparison of teacher and school characteristics (baseline survey). Content Treatment Control Mean Diff. (I) (II) (III): (I)-(II) Teacher characteristics Sex (Male) (%) 52.5 55.5 -2.9 Age 39.3 39.9 -0.6 Standard deviation (10.1) (10.1) Total teaching period (years) 15.6 16.3 -0.6 Standard deviation (9.3) (9.6) Academic degree High school (%) 2.5 1.7 0.9 Professorate (%) 60.2 64.7 -4.5 Bachelor (%) 34.8 31.9 2.8 Master (%) 2.5 0.8 1.7 Doctor (%) 0.0 0.0 0.0 Teacher qualification (1) Pedagogical Bachelor (%) 3.4 3.4 0.00 Professorate (%) 75.4 84.9 -9.5* License in Education (%) 18.6 14.3 4.4 Master’s in Education (%) 1.7 0.8 0.9 Doctorate in Education (%) 0.0 0.0 0.0 Pedagogical Training Course (%) 8.5 4.2 4.3 Teacher qualification (2) Basic Education Teacher (Cycle I and II) (%) 15.3 15.1 0.1 Math Specialty Teacher (Cycle III and High School) (%) 60.2 58.8 1.3 Teacher specializing in other than math (Cycle III and High School) (%) 17.8 22.7 -4.9 Others (%) 15.3 10.9 4.3 School characteristics Number of students N. of Students (7th grade) Morning Shift 39.7 30.6 9.2* Standard deviation (25.9) (20.7) N. of Students (7th grade) Afternoon Shift 20.9 20.6 0.2 Standard deviation (11.4) (10.2) N. of Students (Total) 257.6 238.8 18.8 Standard deviation (231.9) (176.3) Number of teachers 10.6 10.2 0.4 Standard deviation (9.2) (6.8) Repetition and dropout rate Repetition rate (morning shift of 7th grade in 2017) (%) 4.3 3.8 0.5 Repetition rate (afternoon shift of 7th grade in 2017) (%) 4.4 5.1 -0.6 Dropout rate (morning shift of 7th grade in 2017) (%) 8.9 9.4 -0.5 Dropout rate (afternoon shift of 7th grade in 2017) (%) 10.8 9.8 1.0 School facility Electricity (%) 100 100 0.0 Drinking Water (%) 79.7 86.6 -6.9 Computer (%) 94.9 93.3 1.6 Library (%) 25.4 25.2 0.2 Laboratory (%) 11.9 6.7 5.1 Donor support within 5 years (except ESMATE) (%) 91.5 94.1 -2.6 Number of schools 118 119 Note. The data source is the baseline survey. Column (III) reports the mean difference between the treatment and control groups, as well as the result of the test for the difference. For binary values, the chi-square test with stratified data (department and urban status) was conducted. For the total teaching period, age, number of students, repetition rate, and dropout rate, the Wilcoxon rank sum test with stratified data (department and urban status) was performed. * p < 0.1. Although these differences are noted in the teacher and school characteristics, the overall baseline characteristics of treatment and the control groups were well balanced. For logistical reasons, it was not possible to conduct the baseline survey before any component of the intervention package started. The baseline survey began in mid-January 2018, just after the intervention in the treatment group began, and finished on March 1, 2018. The surveys for the treatment and control schools were conducted in parallel. As presented in Table 4, the difference in the baseline scores is nearly zero and not statistically significant (p-value: 0.93), which indicates that the baseline scores are well balanced. The density curves of the baseline scores are shown in Fig. 2. All test items in the baseline survey were derived from primary education content. Since the 7th grade ESMATE textbook did not include a review section of content that was learned in primary education, and it started with new mathematical content, it is plausible to think that the distribution of ESMATE textbooks at the beginning of the school year did not have an impact on the baseline scores in the treatment group. 6 T. Maruyama International Journal of Educational Research 115 (2022) 101977 Table 4 Comparison of student characteristics (baseline survey). Content Treatment Control Mean Diff. (I) (II) (III): (I)-(II) Morning Shift (%) 58.33 63.65 -5.32 Age 13.03 12.98 0.06 Standard deviation (1.30) (1.28) Sex (Male) (%) 48.82 50.09 -1.28 Repeated (%) 22.85 21.92 0.94 Number of elder siblings 1.76 1.79 -0.03 Standard deviation (2.09) (2.02) Number of younger siblings 1.13 1.11 0.02 Standard deviation (1.15) (1.21) Raw test scores (total points: 20) 6.58 6.64 -0.06 Standard deviation of raw test scores (3.55) (3.58) Standardized test scores -0.017 0.000 -0.016 Asset of study Math textbook last academic year (%) 25.74 26.67 -0.92 Math notebook last academic year (%) 90.40 87.63 2.77 Notebook only for Math last academic year (%) 89.24 85.47 3.76 Study desk at home (%) 95.05 95.39 -0.34 Assets of student household Smartphone (%) 86.48 85.34 1.14 Computer (%) 29.79 29.45 0.34 Refrigerator (%) 85.36 84.25 1.12 Car (%) 30.22 29.86 0.36 TV (%) 93.37 93.79 -0.42 Tap water (%) 77.18 80.09 -2.91 Electricity (%) 96.43 96.21 0.22 Flush Toilet (%) 52.35 55.71 -3.36 Not using wood for cooking (%) 38.92 37.67 1.24 Student household economic status index -0.005 0.006 -0.011 Number of schools 118 119 Number of students 2,323 2,190 Note (1) The data source is baseline survey of this research. (2) Baseline test scores were standardized using the mean and standard deviation of baseline scores of the control group. The student household economic status index is the first component of the principal component analysis of different types of household assets (see Appendix C for details). (3) Column (III) reports the mean difference between the treatment and control groups, as well as the result of the test for the difference. The test controls for strata fixed effects were constructed using the stratification variables in the random assignment (department and urban status). Robust standard errors were clustered at the school level. None of the differences were statistically significant. Fig. 2. Density curves of the baseline test scores (standardized scores). Note: Data source is the baseline survey of this research. 2.5. Student attrition Among the 4,513 students who were present at the baseline survey, 894 were absent in the end-line survey, and 1,371 were absent in the follow-up survey. The attrition rate was 19.4 percent in the end-line survey and 29.1 percent in the follow-up survey for the treatment 7 T. Maruyama International Journal of Educational Research 115 (2022) 101977 group. For the control group, the attrition rate was 20.2 percent in the end-line survey and 31.7 percent in the follow-up survey. The major reasons underlying the attrition rates were that students moved to other schools or they dropped out. In the end-line survey, 11 percent of students in the treatment group and nine percent in the control group moved to another school or dropped out. In the following year, 12 percent in the treatment group and 11 percent in the control group moved to another school or dropped out. To check whether differential attrition occurred between the two groups, the student attrition dummy was regressed on the treatment assignment, student characteristics (baseline), and strata fixed effects constructed by department and urban status (see Table E.1 in Appendix E for the results). It was found that attrition did not differ between the two groups. Attrition occurred more frequently among male students, students with lower baseline scores, and older students. Further, a higher student attrition was observed for students who had younger siblings. Grade 7 students who repeated the same grade in year 2 of this study remained in the sample and were assigned the same mathematics test as grade 8 students. In the follow-up survey data, 1.9 percent of students in the treatment group and 2.3 percent of students in the control group repeated 7th grade. 3. Impacts on student learning outcomes in mathematics 3.1. Estimation strategy The impacts of the additional components in the ESMATE program on student learning outcomes in mathematics are estimated using the following equation. ( ) ( ) Yijr = αr + γr Yij0 ×Yearr + δr Treatmentj ×Yearr + Cijr βc + Pkjr βp + Sjr βs + Dj βD + εijr (1) In Eq. (1), Yijr on the left-hand side (r = 1: end-line, r = 2: follow-up) represents the mathematics test score for student i in school j at the end-line or follow-up survey, and Yij0 on the right side of the equation is the student’s test scores at the baseline survey. The baseline test scores were standardized by the mean and standard deviation of the test scores of students in the control group. Based on the five common mathematics test items in the end-line and follow-up surveys, the test scores on the left side of the equation were linked using IRT (in this study, the linked scores are termed “IRT scores”). The IRT scores were standardized by the mean and standard deviation of the scores in the control group at the end-line survey. Yearr is a vector of dummy variables for year r (r = 1: end-line and r = 2: follow-up); Cijr is a vector of the characteristics of student i at school j (e.g., age, gender, morning or afternoon shift at school, the number of brothers and sisters) and characteristics of the family of student i (e.g., household assets at the baseline). Cijr includes the shift at school (morning or afternoon) during year 1 and 2 of this study, and whether student i repeated the 7th grade in year 2. Pkjr is a vector of the characteristics of teacher k at school j, who teaches mathematics to student i in year r (e.g., age, gender, and educational qualification). Sjr is a vector of the characteristics of school j (e.g., school infrastructure and the total number of students attending school j at the baseline). Sjr includes the characteristics of school principals in year 1 and 2 of this study. Dj is a vector of the strata fixed effects constructed by department and urban status of school j. Robust standard errors are clustered at the school level. In Eq. (1), εijr is the error term. To analyze the heterogeneous impacts by the baseline scores and student household economic status, Eq. (1) is expanded by adding the interaction term of the baseline score or student household economic status index and the treatment assignment as: ( ) ( )( ) Yijr = αr + γr Yij0 ×Yearr + δrA + δrB Xij0 Treatmentj ×Yearr + Cijr βc + Pkjr βp + Sjr βs + Dj βD + εijr , (2) where Xij0 represents either a vector of the baseline scores or the student household economic status index. 3.2. Impact of the additional components in the ESMATE program The density curves of the test scores for the end-line and follow-up surveys are presented in Figs. 3 and 4. As the test scores were linked by IRT, they are comparable across the end-line and follow-up surveys. Although the IRT scores in the treatment group are higher than those in the control group at the end of year 1 of this study (see Fig. 3), both groups improved their mathematics learning in the following year; then, the difference between the two groups almost disappeared at the end of year 2 of this study (see Fig. 4). The impact of the additional interventions in the ESMATE program on student learning outcomes in mathematics is estimated using Eq. (1), pooling the samples of students who took tests at both the baseline and end-line surveys, and students who took tests at both the baseline and follow-up surveys. As shown in Columns (I)-1 to (I)-3 in Table 5, the average one-year impact of the additional in terventions in the ESMATE program is estimated at 0.17 standard deviations of the IRT scores, which is statistically significant at the 1 percent level.11 Conversely, the estimated value of the accumulated impact of the first-year additional interventions in the following 11 I conducted the cost-effective analysis following the methodology presented by J-PAL (Bhula et al., 2020; Dhaliwal et al., 2014). The cost-effectiveness is measured as the ratio of the aggregated impact of the project (the average impact on student learning per student multiplied by the number of students impacted) to the aggregated cost of implementing the project. The cost-effectiveness is presented as the total standard deviations gained across the sample per 100 USD spent. The cost-effectiveness of the additional interventions in the ESMATE program, the total standard deviations gained across the sample per 100 USD spent, is estimated to be 2.68. The level of cost-effectiveness for the additional in terventions in the ESMATE program is comparable to those of other programs cited in Kremer et al. (2013). 8 T. Maruyama International Journal of Educational Research 115 (2022) 101977 Fig. 3. Density curves of the end-line test scores (IRT scores). Note. Data source is the end-line survey of this research. The IRT scores were standardized by the mean and standard deviation of the IRT scores of the control group in the end-line survey. Fig. 4. Density curves of the follow-up test scores (IRT scores). Note. Data source is the follow-up survey of this research. The IRT scores were standardized by the mean and standard deviation of the IRT scores of the control group in the end-line survey. year is 0.07, which is not statistically significant.12 The impacts are also estimated using the samples of students who took the tests in all three rounds of the survey. The estimated impacts became slightly higher than the estimates with the pooled sample, but they remained almost at the same level (see Column (II)-3 in Table 5). To deepen the analysis of the accumulated impact, I conducted a quantile regression by regressing the IRT scores in the first and second years of on the treatment assignment dummy. Figs. F.1 and F.2 in Appendix F plots the estimates of quantile treatment effect. The estimates of the one-year impact are positive and statistically significant for students in the treatment group, from the lowest to approximately 90 percent in the distribution of IRT scores (Fig. F.1). The estimates of the accumulated impact are positive and sta tistically significant for students in the treatment group, from the lowest to approximately 30 percent in the distribution of IRT test scores (Fig. F.2). The results indicate that students who improved math learning in year 1 of research could not advance their learning well in the following year. The first-year gain was not sufficient to produce an accumulated impact on mathematics learning in the following year. The test results for the mathematics items illustrate the student learning situation underlying the estimate of the accumulated impact. At the end of grade 7 (year 1 of this study), more than 80 percent of students in both groups had difficulty solving simple equation problems, 12 Additionally, I tested a hypothesis that δ1 (the coefficient of the interaction term of the treatment assignment and first-year dummy) was equal to δ2 (the coefficient of the interaction term of the treatment assignment and second-year dummy). The hypothesis was not rejected at the 10 percent level (p-value was 0.243). 9 T. Maruyama International Journal of Educational Research 115 (2022) 101977 Table 5 Average impact of the additional interventions in the ESMATE program on student learning outcomes in mathematics. (I)-1: Full sample (I)-2: Full sample (I)-3: Full sample (II)-1: Balanced (II)-2: Balanced (II)-3: Balanced panel sample panel sample panel sample Treatment × 1st year dummy 0.186** 0.187*** 0.172*** 0.182** 0.194*** 0.189*** (0.075) (0.058) (0.055) (0.080) (0.061) (0.060) Treatment × 2nd year dummy 0.069 0.088 0.073 0.076 0.089 0.079 (0.096) (0.078) (0.073) (0.098) (0.080) (0.075) 2nd year dummy 0.849*** 0.882*** 0.619*** 0.833*** 0.891*** 0.681*** (0.058) (0.057) (0.145) (0.060) (0.060) (0.155) Z baseline scores 0.516*** 0.471*** 0.522*** 0.477*** × 1st year dummy (0.022) (0.022) (0.024) (0.024) Z baseline scores 0.551*** 0.508*** 0.552*** 0.513*** × 2nd year dummy (0.025) (0.024) (0.025) (0.024) Student characteristics No No Yes No No Yes Teacher characteristics No No Yes No No Yes School characteristics No No Yes No No Yes Number of observations 6,761 6,761 6,761 5,716 5,716 5,716 Number of clusters 232 232 232 229 229 229 Note (1) The data source is the three rounds of survey of this research. (2) Robust standard errors clustered at the school level are in parenthesis. Strata fixed effects were controlled in all the regressions but are not shown. (3) The IRT scores (in the end-line and follow-up surveys) were standardized by the mean and standard deviation of the IRT scores of the control group in the end-line survey. (4) Baseline test scores were standardized by the mean and standard deviation of the test scores of the control group in the baseline survey. *** p < 0.01; ** p < 0.05; * p < 0.1. such as “solve the equation, 2x = 8.”13, 14 In the ESMATE textbook, students learn how to represent objects or situations using symbols such as x or y and then study equation. However, over 90 percent of students responded incorrectly to a simple item on the repre sentation in the end-line survey.15 Whereas the ESMATE textbook included examples for students to introduce how to represent objects or situations using symbols such as x or y, the test result in the end-line survey indicates that it is necessary to improve the part in the textbook for a better understanding of the concept. Since mathematics content in grade 8 involved equations, the low-level under standing of content in grade 7 hampered student mathematics learning in the following year.16 To analyze the impacts of the additional interventions in the ESMATE program from a different angle, the subtotals of test scores according to cognitive domains and skills are used instead of the IRT scores. The test scores subtotals are standardized by the mean and standard deviation of the control group in each round of the surveys. The impacts of the cognitive skills and domains are estimated using the following equation: Yijr = αr + γr Yij0 + δr Treatmentj + Cijr βc + Pkjr βp + Sjr βs + Dj βD + εijr , (3) where the subscript r represents the end-line (r = 1) or follow-up (r = 2). Without IRT, the test scores in the end-line and follow-up could not be directly compared. Therefore, unlike equations (1), which are to be estimated by pooling both the end-line (r = 1) and follow-up (r = 2) samples, Eq. (3) is estimated separately for the end-line and follow-up samples. The regression results are presented in Table F.1 in Appendix F. In year 1 of this study, although the impacts in the domains of “number and operation” and “function” are positive and statistically significant, the impact in the domain of geometry is not statis tically significant. In the former two domains, the correct response rates in the treatment group to items such as simple operations with positive and negative numbers and direct and inverse proportions were higher than that in the control group. Regarding cognitive skills, although the impact on the knowing skill is statistically significant, the impact on applying and reasoning skills is not. Conversely, in year 2 of this study, all the estimated values are nearly zero and not statistically significant (see Table F.1). As noted in Section 2.1, the distribution of ESMATE textbooks and teachers’ guidebooks was delayed by approximately one month in the control group in year 1 of this study. Although the curriculum covered in the 2018 school year was limited in the control group 13 Original text in the item was written in Spanish. The correct response rate for the single-equation item of equation was 17.8 percent for the treatment group and 14.3 percent for the control group. The difference was not statistically significant. 14 The root cause of this difficulty can be found in mathematics test results in the baseline and end-line surveys. Whereas at the beginning of 7th grade, approximately 60 percent of the students in the treatment group correctly responded to the item “answer the number in □ of equation, “2×□=8,” the correct response rate to the simple equation problem “Solve the equation, 2x = 8” was below 20 percent. These results indicated that students in grade 7 had difficulty bridging their mathematics learning from the primary level to the lower secondary level, in which mathematics content became more theoretical and abstract. 15 The item was “There are 5 men and a certain number of women. If the number of women is expressed with x, express the total number of people with x.” (Original text was written in Spanish.) 16 In the follow-up survey, the correct response rate for simple simultaneous equations was less than 10 percent in both groups. 10 T. Maruyama International Journal of Educational Research 115 (2022) 101977 Table 6 Heterogeneity in the impact of the additional interventions in the ESMATE program. (I) (II) (III) (IV) Treatment × 1st year dummy 0.173*** 0.172*** 0.173*** 0.187*** (0.055) (0.055) (0.055) (0.054) Treatment × 2nd year dummy 0.071 0.069 0.069 0.067 (0.072) (0.073) (0.073) (0.072) 2nd year dummy 0.621*** 0.619*** 0.594*** 0.627*** (0.143) (0.144) (0.140) (0.143) Treatment × 1st year dummy -0.048** -0.046* -0.049** × Student economic status index (0.024) (0.024) (0.024) Treatment × 2nd year dummy -0.022 -0.019 -0.019 × Student economic status index (0.033) (0.033) (0.033) Treatment × 1st year dummy -0.037 -0.029 -0.022 × Z score baseline (0.043) (0.042) (0.041) Treatment × 2nd year dummy -0.038 -0.035 -0.035 × Z score baseline (0.048) (0.047) (0.047) Treatment × 1st year dummy -0.060*** × Z score baseline × Student economic status index (0.021) Treatment × 2nd year dummy -0.001 × Z score baseline × Student economic status index (0.026) Student economic status index 0.066*** 0.042*** 0.065*** 0.067*** × 1st year dummy (0.014) (0.012) (0.014) (0.014) Student economic status index 0.043* 0.033** 0.042* 0.043** × 2nd year dummy (0.023) (0.015) (0.023) (0.023) Z score baseline 0.472*** 0.491*** 0.487*** 0.484*** × 1st year dummy (0.021) (0.028) (0.028) (0.028) Z score baseline 0.511*** 0.530*** 0.529*** 0.528*** × 2nd year dummy (0.024) (0.035) (0.035) (0.035) Z score baseline × 1st year dummy 0.030*** × Student economic status index (0.010) Z score baseline × 2nd year dummy 0.012 × Student economic status index (0.019) Num. obs. 6,761 6,761 6,761 6,761 N. Clusters 232 232 232 232 Note (1) Data sources are the baseline, end-line, and follow-up surveys. (2) Robust standard errors clustered at the school level are in parenthesis. (3) The IRT scores (in the end-line and follow-up surveys) were standardized by the mean and standard deviation of the IRT scores of the control group in the end-line survey. Baseline test scores were standardized using the mean and standard deviation of the test scores of the control group in the baseline survey. (4) Student, school, and teacher (end-line) characteristics, as well as strata fixed effects constructed by department and urban status, were controlled for but not shown. The student economic status index is the first principal component of different types of household assets (see Appendix C for details). *** p < 0.01; ** p < 0.05; * p < 0.1. due to the delay, approximately the same percentage of schools completed units 1 through 4 in the curriculum.17 The impact of the additional interventions in the ESMATE program is estimated by taking the subtotal of the end-line test scores that corresponded to units 1 through 4, using Eq. (3) for the robustness check.18 The estimated one-year impact of the additional components is 0.178 standard deviations (standard error: 0.060), which is statistically significant at the 1 percent level. 3.3. Heterogeneity in the impact of the additional interventions The heterogeneity of the average treatment effect with respect to the baseline scores and household economic status is investigated using Eq. (2) (see Table 6 for the results). The heterogeneous one-year impact by student economic household status is estimated at –0.04, which is statistically significant at the 5 percent level (see Column (I) in Table 6). However, the impact of the additional treatment was not heterogeneous in terms of baseline test scores (see Column (II) in Table 6). As shown in Column (III) in Table 6, the estimated values of heterogeneous impact by student household economic status remain at almost the same level, even after controlling for the heterogeneous impact by the baseline scores. The triple interaction term of treatment assignment, student household economic status index, and baseline test scores, as well as the interaction term of student household economic status index and baseline test scores were added to the equation. While the coefficient of the interaction term of 17 In the end-line survey, the units in the math curriculum that students had already learned was surveyed through teacher interview. Among the 8 units in the math curriculum for grade 7, more than 90 percent of schools in both treatment and control groups had completed units from 1 to 4. 18 In the end-line survey, there were 20 math items in total. Among them, items 1 to 9 corresponded to the content in units 1 to 4 in the math curriculum. 11 T. Maruyama International Journal of Educational Research 115 (2022) 101977 student household economic status index and baseline test scores is positive in year 1, the coefficient of the triple interaction term in the year is –0.06, which is statistically significant at the 1 percent level (see Column (IV) in Table 6). To better understand the heterogeneous impact by student household economic status, dummy variables of low, medium, or high household economic status were used instead of the status index in Eq. (2).19 The regression results are presented in Table G.1 in Appendix G. The net impact for high economic status is close to zero (see Columns (I) and (III) in Table G.1). Conversely, the impacts for students with low and median economic status are positive and statistically significant. These results indicate that the additional interventions in the ESMATE program reduced the economic disparity in learning outcomes in mathematics. 4. Impacts of additional interventions on teaching practices and student study at home 4.1. Impacts on teaching practices in mathematics lessons Mathematics lesson observations were conducted in the end-line and follow-up surveys, which measured how long students engaged in learning mathematics in a lesson.20 The surveyors measured the time spent in a lesson when half the students in a class were solving mathematics problems, referring to their notebook or textbook, or consulting with each other (hereinafter referred as “student engaged time”). The lesson observation also checked the following instructional routine of teachers: (a) teacher posed math problem; (b) teacher checked student notebooks; (c) teacher walked around the classroom to check note books; (d) teacher advised students to consult with each other; (e) teacher told students to check their answers; and (f) teacher instructed students to try incorrectly solved problems again. The surveyors noted these points according to the teachers’ speech and/or behaviors. The impacts of the additional interventions in the ESMATE program on teaching practices is estimated by the following equation: Ykjr = αr + λr (Treatmentj ×Yearr )+Pkjr ρp + Sjr ρs + Dj ρD + υkjr , (4) where Ykjr represents either the percentage of student engaged time in a lesson or the instructional routine of teacher k in school j at the rth survey (end-line: 1; follow-up: 2). The value on the instructional routine is a dummy that takes the value of 1 for the teacher who conducted each activity in the instructional routine as described in the previous paragraph. Ykjr also takes values of the frequency of lesson observations or suggestions that teacher m received from the school principal at school k. The other variables of Treatmentj, Yearr, Pkjr, Sj, and Dj represent the same as in Eq. (4), and the error term is υkjr. The one-year impact of the additional interventions in the ESMATE program on the percentage of student engaged time in math lesson is estimated at approximately four percentage points, and the accumulated impact is approximately six percentage points (see Column (I)-1 in Table 7). Both estimates are positive and statistically significant. As the standard length of a lesson was 45 minutes, the difference is equivalent to approximately two to three minutes. Regarding the teachers’ instructional routines in a mathematics lesson, the percentage of teachers who checked students’ answers in the lessons or instructed students to try again if they answered a question incorrectly was higher in year 1 of this study (see Column (I)-2 in Table 7 and Fig. F.3 in Appendix F).21 One of the additional interventions to improve teaching practices for the treatment group was the distribution of mathematics tests. The test results were used in inter-semestrial mutual review meeting of teachers to review and discuss with colleagues. This additional intervention in the ESMATE program increased the percentage of schools that shared and compared their test results with those of other schools in year 1 of this study (see Columns (II)-1 and (II)-2 in Table 7). The objective of the mutual review meetings was to make teachers more conscious of their students’ learning levels. In the treatment group, 85 percent of the surveyed teachers participated in the mutual review meetings and discussed teaching and learning practices based on their students’ test results in year 1. In the end-line survey, the surveyors asked teachers aspects that they paid attention to in when they organized their mathematics lessons. The percentage of teachers who raised the importance of instructing students to try again their wrong answers in the interview was higher in the treatment group (64.4 percent) than in the control group (52.6 percent).22, 23 The difference on their instructional routines in mathematics lesson in year 1 of this study disappeared in the following year (see 19 Please refer to Appendix C for details regarding the category of low, medium, and high household economic status. The comparison of stan dardized test scores between the baseline and end-line by student household economic status level is shown in Fig. G.3. 20 Each survey team comprised two surveyors. In the mathematics lesson observation, one surveyor observed the teacher while the other observed the students. 21 The estimated impact on the instructional routine of checking student notebooks in the lesson is 0.097 (standard error: 0.050, p-value: 0.054). 22 The chi-square test with stratified data (department and urban status) was conducted. The difference was statistically significant at the 10 percent level (p-value: 0.069). 23 During the introductory training in year 1 of this study, teachers in the treatment group developed an annual mathematics teaching plan. Teachers in the control group also developed an annual teaching plan in April 2018. The plan was developed using a simple one-page format with a year-long calendar that defined which textbook page would be taught on which day. During the teachers’ inter-semestrial review meetings, they checked their progress and revised their plans, while also considering the remaining number of lessons in the school year. The percentage of teachers who provided additional mathematics classes when the annual teaching plan was delayed was higher in the treatment group than in the control group in the first year of this study (52.5 percent in the treatment group, and 37.7 percent in the control group). The process of developing annual teaching plans at the beginning of the school year could have helped teachers provide additional classes when the plan was delayed. 12 T. Maruyama International Journal of Educational Research 115 (2022) 101977 Table 7 Impacts of the additional interventions in the ESMATE program on intermediate outcomes. (I) Teaching practices of teachers (II) Mutual review meeting (III) Student study at home (I)-1: Percentage (I)-2: Percentage of (II)-1: Percentage (II)-2: Percentage (III)-1: Frequency (III)-2: Time of student teachers who of schools that of schools that of homework of study at engaged time instructed students discussed teaching compared the math assignment in home for in lesson try again for wrong practices of math test results with a week homework answers based on the math other schools test results Treatment × 1st year dummy 4.362** 11.862* 10.232* 13.309** 0.314*** 0.016 (2.099) (6.517) (5.437) (6.425) (0.088) (0.052) Treatment × 2nd year dummy 5.908** 0.401 -0.519 7.692 0.151 -0.092 (2.324) (7.925) (3.330) (6.465) (0.093) (0.068) 2nd year dummy -0.556 6.924 23.567*** 33.141*** 0.090 -0.265* (3.523) (12.839) (4.320) (6.091) (0.157) (0.156) Average of the control group 18.942 63.158 72.034 31.356 3.404 3.704 (year 1) Num. obs. 400 400 472 472 400 6,761 Note (a) For Columns (I)-1, (I)-2, (II)-1, (II)-2, and (III)-1, the teacher and school characteristics and strata fixed effect constructed by department and urban status were controlled in the regressions, but they are not shown. Robust standard errors are used. (b) For Columns (I)-1 and (III)-1, the data for the dependent variable were collected through interviews with teachers. For Columns (I)-2, the data for the dependent variable were collected during the lesson observations of teachers. For Columns (II)-1, (II)-2, the data for the dependent variables were collected through interviews with school principals. (c) For Column (III)-1, the frequency was categorized into four levels: (1) never, (2) once a week, (3) two or three times a week, and (4) four or more times a week. (d) For Column (III)-2, the data for the dependent variables were collected through interviews with the students. The frequency of studying at home in a week took one of five levels: (1) I never solve homework exercises at home; (2) I spend less than 15 minutes; (3) I spend between 15 and 30 minutes; (4) I spend between 30 minutes and one hour; or (5) I spend more than one hour. Robust standard errors clustered at the school level are in parenthesis. *** p < 0.01; ** p < 0.05; * p < 0.1. Fig. F.4 in Appendix F) as the schools in the control group also organized mutual review meeting where teachers compared test results with other schools in the year (Column (II)-1 and (II)-2 in Table 7).24 The additional interventions in the ESMATE program also included initial on-site advice for school principals regarding mathe matics lesson observation. The impacts on the frequency of school principals’ lesson observations and suggestions for teachers are estimated using Eq. (4). As shown in Columns (I) and (II) in Table F.2, the additional interventions in the ESMATE program did not have an impact on the frequency of school principals’ lesson observations or suggestions. The topics of suggestions for teachers were also surveyed through interviews with school principals. Between the treatment and control groups, statistically significant differences were not observed regarding the principals’ suggestion topics for lessons (Figs. F.5 and F.6).25 The regression results on the intermediate outcomes suggest that the combination of the distribution of mathematics tests and organization of mutual review meeting increased the percentage of schools that reviewed the test results with other schools, which helped teachers in the treatment group pay attention to their students’ learning during mathematics lessons in year 1 of this study. 4.2. Impact on student study at home In year 1 of this study, students in the treatment group received math workbooks in addition to textbooks. As the workbook content corresponded to the ESMATE textbook, teachers could use it for homework assignments. In the end-line and follow-up surveys, teacher interviews were conducted to collect data regarding the frequency of mathematics homework assignment in a week, which was assigned to one of four levels: (1) never, (2) once a week, (3) two or three times a week, or (4) four or more times a week. The impact of the additional interventions in the ESMATE program on the frequency of homework assignment is estimated using Eq. (4). The additional interventions increased the frequency of math homework assignment by 0.31 level points in year 1 of this study, which is statistically significant at the 1 percent level (see Column (III)-1 in Table 7). The end-line and follow-up surveys used student interviews to collect data regarding time spent studying mathematics at home per day, which was categorized according to one of five levels: (1) I never solve homework exercises at home; (2) I spend less than 15 minutes; (3) I spend between 15 and 30 minutes; (4) I spend between 30 minutes and one hour; or (5) I spend more than one hour. The impact on student study at home is estimated using Eq. (2). Although the additional interventions in the ESMATE program increased 24 In 2019, the Ministry did not distribute math tests for schools; however, as the tests were included in teachers’ guides, teachers could refer to them. 25 The dummy variables for each topic of suggestion were regressed on the treatment assignment, characteristics of school and school principal, and strata fixed effects constructed by the rural/urban and department dummies. None of the coefficients of the treatment assignment are statis tically significant. 13 T. Maruyama International Journal of Educational Research 115 (2022) 101977 the frequency of homework assignment in a week in year 1 of this study, no statistically significant difference was found in terms of time spent studying mathematics at home in the two groups (see Column (III)-2 in Table 7). Schools in rural areas were operated with smaller number of teachers than those in urban areas.26 While the increase in workload for teachers in schools with fewer teachers might have resulted in lower frequency of homework assignments, the additional in terventions in the ESMATE program, including the distribution of workbooks, increased the frequency of homework assignments in rural areas in year 1 of this research.27 As rural status negatively correlated with student household economic status, this difference in frequency of homework assignment between the two groups in rural areas could be a source of the heterogeneous one-year impact of the additional interventions on learning outcomes by student household economic status. To investigate the aspect, the heterogeneous one-year impact on the frequency of homework assignment is estimated with Eq. (2), replacing the student household economic status index with the level dummies of low, medium, or higher status (see Table G.2 in Appendix G for the results).28 Whereas in the control group, students with lower economic status were assigned homework less frequently than students with a median economic status, the additional interventions in the ESMATE program increased the frequency of mathe matics homework assignment for students with lower economic status in the treatment group (see Columns (I) to (III) Table G.2). 5. Discussions In year 1 of this study, the Ministry of Education in El Salvador provided additional interventions for the treatment group to strengthen the impact of the ESMATE program on mathematics learning of grade 7 students in two ways: (a) by improving teaching practices in mathematics lessons, and (b) by strengthening student study at home through frequent homework assignment. The regression results in Section 4.1 suggest that a set of the distribution of mathematics tests and the organization of mutual review meeting increased the percentage of schools that reviewed the test results with other schools, which helped teachers in the treatment group pay attention to their students’ learning during mathematics lessons in year 1 of this study. Simple diagnostic feedback for teachers does not always improve student learning, as it depends on how teachers use the test results to improve their teaching practices (Muralidharan and Sundararaman, 2010; Berry et al., 2020; de Hoyos et al., 2019). In El Salvador, schools traditionally organized inter-semestrial meeting to review the teaching practices. The ESMATE program utilized the occasion to make teachers conscious of how much their students had learned mathematics. The mutual review meetings could function as a place where teachers feel peer pressure from colleagues and raise their awareness of how much students learn. In the context of other developing countries, while it would be possible to organize mutual review meetings where teachers review the test results of their students, the meetings would not be sufficient to improve student math learning, due to the limited capacity of teachers. The ESMATE program includes textbooks and guidebooks that teachers could use to help students learn mathematics. It is necessary to combine the distribution of teaching material and training as scaffolding for teachers to improve student learning in developing countries. On the other hand, the additional interventions in the ESMATE program did not impact the school principals’ lesson observations. In the end-line survey, approximately 35 percent of the sampled teachers in the treatment group responded that they did not receive any advice from school principals regarding mathematics lessons. As previous research has demonstrated the effectiveness of coaching to improve teaching practices and student learning (Bruns et al., 2018; Cillers et al., 2020; Kotze et al., 2019; Piper and Zuilkowski, 2015), the impact of the ESMATE program could be further enhanced by improving the intervention for school principals. In the experiment of this study, the number of experience years as a principal was positively correlated with whether the principal advised for teachers.29 This suggests that it is necessary to strengthen the support for principals with less experience to enhance their behavioral change. In terms of student study at home, the additional interventions in the ESMATE program included the distribution of student workbooks. As the content of the workbook corresponded to the ESMATE textbook, students could practice and check their under standing of what they had learned on the day. In the follow-up survey, up to three students per school were randomly selected to check their workbooks of the previous school year to see how they studied with mathematics workbooks (Koei Research & Consulting Inc. 2020). Taking an example from the first unit in the workbook, sampled students in the treatment group solved at least one mathematics problem on 84 percent of the pages.30 Although students received workbooks as well as answers so that they could check their own answers, approximately 30 percent of the pages that the students tried were not self-marked. The pages that both teachers and students checked were only 35 percent of pages.31 Through self-marking, students could identify the math items that they answered incorrectly 26 In the treatment group, the average number of teachers (from grades 1 through 9) in the sampled schools was 7.24 in rural areas and 18.2 in urban areas. In the control group, the average number of teachers (from grades 1 through 9) in the sampled schools was 7.83 in rural areas and 15.76 in urban areas. 27 In the rural area, while the average level point of the treatment group in year 1 was 3.67, the value of the control group was 3.34. The result of the Wilcoxon rank sum test demonstrates that the difference is statistically significant at the 1 percent level. In the urban area, the average level point of the treatment group in year 1 was 3.61, and the value of the control group was 3.54 The difference is not statistically significant. 28 To estimate the heterogenous one-year impact on the frequency of homework assignment, only the baseline and end-line data were used. The interaction term with year dummy was excluded from Eq. (2) to estimate the heterogeneous one-year impact. 29 In the treatment group in year 1 of the research, among the 72 teachers who worked in the school where the principal had less than or equal to ten years of experience, 42 teachers (around 59 percent) received advice from the principal. Among the 46 teachers who worked in the school where the principal had more than ten years of experience, 35 teachers (around 76 percent) received advice. 30 The percentage of pages that students completed was found to be the highest in the first unit (out of nine). 31 Regarding unit 1 of the workbook for grade 7, the follow-up survey checked the workbooks from 73 out of 117 schools. Out of these 73 schools, the student workbooks were not checked both by students and teachers at 33 schools. 14 T. Maruyama International Journal of Educational Research 115 (2022) 101977 and improve their math learning; however, the status of self-marking in workbook indicates that a large percentage of students did not check what they could not understand well after they tried to solve mathematics problems in the workbook. 6. Conclusions Although recent debates on educational development have focused on the learning crisis in primary education, the crisis in the lower secondary education level is equally profound. One approach to improve student learning is a structured pedagogy program that combines different interventions with the provision of teaching and learning materials (Snilsveit et al., 2016). Given an extensive variety of possible combinations of interventions, systematic reviews have not clearly shown a clear picture of effective combination of different interventions for a structured pedagogy program, which would be an interest of policymakers (Piper et al., 2018). The experiment of this study in El Salvador intended to fill the gap in understanding regarding the combination of different components in a structured pedagogy program at the lower secondary level. In El Salvador, the Ministry of Education developed a structured pedagogy program in mathematics, the ESMATE program, with technical cooperation from JICA. The experiment of this study targeted grade 7 students and tracked them for two years. In year 1 of this study, additional interventions were provided for the treatment group to strengthen the impact of the ESMATE program in two ways. First, the Ministry provided teachers with mathematics tests for students. In mutual inter-semester review meetings, the teachers reviewed and discussed their students’ test results with their colleagues. The Ministry also provided initial on-site advice for school principals regarding mathematics lesson observations, with the intention to increase the frequency of their lesson observations and suggestions for teachers. Second, mathematics workbooks were also distributed to students, in addition to textbooks. The additional interventions aimed to strengthen the support of teachers for students to enhance their mathematics learning in class and at home. The average one-year impact of the additional interventions in the ESMATE program is estimated at 0.17 standard deviations of the IRT scores, and the one-year impact was larger for students with lower household economic status. The results indicate that the additional interventions reduced the economic disparity in mathematics learning outcomes. The regression results on the intermediate outcomes suggest that the mutual review meetings that involved discussing test results positively impacted the teachers’ teaching practices in their mathematics lessons in year 1 of this study. The additional interventions in the ESMATE program including the distribution of workbook increased the frequency of homework assignment. This study also investigated the accumulated impact of the first-year intervention in the following year. As students continue learning for years, it is important to see whether they can advance their learning after receiving the intervention. The average accumulated impact was positive but not statistically significant. The gains in grade 7 were not sufficient to yield an accumulated impact on mathematics learning in grade 8. At the end of grade 7, most students in both groups had difficulty solving simple equation problems. Since mathematics content in grade 8 involved equations, the low-level understanding of linear equations in grade 7 hampered the progress of learning in the following year. The results of students’ mathematics assessment in a series of surveys of this study also revealed that students entered lower secondary education without the foundations of mathematics, and that they struggled with mathematics content as it became more theoretical and advanced. Therefore, it is also essential to improve the ESMATE program so that it can better support students to catch up on foundational math concepts and transition smoothly in their learning from the primary to lower secondary level. Declaration of Competing Interest The author declares the following financial interests/personal relationships which may be considered as potential competing interests: Maruyama is a staff member of the Japan International Cooperation Agency (JICA), and he is currently assigned to Hiroshima University as an associate professor. This study was conducted independently of JICA’s policies or organizational views. The findings, interpretations, and conclusions expressed in this paper are those of the author and do not represent the views of JICA. Acknowledgements I am grateful to the Ministry of Education in El Salvador for their understanding and intensive support while this research was conducted. I am grateful to Norihiro Nishikata, Kohei Nakayama, Eiichi Kimura, Satsuki Kawasumi, and Yuko Kawanami for their valuable support in the field surveys, as well as Shin-ichi Ishihara, Hiromichi Morishita, Eiji Kozuka, Chie Esaki, Kazuro Shibuya, Miyako Kobayashi, Miki Morita and Michiyo Iwase for their continuous encouragement during this study. I am also grateful to Takashi Kurosaki for his continuous guidance and support; Yutaka Arimoto, Yukichi Mano, Kensuke Teshima and Chiaki Moriguchi for their constructive comments on this study; and Takuya Baba and Kinya Shimizu for their support. I really appreciate anonymous reviewer for their valuable comments. I also thank Sugashi Nagai, Koei Research & Consulting Inc., and his team members for their dedicated work on data management. Funding All the data used for this study were provided by JICA. This research was conducted as a research proposal project of JICA Ogata Research Institute. 15 T. Maruyama International Journal of Educational Research 115 (2022) 101977 Supplementary materials The dataset and R script for this study are available at the following website of Mendeley data. https://data.mendeley.com/datasets/vmf6fbfm2n/1. Appendix A. Composition of mathematics test Table A.1 Composition of mathematics test (baseline). Item Item content Cognitive Cognitive No. domain skill 1 To add two-digit numbers with carrying to the position of tens NO Knowing 2 To subtract a two-digit number from a two-digit number with borrowing from the position of tens NO Knowing 3 To find the product of three 2’s NO Knowing 4 To operate division of three-digit number by three-digit number without remainder NO Knowing 5 To operate multiplication and division of one-digit numbers successively NR Knowing 6 To subtract a number with one decimal place from a number of the same type without borrowing NO Knowing 7 To operate division without residue of numbers with one decimal place NO Knowing 8 To add two proper fractions with the same denominator without reduction NO Knowing 9 To find the product of two proper fractions without reduction NO Knowing 10 To find the quotient of two proper fractions without reduction NO Knowing 11 To find the least number which can be divided without remainder both by 4 and by 6 NO Knowing 12 To find the number which gives 3 when subtracting it from 7 (the item is presented in the form of equation NR Applying representing the unknown number by □) 13 To find the number which gives 8 when multiplying 2 by that number (the item is presented in the form of equation NR Applying representing the unknown number by □) 14 To find how many times 4 is 8 (the unknown number is represented by □) NR Knowing 15 To find the unknown number in the table which shows the relation between the quantity of the same goods and NR Knowing their total price 16 To find the unknown number in the table which shows the relation between the quantity of wor