Impacts of attending an inclusive STEM high school: meta-analytic estimates from five studies

Inclusive STEM high schools seek to broaden STEM participation by accepting students on the basis of interest rather than test scores and providing a program sufficient to prepare students for a STEM major in college. Almost nonexistent before the present century, these high schools have proliferated over the last two decades as a strategy for addressing gaps in STEM education and career participation. This study uses a meta-analytic approach to investigate the relationship between attending an inclusive STEM high school and a set of high school outcomes known to predict college entry and declaration of a STEM college major. Combining effect estimates from five separate datasets of students from inclusive STEM high schools and matched comparison schools, the analysis reported here used data from administrative records and survey data for 9719 students in 94 high schools to obtain estimates of the average impact of attending an inclusive STEM high school on STEM-related high school outcomes. Positive effects for inclusive STEM high schools were found for completion of key STEM courses and for likelihood that students would engage in self-selected STEM activities. Students who attended an inclusive STEM high school also identified more strongly with mathematics and science and were more likely as high school seniors to be very interested in one or more STEM careers. Importantly, these positive impacts were found for low-income, under-represented minority, and female students as well as for students overall. Attending an inclusive STEM high school appeared to have a small positive impact on science test scores for students overall and for economically disadvantaged students, but there were no discernible impacts on mathematics test scores. These findings suggest that the inclusive STEM high school model can be implemented broadly with positive impacts for students, including low-income, female, and under-represented minority students. Positive impacts on the odds of taking advanced mathematics and science courses in high school and on interest in entering a STEM profession are of particular importance, given the strong association between these variables and entry into a STEM major in college.


Introduction
Historically, secondary education programs to prepare students for the STEM pipeline-such as selective STEM programs and high schools or selective courses like Advanced Placement science, mathematics, and computer science in regular high schools-have targeted students who could demonstrate a high level of prior academic achievement or aptitude. Recently, however, thinking has changed about how to build America's STEM workforce. The National Academies of Sciences, Engineering, and Medicine, for example, have drawn attention to the clash between the growing need for STEM (science, technology, engineering, and mathematics) expertise on the one hand and US demographic trends on the other National Academies, 2005). Those demographic groups most likely to pursue STEM studies and careers-middle and high socioeconomic status white and Asian males-comprise a dwindling proportion of the country's population. A 2010 report from the President's Council of Advisors on Science and Technology (PCAS T) made the case for moving away from the idea that we can fulfill our needs by selecting for STEM talent to the idea that we must develop STEM talent: [S]tudies suggest that achieving expertise is less a matter of innate talent than of having the opportunity and motivation to dedicate oneself to the study of a subject in a productive, intellectual wayand for sufficient timeto enable the brain development needed to think like a scientist, mathematician, or engineer. This has important implications for STEM education; it underscores the need to motivate students for long-term study of STEM, and points to the potential for many more students to excel in STEM. (PCAST, 2010) President Obama's White House developed policies based on this line of thinking, including $80 million in the 2017 federal budget for the creation of "next-generation" high schools (White House Office of Science and Technology Policy, 2016).
The concept of an inclusive STEM high school (ISHS) entails (1) accepting interested students without applying admissions test score or other academic achievement criteria and (2) providing a secondary education program sufficient to prepare all of their students for a STEM major in college.
Almost nonexistent before the present century, inclusive STEM high schools (ISHSs) have proliferated over the last two decades. A 2008 survey of specialized STEM high schools identified over 100 public high schools that described their mission as preparing under-represented minority youth to successfully pursue postsecondary STEM studies (Means, Confrey, House, & Bhanot, 2008). By 2011, the Texas High School Project reported having more than 70 inclusive STEM high schools, North Carolina had at least two dozen according to the North Carolina New Schools Project, and the nonprofit research organization Battelle had teamed with partners in the states of Ohio, Tennessee, and Washington to create and support ISHSs in a STEM learning network within each state. More recently, Rogers-Chapman (2014) generated a list of 221 inclusive STEM high schools in the USA.
There is no single model or accrediting body for inclusive STEM high schools. Some arise from state initiatives, some from district-level strategies, and some from charter school networks. Descriptive studies have found considerable variation across schools that consider themselves ISHSs (LaForce, Noble, King, Holt, & Century, 2014;Lynch et al., 2018), but there are some commonalities. Inclusive STEM high schools are typically small in size (600 or fewer students), with the intent of fostering close relationships among students and between students and their teachers (Lynch, 2015). While they are public schools, most ISHSs are "schools of choice" accepting students from across a school district or geographic area. Case studies of ISHSs (LaForce et al., 2014;Lynch et al., 2018;Scott, 2012) have identified key components characterizing many of them: a rigorous STEM-focused college preparatory curriculum taken by all students; use of project-or problem-based pedagogy; an extensive network of supports for students who need assistance mastering the curriculum; incorporation of career, technology, and life skills into school activities and practices; a supportive school climate; and partnerships with external organizations to support out-ofschool STEM experiences.
Published studies of the effectiveness of this high school model have used different samples and analytic approaches and have come to conflicting conclusions. Although test scores are not the best predictor of entering and persisting in STEM majors (Wang, 2013), most of the empirical research on the effectiveness of inclusive STEM schools has focused on test score impacts. Young et al. (2011) examined student outcomes for "T-STEM" high schools in Texas and found slightly but significantly higher 9th-grade math and 10th-grade math and science test scores compared to other Texas schools, after controlling for demographic and prior achievement variables. In contrast, a study analyzing achievement test outcomes for students spending 2 years in one of six Ohio ISHSs, compared to conventional high schools drawing from the same middle schools, found that only two of the six ISHSs had a positive impact on students' science achievement, with the other four having negligible or even negative impacts (Gnagey & Lavertu, 2016). A study by Saw (2017) used a statewide sample with five student cohorts, comparing test scores of students from 42 Texas T-STEM Academies with those of students from all other Texas high schools (1580 unique schools) and found a positive impact of T-STEM attendance for grade 11 mathematics achievement but not for achievement in other subject areas.
The present study contributes to research on the effectiveness of inclusive STEM high schools by (1) applying meta-analytic techniques to a large inclusive STEM high school data set with five student cohorts drawn from three different states, (2) looking at a range of high school outcomes known to be predictive of entry into a STEM college major rather than just mathematics and science test scores, and (3) examining outcomes for student subgroups under-represented in STEM (i.e., lowincome, under-represented minority, and female students).

Conceptual framework
Non-test high school outcomes that predict entry into STEM in college include completion of advanced mathematics and science courses in high school (Adelman, 2006;Chen & Weko, 2009;Federman, 2007;Trusty, 2002;Wang, 2013), a high level of interest in STEM and involvement in STEM-related activities during high school (Andersen & Ward, 2014;Chang, Eagan, Lin, & Hurtado, 2011;Maltese & Tai, 2011;Regan & DeWitt, 2015), and aspiring to enter a STEM career (Legewie & DiPrete, 2014;Radunzel, Mattern, & Westrick, 2016;Tai, Liu, Maltese, & Fan, 2006). If the rationale behind the ISHS model (that it can increase the likelihood that students will become STEM majors in college) is correct, we would expect ISHSs to have a positive impact on these more proximal indicators that can be measured at the end of high school. These relationships are illustrated in the conceptual framework that guided our data collection, shown in Fig. 1.

Prior work
Our research team has been studying the relationship between attending an inclusive STEM high school and these high school outcomes since 2012. We have sought to address the policy-relevant question of whether inclusive STEM high schools implemented at scale can in fact prepare a diverse student population for STEM college majors. We have conducted parallel analyses employing propensity score weighting for five student samples drawn from North Carolina, Texas, and Ohio, three states that have large numbers of inclusive STEM high schools. Replicating studies in multiple state contexts is important if research findings are to play a role in guiding education policy. Running parallel studies in the three states allows us to observe the generality of ISHS impacts in multiple student and school samples under different conditions. Previously, we  have described impacts of ISHS for the two most mature samples in our program of research (the Class of 2013 in North Carolina and the Class of 2014 in Texas). The analyses reported here combine data from these cohorts with data from three additional student cohorts-for a total of five cohorts from three states-to obtain estimates of the average ISHS impact on the STEM-related high school outcomes in Fig. 1.
Data-sharing agreements with agencies managing state education data systems precluded combining studentlevel data from the different states into a single data set for analysis. But by employing a meta-analytic approach, we can increase the total sample size and provide more precise impact estimates than were available from any one of the five individual cohort studies. This is particularly useful when looking at ISHS impacts for various student subgroups, such as low-income or under-represented minority students. Tests of statistical significance within individual cohort studies are highly influenced by sample sizes, which were relatively small for subgroups of interest in some cohorts. In addition, a meta-analytic approach allows us to inspect the variability of outcome estimates across different state contexts and student cohorts. It may be that the ISHS model has positive impacts under some circumstances (e.g., strong state supports in terms of professional development for school leaders) but not others (e.g., when many of the local alternatives to STEM high schools are also schools of choice). If we observe significant heterogeneity in terms of impacts across states and cohorts, we need to worry about generalizing from findings in these three states to possible initiatives in other states and should direct our attention to searching for conditions or practices particular to a state or time period that can help us understand the prerequisites for effective implementation of ISHSs at scale.

The present study
The analyses reported here combine data from two student cohorts each in North Carolina and Texas along with data from a single cohort from Ohio. The findings for the two younger student samples in North Carolina and Texas are of particular interest because these students were surveyed first as 9th graders in the fall of their freshman year of high school and then as seniors in the spring before graduation, allowing us to use their grade 9 reports of STEM-related activities and interests during middle school (as well as the grade 8 achievement covariates used in analyses for all cohorts) as covariates in analyzing their high school outcomes. The analyses reported here use the combined data from all five student samples to address two research questions: RQ 1. When findings from the study's five student cohorts are combined, do ISHSs appear to have positive impacts for the STEM course-taking, outof-class activity, attitude, achievement, and aspirational outcomes in the inclusive STEM high school conceptual framework? RQ 2. When findings from the study's five student cohorts are combined, do ISHSs appear to enhance STEM-related high school outcomes for low-income, under-represented minority (African American, Hispanic, and Native American), and female students?

State contexts for inclusive STEM high schools
To contextualize our meta-analysis of five student cohorts from three states, we examined aspects of the state environments and policies that could be expected to influence the way in which ISHSs were implemented. These included the demographic and geographic characteristics of their student populations, financial resources, requirements for high school graduation, strength of the K-12 education accountability system, state financial supports for establishing and supporting ISHSs, teacher union presence, charter school policy, and connections between state education and economic development policies. We addressed these issues during interviews with education policymakers at the state, regional, and local levels within each of the three states in our study.
During the timeframe of our study, North Carolina's 600 public high schools were serving a population of around 460,000 students, of whom over 40% were from an under-represented minority (predominantly African American) and half were designated as economically disadvantaged. Roughly 2 out of 5 high schools in North Carolina had been designated as in need of program improvement. In 2006, all of the state's high schools designated as in need of improvement were invited to compete for one of ten $40,000 grants issued by the State Board of Education to support a planning year for creating a new, autonomous STEM-focused high school. The resulting STEM school could be an entirely new school sharing a campus with a larger, traditional high school or it could be a conversion of the entire preexisting high school. The nonprofit North Carolina New Schools Project was designated as the professional services provider for the ISHS planning process. The New Schools Project, founded in 2003 with funding from the Bill & Melinda Gates Foundation and later known as NC New Schools, offered technical assistance services to support STEM-focused curriculum development and instruction and to connect new STEM high schools to higher education and industry partners. The state Department of Public Instruction had relatively little direct involvement in the creation of North Carolina's ISHSs. These schools emerged instead from a combination of school district initiatives and support from NC New Schools and other nongovernmental education support agencies as well as business and higher education partners . North Carolina ISHSs in our study were all district-run public schools and eight of them were schools-within-a-school created as part of a conversion effort for a larger school previously identified as in need of improvement. The North Carolina school sample did not include any charter schools. (At the time we were recruiting North Carolina schools for our study, North Carolina had a cap of 100 on the total number of charter schools in the state.) Another important piece of the North Carolina context was the state's receipt of one of the first Race to the Top education grants in 2010, bringing in roughly $400 million for educational improvement, including funds for learning technology and for establishing four STEM "anchor schools" focused on career areas important to the state's economy. These schools and their associated "affinity networks" were established subsequent to the school recruiting and the first round of survey data collection for our research, but prior to the second student survey for cohort 2 conducted in 2016. Race to the Top spending may have reduced the contrast between ISHS and comparison school experiences for this second cohort of North Carolina students.
During the same period, Texas had a much larger education system with 1450 public high schools serving nearly 1.5 million students. More than half of these students were designated as coming from low-income homes, and 65% were from under-represented minorities (primarily Hispanic). Texas is a charter friendly state, but one with a strong accountability system. Interest in establishing ISHSs (which are called T-STEM Academies in Texas) arose during conversations between the governor's office and representatives from the Bill & Melinda Gates Foundation. T-STEM was envisioned as a public-private partnership from the beginning, with extensive support from both the Texas Education Agency and the Community Foundation of Texas. The intended nature of an ISHS was more highly specified in Texas than in North Carolina. Detailed T-STEM design and implementation requirements were set forth in a T-STEM Academy Blueprint, and T-STEM Academies risked loss of their funding if they did not comply with blueprint requirements, which included serving a student body of which at least 50% were low-income and under-represented minority students. To support the effective implementation of inclusive STEM high schools, Texas established seven T-STEM Centers dispersed throughout the state to provide needs assessments and tailored technical assistance.
Ohio had the fewest number of public high schools, 506, serving around 520,000 students. These students were 53% low income and 35% minority. Ohio offers another example of ISHSs promoted through a partnership between a state education agency and a private entity: In this case, the Battelle Memorial Institute teamed with the Ohio STEM Learning Network (OSLN). OSLN also received support from the Gates Foundation for establishing ISHSs in Ohio. The state's strategy for supporting ISHSs was to have well-established ISHSs, such as Metropolitan High School in Columbus, Ohio, serve as models for new schools within a regional hub. Each regional hub had higher education and business/industry partners. The OSLN hubs provided technical assistance in the form of collaborations, joint classes, site visits, and educator-to-educator professional learning opportunities . The Ohio Department of Education developed a STEM school designation process, but the designation requirements were less strict than the Texas T-STEM Academies Blueprint. With respect to the student body, for example, a STEM designation in Ohio required that it have a "racial, ethnic, socio-economic, and gender balance reflective of the region" in contrast to the Texas stipulation of explicit representation targets. During the years of our study, Ohio public schools were experiencing reductions in state funding and increased teacher accountability based on their students' test scores in reading and mathematics. Under Governor Kasich, Ohio was friendly to charter schools, and there was considerable tension between public school districts and charter proponents, with both sides claiming that state funding practices advantaged the other sector (Strauss, 2016).
A more extended treatment of the education environment and policies in the three states can be found in Young et al. (2017).

School sampling and recruiting
In each state, our recruiting process began with identifying high schools within the state that met our definition of an inclusive STEM high school and that would have both a grade 12 and a grade 9 class during the year when we planned our survey data collection. For this purpose, we defined an inclusive STEM high school as a secondary school or self-contained school-within-a-school that (a) enrolls students on the basis of interest rather than aptitude or prior achievement, (b) provides students with more intensive STEM preparation than conventional high schools do, and (c) expresses the goal of giving all its students the preparation to succeed in a STEM major in college. Following school and district or charter management organization requirements for approval of research participation, we enlisted as many inclusive STEM high schools as we could in North Carolina and Ohio. In Texas, where there were many more such schools, we continued recruiting until we had 38 willing ISHSs. Study participation in North Carolina and Texas entailed administering surveys to incoming 9th graders and to graduating seniors in the first year of a school's participation in the research and then re-surveying the first of these groups 3.5 years later when they were about to graduate. In Ohio, resources were available for just one study cohort, and surveys were administered to graduating seniors only.
Next, for each ISHS agreeing to participate in our research, we used publicly available school-level data to identify high schools without a STEM focus that served student bodies as similar as possible in terms of demographics and prior achievement profiles for their entering 9th graders. These non-STEM comparison schools were then recruited using the same research application and incentive offers extended to ISHSs. The monetary compensation for school participation depended on school size, with an honorarium of $500 for a small school (enrollment of 600 students or fewer) and $1000 for a larger school (enrollment greater than 600). Supplementary Figures 1, 2, and 3 in Appendix A-1 provide details on the stages in the recruiting and data collection process and the number of schools remaining in the sample at each stage for each of the three states.

Grade 9 student survey
The main purpose of surveying students entering high school was to obtain reports with as little time lag as possible of students' STEM-related activities and attitudes during middle school. These measures could be used as covariates in analyses of the same students' responses to the Grade 12 Student Survey that they would take in the future. Entering 9th graders were asked to identify the subject of their favorite course in middle school and to indicate whether they had participated in each of eight types of STEM-related activities, such as math and science clubs, competitions, camps, and study groups.

Grade 12 student survey
The survey for graduating seniors was designed to collect data on sociocognitive constructs highlighted in expectancy theory (i.e., science and math identity, interests, academic expectations, and self-efficacy) and on variables shown to predict entry into STEM college majors in prior empirical research. Survey items and scales addressed students' high school experiences and outcomes including STEM courses taken, extracurricular and leisure-time activities related to STEM, overall academic and STEM orientation, academic and personal supports received through their high school and at home, plans for the year following graduation, and interest in STEM careers.
Sources of items and scales for both the Grade 9 and the Grade 12 Student Surveys included the National

Administrative data
For our North Carolina cohorts, student demographic information, grade 8 test scores, high school grade point average, ACT scores, and graduation status were obtained from the North Carolina Education Research Center (NCERDC). With the exceptions of high school GPA and ACT scores, the same kinds of administrative data were available for students in our Texas samples from the Education Research Center at the University of Texas, Austin. For the Ohio sample, we worked directly with the Ohio Department of Education, which linked our survey files to student longitudinal records, stripped personally identifying information, and then returned the linked data sets to us for analysis.

Outcome measures
Most of the high school outcome measures in this research, such as STEM course-taking, out-of-class activity, attitude, aspirations, and mathematics and science grades, were derived from the Grade 12 Student Survey, which was essentially identical across the five individual studies. Mathematics and science achievement test measures, on the other hand, were obtained from state data systems and differed across the three states. North Carolina and Ohio had ACT scores for nearly all of the 12th graders in our samples. The Texas student data system does not include ACT or SAT scores, but it did have subject area Texas Assessment of Knowledge and Skills (TAKS) scores for students in the Class of 2014 (cohort 3). Texas students in cohort 4 (Class of 2017) did not take the TAKS test in grade 11, and no science and math test scores were available for a majority of this sample. In addition, some of the covariates in our analytic models, such as special education status or English proficiency, were operationalized slightly differently across the three states.

Analytic approach for within-state impact estimates
For analyses of each of the five cohorts, we applied propensity score weighting to make the comparison school student sample as similar as possible to the ISHS student sample in terms of students' prior achievement (mainly grade 8 achievement test scores) and demographic characteristics (including gender, ethnicity, English proficiency, parents' education, and parent employment in STEM). For cohort 2 and cohort 4 studies, where students had been surveyed previously in 9th grade, we also included two prior STEM experience variables from the 9th-grade survey-namely whether STEM was a favorite subject and participation in STEM activities in middle Means et al. International Journal of STEM Education (2021)  school-in the propensity score weighting and as covariates.
For propensity score weighting, we first posited a logistic regression model with being an ISHS student as the outcome and included the abovementioned set of student variables as predictors. For each student, we then calculated a propensity score p i , which is the probability of being in an ISHS, based on the estimated propensity score model. We weighted each comparison student by the odds of being in an ISHS, calculated as p i /(1 − p i ), and assigned a weight value of 1 to each ISHS student. These weights were used in the subsequent analysis.
Because students are clustered within high schools, our analyses used hierarchical modeling with school and student levels to compare outcomes for 12th graders in ISHSs to those of 12th graders in comparison schools, adjusting for student demographic characteristics and eighth-grade achievement scores through propensity score weighting as described above. We also adjusted for middle school STEM subject interest and STEM activities for cohorts 2 and 4.
For each set of comparisons, we posited a hierarchical model with student and school levels for the same set of outcomes. The ISHS impact was estimated at the school level. The hierarchical model for student-level outcomes took the following form: where i is students, j is schools, Y ij is a student outcome, and ISHS equals 1 for students in an ISHS school and 0 for students in a comparison school. e ij and r j are student and school random effects. β 1 is the estimated ISHS impact on the student outcome. We included as student-level covariates being female, African American, Hispanic, economically disadvantaged, limited in English proficiency, special education designation, either parent having a bachelor's degree, and grade 8 mathematics, science, and reading achievement scores. We also included eighth-grade social studies score in cohort 3 and cohort 4 since it was available in the Texas administrative data. We only included students with at least one grade 8 achievement test score in the analysis. For student-level predictors, we used multiple imputation, applying the MIANALYZE procedure in SAS to impute each missing value 5 times. Our model also incorporated school-level covariates, including urbanicity, title I improvement status (controlling for accountability pressure), percent minority students, percent economically disadvantaged students, and average incoming students' eighth-grade test scores in the school.

Meta-analytic approach
Because we wanted to test for average effects on outcomes for inclusive STEM high schools as a conceptual model across the three states with a total of five datasets, we performed a fixed-effect meta-analysis that calculates the average effect across the five cohorts. We applied the metan commands in Stata. For each outcome, the weighted mean effect was computed by weighting the effect estimate for each study cohort by the inverse of its variance. Log odds estimates were used for dichotomous outcomes from a logit function. In working with achievement test data, we converted ACT scores (in North Carolina and Ohio for cohorts 1, 2, and 5) and TAKS scores (in Texas for cohort 3) into standardized effect sizes using Hedge's G, and conducted a metaanalysis across the studies on the mathematics and science test score outcome constructs.

Results and discussion
Sample characteristics Table 1 provides basic information on the five student samples (cohorts) in the meta-analysis, identifying their state, graduation year, high school achievement test measure, and the number of schools and students in the ISHS and comparison groups. Characteristics of the high schools in each of the five ISHS samples compared to all high schools in their state in the focus year are shown in Table 2. These school-level descriptive data characterize all the students in each school in the study, not just those students in our analytic samples. These schoollevel data suggest that the ISHSs in each of the five studies were serving higher proportions of low-income and under-represented minority students than were public high schools in their state as a whole, suggesting that ISHSs are indeed expanding the diversity of students exposed to a STEM-focused curriculum. The other major school-level difference apparent in Table 2 is in average school size. One of the essential components of an ISHS is the creation of a close-knit school community, which many educators believe is possible only in a school of relatively small size. Table 3 summarizes key student characteristics for each ISHS and comparison student analytic sample, both before and after propensity score weighting. Before the propensity score weighting, there were some significant differences in the characteristics of ISHS and comparison school student samples in each of the five cohorts. There were more females in comparison schools than in ISHSs for example in cohorts 3 and 4. There was a significantly larger proportion of under-represented minority students in ISHSs than in comparison schools in cohorts 1, 2, 4, and 5. There was a higher proportion of economically disadvantaged students in the ISHS sample in cohort 1 and 5. Differences in grade 8 test scores were not large (which would be expected since similarity of incoming students' grade 8 test scores was a major criterion in selecting schools to recruit for the comparison sample) and favored the comparison school students and the ISHS students equally often. After propensity weighting, the comparison school student sample did not differ from the ISHS student sample for any student characteristic in any of the five studies with the exception of the percentage of females in the school in the Ohio sample (Cohort 5).

High school outcomes
Figures 2, 3, 4, and 5 display the results of the five individual cohort analyses and the meta-analysis for the set of high school outcomes in the ISHS conceptual model (Fig. 1). In each of these figures, squares to the right of the solid vertical line labeled 0 denote positive impacts on the log odds of obtaining the outcome for the student cohort; the ISHS impact for that cohort was statistically significant if the "whiskers" (demonstrating the confidence interval for the impact estimate) extending out from the square do not cross 0. The shaded diamond at the bottom of each figure shows the overall impact estimate obtained in the meta-analysis for that outcome; a dotted vertical line running through the diamond is shown to facilitate comparing impact estimates for individual cohorts to the overall impact estimate. The values for each impact estimate, its confidence interval, and its weight in calculating the overall impact are shown to the right of the figure.
Tables 4, 5, 6, and 7 provide the meta-analysis estimates of ISHS effects for the same set of outcomes for student subgroups-i.e., under-represented minorities, economically disadvantaged, and female students. (Effect estimates for subgroups in the five individual student cohorts are available from the authors upon request.)

Course-taking
As shown in Fig. 2, ISHSs appear to make a difference in the level of mathematics courses students take in high school. Because students need to be ready for calculus when they enter college if they are to complete a STEM college major within 4 years, completion of precalculus or calculus in high school is an important outcome. The estimated ISHS effect on this outcome was positive with log odds of .84, p < .001, corresponding to an odds ratio (OR) of 2.3. This odds ratio suggests that attending an ISHS more than doubles the likelihood that a student will complete precalculus or calculus while in high school. In addition, there was a significant impact across the five cohorts on the OR for having taken chemistry in high school (log odds = .94, OR = 2.6, p < .001), but there was no significant impact on the odds of completing physics (log odds = .19, OR = 1.2, p > .05). Attending an ISHS appeared to increase the odds that a student would complete some kind of technology course in high school (log odds = .47, OR = 1.6, p < .01) and to have a very large impact on the likelihood of taking an engineering course (log odds = 2.29, OR = 9.9, p < .001). Importantly, every ISHS impact estimate for course-taking that was significantly positive for students overall was also significantly positive for under-represented minority, economically disadvantaged, and female students, as shown in Table 4.
In summary, ISHS attendance appeared to impact STEM courses completed in high school. Specifically, positive ISHS effects were found for mathematics (completing calculus or precalculus), chemistry, technology courses, and engineering. These positive impacts were found for low-income, under-represented minority, and female students as well as for the ISHS student sample as a whole. The enrollment of lowincome, under-represented minority, and female students in such advanced mathematics and science classes within ISHSs contrasts with reports of their  We approximated school-level enrollment for from ISHSs that are schools-within-a-school without their own school code in the state data system by extrapolating from the number of students on grade 12 and grade 9 rosters collected as part of student survey administration. For demographic variables, we used state data for the larger schools within which these schools were contained.

Under-represented minority students African
American, Hispanic/Latino, or Native American  typically low participation rates in these courses in US high schools (see https://ocrdata.ed.gov/). These findings for 77 ISHS senior classes across three states appear to confirm the conclusions of earlier qualitative research on ISHSs suggesting that they provide a more rigorous STEM curriculum than do regular high schools (Lynch et al., 2018). Figure 3 presents the analytic results for student reports of their participation in STEM activities outside of courses and their attitudes toward mathematics and science. Students who attended ISHSs reported participating in more STEM extracurricular activities overall (estimated difference = .28 on a 4-point scale, p < .001) and engaging in more self-selected STEM-related activities outside of school, such as visiting a science museum (estimated difference = .13 on a 4-point scale, p < .001). Again, the positive ISHS impact estimates obtained for students overall were seen also for under-represented minority, economically disadvantaged, and female students, as shown in Table 5. The level of engagement in STEM activities for these groups of students contrasts with reports of the lower participation rates of underrepresented minority and female students in out-ofschool STEM activities nationally (see, for example, the responses of 12th graders on the questionnaire administered with the National Assessment of Educational Progress available at https://www.nationsreportcard.gov/sq_ students_views_2015/). However, the ISHS impacts on these outcomes for cohorts 2 and 4, for which we were able to control for middle school engagement in STEM activities, were not statistically significant with the exception of participation in informal STEM activities by the Texas Class of 2017 (cohort 4), suggesting that this differential inclination to engage in voluntary STEM activities may have pre-dated entry into an ISHS.

STEM-related activities and attitudes
In terms of students' attitudes toward science and mathematics, the ISHS experience appears to have a positive influence on students' affinity for these subjects but not on their confidence in their ability to do well in them. ISHS students are more likely than comparison school students to report that their favorite high school subject was in a STEM area (log odds = .52, OR = 1.7, p < .001), and the meta-analysis effect estimates were significantly positive also for under-represented minority, low-income, and female students (see Table 5). ISHS seniors also expressed a stronger identity as a science person (estimated difference = .16 on a 4-point scale, p < .001) and as a mathematics person (estimated difference = .11 on a 4point scale, p < .001). These positive impacts too held for the three student subgroups. In contrast, ISHS students' sense of efficacy in science and mathematics (expectation that they can do well in the subject) was no higher than that of comparison school students in the overall metaanalysis or in any of the five cohorts (not shown in figure), nor was it significant for any of the subgroups in the meta-analysis (see Table 5).
While it seems clear that ISHS students demonstrate a stronger sense of identification with both science and mathematics than do their peers in other high schools, it may be that this heightened interest is something they brought to their high schools. In those analyses where we were able to control for the extent to which students identified with science and mathematics when they began high school (i.e., in cohorts 2 and 4), the ISHS impact estimate was significantly positive in one case (cohort 2) but not in the other (cohort 4). It makes sense that students who identify with mathematics and science in middle school are more likely to choose to attend an ISHS, and the mixed findings for cohorts 2 and 4 leave uncertainty as to whether attending an ISHS deepens that sense of identity.
Our analyses indicate that ISHS attendance did not enhance students' sense of self-efficacy in mathematics or science relative to that of their peers attending other kinds of high schools. It should be remembered, however, that students tend to take more advanced math and science courses within ISHSs, and it may well be that ISHS students have a better understanding than students taking less advanced courses do of what they do not know. While self-efficacy is regarded as an important predictor of high school course-taking and postsecondary engagement and success in STEM studies in several theoretical models (Lent, Brown, & Hackett, 1994;Simpkins, Davis-Kean, & Eccles, 2006), results from some studies and from international assessment programs suggest that achievement and sense of efficacy do not necessarily go hand in hand (Andersen & Ward, 2014;Chiu, 2017;Maltese & Tai, 2011).

STEM achievement and standardized test scores
Estimates of the ISHS impacts on science and mathematics achievement test scores and self-reported grades are shown in Fig. 4. Although impact estimates did not reach statistical significance for individual cohorts, the overall ISHS impact estimate in the meta-analysis was significantly positive for science achievement test scores (g + = .12, p < .05). The student subgroup impact estimates in Table 6 show that the positive relationship between ISHS attendance and science test scores is found for economically disadvantaged students as well (g + = .13, p < .05) but fails to attain statistical significance for under-represented minority (g + = .10) or female students (g + = .11). The meta-analysis found no ISHS impact on mathematics test scores (g + = .02). Nor was there a statistically significant relationship between ISHS attendance and math test score for any of the student subgroups. In summary, ISHS attendance appeared to have a small positive impact on science test scores for students overall and for economically disadvantaged students. There were no discernible impacts on mathematics test scores in any of the meta-analyses. Achievement can also be measured by course grades, and this was one area where the pattern of statistically significant ISHS effects differed for student subgroups compared to the overall student sample. The metaanalysis found that for students overall the likelihood of earning high grades (all As or As and Bs) in science and mathematics classes was not significantly greater for ISHS students than for students from comparison high schools (log odds = .28, OR = 1.3 for science and log odds = .17, OR = 1.2 for mathematics). But there were significant ISHS advantages for some subgroups, as shown in Table 6. Under-represented minority students were more likely to report earning high grades in science classes if they attended an ISHS (log odds = .37, OR = 1.4, p < .01). Economically disadvantaged students were more likely to report earning high grades in both science Under-represented minority students African American, Hispanic/Latino, or Native American, ISHS inclusive STEM high school, SE standard error ISHS differs from comparison school sample at *p < .05, **p < .01, and ***p < .001 (log odds = .40, OR = 1.5, p < .001) and mathematics classes (log odds = .40, OR = 1.5, p < .01) if they attended an ISHS. The meta-analysis findings for the other three combinations of subgroup and grades were null. Thus, students overall did not report earning higher grades in science and mathematics classes if they attended an ISHS, but again, ISHS students were more likely than their peers in comparison schools to be taking advanced courses in these subjects. Economically disadvantaged students did report earning higher grades in both science and mathematics classes if they attended an ISHS. In addition, under-represented minority students were more likely to report earning high grades in science if they attended an ISHS. One of the tenets of the ISHS model is that all students, including those who are under-represented minority students, can excel in STEM, and this finding is congruent with prior research showing that high expectations enhance academic achievement (Hattie, 2009). It appears that ISHS attendance offers some enhancement of STEM course performance among the kinds of students this educational innovation was intended to benefit.
Education and career aspirations Figure 5 displays the impact estimates for three key variables related to likelihood of entering and completing a STEM college major. The first of these is going directly to a 4-year college in the fall after high school graduation. Many low-income students and students of color begin their postsecondary work at two-year colleges. Unfortunately, statistics show that students who start at a 2-year college, like those who delay college entry altogether, have a lower probability of ever earning a bachelor's degree. Attending an ISHS did not increase the odds of reporting the intent to go directly into a 4year degree program for students overall or for lowincome or under-represented minority students. One exception to this overall pattern was a significant relationship between ISHS attendance and planning to enter a 4-year college the next fall for female students (log odds = .33, OR = 1.4, p < .05).
There was a positive ISHS impact on our other measure of educational aspiration. Students who had attended an ISHS were more likely to report that they expect to earn a master's or higher degree (log odds = .40, OR = 1.5, p < .01), and this significant effect was found for under-represented minority, low-income, and female students as well, as shown in Table 7.
Finally, ISHS students were more likely to report that they were very interested in entering a STEM career (log odds = .40, OR = 1.5, p < .001). This latter positive impact was found not only for the combined meta-analytic sample but also for four of the five cohorts, including the two cohorts for which the statistical model controlled for STEM interest during middle school. The positive ISHS effect on STEM career interest was found also for under-represented minority (log odds = .42, OR = Under-represented minority students African American, Hispanic/Latino, or Native American, ISHS inclusive STEM high school, SE standard error ISHS differs from comparison school sample at *p < .05, **p < .01, and ***p < .001 1.5, p < .001), economically disadvantaged (log odds = .40, OR = 1.5, p < .001), and female students (log odds = .43, OR = 1.5, p < .001). This finding is important because interest in a STEM career at the end of high school is a strong predictor of entering and persisting in a STEM major in college (Radunzel et al., 2016). In summary, ISHS attendance had a positive impact on several key measures of STEM aspirations. Students who attended an ISHS were more likely to expect to earn a graduate degree and were more likely to be very interested in one or more STEM careers. These positive ISHS effects were found in the three student subgroup meta-analyses as well.

Tests of sensitivity and heterogeneity of effects
Additional analyses were run to examine the sensitivity of our findings to choice of analytic model and to differences in state context or timeframe. For cohorts 2 and 4 (students who had taken a survey in grade 9 as well as grade 12), we conducted a sensitivity analysis by modeling the high school outcomes without using these two middle school STEM experience indicators as covariates and found that after controlling for all of the studentlevel characteristics used in all five studies, adding controls for prior STEM interest and activity did not change any inferences about ISHS impacts on grade 12 outcomes.
To examine the sensitivity of our results to the choice of a fixed-effects model, we also conducted a randomeffects meta-analysis and found the results to be quite similar to those of the fixed-effect analysis, with the direction and statistical significance of all the effect estimates remaining the same.
Given the lack of a definitive, national model of what an inclusive STEM high school is and the likelihood that both the choices of individual school leaders and communities and the policies of different states will affect ISHS designs and the ways they operate, we wanted to assess the consistency of the ISHS impacts across the five study cohorts. We conducted tests of the heterogeneity of the distribution of the effect estimates using Cochran's Q for each outcome variable, and did not detect statistically significant heterogeneity for any of them. 1 We also looked at I squared values, representing the percentage of total variation across studies that is due to heterogeneity rather than chance for the overall sample. All outcomes have I squared values less than 25% except for completed calculus or precalculus (36%), completed chemistry (42%), and science self-efficacy (30%). Heterogeneity in the first two of these variables was likely related to differences in state coursetaking requirements, as discussed below.

Conclusions
Across the five study cohorts, our data show first that, as intended, ISHSs attract students from groups underrepresented in STEM. The proportion of ISHS students who came from low-income homes was 63% and the proportion from under-represented minorities was 69% across the five study samples. Moreover, the proportions of under-represented minority and low-income students in the ISHSs exceeded those in public high schools in their states as a whole in every state and for every study cohort.
Our findings suggest that nonselective STEM-focused high schools may increase the likelihood that students, including those from groups under-represented in STEM, will leave high school with stronger STEM academic experiences and greater interest in STEM careers than they would have had if they had attended secondary programs without a STEM focus. However, we acknowledge the limitations of propensity score modeling as a basis for causal inference and the possibility that our models did not fully control for a greater initial interest in STEM careers on the part of students who selfselected into ISHSs.
While very consistently positive, the size of the ISHS impact estimates varied for different kinds of outcomes. They appear large for STEM course-taking, STEM identity, and interest in pursuing a STEM career; moderate for general education aspirations; small for science achievement test scores; and absent for mathematics achievement scores and STEM self-efficacy.
These findings have important implications for education policy. They suggest that the inclusive STEM high school model can be implemented broadly with positive impacts for students, including low-income, female, and under-represented minority students. These findings underscore the assumption expressed in the PCAST report that a much broader cross-section of students can experience sustained, advanced instruction in STEM if given the opportunity and suitable support structures.
An important question is whether the ISHS impacts are large enough to have practical import. In particular, policymakers would want to know whether attending a STEM high school increases the likelihood of entering and completing a STEM degree program in college. Postsecondary outcome data were not available for most of the five cohorts included in this meta-analysis. But, as noted above, strong interest in a STEM career at the end of high school predicts entry into a STEM major in college. In addition, as reported elsewhere (Means, Wang, Wei, Iwatani, & Peters, 2018), we have analyzed postsecondary data for cohort 3 and found that the odds of being in a STEM bachelor's degree program 2 years after finishing high school were nearly triple for these Texas ISHS graduates compared to matched graduates of comparison high schools.
These findings also suggest that the ISHS model that emerged over the last 15 years is robust enough to yield similar positive outcomes across a wide range of state contexts. The five cohorts in our analyses incorporate 77 ISHS school samples from three states and 4 graduation years. As described earlier, student demographics, education policies, and the specific strategy for starting and supporting inclusive STEM high schools varied across the three states. Nevertheless, the overall picture presented by the impact estimates in Figs. 2, 3, and 5 is one of consistent impacts across different contexts and graduation cohorts. For most of the 21 high school outcomes, ISHS impact estimates for all five study cohorts were positive in direction. While the size of the positive impact and of the standard error (and therefore the significance level) differed from sample to sample, the consistency in the direction of the effect suggests that inclusive STEM high schools with the characteristics shown in Fig. 1 do typically enhance STEM coursetaking and career interest across a range of different state contexts. The consistency of the direction of impacts observed across the five cohorts suggests that despite lack of any national accrediting process or control of the ISHS model, the emergent practice is consistent enough that impacts are equivalent across a range of different state contexts. Several factors may have contributed to the observed consistency across state contexts.
First, all three of the states where we conducted studies received funding from the Bill & Melinda Gates Foundation to support the creation of inclusive STEM high schools. Funding and ideas came from organizations within each state (legislature, governor's office, education department, science and technology organizations, local foundations) as well, but the Gates Foundation investment was certainly an impetus for starting this work at scale and came with a set of core ideas about the need for new designs for small high schools promoting rigor, relevance, and relationships for students from underserved communities (Gates, 2005). Comparing our findings for inclusive STEM high schools within North Carolina, Ohio, and Texas to those in other states as described by other research teams (LaForce et al., 2014;Lynch et al., 2018;Scott, 2012) does not suggest that there are systematic differences between the two, but we need to acknowledge the role of the Gates Foundation and the legitimacy of the question of whether implementation of these schools would have been more variable absent the foundation's involvement in early planning for all three state initiatives.
Another likely contributor to the consistency of ISHS impacts across states was our use of phone interviews to screen potential schools for our ISHS sample to make sure they really were nonselective and had a schoolwide STEM-focused program that all students were expected to complete. Some schools have rebranded themselves as STEM without making any substantive changes in expectations, curriculum, or pedagogy (Eisenhart et al., 2015;Weis et al., 2015) or involve some but not all of their students in intensive STEM coursework. Our screening of potential study schools was designed to exclude such superficial school reform efforts from our study samples, but may have screened out some variants of broad-access STEM schools and programs related to different state policies and incentives (e.g., around career technical education pathways).
One high school outcome category that did seem to be sensitive to state context effects was STEM coursetaking. By virtue of the way impacts are estimated, the estimated ISHS impact on likelihood of taking a specific STEM course was influenced both by practices in ISHSs and by practices in non-STEM high schools. The latter can change over time as a result of state policy initiatives or other educational trends. For example, the "4 X 4" policy operating in Texas from 2007 to 2014 meant that in those years every high school student had to take 4 years of science and 4 years of math to graduate, and this policy likely reduced the ISHS impact on science and math course-taking for cohort 3. The Texas state legislature repealed this policy during the second year of high school for cohort 4.
In summary, these meta-analysis findings provide a positive example of an equity-oriented educational improvement effort with measurable positive impacts. Cohen and Mehta (2017) argue that the weak central control and loosely coupled nature of American public education make system-wide change in core instruction difficult but do open up possibilities for more limited, niche reforms that deviate from usual practices with respect to teaching and learning. Inclusive STEM high schools appear to be one such niche reform-manipulating curriculum, instructional practices, expectations for nondominant student subgroups, and school size and culture in ways that in combination pay off in terms of high school outcomes. The ISHS data suggest that regardless of their demographic background, students who have an interest in STEM can benefit from a rigorous STEM-focused curriculum if provided with the kinds of instruction and supports emphasized in the ISHS model. The next critical question for policy and practice is whether this kind of educational approach can travel beyond its niche-becoming something that low-income,