Gender differences in the intention to study math increase with math performance

Even though females currently outnumber males in higher education, they remain largely underrepresented in math-related fields of study, with no sign of improvement during the past decades. To better understand which students drive this underrepresentation, we use PISA 2012 data on 251,120 15-year-old students in 61 countries to analyse boys’ and girls’ educational intentions along the ability distribution on math assessment tests. We analyze the percentages of boys and girls intending to pursue math-related studies or careers as a function of math performance. First, we show that for both boys and girls, there is a positive and linear relation between the probability of intending to pursue math and math performance. Second, the positive relation is stronger among boys than among girls. In particular, the gender gap in student intentions to pursue math-related studies or careers is close to zero among the poorest performers in math and increases steadily with math performance. Third, as a consequence, the gender gap in math performance, to the detriment of girls, is larger among students intending to pursue math than in the general student population.


Supplementary Note
First, we verify that differential intentions by themselves lead to girls' lower math performance and lower representation among the highly able, within the population of those (boys and girls) who intend to do math. For this purpose, we start by exhibiting a theoretical condition on girls and boys intentions to do math that ensures a gender gap in performance and a lower representation of girls among top performers, within the population of those (boys and girls) who intend to do math, whereas this is not the case in the overall population. Then, we confirm empirically the negative effect of differential intentions alone on girls' relative performance Second, we exploit a dataset from the High School Longitudinal Study (HSLS:09), to shed light on the validity of our conclusions on actual enrolment.

Theoretical properties of boys' and girls' pattern of intentions ensuring a negative impact on girls' relative performance and representation among those who intend to do math
We let Condition C denote the fact that the ratio between boys' and girls' conditional probabilities to intend to do math given their math performance is an increasing function of this performance.
Condition C corresponds to the fact that the gender gap in intentions increases with math performance, when the gender gap is measured by the ratio of boys' and girls' probability of strongly intending to do math. We have seen in the main text that this condition is satisfied by our intentions data (see Figure 2).

Proposition If boys' and girls' probabilities to intend to do math satisfy condition C and if boys
and girls have the same initial distribution of performance, then 1. the average math performance of boys among those who intend to do math is higher than that of girls; 2. among those intending to do math, the ratio of girls to boys above a given level of performance decreases with that level.
Proof By Condition C, is decreasing in , we get that ≤ +ℎ +ℎ , hence − +ℎ +ℎ is positive and the ratio of girls to boys among those intending to do math above a given level of performance decreases with that level. ∎ Note that the proof is made considering discrete distributions. Straightforward adaptations permit to derive the same results with continuous distributions. Similarly, the same results can be obtained replacing the performance by the fractile of performance. Indeed, the fractile of performance is just a nondecreasing (increasing) transformation of the performance (when the subdivision in fractiles is thin enough, i.e. when the number of fractiles is large enough).
Empirical confirmation of the negative impact of the differential pattern of intentions alone on girls' relative performance and representation at high levels As explained in the main text, in order to pin down the impact of the differential pattern of intentions alone, we need to net out the impact of the differences in ability distribution. We do so in panel B of Supplementary Table 7 that provides statistics after reweighting students so that girls and boys have the same ability distributions. The reweighting is done separately for each plausible value (PV) of math ability, so that we construct five sets of weights (one for each PV), compute the relevant statistics for each plausible value and its associated weight, and average the results over all PVs. For each PV, we reweight girls with a standard DFL reweighting procedure so that their ability distribution becomes similar to that of boys. In practice, we do so by applying a weighted linear regression (using PISA students' weights W) of a dummy for girls on one hundred dummies for each percentile of the ability distribution (for the relevant PV). We then take the predicted value p(girl). The final weight is W for boys and W*( (1-p(girl))/p(girl) )* ( (1-s) / s ) for girls where s is the share of girls in the sample (weighting observations by W). This whole weighting procedure is applied separately on each sample on which we provide statistics (all countries, OECD countries and non-OECD countries).
Therefore, these statistics cannot capture differences in ability distributions. Panel A of Supplementary Table 7 provides similar statistics without reweighting to allow for easy  comparisons, as in Table 2. Column (a) shows a gender gap in performance, at the advantage of boys, before selection occurs in the general population (panel A), but not anymore with reweighting making boys' and girls' ability distributions alike (panel B). Comparing columns (d) and (f) of Supplementary Table 7, one can see that girls-to-boys ratios in the general population are decreasing with math performance (panel A), but not anymore once distributions of math ability have been equated (panel B).
The comparison of columns (a) and (b) shows the negative impact of intentions on girls' relative math performance. In panel B, this result cannot be attributed to initial differences in abilities. We observe that the negative impact of the differential pattern of intentions itself is more marked on non-OECD countries. When we consider all PISA 2012 participating countries, the differential intentions alone (when initial distributions are identical) leads to gender performance gap among those strongly intending to pursue math studies of 3.7% of a standard deviation. This represents approximately 70% of the total increase in the gender gap in math performance observed after selection in panel A. Table 7 then compare girls-to-boys ratios in the whole population and among those strongly intending to pursue math studies and show how the girlsto-boys ratio is strongly reduced among these due to the fact that intentions to pursue math studies are lower among girls (factor (ii)).

Columns (d) and (e) of Supplementary
Finally, the comparison of columns (f) and (e) shows the impact of selection on girls' representation among high performers. The girls-to-boys ratio among those strongly intending to pursue math studies is decreasing with math ability. In panel B, this result cannot be attributed to initial differences in abilities. We see that the decrease remains limited, so the small girls-toboys ratio among those strongly intending to pursue math studies above any level of math ability is primarily due to the fact that, on average, intentions to pursue math studies are lower among girls.

High School Longitudinal Study and actual enrolment of high school students
We also exploit a dataset from the High School Longitudinal Study (HSLS:09), a US nationally representative study of the National Center for Education Statistics (NCES) designed with a focus on STEM. Note that this dataset is analyzed in (19), restricting the sample to 4-year college students.
HSLS:09 is a longitudinal study of approximately 20,000 ninth graders. Students were first surveyed in fall 2009 as ninth graders and were surveyed again 2.5 years later, in 2012, when most were spring-term eleventh graders. In the summer and fall of 2013, students or their parents responded to a survey about the student's high school completion status, postsecondary education and work plans, college application experiences, and work experiences. In addition, school personnel in base-year schools and other schools identified during data collection supplied high school transcripts for HSLS:09 students from all schools that these students had attended.
We consider as outcome the actual enrolment of high school students in a given set of courses in engineering, computer science and math, as declared in 2013. These outcomes are particularly important since the leaky pipeline starts at high school level and high school choices shape college choices (see e.g., 14). We observe for instance on HSLS data that a much higher percentage of males than females earn any high school credit in engineering and technology (21% versus 8%) More precisely, we consider as outcome the earning of an above average number of credits in math (i.e., 3.5) as well as either more than one credit in computer science or at least one credit in engineering. We say that the student is in MECS (Math, Engineering or Computer Science) if he has earned these credits. We obtain similar results if we restrict our attention to engineering credits only or computer science credits only. We consider as a measure of math performance the score at a math assessment that students take during 9th grade ("theta math ability score", fall 2009). As a robustness check, we also consider the math score at the first follow up (11th grade, spring 2012). We use the longitudinal sampling weights provided by HSLS.
Our sample consists of 16,303 students (50% females), with available data for math scores and credits. On average, around 17% of males but only 9% of females are in MECS.
Supplementary Figure 6 presents the proportion of males, of females, and of all students in MECS as a function of the deciles of math performance. The features are very similar to those of Figure 1 in the main text. In particular, the relation is more increasing for males than for females. The probability of being in MECS varies from 0.07 to 0.1 for females and from 0.07 to 0.27 for males. Moreover, math performance explains better MECS for males than for females. Indeed, the R-squared of a simple regression of MECS on math performance is much larger for males than for females (0.03 vs. 0.001).
As a result of the differential enrolment of females and males into MECS, the difference between the proportion of males and the proportion of females in MECS increases with math performance, as is illustrated by Supplementary Figure 7. The sex gap is very limited and not statistically different from zero for the lowest levels of math ability but it increases steadily and linearly with math ability, from close to 0 for the lowest achievers to 15% for the highest achievers.
We fit linear models of the type: Supplementary Equation (1) captures how the probability of enrolling in MECS varies with math ability for males (coefficient ) and for females ( + ), assuming that the relationship is linear for both sexes. A one standard deviation increase in math performance increases males' probability to enroll in MECS by 6 percentage points, but females' probability by only 1 percentage point.
Supplementary Figure 8 presents the ratio between the proportion of males and the proportion of females in MECS as a function of the deciles of math performance. We observe that this ratio is indeed increasing with math performance, being close to 1 among the poor performers and reaching 2.5 for high performers.
In terms of relative performance, in the whole population and among the students in MECS, we obtain the following result. The sex gap in math performance is null among all high schoolers but it amounts to 0.25SD at the advantage of males and is highly significant among those who choose these courses. If we consider as a measure of math performance the score at the second assessment (in 2012), the result is even stronger with a sex gap of 0.35SD.
These results mirror those we obtain in the main text about intentions to pursue math studies and careers. They suggest that our conclusions remain valid for actual course choices in high school in the US.

Supplementary Figure 2: Girls' lower representation at high levels of math performance among students strongly intending to do math
This figure represents the girls-to-boys ratio of students strongly intending to do math, above a given fractile of performance. See the legend of Figure 1 for more details about the fractiles of performance and the definition of "strongly intending to do math".

Supplementary Figure 3: Gender gap in intentions to do math as a function of math performance not standardized by country (3a: Girls and boys separately) (3b: Gender gap)
This figure is the analogue of Figure 1 when we consider a measure of math performance that is not standardized by country. Here, math performance is standardized to have a weighted mean equal to zero and a weighted standard deviation equal to one in the whole sample of 61 countries. See the legend of Figure 1 for other details. Confidence intervals are based on two-sided t-tests.

Supplementary Figure 4: Gender gap in intentions to do math as a function of math performance fractiles of students' gender-specific performance distributions.
This figure is the analogue of Figure 1 except that the math performance fractiles considered in this figure are gender specific. The Figure therefore compares girls and boys that are at the same rank in their gender-specific math performance distribution rather than at the same rank in the whole math performance distribution. Confidence intervals are based on two-sided t-tests.

Supplementary Figure 5: Sex gap in enrollment in math-related fields as a function of math performance. Results for France based on math performance measured at the end of high school and actual enrollment in higher education.
Performance in math is the students' math grade obtained at the Baccalauréat math test. Math-related fields correspond to university and selective higher education programs in mathematics, physics, engineering or computer sciences. The sample consists of 5,574 students (due to this smaller sample size we split it in 10 groups instead of 20 groups as we do in the main figures). Confidence intervals are based on two-sided t-tests.

Supplementary Figure 6: MECS (Math, Engineering and Computer Science in HSLS:09) for males and for females as a function of math performance
The sample consists of 16,303 students, Performance in math is the students' math score obtained at the 9th grade assessment. The Figure shows the percentage of males and of females earning above a given number of credits in high school in Math, Engineering or Computer Science by decile of math performance.

Supplementary Figure 7: Sex gap in MECS (Math, Engineering and Computer Science in HSLS:09) as a function of math performance
The sample consists of 16,303 students, Performance in math is the students' math score obtained at the 9th grade assessment. The Figure shows the difference between the percentage of males and the percentage of females earning above a given number of credits in high school in Math, Engineering or Computer Science by decile of math performance. Confidence intervals are based on two-sided t-tests.

Supplementary Figure 8: Males-to-females ratio in MECS (Math, Engineering and Computer Science in HSLS:09) as a function of math performance
The sample consists of 16,303 students, Performance in math is the students' math score obtained at the 9th grade assessment. The Figure

Supplementary Figure 11: Gender gap in students declared interest for math as a function of math performance
Declared interest for math is an index computed from several questions. It is standardized to have a mean equal to 0 and a standard deviation equal to 1 on the whole sample of students. Confidence intervals are based on two-sided t-tests.

Supplementary Figure 12: A non-parametric investigation of the relationship between math ability (raw measure rather than fractiles) and math intentions
The Figure shows the proportion of students strongly intending to do math as a function of the math score (first plausible value only). Each dot shows the average score (x-axis) and the average proportion (y-axis) within a score bin of size 10 (e.g. all scores between 500 and 510). The bottom and top 5% of the score distribution have been excluded. The Figure shows that the relationship between the math score and intentions to do math is close to linear.

Supplementary Tables
Supplementary Notes: The proportion of students who graduate in STEM comes from the UNESCO database (http://data.uis.unesco.org/#, extracted on September 22nd, 2020) for years 2010-2016. When the information was missing for a country in 2012, the closest available year was considered instead. The proportion of students who strongly intend to do math corresponds to the proportion of students in PISA 2012 that answer yes to five questions measuring their intentions to pursue a career in a math-related field or to invest more in math than in other science subjects or reading/literature.  Table shows that boys rely more on their math performance to intend to study math (higher slopes and R-squared in all models) no matter how math performance and intentions to study math are standardized or measured. Analyses based on a sample of 61 countries. Math performance standardized by country is standardized to have a weighted mean equal to 0 and a weighted standard deviation equal to 1 in each country (and on the full sample of countries). Math performance standardized worldwide is an affine transformation of math ability to force the weighted mean to be equal to 0 and the weighted standard deviation to be equal to 1 on the full sample of countries. All estimates and standard errors are based on plausible values for math ability and account for measurement error in these abilities on top of standard sampling error. 95% Confidence intervals in brackets. P-values based on two-sided t-tests.

Supplementary Table 3: Differential intentions to do math of boys and girls (from Equation (1)). Controlling for country fixed effects and using alternative models
Dependent variable is "strongly intending to do math"

Model used:
Linear probability Logit Probit Observations 251,120 251,120 251,120 251,120 251,120 251,120 Notes: Analyses based on a sample of 61 countries. Math performance is standardized to have a weighted mean equal to 0 and a weighted standard deviation equal to 1 in each country of the sample (and on the full sample of countries). "Strongly intending to do math" corresponds to answering yes to five questions measuring PISA 2012 students' intentions to pursue a career in a math-related field or to invest more in math than in other science subjects or reading/literature. All estimates and standard errors are based on plausible values for math ability and account for measurement error in these abilities on top of standard sampling error. 95% confidence intervals in brackets. Pvalues based on two-sided t-tests.  (1) without control variables for all countries in the PISA 2012 sample. a: In these countries, intentions to pursue math-related studies or careers do not increase significantly (at the 5% level) with math performance (in a regression that does not distinguish between girls and boys). 95% Confidence intervals in brackets. P-values based on two-sided t-tests. Math ability is standardized to have a weighted mean equal to 0 and a weighted standard deviation equal to 1 in each country. Math intention is a dummy equal to one for students answering yes to a series of 5 questions capturing intentions to pursue studies or careers related to math. Estimates and standard errors involving are based on plausible values for math ability and account for measurement error in these abilities on top of standard sampling error. Notes: 95% Confidence intervals in brackets. P-values based on two-sided t-tests. Math ability is standardized to have a weighted mean equal to 0 and a weighted standard deviation equal to 1 in each country. Gender gaps are therefore expressed as a fraction of the standard deviation of math performance among the whole population. "Strongly intending to do math" corresponds to answering yes to five questions measuring PISA 2012 students' intentions to pursue a career in a math-related field or to invest more in math than in other science subjects or reading/literature. The last two columns provide the ratio between the number of girls and boys in the top decile of the ability distribution both in the whole population and among those strongly intending to pursue math studies. All estimates and standard errors are based on plausible values for math ability and account for measurement error in these abilities on top of standard sampling error. p=0.913 p<.001 Notes: 95% Confidence intervals in brackets. P-values based on two-sided t-tests. Math ability is standardized to have a weighted mean equal to 0 and a weighted standard deviation equal to 1 in each country. Gender gaps are therefore expressed as a fraction of the standard deviation of math performance among the whole population. "Strongly intending to do math" corresponds to answering yes to five questions measuring PISA 2012 students' intentions to pursue a career in a math-related field or to invest more in math than in other science subjects or reading/literature. The last two columns provide the ratio between the number of girls and boys in the top decile of the ability distribution both in the whole population and among those strongly intending to pursue math studies. All estimates and standard errors are based on plausible values for math ability and account for measurement error in these abilities on top of standard sampling error.  (3) are based on a French general survey asking both 10th and 12th graders if they would consider pursuing math, physics or computer science after high school. It also analyzes the intentions of 10th graders to enroll in a scientific track in the following year after as well as their actual enrolment in such a track (in the following year, i.e., when they are in 11th grade). Columns (4) to (7) are based on a rich survey and on administrative data on 12th graders' educational choices. Standard errors in parentheses. 95% Confidence intervals in brackets. P-values based on two-sided t-tests.   .0286 Notes: Analyses based on a sample of 61 countries. Math performance is standardized to have a weighted mean equal to 0 and a weighted standard deviation equal to 1 in each country of the sample (and on the full sample of countries). "Strongly intending to do math" corresponds to answering yes to five questions measuring PISA 2012 students' intentions to pursue a career in a math-related field or to invest more in math than in other science subjects or reading/literature. The rich set of controls in column 1 includes a third order polynomial in reading ability, a third order polynomial in science ability, reading ability interacted with gender, science ability interacted with gender, and interaction terms between math, science and reading ability (3 two-terms interactions and one three-term interaction). All estimates and standard errors are based on plausible values for math ability and account for measurement error in these abilities on top of standard sampling error. Standard errors in parentheses. 95% Confidence intervals in brackets. Pvalues based on two-sided t-tests.