Investigating spatial skills and math anxiety as mediators in a sequential mediation model: A pilot study

Prior research showed a gender effect on spatial ability, math anxiety, and math achievement. Lacking, however, is a comprehensive study that tested the mediation effects of spatial ability and math anxiety between gender and math achievement in a sequential mediation model. To fill this gap, this pilot study tested two mediation relationships, one with spatial ability as a mediator, gender as a predictor, and math anxiety as an outcome variable; the other with math anxiety as a mediator, spatial ability as a predictor, and math achievement as an outcome variable. In addition, the study tested the relative strengths of the relationship between specific spatial skills that included perspective-taking, spatial imagery, and mental rotation and collegiate math achievement that included trigonometry, calculus, and linear algebra) via canonical correlations. Lastly, gender differences in spatial skills, math anxiety, and math achievement were investigated. The results of the independent ttests showed that none of the well-documented gender differences in spatial ability was found. Canonical correlation analysis showed that a single canonical variable is sufficient in accounting for math-spatial relationship. The sequential mediation model, with spatial ability and math achievement serving as the mediators in the model, fitted reasonably well. However, none of the mediation effects was statistically significant. Implications of these findings and future directions of this research are discussed.


Introduction
While gender differences in spatial ability (Lauer, Yhang, & Lourenco, 2019;Voyer, Voyer, & Saint-Aubin, 2017) and math anxiety (Van Mier, Schleepen, & Van den Berg, 2019; Wigfield & Meece, 1988) are well-documented, the mediating mechanisms between gender and math anxiety, and between spatial ability and math achievement has not been formally tested in a sequential mediation model. Furthermore, prior research focused on elementary and secondary school students, with little attention paid to the college population. From the perspective of lifespan development, it is theoretically interesting to understand the continuity of math-spatial relationship from childhood to adulthood.
The present study filled these gaps by testing the mediational relationships among gender, spatial ability, math anxiety, and math achievement in a college population. The http://journals.ums.ac.id/index.php/jramathedu study also investigates the relative strengths of perspective-taking, spatial imagery, and mental rotation in predicting trigonometry, calculus, and linear algebra performance, respectively. This is because not all spatial skills are equally implicated in math problem solving. Spatial skills that involve actively transforming mental images have been shown to be most strongly implicated (Cheng & Mix, 2014;Mix & Cheng, 2012;Mix, Levine, Cheng, Stockton, & Bower, 2021). Likewise, certain math problems may more heavily depend on spatial skills than others. By exploring the canonical correlations among subsets of spatial skills and college-level math skills, the present study sheds light on the relative strengths of the relationships between various types of spatial and math skills in a college population. Lastly, the present study compares gender differences in spatial skills, math anxiety, and math skills in a piecemeal fashion in order to pinpoint sources of gender differences in the proposed chain of effects.

Spatial skills
Spatial skills include the ability to find one's way in complex environments (Wolblers & Hegarty, 2010), generate, transform, or imagine spatial relationships, and perceive line orientations and simple shapes from distracting visual fields (Linn & Peterson, 1985). Prior research distinguishes two main categories of spatial skills, small-scale and large-scale spatial skills, respectively (Wang, Cohen, & Carr, 2014). Large-scale spatial skills involve perspective-taking when navigating in real or virtual environments. Small-scale spatial skills involve mentally rotate objects (often smaller in dimension than self) along an allocentric frame of reference. There is empirical evidence that large-and small-scale spatial skills are partially dissociated (Hegarty et al., 2006;Wang, Cohen, & Carr, 2014). The two types of spatial skills have distinct neurological underpinnings (Wolblers & Hegarty, 2010;Zacks, 2008).
Gender differences in spatial skills are well documented (see Lauer, Yhang, & Lourenco, 2019;. The gender effect is strongest for mental rotation, which favors males' performance (Linn, Linn, & Peterson, 1985;Voyer, Voyer, & Bryden, 1995). On average, males also tend to have greater visuospatial working memory (VSWM) capacities, though the magnitude of this gender effect is smaller than mental rotation (see Voyer et al., 2017). Overall, gender differences in small-scale spatial ability is well supported. Mental rotation, a type of small-scale spatial ability, show the strongest gender differences that favor males. In terms of its developmental trajectory, gender differences in mental rotation emerges as early as 5 months of age in some studies (Moore & Johnson, 2008). Individual differences in spatial ability remain relatively stable throughout early elementary school years (Moretensen, Andresen, Kruuse, Sanders, & Reinisch, 2003) and peaks in adolescence (Linn, & Peterson, 1985;Voyer, Voyer, & Bryden, 1995). Prior research indicates that low spatial ability may hinder math problem-solving (Geary, 2004;Rotzer, Loennecker, Kucien, Martin, Klaver, & van Aster, 2009).

Math anxiety
Math anxiety has been defined in many ways in the literature. It can be deduced from these definitions that math anxiety is a negative emotional state that some individuals experience when working with numbers (Suárez-Pellicioni, Núñez-Peña, & Colomé, 2016). Research suggests that the negative emotional state can compromise numerical task performance. Math anxiety resembles phobias (Ashcraft & Ridley, 2005). There are physiological markers, e.g., increases in heart rate that indicates math anxiety is evoked by math-related situations (Faust, 1996). Although at a behavioral level, math anxiety seems to be specific to math-related situations (Stein, Simmons, Feinstein, & Paulus, 2007), at the neural level, brain structures that were involved in other forms of anxieties were also http://journals.ums.ac.id/index.php/jramathedu activated. These brain regions included amygdala (Paulus & Stein, 2006), prefrontal areas (Bishop, 2009), bilateral inferior frontal junction (Lyons &Beilock, 2012), and insula (Suárez-Pellicioni, Núñez-Peña, & Colomé, 2016).
On average, females tend to report a higher level of math anxiety than their male counterpart in math-related situations (see a meta-analysis by Else-Quest, . Wilder (2012) reported that female first-and second-year undergraduate students who were enrolled in math classes reported a significantly higher level of math anxieties than male students. Wigfield and Meece (1988) assessed math anxiety in children from grades 6 to12. The results showed that girls had a stronger negative reaction but no greater worry proneness than boys when working with numbers.
There is some evidence that spatial ability and math anxiety may be inversely related. Ferguson, Maloney, Fugelsang, and Risko (2015) study showed that spatial ability is inversely related to math anxiety. In other words, higher spatial ability is predictive of lower math anxiety. Because males, on average, have higher spatial ability and experience lower math anxiety with respect to females, it is possible that high spatial ability mitigates the gender effect on math anxiety (i.e., being female is positively associated with math anxiety experienced). Maloney, Waechter, Risko, and Fugelsang (2012) study showed that spatial ability mediates the relationship between gender and math anxiety. Though correlational by design, Maloney et al. (2012) study suggests that poor spatial ability may predispose one to heightened math anxiety. Most recently, Sokolowski, Hawes, and Lyons (2019) tested a mediational model among gender, spatial ability, and math anxiety. Spatial ability was found to mediate the relationship between gender and math anxiety at a statistically significant level.

Math achievement
The math content areas being investigated by the present study refers to a standard a college level math curriculum that was in place in a regional comprehensive university in the Midwest region of the United States. At the discretion of the content experts in college mathematics/math education, three content domains of math achievement that a typical college level math curriculum covers are trigonometry, calculus, and linear algebra (matrix operations). To our knowledge, the majority of the studies investigating gender differences in math achievement focused on pre-college math contents. Few studies investigated gender differences in math achievement at the college level. Therefore, the review of literature on gender differences in math achievement is overly represented by studies at the early grade levels, even though the present study focuses on college-level math achievement.
A few trends emerged from this area of research: 1) Gender differences in math achievement are greater in more distant studies than in more recent studies; 2) Gender differences in math achievement differ across different content areas of mathematics; 3) Gender differences in math achievement differ at different developmental stages; and 4) Gender differences in math achievement differ across cultural contexts.
Historically, studies investigating gender differences in math achievement indicate a male advantage (Hyde, Fennema, & Lamon, 1990), with more distant studies showing greater gender differences than more recent ones (Devine, Hill, Carey, & Szüc, 2018;Hyde, Lindberg, Linn, Ellis, & Williams, 2008;Lindberg, Hyde, Petersen, & Linn, 2010;Scheiber, Reynolds, Hajovsky, & Kaufman, 2015). For instance, Hyde et al. (1990) meta-analyzed 100 studies (totaling 254 independent effect sizes). The meta-analysis concluded that there is a decline in gender differences in math achievement over the years. The magnitude of this http://journals.ums.ac.id/index.php/jramathedu decline can be quantified with a drop of effect size from .31 for studies published before 1973 to .14, for studies published thereafter.
The literature also reveals different patterns of gender differences in math achievement across different math content areas. For instance, basic numerical skills (Hutchison, Lyons, & Ansari, 2019) and simple arithmetic problem-solving show few gender differences, whereas complex problem-solving show greater gender differences (Lindberg, Hyde, Petersen, & Linn, 2010). On occasions, females have been shown to have a slight advantage in solving computation problems (Lindberg et al., 2010).
There is also evidence supporting developmental differences in math achievement. Gender differences in math achievement are more pronounced in high school years (Lindberg et al., 2010). Gender differences are either non-existent or barely noticeable in elementary and middle-school years (Hutchison et al., 2018;Hyde et al., 2008).
Some noted that sociocultural factor may also contribute to gender differences in math achievement (Hyde & Mertz, 2009). More labor-intensive countries showed a larger gender gap in math achievement, especially at the top level. More gender egalitarian countries witness fewer gender differences in math achievement than countries with a lower degree of gender egalitarianism (Penner, 2008).
Math anxiety has been shown to be negatively related to math achievement, though the direction of influence is not entirely clear. On the one hand, math anxiety may hinder an individual's ability to solve math problems, possibly through disrupting limited working memory resources (Ashcraft & Kirk, 2001) that could otherwise be dedicated to math problem-solving. On the other hand, poor math performance may trigger math anxiety, thereby reinforcing the negative feedback loop between the two factors.

The present study
The present study tested gender differences in three spatial tests, one measuring perspective-taking ability, a large-scale spatial ability, and two measuring spatial imagery and mental rotation, which are small-scale spatial abilities in college students majoring in math intensive fields. Therefore, the first three research questions are whether there are gender differences in performance on Perspective Taking Ability (PTA), Spatial Imagery Ability (SIA), and Mental Rotation Test (MRT). The next three research questions pertain to whether gender differences exist in performance on calculus (Math Test 1), trigonometry (Math Test 2), and linear algebra (Math Test 3) tests. Next, the canonical correlation between spatial ability composite, which is comprised of PTA, SIA, and MRT, and math achievement composite, which is comprised of Math Tests 1-3, was explored. Lastly, as a segue to Maloney, Waechter, Risko, and Fugelsang (2012) study, which investigated the relationship between gender and math anxiety, as mediated by spatial ability, the present study introduced an additional component, i.e., math achievement, in the sequential http://journals.ums.ac.id/index.php/jramathedu mediation analysis. Specifically, the sequential mediation model can be summarized in the following logic diagram: gender  spatial ability  math anxiety  math achievement.

Participants
Participants are 52 (Females=24) college students who majored in actuarial science, applied mathematics, chemistry, computer science, economics, finance, meteorology, math education, computer science, mechanical engineering, physics, and statistics. The ethnic composition of the sample mirrors the demographic characteristics of the region, with 73.1% of Caucasian, 15.4% of Hispanic-Americans, 9.6% of African-American, and 1.9% of Asian-Americans. The average age of the participants was 23.37 years (SD=6.18 years). The median age of the sample was 21 years. These participants age ranged from 18 years to 46 years.

Instruments
Perspective-Taking Ability test (PTA). We used a computer-based test developed by MM Virtual Design, LLC to measure perspective-taking. On this test, participants were asked to imagine taking the perspective of the miniature cartoon figure at the center of the screen and identify the direction of a target object from the starting position on the screen. To successfully perform the task, participants need to be able to take on the perspective of the cartoon figure and navigate the routes as directed by the screen instructions. Because the task can potentially be solved by applying either a mental rotation strategy or a perspective-taking strategy, a two-step instruction was delivered: participants were first asked to imagine taking on the perspective of the cartoon figure on the screen; then, participants were instructed to indicate the relative position of the final destination from the departing point using a categorical coordinate system that was symbolized by eight arrow direction icons. Response in a "delayed" response time condition was measured. This method of registering response time helps distinguish between the two types of problem-solving strategies. Following the guideline of the test developer, participants were allotted 15 minutes to complete the PTA.
Spatial Imagery Ability test (SIA). We used another computer-based test developed by MM Virtual Design, LLC to measure participants' spatial imagery ability. On this test, participants were asked to reason about object(s) appearance and the location of objects in space, after they had mentally performed a number of spatial transformations. The test features 24 stimuli. Following the guideline of the test developer, participants were allotted 15 minutes to complete the SIA.
Vandenberg-Kuse Mental Rotation test (MRT). As a classic test of mental rotation spatial ability, the MRT (Vandenberg & Kuse, 1978) has been used in numerous studies investigating individual and gender differences in spatial ability. The test is a paper-andpencil based and contains 20 items. Participants were to match two out of the four choice items to the target item, a LEGO-type twisted blocks, displayed on the left. Incorrect choice items could be mirror images of the target item, or they could be configurationally different. The test was administered with a 7-minute time limit.
Math tests 1-3. This set of math achievement tests cover the core math curriculum at the college level that all math-intensive majors were required to complete. The three math achievement tests were proofed by college professors who were familiar with the content coverage of the core math curriculum at the college level. The recommended time for completing each of the math tests is 15 minutes. Math Test 1 covers the content area of http://journals.ums.ac.id/index.php/jramathedu calculus. It contains 7items. Math Test 2 covers the content area of trigonometry. It contains 8 items. Math Test 3 covers the content area of linear algebra. It contains 6 items.
Abbreviated Mathematics Anxiety Rating Scale (A-MARS) questionnaire. We used a an abbreviated (25-item) version (Alexander & Martray, 1989) of the original MARS (Richardson & Suinn, 1972). The internal consistency of the A-MARS is .96. The test-retest reliability is .90, and the validity index is .92 (Alexander & Martray, 1989).These reliability and validity indices are deemed adequate for general research purpose (Nunnally & Bernstein, 1994).
Demographic questionnaire. At the end of the study, a survey was administered to all participants. The survey requested demographic information that is relevant to the study, e.g., gender, age, ethnicity, handedness.

Procedure
This research was approved the university's Institutional Review Board (IRB) where data collection took place. Participants were individually tested. Each test session lasted for approximately an hour and a half. Participants were recruited via IRB-approved email announcement and paper flyer. Before each session, participants reviewed the consent form. Session starts shortly after a participant completed the consent form. The order the math and spatial sub tests were administered were counterbalanced in order to minimize any systematic order effect. A-MARS and the demographic questionnaire were always delivered at the end of the test session. This is because exposure to the contents of these measures may have priming effects on some participants' math and spatial subtest performance.

Scoring
The pre-programmed scoring algorithm of PTA and SIA takes both accuracy and response time into considerations when computing the overall performance score on these tests. For PTA, an effort was made to differentiate between people who use perspective taking strategy and those who do not. Scoring algorithms for PTA and SIA also correct for guessing. In scoring the MRT, participants score a point only if both matching items were correctly selected. No partial credit was given (see Peters et al., 1995;Vandenberg & Kuse, 1978, scoring guidelines). The total score on the MRT was the sum of the points earned for correctly answered items. This variable was used in subsequent statistical analyses. The scoring of Math Tests 1-3 (calculus, trigonometry, and linear algebra) was based on a detailed scoring rubric developed by a group of content experts in these areas of mathematics, i.e., math professors and area coordinators who regularly taught or administered tests in these areas. Two experienced graders who were graduate assistants at the mathematics department graded Math Tests 1-3 independently, using the abovementioned scoring rubric as a guide. This study used the intraclass correlation (ICC) to compute Interrater Reliability (IRR). ICC is a commonly used statistics for assessing IRR for ordinal, interval, and ratio variables. ICCs are suitable for suitable for studies with two or more coders. Unlike Cohen's (1960) Kappa, which quantifies IRR based on all-or-nothing agreement, ICCs factor into the magnitude of disagreement in computing the IRR estimates, with higher ICCs indicating lower disagreement among raters and vice versa (Hallgren, 2012). Because the average of multiple measurements is more reliable than a single measurement, the average of multiple measurements is more reliable than a single measurement and also because in studies where all subjects were coded by more than one rater, average-measure ICCs are more appropriate (Hallgren, 2012), this study computed average-measure ICC for Math Tests 1, 2, and 3, respectively. SPSS program was used to compute the average-measure ICC for Math Tests 1 (calculus), Math Test 2 (trigonometry), http://journals.ums.ac.id/index.php/jramathedu and Math Test 3 (linear algebra), which are .989, .797, and .973, respectively. These values are considered to be excellent, based on Cicchetti's (1994) widely-cited cutoff values for rater agreement based on ICC values (poor for ICC < .40, fair for .40 < ICC < .59, good for .60 < ICC < .74, and excellent for .75 < ICC < 1.0).

Data Analysis
Gender differences via independent sample t-tests and non-parametric statistics. The Mann-Whitney U test was used to compare gender differences in performance on PTA, SIA, MRT, and on tests of calculus, trigonometry, and linear algebra. The Mann-Whitney U test is appropriate to use when the following assumptions were met, when: 1) dependent variable was measured at the ordinal or continuous level; 2) independent variable is comprised of two categorical, independent groups; 3) independent observations, i.e., no relationship exists between the observations within each group or between the groups; 4) though the two variables involved may not be normally distributed, their distributions have approximately the same shape. To answer the questions of whether gender differences exist for the three spatial ability tests and the three math achievement tests, pairwise ttests were performed. Pairwise comparisons are statistical methods that can be used when a researcher is comparing more than two group means, i.e., the independent variable has more than two levels (Keppel, 1982). SPSS Statistics were invoked to check for these four assumptions. The results of the preliminary test showed that these assumptions were met.
Canonical correlation. To explore the associations among spatial ability-related variables (i.e., "set 1," which includes performance scores on PTA, SIA, and the MRT) and math achievement-related variables (i.e., "set 2," which includes performance scores on calculus, trigonometry, and linear algebra), canonical correlation analysis was performed via SPSS Statistics. Canonical correlation is appropriate in situations where multiple regression can be applied, but where there are multiple inter-correlated outcome variables to consider (Afifi, Clark, & May, 2004). This analysis seeks to explore orthogonal linear combinations of two sets of variables (canonical variates) within each set that best explain variation both within and between the sets.
Mediation analysis. To test the mediational relationships among gender, spatial ability, math anxiety, and math achievement, as derived from the reviewed studies, SPSS AMOS was invoked. In the model, spatial ability and math achievement are latent variables .Each latent variable has three indicators in the measurement model, whereas gender and math anxiety are measured variables. Initially, gender was allowed to also have indirect paths to math anxiety and math achievement. However, because removing these indirect paths improves model fit and also because removing these indirect paths makes the model more parsimonious, in the final model tested, gender only has a direct effect in the model. Table 1 presents the descriptive statistics of the observed variables. Table 2 presents the inter-correlations among these observed variables. What is noteworthy in the correlation table is that trigonometry (Math Test 1) is significantly related to spatial imagery (p= .004) and mental rotation (p= .034). These correlations are moderate in magnitude (Cohen, 1988). Similarly, calculus (Math Test 2) is significantly related to perspective taking (p= .011) spatial imagery (p= .001) and mental rotation (p= .007). These correlations are also moderate in magnitude (Cohen, 1988). Within the spatial ability http://journals.ums.ac.id/index.php/jramathedu measures themselves, the intercorrelations among perspective taking, spatial imagery, and mental rotation are large in effect size (Cohen, 1988) and are also statistically significant at p < .001. This indicates a high degree of relatedness in measures tapping the spatial ability construct. The intercorrelations among math-related measures, however, are not consistently high, with only the correlation between calculus and trigonometry reaching statistical significance at p < .001 and is also large in effect size (Cohen, 1988). In other words, linear algebra is not meaningfully related to either trigonometry or calculus. In looking at the test score range of linear algebra, it approaches the floor level. It is possible that some participants are still in the process of attaining mastery over the linear algebra operations when the data was collected. The restricted range of test score for linear algebra may be the reason that there is a lack of significant and meaningful correlation between this variable and the other two math-related measures (intercorrelations among variables within a set), or between this variable and the three spatial ability-related measures (intercorrelations between the sets). This is because when a sample has a restricted range of scores, the correlation will be reduced.  Note. Statistically significant correlations at the alpha level of .05 are bolded in this table. * p< .05, **p< .01, ***p< .001.Due to a high degree of interrater reliability, the average scores produced by graders 1 and 2 were used as indicators of math achievement in trigonometry (MATH1), calculus (MATH2), and linear algebra (MATH3) in the ensuing analyses.

Gender differences: pairwise comparisons and non-parametric analyses
Surprisingly, the results of these pairwise t-tests did not yield any statistically significant differences. Because Independent-Samples Mann-Whitney U Test does not require the underlying distributions of the comparison groups to be normal, that test was http://journals.ums.ac.id/index.php/jramathedu performed to double check on whether there are any discrepancies in results between the two statistical methods. The results of these asymptotic significance tests are similar to those of the pairwise t-tests. For PTA, the 2-sided asymptotic significance test has a p-value of .97, for SIA, the 2-sided asymptotic significance test has a p-value of .60), for MRT, the 2sided asymptotic significance test has a p-value of .28. The p-values of the 2-sided asymptotic significance tests for calculus (Math Test 1), trigonometry (Math Test 2), and linear algebra (Math Test 3) were .87, .26, and .74, respectively.

Canonical correlations between spatial ability and math achievement
Canonical correlations analysis is useful for identifying overall relationships between multiple sets of variables. Because only the first canonical variate is statistically significant at (p<.10, see Table 3), Table 4 presents the results based on the first variate. A single canonical variate is therefore considered sufficient for explaining the spatial-math relationship, i.e., between set 1 (PTA, SIA, MRT) and set 2 (calculus, trigonometry, linear algebra) variables. This canonical variate indicates that 12.2% of the variance in math achievement can be explained by spatial ability. This finding is consistent with the medium to large Pearson correlation between spatial ability and mathematics achievement that can be deduced from the extant research.

Mediation Analysis
To test the mediational relationship of spatial ability between gender and math anxiety and the mediational relationship of math anxiety between spatial ability and math achievement, a sequential mediation model was tested. Figure 7 is a schematic diagram of these mediational relationships. The standardized path coefficients are also shown in Figure 7. Although none of the structural path coefficients, i.e., the direct and indirect effects, is statistically significant, the overall model fits reasonably well (GFI = .9, CFI = 1.0, AIC = 56.6, BIC = 91.7, RMSEA = .1). This suggests that the model closely represents the mediation relationships among gender, spatial ability, math anxiety, and math achievement and that the indicator variables of spatial ability and math achievement represented the latent variables reasonably well in the model.

Discussion
This paper presents findings from a pilot study that investigated the relationships among gender, spatial ability, math anxiety, and math achievement that were tested in a sequential mediation model. Results from U-tests on gender differences in spatial ability, math anxiety, and math achievement were also presented. Furthermore, canonical correlation between spatial and math factors was explored. Beyond synthesizing prior research through testing the mediational relationships in a sequential mediation model, another contribution of the study is its focus on college level math when exploring mathspatial relationships.
Unexpectedly, none of the well-documented gender effects concerning spatial ability and math anxiety was found. Nor was there a significant gender difference in math achievement. The non-significant mediators and gender effects on spatial ability, math anxiety, and math achievement may be due to a highly selective sample that is comprised of college students who received intensive training in mathematics. This sample may not be representative of young adults with normal range of numerical and spatial abilities. In order to understand whether ability level moderates the mediational relationships tested in this study and whether gender differences vary at different ability levels, future studies should explore math-spatial relationship in college students in non-math intensive majors and compare the magnitude of that relationship vis-à-vis that of students who are in mathintensive majors. Future studies could use common math problems that require high school level mathematics that are accessible to both comparison groups.
The significant Pearson correlations between individual measures of spatial ability and mathematics achievement tests extended a body of research that supported spatialmath relationship in childhood (e.g., Cheng & Mix, 2014) and in adolescence (e.g., Kyttälä & Lehto, 2008). Tests of Canonical correlation indicate that a single canonical variate is sufficient in accounting for math-spatial relationship in this college sample. The range of Pearson correlations reported in this study is consistent with what was reported in prior research (r = .4 -.5). The practical implications of this finding are that spatial and math skills are interwoven throughout developmental stages that spanned from childhood to young adulthood. Therefore, any serious math training programs ought not to neglect developing spatial skills alongside teaching content-specific math problem solving strategies. There is strong evidence that spatial skills are malleable to training (Uttal, Meadow, Tipton, Hand, Alden, Warren, & Newcombe, 2013). The effects of spatial training have also been shown to transfer to math domain. Training materials developed by some researchers have already been used to supplement standard Engineering curricula to improve college level math skills (see Sorby, 2007;Sorby, 2009;Sorby, Casey, Veurink, & Dulaney, 2013). Therefore, on both theoretical and practical grounds, there is strong evidence for the close relationship between spatial and math skills. http://journals.ums.ac.id/index.php/jramathedu

Limitations and future directions
In the model tested, while indicators of spatial ability have statistically significant loadings, the loadings of math measures to the math achievement were not statistically significant. It is possible that general low performance on linear algebra (Math Test 3) may have contributed to the lack of significant relationships between linear algebra and the other math achievement indicators in the model. This in turn, may have reduced the overall magnitude of the path loadings from individual measures of math achievement to the math achievement factor. Although none of the mediators tested in the sequential mediation model was statistically significant, the model fitted reasonably well to the data. While the G-Power program showed that the present sample is sufficiently powered to detect a large gender effect, detecting a small to medium-sized effect requires larger samples (e.g., up to 788 participants). Future studies should aim for a larger sample size, in light of the findings that the magnitude of gender differences in math achievement may differ across cultural contexts, with a general trend of more egalitarian countries showing lesser gender gap in math achievement (Gevrek, Gevrek, & Neumeier, 2020). Large-scale studies also help more accurately gauge the magnitude and significance of the two sequential mediation relationships tested in this pilot study, i.e., genderspatial abilitymath anxietymath achievement .In addition, due to a low performance scores on the linear algebra test (Math Test 3), possibly due to some students' readiness to take the test at the time when the data was being collected for this study, future studies could either use data from AP tests on linear algebra or use a study sample that has already completed all the required math courses that are representative of a standard college math curriculum. Furthermore, a sufficiently powered study enables the researchers to investigate the influence of potential covariates on the mediation relationships tested in this pilot study such as the socioeconomic status (SES) and general cognitive ability.
Despite these limitations, this pilot study is one of the first attempts to investigate math-spatial relationship among college students majoring in math-intensive fields, to test the mediation effects of spatial ability and math anxiety in a sequential mediation model that also included gender and math achievement. Furthermore, the instruments used in the present study have either been extensively used in academic research or were carefully reviewed by content experts in mathematics, making the study methodologically rigorous. The lack of statistical significance in the present findings should spur more large-scale studies in the future, rather than being prematurely tucked away in a "file drawer." The "file drawer problem" has long been noted in social and behavioral sciences (Rosenthal, 1979). It refers to a greater likelihood for statistically significant results to be published.

Educational implications
Understanding that the relationship between gender and math achievement are mediated by spatial ability and math anxiety provides useful information to policy makers and instructional designers. This is because this knowledge helps policy makers and math educators tactfully allocate attention and resources to areas where targeted interventions are most likely to make a difference in students' STEM learning outcomes. These potential areas of improvement are spatial ability and math anxiety. Recent research indicates that spatial ability and math anxiety jointly mitigate gender's effects on math achievement (Geary, Hoard, Nugent, & Scofield, 2021;Soltanlou, Artemenko, Dresler, Fallgatter, Ehlis, & Nuerk, 2019). Should a large-scale study confirm the sequential mediation model tested in the present study (i.e., genderspatial abilitymath anxietymath achievement). There is already a body of research that indicates spatial ability is malleable through training programs at various stages of cognitive development (see Cheng & Mix, 2014 Uttal et al., 2013). Likewise, math anxiety invention studies indicate that even a short-term program can effectively reduce levels of math anxiety in high schoolers (see LaGue, Eakin, & Dykeman, 2019) and first-year college students (see Samuel & Warner, 2021). This pilot study is a first step towards testing a comprehensive model that specifies potential causal influences from gender to spatial ability, from spatial ability to math anxiety, and from math anxiety to math achievement, with spatial ability and math anxiety jointly mediating the effects of gender on math achievement.

Conclusion
The present study investigated math-spatial relationship among college students majoring in math-intensive fields. It tested the mediation effects of spatial ability and math anxiety on the relationship between gender and math achievement in a sequential mediation model. While disparate bodies of prior works showed inverse relationships between spatial ability and math anxiety, and between math anxiety and math achievement, this study did not find spatial ability and math anxiety to be significant mediators of gender differences in math achievement. We conclude by suggesting replicating the study in a large sample that is sufficiently powered to detect even small mediation effects.