The Influence of Geochemical Variation Among Globigerinoides ruber Individuals on Paleoceanographic Reconstructions

Variation among individuals within species is a biological precondition for co‐existence. Traditional geochemical analysis based on bulk averages facilitates rapid data gathering but necessarily means the loss of large amounts of potentially crucial information into variability within a given sample. As the sensitivity of geochemical analysis improves, it is now feasible to build sufficiently powerful datasets to investigate paleoclimatic variation at the level of individual specimens. Here, we investigate geochemical and morphological variation among the sensu stricto, sensu lato and sensu lato extreme subspecies of the workhorse extant planktic foraminifera Globigerinoides ruber. Our experimental design distinguishes between subspecies and intraspecific variability as well as the repeatability of laser ablation inductively coupled plasma mass spectrometry (LA‐ICP‐MS). We show that geochemical variability in Mg/Ca ratios is driven by differences in subspecies depth habitat and that ontogenetic trends in Mg/Ca ratios are evident in the final whorl, with the final chamber consistently showing depleted Mg/Ca. These ontogenetic trends are not driven by individual chamber or test size. The Mg/Ca value variance among individuals is ∼100 times higher than the variance among repeated laser spot analyses of single chambers, directing laboratory protocols towards the need to sample ecologically and environmentally homogeneous samples. Our results emphasize that we can use LA‐ICP‐MS to quantify how individual variability aggregates to bulk results, and highlights that, with sufficient sample sizes, it is possible to reveal how intraspecific variability alters geochemical inference.

(within-species) variability of laser ablation inductively coupled plasma mass spectrometry (LA-ICP-MS) measurements into meaningful differences among subspecies, size classes, and repeated LA-ICP-MS measurements. Our goal is to ascertain the magnitude of variation among individuals relative to the analytical uncertainty of single-specimen LA-ICP-MS analyses, as well as amongst subspecies within the workhorse extant planktic foraminifera Globigerinoides ruber (G. ruber).
The dismissal of intraspecific variation is a common thread in ecology. The mean of a population is habitually used as a sufficient descriptor of the whole population contrary to mounting evidence that intraspecific variability impacts community stability through resource partitioning (Jung et al., 2010), competition (Bolnick et al., 2011;Hart et al., 2016), and abiotic tolerances (Bolnick et al., 2011). Applying a mean-field approximation to foraminifera might be sufficient for biological factors common among individuals such as seasonality resulting from synchronous reproduction (Bijma et al., 1990). For many other factors, however, the evidence for commonality among individuals is not robust due to the structure of variation among-and within-individuals.
Within-species variation can be apportioned into systemic and non-systemic variation. The latter encompasses random noise; there may be mechanistic reasons why two individuals express a different geochemical signature, but these reasons are often not detectable at the given scale of analysis and are prone to random variation during the measurement process. The systemic reasons could be continuous, such as geographical space (Darling & Wade, 2008) or vertical depth habitat (Norris, 2000) or discrete, such as the presence/absence of symbionts (Ezard et al., 2015;Spero & Lea, 1993). Subspecies, another source of systematic variation, form a supposedly discrete taxonomic rank below the species whose utility in evolutionary biology remains contentious (Phillimore & Owens, 2006). Subspecies' existence could potentially bias paleoceanographic studies in systemic ways if the subspecies are genetically, vertically, seasonally, or geographically distinct (Lazarus, 1983;Mayr, 1942;Norris, 2000;Seears et al., 2012). Co-existence of subspecies at a given location obviates traditional geographical criteria for subspecies delimitation (Mayr & Ashlock, 1991) and makes nuanced taxonomic identification increasingly important. Measuring, understanding, and accounting for systemic subspecies variability should be a priority to reduce errors and reveal potential bias in geochemical proxies (Antonarakou et al., 2015;Sadekov et al., 2008).
Technological advances have popularized high-resolution, single specimen measurements and intratest analysis, previously thought impossible (Emiliani, 1954). Single specimen analysis has revealed size dependent trends in stable isotopes (Kelly et al., 1997), unlocked the ability to circumvent post depositional diagenetic issues for paleotemperature reconstructions (Aze et al., 2014), reconstructed high-frequency climate signals (Ford et al., 2015;Sadekov et al., 2013), further constrained seasonal paleoclimate variability , revealed the presence of intraindividual chamber difference in biomineralization-related Mg-banding (Eggins et al., 2003;Sadekov et al., 2008Sadekov et al., , 2009, and, through an innovative use of culturing experiments, begun to illuminate the physiological drivers of such banding (Fehrenbacher et al., 2017;Spero et al., 2015). Despite instrumental improvements, our understanding of intraspecific variation has not progressed substantially because the few studies that have investigated these issues often display conflicting results, in part due to smaller sample sizes as the new technology came onstream.
Understanding whether variation is structured systemically requires individual-based data. Whilst Groeneveld et al. (2019) studied 451 individuals in 4 species, the largest isotope study of this type to date, their investigations of G. ruber analyzed these individuals in pairs. The largest G. ruber single-specimen LA-ICP-MS investigation to date analyzed 60 individuals (Naik, 2016). Using LA-ICP-MS on 264 G. ruber individuals we generated 1,860 Mg/Ca chamber level measurements with statistical replication at chamber, individual and spot level to tease apart potential ecologically driven variability. We demonstrate that: (a) intraspecific variability in geochemical space is structured around subspecies classification, but (b) within-specimen variation through ontogeny explains more variation in Mg/Ca ratios than differences among subspecies; (c) there is a no detectable impact of specimen size on Mg/Ca ratios within restricted size fractions; and (d) Mg/Ca ratios are a reliable and repeatable measurement when determined using LA-ICP-MS.

Material and Regional Setting
This study uses material collected during the Paleogene GLObal Warming events, "GLOW" cruise offshore Tanzania in the Western Indian Ocean (Kroon and the Shipboard Scientific Party, 2010). All sites on the cruise collected material that sat well above the carbonate compensation depth [CCD; 3,500-4,500 m (Bickert, 2009)] and lysocline [3,330 m (Belyaeva & Burmistrova, 1984;Ivanova, 2009]; and was composed of more than 30% clay leading to excellent well preserved planktic foraminifera (Kroon and the Shipboard Scientific Party, 2010). For this study, we focus on box core material collected at GLOW station 8 (from here on referred to as GLOW 8 (9 21' 25.20" S; 40 35' 27.60" °E) with a box corer with a diameter of 30 cm and height of 55 cm. GLOW 8 was located in the Southern Seagap, a topographic high between the Davis ridge and the Tanzanian coastline at a depth of 2,420 m. The box corer penetrated to a depth of 47 cm; for this study, we use the top 1 cm of the box core sample, which was separated whilst onboard (Kroon and the Shipboard Scientific Party, 2010). The estimated sedimentation rate in this area is ∼2 cm/kyr (Kroon and the Shipboard Scientific Party, 2010), which, alongside the presence of living benthic foraminifera at the sediment water interface in other cores in the area (Birch et al., 2013), implying we have Late Holocene planktic foraminifera.
The Western Indian Ocean is influenced by the Northeast Madagascan Current (NEMC) and movement of the Inter Tropical Convergence Zone (ITCZ). In Northern Tanzania the seasonal shifts of the ITCZ creates two monsoonal periods with heavy, prolonged rains from March to May (south east monsoon) and shorter rains between October and December (McClanahan, 1988). The coastal waters of North East Africa experience different hydrographic changes through the seasons that lead to different ecosystems from North to South (McClanahan, 1988). North of 4°S, ecosystems benefit from cooler, nutrient-rich waters as a result of seasonal reversals in the wind and current direction and subsequent variability in thermocline depth and nutrient availability resulting in high planktic productivity (McClanahan, 1988). In contrast, the GLOW 8 material used in this study sits at ∼9°S where waters are warm and low in nutrients resulting in a predominance of coral reefs and high benthic productivity (McClanahan, 1988). Today the study area consists of warm surface waters with a max seasonal surface temperature (SST) variation of 5°C (August (∼25-26°C); February (30°C) (Birch et al., 2013;Damassa et al., 2006;McClanahan, 1988)) that remains stratified all year round (Birch et al., 2013) with small variations in surface mixed layer (SML) salinity (<0.5PSU) throughout the year (Birch et al., 2013;Damassa et al., 2006;McClanahan, 1988). Full conductivity, temperature, and depth (CTD) profiles collected at GLOW 5 (8 54' 6.01" °S; 41 29' 42.25" °E) ∼111 km southwest and GLOW 2 (10 54' 6.01" °S; 41 29' 42.25" °E) ∼198 km southeast of GLOW 8 show that the thermocline sits at ∼40 m with the turbidity maximum at ∼137 m indicating a maximum depth for symbiont hosting foraminifera (Supplementary material of Birch et al., 2013). In addition, CTD data show that salinity varies between ∼34.8 and ∼35.4 PSU in the upper 200 m of the two GLOW sites.

Morphological Analysis
The material was sieved to 250-355 μm size fraction, and the first 150 individuals of each G. ruber morphotype ( Figure 1a) were picked for analyses and given an individual ID including a number 1-150 and letters referring to 4 of 20 morphotype (sensu stricto: SS, sensu lato: SL and sensu lato extreme: SLE). Due to the observed morphological variation in G. ruber (Kontakiotis et al., 2017;Robbins & Healy-Williams, 1991), we only picked individuals, which were the most morphologically distinct in the final chamber as this is where the most visual difference was evident ( Figure 1a). Any intermediate morphologies were ignored. The specimens were mounted on glass slides in groups of 20 and orientated with the aperture facing upwards (Brombacher et al., 2017). Images of the final whorl of each sample were captured using an Infinity 3 Lumenera camera mounted on an Olympus SZX10 microscope, illuminated from above (Brombacher et al., 2017). Pre-selected traits ( Figure 1b) were measured using automated image analysis macro within Image Pro 9.1 Premier software (Table S1 in Supporting Information S1). Ontogenetic traits (Figure 1c), beyond the scope of the imaging macro, were measured manually in the same software (Table S1 in Supporting Information S1).

LA-ICP-MS Trace Element Analysis
For LA-ICP-MS, 100 individuals from each subspecies already measured morphologically were selected. For each subspecies 100 unique random numbers, relating to the previously assigned morphological ID, were sampled without replacement using a random number function in the R environment (Version 4.0.3; R Core Team, 2020). The corresponding individuals were then removed from the slides and placed into separate vials. Individuals were then cleaned by ultrasonification in methanol for 5-6 s followed by two washes in Milli-Q water (Eggins et al., 2003). The samples were then dried overnight in a 50°C oven before being remounted as described previously.
Trace element (TE)/Ca ratios were analyzed using a New Wave UP193 ArF laser ablation system coupled to a Thermo Fisher Scientific X-Series II ICP mass spectrometer at the University of Southampton. During analysis Pre-selected traits measured automatically: 1a-test area and 1b-test aspect ratio. (c) Ontogenetic traits measured manually: 2a-final chamber area, 2b-final chamber perimeter, 2c-final chamber aspect ratio, 3a-penultimate chamber area, 3b-penultimate chamber perimeter, 3c-penultimate chamber aspect ratio, 4a-antepenultimate chamber area, 4b-antepenultimate chamber perimeter, 4c-antepenultimate aspect ratio, 5a-aperture perimeter, 5b-aperture width, and 5c-aperture height.   Table S2 in Supporting Information S1. Individual chambers in the final whorl were ablated in triplicate with each time resolved analysis set to 60 s; wall penetration was achieved prior to the end of each analysis to ensure that the full wall thickness was sampled.
Prior to an analysis session, the mass spectrometer and laser system was tuned for optimal sensitivity, stability, and low oxide formation. Each laser analysis used a spot diameter of 30 µm, a repetition rate of 5 Hz, and fluence 0.3 J/ cm 2 . Using the laser software, sample and standard locations were mapped and recorded to form a programmed sequence. Ten replicates each of NIST 612 and 610 reference glasses were analyzed in batches throughout each session, before and after the analysis of 9-11 individuals (about 80-100 spots). To assess accuracy of the analytical method, the Japanese Geological Survey's JCp-1 (Porites sp. Coral reference material) was analyzed in a parallel session. Each time resolved analysis was background subtracted using the gas blank measured immediately before each sample or standard. Sample and standard data were internally corrected using 43  . We also processed JCp-1 as an unknown using the same calibration curve as the foraminifera to back calculate its concentration and compare to its published value (Jochum et al., 2019). JCp-1 was measured as 836 ± 124 ppm 24 Mg (vs. 867 ± 23 ppm by Jochum et al. (2019)) and as 38.1 ± 3.8% (vs. 37.5 ± 2.4% by Sekimoto et al. (2019)). Our NIST glass calibrations thus also predict trace elements within carbonate material within specified uncertainty limits. Influential outlier values were detected using Cook's Distance and anomalous values beyond a threshold were removed (0.0001 for standards; 0.2 for foraminiferas). For each foraminifer depth profile, the processing algorithm identifies when the laser fully penetrates the chamber wall. The raw 43 Ca signal is smoothed using a moving average of 5 data points. The rate of change of the smoothed signal with respect to a time step of 20 observations (7.44 s) was calculated and the largest signal drop identified as the point when the laser had ablated through the calcite wall and thus no longer recorded the geochemical signal ( Figure S2 in Supporting Information S1). The function then removes the rows of data that occur between the final laser pulse with high signal intensity and the maximum rate of change ( Figure S2 in Supporting Information S1). The corresponding time stamp for this final laser pulse is returned and any rows of data after this are removed from further consideration. A short laser profile with a low number of data points can be caused by a false positive detection, such as where the laser has encountered a contaminant, that is, clay or something else that is not calcite that causes the 43 Ca signal to decrease rapidly and trigger the end-point detection algorithm. Such profiles were flagged in the data reduction procedure and removed. The mean number of data points per profile was 20 (7.4 s).
Trace element (TE) to calcium ratios, including Mg/Ca ratio (mmol/mol), were calculated for each time slice and a median taken. Three locations per chamber were analyzed. To manually screen for contaminants, we looked at other indicative TE/Ca ratios against Mg/Ca. Nine profiles were removed using Mn/Ca and a cut-off of 1 mmol/mol ( Figure  S3a in Supporting Information S1). For further contamination control, we also remove 29 (∼1%) additional spots with an unrealistic mean Mg/Ca >10 mmol/mol. The mean spot-average across all foram analyses was 4.36 mmol/ mol with a back-transformed 95% confidence interval of (4.19, 4.53). In total, we present results from 1,860 analyses of 263 individuals, which represents 5,323 μm 3 of material ablated, which is the largest study on G. ruber to date.
The exponential relationship between Mg/Ca ratios and temperature Rosenthal et al., 1997) has resulted in many Mg/Ca based temperature calibrations (e.g., Anand et al., 2003;Bolton et al., 2011;Dekens et al., 2002;Elderfield & Ganssen, 2000;Tierney et al., 2019). To allow for comparisons between other studies we use Equation 1 where T is temperature, b = 0.38 and m = 0.09 based on core top calibrations using G. ruber (Dekens et al., 2002;Lea et al., 2000). Mg/Ca data and temperature conversion can be found in and Table S2 in Supporting Information S1.

Stable Isotope Analysis
Stable isotope measurements were obtained from 89 individuals (SS n = 35, SL n = 23, SLE n = 31; Table  S3 in Supporting Information S1), previously measured for trace elements by LA-ICP-MS, using a Thermo Fisher Scientific Kiel IV carbonate device coupled to a MAT253 stable isotope ratio mass spectrometer at the SEAPORT Stable Isotope Laboratory, University of Southampton. Samples were placed into individual vials and measured against the global reference standards NBS19 and NBS18 as well as an in-house quality control standard (GS1). Long-term analytical precision (1σ) was based on the repeat analysis of GS1 and estimated as ±0.09‰ for δ 18 O and ±0.05‰ for δ 13 C. All results were standardized to Vienna Pee Dee Belemnite (VPDB) using a two-point calibration between NBS19 and NBS18. Calcification temperature was calculated using the general calibration of (Erez & Luz, 1983): with a w of 0.47‰ VSMOW (Birch et al., 2013) converted into VPDB using the correction of −0.22‰ (Bemis et al., 1998;Friedman & O'Neil, 1977;Pearson, 2012). Although this conversion was calibrated using Trilobatus sacculifer, we employ it here to allow direct comparisons to other studies based on this material.

Statistical Analysis
The mean spot-averaged Mg/Ca ratios in Section 2.3.1 exaggerate the precision achievable because each analysis is not independent due to nested pseudoreplication at spot, individual and subspecies levels. To remove this pseudoreplication, understand the influence of subspecies and morphological variation on Mg/Ca ratios and ascertain the relative amounts of variance explained by spot, individual and subspecies levels, we used generalized linear mixed effect models implemented in the lme4 package (Bates et al., 2015). Mixed effects models comprise random and fixed effects. Fixed effects are experimentally determined and of direct interest in hypothesis testing (Bolker et al., 2009) for patterns of variation common to all experimental units; random effects are selected from a larger population but thought of as uncontrollable "nuisance" parameters systematically obscuring the signal held by the fixed effects (Bolker et al., 2009;Gillies et al., 2006). The "vital effects" that alter each individual's geochemical composition are examples of random effects, while an overall temperature gradient would be a fixed effect. Variation among individuals means we cannot assume each ablation represents an independent sample: as an example, the Mg/Ca ratio difference from penultimate to final chamber is likely to be more similar in the same individual than from one individual to another. Our experimental design is stratified random sampling because we targeted sufficient numbers of individuals in each subspecies. We do therefore assume that individuals were sampled independently within their subspecies. In all models, individual test ID, repeat laser spot number (spot) and analysis batch (batch) were classified as random effects (Bates, 2005) because, while we know that individuals, batches and spots could differ amongst each other, we do not have a systemic hypothesis for how they differ; chamber and morphotypes were categorized as fixed effects and Mg/Ca ratio as the dependent response variable because we expect certain relationships with increasing chamber size through ontogeny and because we anticipate that subspecies-specific offsets are possible. We consider subspecies a fixed effect following the recommendations of Bolker et al. (2009) given low numbers of categorical levels. Our different statistical models were compared using Analysis of Variance amongst competing models to test model fit assuming a Gamma error distribution for the generalized linear model (inverse link function to transform the mean of the data; variance increases as a quadratic function of the mean). The best fit model was decided by likelihood ratio tests and the Akaike Information Criterion (AIC, Burnham & Anderson, 2002), which summarizes model fit as a compromise between variance explained and parameters used.
All data processing and statistical analysis was carried out in the R environment (version 4.0.3; R Core Team, 2020). Scripts are provided as Supplementary Material.

Morphological Analysis
To test the null hypothesis that the morphotypes of G. ruber are morphologically indistinguishable, a principal component analysis (PCA) was conducted using all traits in Figures 1b and 1c. To reduce the number of components, a Horn's Parallel analysis was conducted (Peres-Neto et al., 2005) using the "paran" package (Dinno, 2018). This analysis indicated the first three principal components explaining ∼71% of cumulative variance should be retained ( Figure 2, Table S4 in Supporting Information S1). When these components are plotted against each other, the SS and SL subspecies occupy a homogenous morphospace (Figure 2). While PC1 does not separate any subspecies effectively (Figure 2, Figures 3a and 3b), SLE shows a degree of offset in principal component 2 ( Figure 2), which is loaded highly by the aspect ratio of the final chamber and the whole test (Figures 3c and 3d, Table S5 in Supporting Information S1).
Following the PCA, a cluster analysis was conducted using the package "mclust" (Scrucca et al., 2016) on the three retained principal components. This package tests a finite number of models to determine the most supported statistical model among those considered and identifies the optimal number, size, shape, and orientation of clusters needed to explain the data using Schwarz's Bayesian information criterion (BIC) (Schwarz, 1978). The BIC considers the statistical fit of a model as well as the number of parameters the model has to determine the posterior model probability (Wintle et al., 2003); in the "mclust" package, a larger BIC value indicates a better model fit (Fraley & Raftery, 1996). To reject our null hypothesis of no sub-species variability, the "mclust" analysis must indicate the data are better represented by more than one cluster and thus different morphotypes should occupy statistically distinct clusters within the morphospace. Clustering analysis shows that the morphospace described by PC1 and PC2 is best represented by two clusters ( Figure S4-S5 in Supporting Information S1), though these do not correspond tightly to our subspecies classification (Table S6 in Supporting Information S1). When the method is forced to identify three groups (Figure 2d) subspecies are split across all three groups (Table  S7 in Supporting Information S1) and do not correspond to morphologically meaningful units. Therefore, we reject the null hypothesis of no morphological variability in G. ruber but conclude the variability identified is not taxonomically aligned to established subspecies definitions. Loadings of principal components can be found in Table S5 in Supporting Information S1. Predicted clusters can be found in Table S7 in Supporting Information S1.

Trace Element Subspecies Variability
The median Mg/Ca ratios (± Median Absolute Deviation, MAD) for G. ruber show substantive interspecific differences (SS = 4.74 ± 1.28 mmol/mol, SL = 4.20 ± 1.37 mmol/mol and SLE = 3.64 mmol/mol ± 1.24 mmol/ mol). To investigate subspecies trace element variability, we conducted a one-way analysis of variance (ANOVA) on the Mg/Ca dataset that detectable differences amongst subspecies (F 2,1857 = 112.9, p < 0.01). A subsequent post-hoc TUKEY HSD test revealed a detectable significant (p < 0.01) differences in the mean Mg/Ca ratios between all subspecies. The SL and SLE morphotypes are depleted by 0.48 mmol/mol and 1.11 mmol/mol, respectively, when compared to SS. In addition, SLE is depleted in Mg/Ca ratios compared to SL by 0.63 mmol/ mol.
To investigate factors driving interindividual Mg/Ca separation in G. ruber, generalized linear mixed effect models were constructed with a gamma link function to account for the positively skewed distribution of the response variable Mg/Ca. The best fitting model assumed that the relationship between test size and Mg/Ca was constant across subspecies whilst the Mg/Ca relationship with chamber position was variable among subspecies. We constructed two sets of models; one set used test size (Figure 1b) to investigate whole test impacts on Mg/Ca ratios whilst the other set used chamber size (chamber area labeled 2a, 3a, 4a in Figure 1c) to investigate ontogenetic impacts on Mg/Ca ratios.

Modeled Drivers of Mg/Ca
Though our morphological analysis showed that morphological traits, such as test area, did not correspond to subspecies (Section 3.1), test size is known to be a controlling variable in stable isotope values but is less well understood as a potential driver of Mg/Ca ratios. To understand whether test size is a driving variable of Mg/Ca ratios we built separate models with test size (test area) and chamber size (chamber area) effects. The best models, chosen through AIC, retained both whole test and chamber size effects. Results are reported here with a coefficient (β), standard error (s.e.), t statistic (t) and p value (p) associated with these coefficients; full model outputs 9 of 20 are available in Supporting Information S1. The generalized linear mixed model found significant evidence that chamber position (final, antepenultimate and penultimate) and subspecies impacted spot-averaged Mg/Ca ratios (Table S8 in Supporting Information S1). Whilst the best fitting models retained both test and chamber size, we find that test size and chamber size are not statistically significant drivers of Mg/Ca ratios over the range in size examined here (Figure 4, Table S8-S9 in Supporting Information S1). Subsequent discussions of results will focus on the model including test area as this was the highest loading morphological trait from principal component analysis (Figure 2).
We detected a strong impact of chamber position where the Mg/Ca ratios differed between the final and penultimate chamber (β = −0.044, s.e. = 0.004, t = −9.995, p < 0.001; the coefficients is on the scale of the inverse link function, hence a negative value means a positive relationship through the untransformed data) and between the final and antepenultimate chamber (β = −0.051, s.e. = 0.004, t = −11.668, p < 0.001), that is, a mean enrichment of 0.044 mmol/mol and 0.051 mmol/mol, respectively, in the penultimate and antepenultimate chamber compared to the final chamber ( Figure 4, Table S8 in Supporting Information S1). This enrichment differs amongst subspecies such that depletion in the penultimate and antepenultimate chamber is stronger in SL than SS (penultimate: β = −0.039, s.e. = 0.007, t = −5.805, p < 0.001, antepenultimate: β = −0.039, s.e. = 0.007, t = −5.571, p < 0.001; in Supporting Information S1). This pattern of depletion is stronger still in SLE (penultimate: β = −0.043, s.e. = 0.007, t = −5.916, p < 0.001, antepenultimate: β = −0.042, s.e. = 0.007, t = −5.953, p < 0.001; in Supporting Information S1). This interaction supports the conclusion that chamber position is differentially influential amongst the subspecies in driving Mg/Ca trends. These trends are not consistent amongst subspecies and are most visible when comparing at chamber resolution rather than the whole specimen ( Figure 4).
These interdependent relationships among chamber position and subspecies are present after controlling for so-called "random effects." We included individual test ID, Batch, and Spot number as random effects to test the repeatability of the LA-ICP-MS method on repeated measurements of the same chamber (Spot) in the same individual (test ID) across 2 days (Batch). Compared to residual standard deviation of 0.049, the variance explained by individual, batch and spot were 0.001, 0.00005, and 0.00001, respectively ( Figure S6 in Supporting Information S1). This means that, on top of the systemic "fixed effect" variation explained in the previous paragraph, 100 times more variation (0.001/0.00001) can be explained by intraspecific variability among individuals than by variability among repeated laser spots and 20 times more variation (0.001/0.00005) can be explained by intraspecific variability among individuals than by batches run on different days ( Figure S6 in Supporting Information S1). These random effect variances are a formal treatment of the variance disaggregation, but the rank orders match raw calculations: Spot MAD is the smallest variation (0.46 mmol/mol), followed by Batch MAD (0.56 mmol/mol) then individual MAD (1.04 mmol/mol). We therefore conclude that LA-ICP-MS is a highly repeatable technique for extracting trace element signatures given that spot-average Mg/Ca ratios do not vary substantially within a given chamber [ Figure S7 in Supporting Information S1; (i.e., Bolton et al., 2011;Sadekov et al., 2008)].

What Is the Optimum Number of Individuals Needed to Separate Subspecies in LA-ICP-MS Mg/ Ca Analysis?
To determine the optimum number of individuals needed to detect subspecies differences in Mg/Ca, we conducted a rarefaction subsampling experiment using the best supported model from the previous subsection, and a simplified model without the subspecies effect. The difference in AIC scores between these two models indicates the improvement in model fit by considering systemic variation amongst subspecies. Each subspecies was subsampled at random in increments of 5 up to a maximum of 80. The process was repeated 50 times to create 800 model comparisons in total. We either record the ΔAIC values for the two models with and without subspecies, or N/A if a model failed to converge due to insufficient sampling coverage. As subsample size increased, so too did the ΔAIC (Figure 5a) while the percentage of models failing to converge decreased (Figure 5b). Once ∼55 individuals from each subspecies were sampled (Figure 3b), the ΔAIC was consistently sufficiently large to infer statistical support for systemic subspecies variation with most models converging. This finding illustrates that a focus for paleoceanography and paleoclimate research must be on increasing sample sizes for LA-ICP-MS work to achieve representative sample sizes as used here. Representative sample sizes will vary between species and system investigated, particularly as a function of the homogeneity of the material being sampled.

Stable Isotope Variability
Stable isotope analysis was conducted on a subset of 89 individual specimens that had previously undergone LA-ICP-MS analysis using the methods described in Section 2.3. In contrast to trace element analysis, no interindividual differences in carbon or oxygen isotopes were detected (ANOVA: Carbon F (2,86) = 1.403, p = 0.252, Oxygen F (2,86) = 1.292, p = 0.280) (Figures 6a and 6b). A conversion of oxygen isotope values to temperature using Equation (2) shows no detectable difference (F(2,86) = 1.305, p = 0.277) between the mean of SS compared to SL and SLE (SS = 24.35°C, SL = 23.25°C, SLE = 23.20°C). The range of calculated temperatures within each subspecies is however large, with all subspecies showing ranges of 9-12.5°C between the highest and lowest calculated temperatures. The relationship between size and stable isotopes differs amongst subspecies (Figures 6c and 6d, Table S10-11 in Supporting Information S1). SS and SLE show large variation with size in both oxygen and carbon; sensu lato shows a positive relationship, clearest in carbon isotopes, between stable isotopes and test area (Figures 6c and 6d).

Stable Isotope Versus Mg/Ca
In Figure 7, we plot δ 18 O and δ 13 C values against median Mg/Ca ratios of all spots per individual which, despite final chamber depletion, is a good representative of past ocean conditions (Fehrenbacher et al., 2020;Rustic et al., 2021). There is no detectable correlation between stable isotopes and Mg/Ca ratios (Figure 7; Table S12-S17 in Supporting Information S1) in all subspecies. In oxygen isotopes, we observe that as δ 18 O values become more negative Mg/Ca ratios increase in SS and SL which would be expected ( Figure 7a); however, this relationship is much weaker in SLE. In δ 13 C values there is no observable pattern between δ 13 C and Mg/Ca ratios, most likely because of a narrow range of δ 13 C values.

Discussion
We show substantial variability in G. ruber Mg/Ca ratios that is partially explained by the presence of subspecies and differential Mg/Ca signatures through the ontogeny of those subspecies. We find no clear relationship between the morphological traits measured in this study and subspecies identification used but do detect geochemical variation aligned to subspecies classification. The final chamber was consistently depleted in Mg/Ca ratios compared to the other two chambers in the final whorl. Intraspecific variation through ontogeny explains more variation in Mg/Ca ratios than differences among subspecies, which implies we need large sample sizes (>55) when performing LA-ICP-MS to have a realistic expectation of detecting such differences. Variation among individuals explains two orders of magnitude more variability in Mg/Ca ratios than variation amongst shots, implying that Mg/Ca ratios are a reliable and repeatable measurement when determined using LA-ICP-MS.

Morphological Variability in G. ruber Subspecies
Inter-and intraspecific morphological variation is abundant amongst the three G. ruber subspecies (Figures 1-3), yet the general taxonomic importance of such sub-species variability is disputed (Phillimore & Owens, 2006). The number of distinct morphological groups identified in G. ruber ranges from three (Parker, 1962) to eight (Robbins & Healy-Williams, 1991) along a morphocline. In this study we found that morphological variation does exist (Figures 1 and 3) but does not correspond to taxonomically informative units (Figure 2, Table S4 in Supporting Information S1), similar to other studies that used a greater range of morphological measurements (Numberger et al., 2009). We observed some separation in the aspect ratio of the final chamber and test aspect ratio that separated a large proportion of the SLE morphotype (PC2: Figure 2, 3c, and 3d). The separation observed due to test and final chamber aspect ratio, which corresponds to test and chamber compression, respectively, are commonly used morphological differences for morphotype identification (Aurahs et al., 2011; Carter  Kontakiotis et al., 2017;Kuroyanagi et al., 2008;Lin et al., 2004;Löwemark et al., 2005;Steinke et al., 2010;Wang, 2000). Based on our analysis and observations we recommend researchers focus on the final chamber, particularly the degree of compression to separate the subspecies of G. ruber, with compression of the final chamber increasing along the morphological cline from SS to SLE (Figures 1a and 3d). We suggest that future morphological studies should focus on final chamber traits as well as investigating morphological differences in the 3D space using μ-CT scanning. Furthermore, we also suggest that future studies measure traits that have a known functional role and can be linked to the environment such as pore size which is linked to gas exchange (Bé, 1968;Burke et al., 2018;Constandache et al., 2013;Kearns et al., 2021).

Geochemical Variability
Geochemical variability in G. ruber does exist at this Indian Ocean site and is the result of interspecific variability. G. ruber SS shows systematically higher mean Mg/Ca ratios compared to SL and SLE of 0.48 ± 0.18 and 1.11 ± 0.17 mmol/mol, respectively, based on test-averaged mean Mg/Ca ratios. When test-averaged mean Mg/ Ca ratios are converted to temperature using Equation 1 this equates to a 1.34 ± 0.47°C difference between SS and SL and a 3.01 ± 0.43°C difference between SS and SLE.
The temperature differences we find between SS and SL are slightly higher than those found in a study of the Indian Ocean and Western Pacific (0.91 ± 0.75°C; Steinke et al. (2005)). Our Mg/Ca derived temperature difference between SS and SL is smaller than previously found in the Gulf of Mexico (∼3°C; Antonarakou et al., 2015), but the SL morphological concept used by Antonarakou et al. (2015) is similar to our SLE here (Figure 1). Based on this observation and published confidence intervals, our 3.01°C difference between SS and SLE in this study is within the range of that between SS and SL in the Gulf of Mexico (Antonarakou et al., 2015). Detailed taxonomic work is clearly fundamental for unbiased geochemical inference.
Despite our trace element results, the δ 18 O and δ 13 C values do not show statistically significant differences amongst subspecies ( Figure 6). The lack of significant isotopic differences among subspecies (Section 3.3) is similar to some studies (Lynch-stieglitz et al., 2015;Mohtadi et al., 2009;Thirumalai et al., 2014) but not others (Antonarakou et al., 2015;Carter et al., 2017;Löwemark et al., 2005;Steinke et al., 2010;Wang, 2000). In those studies that found a significant difference between SS and SL, the smallest mean difference observed over glacial-interglacial cycles is similar to the mean δ 18 O difference of ∼0.2‰ we observe (Carter et al., 2017;Steinke et al., 2010;Wang, 2000). However, despite similar values, we failed to statistically separate subspecies in stable isotope space ( Figure 6). We do however see a large degree of variability resulting in a higher δ 18 O mean (−1.21‰) and lower δ 13 C (0.82‰) mean than has been recorded in other sites from this cruise (Birch et al., 2013) and another regionally similar study (Fallet et al., 2010), though both those studies were based on bulk isotopes and not individual analyses. When converted to temperature using Equation 2, our δ 18 O results produce mean temperatures in all subspecies to be below 25°C (SS = 24.35°C, SL = 23.25°C and SLE = 23.20°C). Based on CTD profiles from the area (Birch et al., 2013; Kroon and the Shipboard Scientific Party, 2010) our δ 18 O values suggest all subspecies lived below the base of the SML at ∼40m (Birch et al., 2013). The CTD profiles of DIC δ 13 C show that there is a weak δ 13 C gradient in the upper 100 m of the water column, potentially explaining the narrow range and lack of statistical significance we observe in our δ 13 C samples. The range in δ 13 C that we do observe probably reflects individual variation in calcification, respiration, and photosynthesis (Henehan et al., 2017;Lombard et al., 2009) and the influence of these processes on the δ 13 C of foraminifera (Zeebe. 1999).

Is Trace Element Variability in G. ruber Ecologically Driven?
We have shown intraspecific variation in G. ruber is systematic. G. ruber is restricted to the upper water column (Anand & Elderfield, 2005;Dekens et al., 2002) and is not thought to migrate vertically during life (Aurahs et al., 2011;Tolderlund & Bé, 1971). Despite being used to indicate SML temperatures, G. ruber does have a broad ecological niche extending below the base of the SML to the base of the deep chlorophyll maximum (DCM) (Lončarić et al., 2006;Peeters et al., 2002).
Separation of subspecies in geochemical space has been inferred previously to result from subspecific specific depth habitats (Antonarakou et al., 2015;Carter et al., 2017;Kawahata, 2005;Kuroyanagi & Kawahata, 2004;Löwemark et al., 2005;Numberger et al., 2009;Steinke et al., 2005;Wang, 2000). Our Mg/Ca results agree with this inference with SLE and SL calcifying in an ∼3°C and ∼1.3°C respectively cooler, deeper part of the upper water column compared to SS. Subspecies calcification depth variability would explain the large ecological niche found in other studies (Lončarić et al., 2006;Peeters et al., 2002). Based on CTD profiles from the area (Birch et al., 2013;Kroon and the Shipboard Scientific Party, 2010), our Mg/Ca temperature calibrations would place SS slightly above the base of the SML at ∼40m (Birch et al., 2013) with SLE inhabiting the upper thermocline below the SML but well above the deep chlorophyll maximum at ∼95m (Birch et al., 2013) and SL in-between SS and SLE.
Another potential influence of subspecies temperature variability is the seasonal preference of subspecies. The annual seasonal SST range in this area is ∼5°C (Birch et al., 2013;Damassa et al., 2006;McClanahan, 1988), which is larger than 1-3°C range in Mg/Ca derived temperature we observe among subspecies. Seasonality would imply that SS preferentially live during the summer months when SST is ∼30°C whilst SLE lives during the winter when SST decreases to around 25°C (Birch et al., 2013). SL calculated temperature sits between these two seasonal temperatures suggesting a deeper depth habitat rather than seasonal preference. Preferential seasonality has been proposed as a potential driver in other ocean basins (Antonarakou et al., 2015;Sadekov et al., 2008) but sediment trap studies in Java (Mohtadi et al., 2009), South China Sea (Lin et al., 2004) and the Gulf of Mexico (Thirumalai et al., 2014) as well as plankton tow studies in the Arabian Sea (Peeters et al., 2002) suggest no seasonal or monsoonal preference of G. ruber subspecies.
We therefore conclude that our subspecies differences are due to depth habitat preferences with SLE and SL inhabiting a cooler, deeper part of the water column compared to SS. Therefore, the intraspecific variability in G. ruber is ecologically meaningful and studies that use a mean-based approach without separating between subspecies are removing ecologically meaningful data whilst potentially influencing paleoclimatic reconstructions. However, rather than being a hindrance, this depth habitat separation of subspecies in G. ruber, if consistent across ocean basins, could be used to reconstruct SML changes through time.

Drivers of Stable Isotopes
Whilst our Mg/Ca ratios are comparable across studies (Section 4.2), our stable isotope results are not and furthermore do not align with our Mg/Ca ratios. Additionally, though a weak linear relationship does exist between δ 18 O values and Mg/Ca ratios (Figure 7a), indicating a similar depth habitat control on both proxies the relationship is far from perfect and not statistically significant, implying further factors are acting on one or both proxies. The imperfect correlations we see between stable isotopes and Mg/Ca ratios (Figure 7) make it difficult to untangle any secondary controls on Mg/Ca, δ 18 O, and δ 13 C. Data from across a wider size range may result in a larger signal-to-noise ratio providing valuable into the various environmental and physiological controls on these proxies and should therefore be the focus of future studies.

Ontogenetic Variability of Mg/Ca Ratios
Subspecies variability (Section 4.2.1) is complicated further by ontogenetic changes (Figure 2). Test heterogeneity of Mg is present in all species of foraminifera and thought to be at least partly decoupled from temperature (Spero et al., 2015). Intratest variability is not unexpected, given the nature of the water column and planktic lifestyle of these foraminifera (Pracht et al., 2019). Ontogenetic variability has been observed in other studies of planktic foraminifera (Anand & Elderfield, 2005;Bolton et al., 2011;Dueñas-Bohórquez et al., 2011;Sadekov et al., 2008). Here, we show for the first time that interchamber variation in G. ruber Mg/Ca is systematic at the subspecies level with different Mg/Ca ratios through the final whorl. We observed an overall pattern of final chamber depletion in all subspecies (Figure 4) with the antepenultimate and penultimate chambers showing similar values (Figure 4). The subspecies similarities of these patterns suggest that they are ecologically or biologically driven geochemical signatures.
Interspecific chamber heterogeneity could be a result of the biomineralization process. In symbiont bearing planktic foraminifera, like G. ruber, diurnal changes in the biological activity of algal symbionts have been hypothesized to contribute to Mg/Ca banding within the chamber walls (Eggins et al., 2004;Fehrenbacher et al., 2017;Sadekov et al., 2005). As foraminifera grow, older chambers are overprinted with the calcite of newly formed chambers (Hemleben et al., 1989) increasing chamber thickness and adding new Mg/Ca bands. Intratest variability in high and low Mg bands may explain the variation we see if such banding was subspecies specific and a geochemical "vital effect." The number of high Mg bands diminishes through the final whorl of G. ruber with the final chamber made up of only low Mg bands (Sadekov et al., 2005). Although this banding is not thought to impact overall test signal (Holland et al., 2020), it could influence chamber specific signals. The observed absence of high Mg bands in the final chamber of G. ruber specimens (Sadekov et al., 2005) may also explain the apparent depletion of the final chamber in this study and others (Anand & Elderfield, 2005;Bolton et al., 2011;Dueñas-Bohórquez et al., 2011;Sadekov et al., 2008). The experimental design of our study meant that an investigation of intratest and interspecific banding differences, while of fundamental interest, is not possible given the nature of our data collection. Chamber heterogeneity comparisons between test averaged LA-ICP-MS and solution ICP-MS have shown that averaging Mg/Ca across chambers is a good indicator of past ocean conditions (Fehrenbacher et al., 2020;Rustic et al., 2021). We therefore recommend that studies using LA-ICP-MS should measure all chambers in the final whorl (Fehrenbacher et al., 2020;Rustic et al., 2021).

Influence of Chamber and Test Size on Mg/Ca Ratios
Test size is recognized as an influential driver of stable isotope variability in foraminifera (Elderfield et al., 2002;Ezard et al., 2015;Friedrich et al., 2012;Spero, 1998;Spero & Lea, 1996). To minimize such effects, stable isotope analyses are typically conducted on narrow size fractions. A similar practice has been applied to trace elements (Cléroux et al., 2008;Elderfield et al., 2002;Friedrich et al., 2012;McConnell & Thunell, 2005;Ni et al., 2007) with the larger size fraction often recommended (Elderfield et al., 2002). In this study, we picked G. ruber individuals from a narrow size fraction (250-355 μm) and found no impact of size on Mg/Ca ratios ( Figure 4; Table S7 in Supporting Information S1). Therefore, the use of narrow size fractions remains a good mitigation technique in trace element analysis to avoid test size effects.
The effects of individual chamber size on Mg/Ca ratios have never been investigated until now. We found no detectable impact of chamber size on chamber Mg/Ca ratios of G. ruber (Section 3.2.1; Table S8 in Supporting Information S1). The final chamber depletion we consistently observe in all subspecies is not therefore the result of smaller final chambers as often found in "kummerform" individuals (Berger, 1969;Olsson, 1973).

Conclusions and Paleoceanographic Implications
Mg/Ca ratios vary systematically due to the presence of three subspecies that live at different depth habitats within the surface mixed year. Using LA-ICP-MS at chamber-by-chamber resolution, we have shown that individual foraminiferal analysis can be used effectively to reveal ecologically meaningful information that is useful for paleoceanographic reconstructions. Although such fine scale measurements do generate statistical noise, meaningful patterns can still be found when sufficient samples of individuals are used. To avoid signal mixing, we recommend (a) paleoceanographic studies should favor G. ruber SS as the subspecies that best represents the SML; (b) the continued selection of the narrowest size fraction for LA-ICP-MS analysis of Mg/Ca ratios until other studies find a qualitatively similar result; (c) the continued sampling from the narrowest size fraction possible for stable isotope analysis; (d) to determine test Mg/Ca final whorl chamber resolution LA-ICP-MS should be used; and (e) the need for larger sample sizes than routinely used to sufficiently represent a wider homogeneous population. Through this analysis we have also demonstrated and discussed how more work is needed to understand the drivers of Mg incorporation in G. ruber (and other foraminifera) and how this signal is influenced by environmental and biological variables. Better understanding of the biological, ecological, and environmental causes of intraspecific variability would unlock the full potential of measurements, integrating single-specimen and time-averaged bulk measurements, to reveal past climate change at all levels of temporal granularity from days to millions of years.

Data Availability Statement
Geochemical and morphological data generated as part of this study as well as code to reproduce our results and supplementary figures are available on Figshare via Kearns et al. (2022a) and Kearns et al. (2022b). All analysis and figure creation were done in the R environment (version 4.0.3; R Core Team, 2020). funded by Fellowship NE/J018163/1; LEK was part-funded by a University of Southampton Institute of Life Sciences' PhD Studentship and Natural Environment Research Council NE/L002531/1. We thank Ryan Glaubke and two anonymous reviewers for their comments on this manuscript, and three anonymous reviewers for their comments on a previous version of this manuscript. We also thank Dr Anieke Brombacher for helping with taxonomic and morphometric measurements and Dr Max Holmström helping with stable isotope measurements.