Quantitative Analysis of Paleomagnetic Sampling Strategies

Sampling strategies used in paleomagnetic studies play a crucial role in dictating the accuracy of our estimates of properties of the ancient geomagnetic field. However, there has been little quantitative analysis of optimal paleomagnetic sampling strategies and the community has instead defaulted to traditional practices that vary between laboratories. In this paper, we quantitatively evaluate the accuracy of alternative paleomagnetic sampling strategies through numerical experiments and an associated analytical framework. Our findings demonstrate a strong correspondence between the accuracy of an estimated paleopole position and the number of sites or independent readings of the time‐varying paleomagnetic field, whereas larger numbers of in‐site samples have a dwindling effect. This remains true even when a large proportion of the sample directions are spurious. This approach can be readily achieved in sedimentary sequences by distributing samples stratigraphically, considering each sample as an individual site. However, where the number of potential independent sites is inherently limited the collection of additional in‐site samples can improve the accuracy of the paleopole estimate (although with diminishing returns with increasing samples per site). Where an estimate of the magnitude of paleosecular variation is sought, multiple in‐site samples should be taken, but the optimal number is dependent on the expected fraction of outliers. The use of filters based on angular distance helps the accuracy of paleopole estimation, but leads to inaccurate estimates of paleosecular variation. We provide both analytical formulas and a series of interactive Jupyter notebooks allowing optimal sampling strategies to be developed from user‐informed expectations.


Introduction
Paleomagnetism is concerned with attempting to estimate properties of the ancient geomagnetic field from magnetic records preserved in rocks.This involves laboratory measurements of magnetization directions recorded by rocks and statistical analyses of those directions.Two geomagnetic properties of particular interest that can be estimated from these paleomagnetic directional data are: • The position of the time-averaged (≳10 4 -10 5 a) ancient geomagnetic pole (also known as a paleopole) that corresponds to the Earth's spin axis according to the geocentric axial dipole hypothesis (Creer et al., 1954).• The paleosecular variation of the field, which is associated with the shorter-term (≲10 4 -10 5 a) time-varying position of the geomagnetic pole.
Despite the importance of these two quantities, there has been little exploration of the best sampling practices with which to derive estimates of them.This has resulted in practices that vary according to the traditions of different laboratories; that is, the community largely relies on conventional wisdom.
Abstract Sampling strategies used in paleomagnetic studies play a crucial role in dictating the accuracy of our estimates of properties of the ancient geomagnetic field.However, there has been little quantitative analysis of optimal paleomagnetic sampling strategies and the community has instead defaulted to traditional practices that vary between laboratories.In this paper, we quantitatively evaluate the accuracy of alternative paleomagnetic sampling strategies through numerical experiments and an associated analytical framework.Our findings demonstrate a strong correspondence between the accuracy of an estimated paleopole position and the number of sites or independent readings of the time-varying paleomagnetic field, whereas larger numbers of in-site samples have a dwindling effect.This remains true even when a large proportion of the sample directions are spurious.This approach can be readily achieved in sedimentary sequences by distributing samples stratigraphically, considering each sample as an individual site.However, where the number of potential independent sites is inherently limited the collection of additional in-site samples can improve the accuracy of the paleopole estimate (although with diminishing returns with increasing samples per site).Where an estimate of the magnitude of paleosecular variation is sought, multiple in-site samples should be taken, but the optimal number is dependent on the expected fraction of outliers.The use of filters based on angular distance helps the accuracy of paleopole estimation, but leads to inaccurate estimates of paleosecular variation.We provide both analytical formulas and a series of interactive Jupyter notebooks allowing optimal sampling strategies to be developed from user-informed expectations. 10.1029/2023JB027211 2 of 16 In the hierarchical framework of paleomagnetic studies, a site should correspond to a unit of rock with a common age and direction of magnetization (McElhinny & McFadden, 2000;Tauxe, 2010).Note that in some contributions a site is defined more loosely as a small area or stratigraphic interval from which samples are collected which is not the definition that we use here.In our preferred definition, each site is interpreted to be a spot recording of the time-varying geomagnetic field.In the case of an igneous rock, a site could be an individual lava flow or intrusion, whereas for a sedimentary rock, a site should ideally comprise a single depositional event.In practice, a sedimentary site typically corresponds to a single stratigraphic horizon that is the height of a standard paleomagnetic sample, usually about 2.5 cm.Notice that when sedimentation rates are low, an individual samples may partially time average the field.To move up the hierarchy, a collection of paleomagnetic samples from a given site are averaged and the site mean is transformed from a direction with an associated declination and inclination to pole space with an associated latitude and longitude, where the mean is referred to as a virtual geomagnetic pole (VGP).Following the definition of a site, each VGP ideally represents an independent estimate of the position of the ancient geomagnetic pole at an instant in time.Estimates of paleosecular variation of the ancient geomagnetic field prior to 10 Ma can be made from populations of VGPs by determining their angular dispersion-most typically applied to collections of igneous sites of a similar age (e.g., model G; McFadden et al., 1988).To determine a mean paleomagnetic pole position, a group of similarly aged VGPs are averaged to a Fisher mean paleopole that is taken as the best estimate of the true position of the ancient geographic pole relative to the observation point.
Regardless of whether we seek to discern the statistical properties of the time-averaged pole position or geomagnetic secular variation, our estimates will include error.Paleomagnetic errors come from a variety of sources which can include orientation errors both in the field and the laboratory; measurement errors; and the imperfect isolation of the magnetization of interest from secondary magnetic overprints.The frequent occurrence of imperfect magnetization acquisition or the inability to isolate primary components often results in a sample collection being contaminated by outliers.Orientation and measurement errors are generally assumed to be randomly unbiased (non-systematic) and so can be mitigated through the collection, measurement and directional averaging of multiple samples within a site.However, given finite resources, the collection of additional samples per site will come at the cost of a lower number of sites in many settings.A relevant question is thus: how should we distribute our sampling to minimize uncertainty on the property we seek to estimate?Is it better to take a few sites with many samples?Or many sites with fewer samples?How might the recommended strategy change depending on the objective (in estimating the location of the paleopole vs. the dispersion of VGPs) or the fidelity of the magnetic record?
Some notions concerning sampling have become entrenched in the paleomagnetic literature.For example, many workers seek to collect six to eight samples per site (Butler, 1992), although the rationale for this range is not entirely clear.Opdyke and Channell (1996) suggest that at least three samples per site be collected where determinations of polarity are important, whereas to reliably estimate the dispersion of sample directions within a site, a minimum of four (Cromwell et al., 2018) or five (Tauxe et al., 2003) samples per site has been deemed necessary.
Having a more significant number of samples within the site provides the benefit of being able to apply data filters based on within-site scatter.However, Gerritsen et al. (2022) have found empirically that collecting and averaging multiple samples per site only results in a modest enhancement of the overall accuracy of the paleopole.Thus, where the objective is to estimate the position of a paleopole, Gerritsen et al. (2022) suggested that it is most beneficial to maximize the number of sites, and so the collection of additional single-sample sites should be preferred over the collection of multiple samples from fewer sites.Nevertheless, a statistical and quantitative evaluation of alternative strategies has not yet been conducted.
Here we explore how the distribution of samples across sites affects the performance in the estimation of the paleopole position and the dispersion of VGPs, and how the varying influence of outliers dictates the optimal strategy to best estimate these parameters.We also derive a set of equations that can enable quantitative sampling strategy recommendations based on specified parameters informed by user expectations.

Mathematical Setup
Consider the problem of estimating a paleomagnetic pole μ 0 for some given interval of time, where μ 0 is a three-dimension vector contained in the unit sphere.Observations consist of a collection of a total of n samples distributed among N sites.Because the geomagnetic field is constantly varying around a mean configuration, SAPIENZA ET AL. 10.1029/2023JB027211 3 of 16 each one of the VGPs per site, denoted by μ i with i = 1, 2, …, N, is going to differ from the time-averaged paleomagnetic pole μ 0 .A fundamental assumption in paleomagnetic research is that this secular variation of the geomagnetic field can be effectively estimated through averaging of a sufficiently high number of independent and temporally distributed VGPs.We now seek to evaluate how our choices of n and N will affect our estimation of μ 0 , as well as how we distribute the n samples among the N sites.

Data Generating Model
We define the following data generating model.First, we consider a set with a total of N VGPs sampled from a statistical model of secular variation.Examples of these models include the Gaussian process type model (Constable & Parker, 1988;Tauxe & Kent, 2004) and model G (McFadden et al., 1988).In this contribution, we use model G which captures latitudinal variation in VGP scatter, and considers a mean geocentric axial and dipolar (GAD) field.Then, given a GAD mean direction μ 0 , we sample a series of VGPs μ 1 , μ 2 , …, μ N according to The sampling procedure depends on the mean direction μ 0 and the precision parameter κ b that will depend on the secular variation model used.In this study, we adopt the mild assumption that VGP distributions are circularly symmetric (Tauxe & Kent, 2004) and can be sampled from a Fisher distribution (Deenen et al., 2011;Fisher, 1953), whose dispersion S b , according to model G (McFadden et al., 1988), depends on the sampling latitude λ through the following formula with a and b two empirical coefficients, recently calculated as Doubrovine et al. (2019).At population level, there is a one-to-one relationship between S b and the value of κ b we use to sample from the Fisher distribution.This relationship can be found numerically with an arbitrary level of precision.Then, VGPs can be sampled according to a Fisher distribution with mean direction μ 0 and dispersion parameter κ b (λ).
In the following, we use the supraindex * to denote variables in directional space (inclination-declination). Thus, μ i refers to any given VGP (geographic coordinates) and   *  refers to its corresponding direction in inclination and declination space according to the dipole formula.Note that this transformation between pole and directional space depends on the latitude and longitude of the site.Now, we assume that the ith-site has n i individual directions that follow a Fisher distribution with probability 1 − outlier and with x ij the jth-direction of the ith-site; κ i the dispersion parameters per site; and Unif represents the uniform distribution on the sphere.The parameter p outlier has been added to quantify the effect of outliers in the sampling process.With probability 1 − p outlier we are going to observe a true sample, while with probability p outlier our sample will be corrupted and instead we will observe a spurious direction, modeled by a uniform distribution on the sphere where no information is provided about the true orientation of the field.For cases where we do not want to consider the effect of outliers in the sampling process, we set p outlier = 0. Also, for cases where the number of samples and dispersion parameter are the same for all the sites, we will use n 0 and κ w to refer to any of the n i and κ i , respectively.The parameters used in the model are summarized in Table 1.

Estimation of the Paleopole Direction
We can estimate the true pole location μ 0 by computing the Fisher mean of the VGPs estimated from each site, that is, where R 0 is the length of the resultant vector with ‖⋅‖ denoting the Euclidean norm; and    is the sample mean per site, which results from transforming to pole space the estimate of the pole in directional space, The overall goal of this estimation procedure is to get a value for   0 as close as possible to the ground truth μ 0 .We assess the accuracy of the pole estimate across simulations by computing the root-mean-square error (RMSE) as where  angle ( μ() is the angular distance in degrees between the true pole μ 0 and each one of the simulated estimations    ()  0 , where M is the total number of simulations.

Estimation of the VGP Scatter
Long-term assessment of the paleomagnetic secular variation of the geomagnetic field relies on the VGPs dispersion S b instead of their mean.The observed global dispersion S is estimated as Cox (1970) The global dispersion S 2 is a combination of the dispersion between VGPs S b and that arising from the dispersion among the samples within the site S w (McFadden et al., 1991).We assume that the latter arises purely from random errors associated with orientation, measurement and analytical errors, whereas the former is an unknown, latitude-dependent parameter of the time-averaged geomagnetic field.In order to estimate S b , we first need to extract the within-site dispersion from the global dispersion of the VGPs, that is where the estimated within-site dispersion  Ŝ is computed in directional space following McFadden et al. (1991) and Doubrovine et al. (2019).p outlier [0,1] Outlier rate where 0 is no outliers and 1 is all samples are outliers drawn from a uniform distribution.( 5 + 18sin 2  + 9sin 4  ) the latitude correction introduced in Cox (1970); and R i the resultant vector length defined in Equation 5. Notice that the within-site dispersion will lead to unrealistic estimates of the between-site dispersion in cases where n i is small, n i = 1 being the extreme case where the within-site dispersion cannot be estimated; that is, we cannot disentangle the contribution of the within-site and between-site dispersion.For cases where n i = 1, we set  Ŝ = 0 , that is, the within site dispersion is zero since it cannot be estimated from these series of equations.

Numerical Results
In this section, we present the results of numerical simulations that explore how different sampling strategies affect the estimation of paleopole position μ 0 and VGP scatter S b .These simulations implement the data generating model described in Section 2.1 to draw samples of site directions and associated directions within a given site.For the different numerical experiments, we apply varied choices for the model parameters (Table 1) and we respectively compute the mean pole position   0 and VGP scatter  Ŝ .These simulations enable us to assess what differences in sampling strategy yield estimates of the parameters of interest that are closer to the true value.
We compare the results of these estimates for different choices of filters and compare them to determine which sampling strategy and method yields the highest accuracy.

Trade-Off Between Number of Sites and Number of Samples per Site
The top panel in Figure 1 shows the accuracy of the mean   0 (Equation 6) as a function of the number of sites N and the number of samples per site n 0 in the absence of outliers (p outlier = 0).As the number of sites increases (moving up the y-axis), the total error reduces.The mean error is also reduced if we increase the number of samples per site while keeping the total number of sites fixed.However, in the latter case we see that the improvement resulting from increasing the number of samples per site is small relative to increasing the number of sites and saturates for small numbers of n 0 (see black contour lines).
In a scenario with unlimited resources to collect and analyze paleomagnetic samples, one could seek to maximize both the number of sites (N) and the number of samples per site (n 0 ).However, in the context of finite resources, it is interesting to consider what happens when we keep fixed the total number of samples n = n 0 N but change how these samples are partitioned between number of sites (N) and number of samples per site (n 0 ).As visualized with the white dotted curves in Figure 1 that follow a fixed total number of samples, we see that smaller errors are associated with sampling strategies that prioritize the acquisition of additional sites over the collection of additional samples per site.The same behavior is exposed when we plot the error as a function of the total number of samples n and for different values of n 0 (Figures 2a and 2b).For all choices of samples per site n 0 , the net error decreases at rate  1∕ √  , with the absolute value of the error being additionally affected by n 0 .We quantify the improvement in accuracy due to an increase in the number of samples for different number of samples per site (Figures 2c and 2d).Even by keeping fixed the number of sites and increasing n 0 (and, consequently, increasing the total number of samples), the improvement in accuracy is minimal once n 0 ≥ 3.
The effect of varied numbers for N and n 0 on the accuracy of estimates of VGP scatter (between-site dispersion S b ) is shown in Figure 1.As with estimating pole position, we observe similar behavior for estimating VGP scatter where, given a fixed total number of samples, there is smaller error when the number of sites is higher.However, the benefit of increasing the number of samples per site on reducing the root mean square error between  Ŝ and the true VGP scatter S b is more pronounced.Notice that for n 0 = 1, this error is large due to the inability to estimate the within-site dispersion.However, for n 0 ≥ 3 the error stabilizes and we observe the same behavior as before: the acquisition of more sites over more samples per site leads to better estimation of the VGP scatter assuming n 0 ≥ 3.

Sampling Strategy in the Presence of Outliers
In the previous section, we concluded that the number of sites N is mostly what determines the accuracy of the estimated position of the paleopole.However, an argument for collecting more samples per site is the ability to detect and filter out spurious sample directions.A more fair comparison then is to compare two p outlier = 0, and κ w = 50.The white dashed lines represent isolines where the total number of samples n is constant, and the black lines represent isolines with constant net mean error angle.Each point-wise estimate of the mean error (i.e., each box) is based on the results of 10,000 simulations.While these simulations represent secular variation using model G, similar results emerge from using the TK03 model (Tauxe & Kent, 2004).
SAPIENZA ET AL. 10.1029/2023JB027211 7 of 16 different strategies for estimating the paleopole while taking the possible occurrence of such outliers into account.When using a small number of samples per site n 0 , outlier detection at the site level may be difficult, or directly impossible where n 0 = 1 given that within site consistency cannot be evaluated.However, it is possible to implement methods to filter VGPs that are statistically significantly apart from the mean (e.g., the paleopole) using an iterative cut-off (Vandamme, 1994).We compare this first strategy (n 0 = 1 with Vandamme's iterative cut-off applied on the estimated population of VGPs) with the optimistic case where we collect more samples per site and are able to identify and filter all the outliers directly at the site level.The latter case provides a lower bound on the most optimistic error when using any outlier detection criteria at site level.For this second strategy, no outliers are included in the calculation of the final estimated pole   0 .This means that the effective number of samples used to estimate μ 0 will be less than n, but since the samples removed are spurious directions, we expect the estimate of the paleopole will be more accurate than if we included all the samples in the calculation.We also show the results of the first method without using any outlier filter whatsoever.3a-3c show the distribution of the angles between μ 0 (true GAD pole) and   0 (estimated pole) for the two sampling strategies and with 10%, 40%, and 60% outlier rate, respectively.Even in the presence of outliers, using n 0 = 1 gives lower angular errors than when using n 0 = 5 until the proportion of outliers p outlier increases by a significant amount.We illustrate this by showing in Figure 3d the mean of these two errors as a function of the outlier rate p outlier .Until the proportion of outliers reaches a critical point of approximately 55%, having n 0 = 1 but being able to sample more sites N still out-performs the case where n 0 = 5 and all outliers are removed.Figure 3e shows this critical value of p outlier for different site latitudes and within-site dispersion, showing that we need to have more than 40% outliers before the second strategy out-performs the n 0 = 1 strategy.Figure 3f further shows this critical value in the case where no filter is used for n 0 = 1.It is noteworthy that despite the small variance, this critical value of p outlier grows as a function of site latitude (increasing S b ) and remains relatively similar as a function of within-site dispersion.

Histograms in Figures
A wider comparison of these methods for a range of samples per site n 0 is provided in Figure 4.Here again we can observe that for a fixed number of total samples the scenario with n 0 = 1 leads to better estimation of the true pole until the proportion of outliers becomes very high.On the right side of the panel we can also observe the improvement in accuracy when we fix the number of sites N and we increase the number of samples per site and thus the total number of samples.In agreement with the results shown in Figure 2, we observe that the improvement due to an increase in the number of samples per site n 0 by keeping N fixed is small compared to a change in the overall sampling strategy.Here κ w = 66 is such that the angular dispersion within site is 10°, and λ = 30°.The gray line denotes the case in which we sample for n 0 = 1 but we do not use any outlier detection method.(d) As we increase the number of outliers p outlier , the error increases differently depending on whether we can detect and filter outliers or not.The intersection of the two errors corresponds to the value of p outlier whereupon there is a crossover in the efficacy of the two methods.The shaded envelopes around the solid lines correspond to the 25 and 75 percentile bands.(e) Value of the intersection between the mean errors for strategies 1 and 2 (panel d) for different values of latitude λ and within-site dispersion k w .(f) Same as in (e) but comparing n 0 = 5 with the scenario of no outlier detection.

Figure 4.
Boxplot of the angular error between estimated and true GAD pole for different sampling strategies (number of samples per site, and total number of sites in parenthesis) for (a), (b) p outlier = 0.10, (c), (d) p outlier = 0.40 and (e), (f) p outlier = 0.60.The left column corresponds to the case where the total number of samples is fixed around n ≈ 100, while the right column is the case with fixed number of sites (N = 100) and a variable total number of samples.Following the convention in Figure 3, the red diagrams correspond to n 0 = 1 using the Vandamme filter; the blue to n 0 = 5 with perfect outlier detection algorithm; and the gray boxes correspond to n 0 = 1 with no outlier detection been applied.For all simulations shown, k w = 50 and λ = 30°.SAPIENZA ET AL.We conducted the same analysis for estimating the VGP scatter S b and its associated error.Figure 5 shows the signed percentage error  100% ⋅ ( Ŝ −  ) ∕ for different choices of n 0 .When n 0 = 1, all methods overestimate the real VGP scatter due to the lack of estimates of the within site dispersion   2  (Equation 9).On the other hand, S b tends to be underestimated when we use the Vandamme (1994) filter, since the cut-off of outliers reduces the total dispersion of the VGPs (Equation 7).As we increase the number of outliers, we observe a significant deterioration of the VGP scatter estimation due to the inability to filter outliers.This behavior is rather different to what we observed for paleopole estimation, where the estimation is more robust to outliers.However, after reaching a minimum required value of samples per site (around n 0 = 3), the accuracy only minimally improves by adding more samples per site.In the case where no outliers are present, we are back to the case in Figure 1 where we observed that, for the same budget of total samples n, a larger value of sites N leads to more accurate estimates as long as n 0 ≥ 3.

Theoretical Results
We can quantify the trade-offs between the different model parameters introduced in the previous section by theoretically deriving approximations for the dispersion parameter of the distribution of the estimated pole   0 .This procedure works by finding the effective precision parameter κ eff of a Fisher distribution that minimizes the Kullback-Leibler divergence with respect to the actual dispersion of   0 (Heslop & Roberts, 2020;Kurz et al., 2016).As derived in Kurz et al. (2016), this approach is equivalent to finding the mean direction and dispersion parameter that matches the resultant vector length of the target distribution.In Appendix A, we have provided the essential definitions and theoretical derivations used in our analysis.Using this method, we can derive the following approximation for the dispersion of the estimated   0 :  ()   . ( The effective dispersion parameter κ eff is a function of all the parameters in the model.Under the assumptions of model G (McFadden et al., 1988), we have κ b = κ b (λ) is a function of the paleolatitude according to Equation 2. However, this result holds for other choices of κ b where the Fisher approximation of the VGP scatter is appropriate.
In the case where no outliers are included (p outlier = 0), based on the approximated relationship between angular dispersion S and κ we can approximate the angular error  Err μ0 introduced in Equation 6as This equation allows us to quantify the amount of error associated with different choices of n 0 .Comparing this theoretical approximation with the simulations (Figures 1 and 2) reveals relative error of around 1% between simulation and theory.
From the theoretical expression for  Err μ0 we can see that as n 0 increases, the improvement in accuracy to the final error becomes rather minimal since the coefficient 1/n 0 κ 1 T(λ) is dominated by 1/κ b .Surprisingly, this limit is reached for very small values of n 0 , which shows the small amount of improvement that increasing n 0 adds to the final error, especially when we compare this with the decay of the error given by the factor  1∕ √  .No matter the choice of n 0 , the error goes to zero as N increases.On the other hand, no matter how large n 0 becomes, the overall error will never be lower than  81 • ∕ √  , N being the quantity that controls the overall error most.
The approximation with outliers is accurate for values of which n 0 (1 − p outlier ) is strictly larger than one.For the case of n 0 = 1, a more accurate approximation is given by where ρ(κ) = 1/tanh(κ) − 1/κ is the expected length of a Fisher distribution with precision parameter κ and ρ −1 its inverse.When using a perfect outlier algorithm with (1 − p outlier )n 0 ≥ 2, the approximation in Equation 13 is still appropriate.Further investigation is needed to estimate the final error when using iterative cut-off methods such as the Vandamme filter (Vandamme, 1994).
Notice that the theoretical expression for the final dispersion can be used to define confidence intervals around the true pole for a specific study case.Effectively, given a sampling procedure with prescribed N and n 0 , we can estimate the dispersion parameters κ w and κ b and then, by plugging these into Equations 12 and 13 obtain a confidence region around the sample estimated pole.This procedure will take into account the hierarchical nature of paleomagnetic samples at the moment of quantifying uncertainty.

Recommendations
When the goal is to estimate the position of a paleopole, our results show that the total number of sites N has a far larger impact on accuracy than the number of samples per site n 0 .We therefore recommend the following rule of thumb for sample collection where the objective is paleopole estimation: the more samples the better, but efforts to maximize the number of independent sites will have a greater effect on improving accuracy than more samples per site.In particular, the benefit of collecting more samples per site is small for n 0 ≥ 3 and diminishes at n 0 ≥ 5. Analyzing more samples than these values per site is inadvisable if it will result in fewer overall sites in a given study.As was concluded in Gerritsen et al. (2022), for the purpose of computing a paleopole and for a fixed total number of samples, it is always better to collect these samples from different sites than to collect more samples per sites.In the context of sedimentary sections, this result strongly supports stratigraphic sampling strategies of one sample per horizon for directional estimation where each sample is its own site, consistent with previous findings by Vaes et al. (2021).Collecting a large number of single-sample sites is also beneficial for the application of the elongation-inclination (E/I) correction for inclination shallowing (Tauxe & Kent, 2004), which requires N ≥ 100 to be robust.In settings of limited sites or where moving between sites is itself resource intensive, as can be the case of igneous intrusions, there is a benefit to more samples per site given that it can improve site level direction estimates and enable within site outlier detection.
A recent approach to synthesize site data into apparent polar wander paths developed by Gallo et al. (2023) enabled the propagation of directional uncertainty using site level precision κ w estimated by multiple samples in a given site.This approach is not possible when applying a n 0 = 1 sampling strategy.However, estimation of the in-site dispersion can be derived using a different estimator such as the maximum angular deviation (MAD) of a directional fit (Khokhlov & Hulot, 2016).
For paleopole estimates, filters based on populations of VGPs can aid in the detection of outliers (e.g., Vandamme, 1994).If there is an appreciable outlier rate, such filtering schemes are necessary when n 0 = 1 given that outliers cannot be detected through within site consistency.When conducting a study with a low number of samples per site, the site consistency test proposed by Gerritsen et al. (2022) can be applied where more samples are analyzed for selected sites.This field test can be used to gain insight into within site reproducibility and precision for a given lithology.Multiple samples per site can also be when the presence of single outliers would have a major impact on interpretations such as in the case of interpreting geomagnetic polarity or transitional directions.We recommend that researchers use of Equation 13to obtain an estimate of the net error as a function of the expected parameters present in the sampling.
An important caveat concerning the use of directional filters is that while the mean may be relatively insensitive to their effects, they can significantly distort the shape of the true directional distribution and should therefore be avoided where the latter is a parameter of interest (e.g., paleosecular variation studies).Indeed, the presence of outliers has a major impact on the estimation of the dispersion, and thus the VGP scatter S b .Increasing the number of samples per site n 0 is beneficial as long as this helps us to detect outliers more accurately.However, this is not always straightforward using conventional data filters and cutoffs, which leads to a reliance on the expert's subjective interpretation (Gerritsen et al., 2022).There is a greater improvement in the accuracy of estimates of VGP scatter through increasing the number of samples per site, even in the absence of outliers, than there is for estimating the mean pole position.However, the improvement in the estimate of the VGP scatter progressively diminishes for increasing samples per site.When outliers can be detected efficiently, and for a minimum of three or four samples per site, the same trade-offs as noted above for paleopole estimation again apply: the preferential collection of more sites over more samples per site leads to more accurate estimates of the VGP scatter.And again, the most optimal sampling scheme given any suite of expected parameters can be determined from the results presented herein.(A10) Finally, since μ i (the mean direction for    ) is also Fisher distributed with mean μ 0 and precision parameter κ b , using Proposition 2 we have that the final pole   0 will have dispersion parameter equal to  1 +    ( 1− outlier )  () ()   . (A11) Now, if n i = n 0 are all the same, we can average all the    to came up with the final pole dispersion parameter (A12) Assuming ρ(κ i ) ≈ 1, we then obtain that the final estimate   0 has a concentration parameter κ* approximately equal to  1 +     (1−outlier)0 ()   , which is the same expression as in Equation 12.In order to derive Equation 13, we rely again in the approximation of the dispersion given in Equation A6.
As we mentioned before, Proposition 3 will fail when the number of samples per site n 0 is small and the number of outliers p outlier is large.For those cases, a better approximation is given by Equation 14.This arises from computing the expected vector length without outliers and then multiply the expected vector length by the factor (1 − p outlier ), which gives an approximated vector length for this case.We then find the corresponding κ for such resultant length by computationally inverting the function ρ(κ).
per site.We will assume n 0 = n 1 = … = n N and denote n = Nn 0 the total number of samples.κw [0, ∞)Precision parameter of the Fisher distribution for a given site, where k w = 0 results in a uniform distribution on a sphere and k w → ∞ is a singular point.κb [0, ∞)Precision parameter of the Fisher distribution between sites.For the model G, this is directly determined by λ.

Figure 1 .
Figure1.Root mean square error (RMSE) in degrees between site mean poles and the true GAD pole (top panel) and between-site VGP dispersion (bottom panel) as a function of different combinations of the total number of sites N and the number of samples per site n 0 .For this diagram, we use a paleolatitude of 30° (κ b ≈ 35), p outlier = 0, and κ w = 50.The white dashed lines represent isolines where the total number of samples n is constant, and the black lines represent isolines with constant net mean error angle.Each point-wise estimate of the mean error (i.e., each box) is based on the results of 10,000 simulations.While these simulations represent secular variation using model G, similar results emerge from using the TK03 model(Tauxe & Kent, 2004).

Figure 2 .
Figure 2. (a) Root mean square error (RMSE) angle of the computed mean pole as a function of the total number of samples n for different values of samples per site n 0 where an increase in samples per site results in a decrease in the number of sites.(b) Displays the same values on a logarithmic scale, making explicit the  1∕ √  decay of the error, independent of the value of n 0 .(c) RMSE as a function of the total number of sites N for different values of n 0 where an increase in n 0 increases the total number of samples, also in (d) logarithmic scale.For all the figures, we set λ = 30°, κ w = 50, and p outlier = 0.The dot-dashed lines in all the plots represent the theoretical approximation (see Section 4).

Figure 3 .
Figure 3.Comparison between two different sampling strategies to determine a mean paleomagnetic pole position in the presence of outliers for a fixed number of total samples (n = 100).The red histograms and curve are strategy 1 where we have one sample per site (n 0 = 1), one hundred sites (N = 100) and we use the Vandamme filter.The blue histograms and curve are strategy 2 where n 0 = 5, (N = 20) and we filter all the outliers (perfect detection algorithm) for (a) p outlier = 0.10 (10% of sample directions are outliers); (b) p outlier = 0.40; and (c) p outlier = 0.60.Here κ w = 66 is such that the angular dispersion within site is 10°, and λ = 30°.The gray line denotes the case in which we sample for n 0 = 1 but we do not use any outlier detection method.(d) As we increase the number of outliers p outlier , the error increases differently depending on whether we can detect and filter outliers or not.The intersection of the two errors corresponds to the value of p outlier whereupon there is a crossover in the efficacy of the two methods.The shaded envelopes around the solid lines correspond to the 25 and 75 percentile bands.(e) Value of the intersection between the mean errors for strategies 1 and 2 (panel d) for different values of latitude λ and within-site dispersion k w .(f) Same as in (e) but comparing n 0 = 5 with the scenario of no outlier detection.

Table 1
Parameters Used for the Sampling of Poles