Computers and Geosciences An optimization method for paleomagnetic Euler pole analysis

Owing to the axial symmetry of the Earth’s magnetic field, paleomagnetic data only directly record the latitudinal and azimuthal positions of crustal blocks in the past, and paleolongitude cannot be constrained. An ability to overcome this obstacle is thus of fundamental importance to paleogeographic reconstruction. Paleomagnetic Euler pole (PEP) analysis presents a unique means to recover such information, but prior implementations of the PEP method have incorporated subjective decisions into its execution, undercutting its fidelity and rigor. Here we present an optimization approach to PEP analysis that addresses some of these deficiencies—namely the objective identification of change-points and small-circle arcs that together approximate an apparent polar wander path. We elaborate on our novel methodology and conduct some experiments with synthetic data to demonstrate its performance. We furthermore present implementations of our methods both as adaptable, stand-alone scripts in Python and as a streamlined interactive workflow that can be operated through a web browser.


Introduction
The continuously changing positions of continents and oceans throughout Earth history have played a fundamental role in shaping an enormous array of Earth processes. An important step toward an understanding of the long-term evolution of Earth is thus the quantification of past lithospheric plate motions. Following from Euler's rotation theorem, we may precisely describe plate motions in the form of finite rotations-commonly called Euler vectors (Cox and Hart, 1986). However, our ability to estimate Euler vectors of past plate motion from geological and geophysical data remains limited for much of Earth history. Improvements and innovations in the methods of Euler vector inference are therefore of broad importance.
An assumption common to all methods of Euler vector inference is that the motion of a given plate with respect to some arbitrary reference can be treated as stable over some nominal interval, such that it can be expressed by a single stage Euler vector (Gordon et al., 1984;Cox and Hart, 1986). From the perspective of the system of reference, this stage Euler vector will be fixed and any point belonging to the plate will move along a plane perpendicular to that axis of rotation, tracing a circle centered on it. If an object passes from the domain of the reference to the plate (becoming a passive occupant of the latter), it too will follow the path of a circle during the stage interval, with a displacement proportional to its residence time on the plate. A series of such objects, passed from the reference domain to the plate progressively in time during the stage interval, would therefore * Corresponding author.
form an arcuate array decorating the trace of a circle about the Euler pole. If a sufficient number of such objects could be identified, their distribution could be inverted to retrieve the location of the rotation axis, and their corresponding residence times on the plate could be used to determine its angular displacement.
At least two kinds of observational records present such age-progressive arcuate segments that can be used to infer stage Euler vectors: hotspot tracks and paleomagnetic apparent polar wander paths (APWPs) (Fig. 1). Hotspot tracks record the motion of a plate with respect to the underlying mantle (neglecting any lateral plume motion), whereas APWPs record the motion of a plate relative to the Earth's magnetic field, which is aligned with the planetary rotation axis. The inversion of paleomagnetic data to retrieve stage Euler vectors -or paleomagnetic Euler pole (PEP) analysis -is of particular interest because it offers the possibility to recover full kinematic descriptions of past plate motion (i.e. including relative paleolongitude change), in contrast to the conventional analysis of paleomagnetic data, which constrains only paleolatitude and paleoazimuth. However, despite being first conceptually introduced more than half a century ago (Francheteau and Sclater, 1969), PEP analysis has seen limited application (e.g. Gordon et al., 1984;Beck and Housen, 2003;Smirnov and Tarduno, 2010;Wu et al., 2017;Swanson-Hysell et al., 2019). This appears to be due, at least in part, to the fact that most prior approaches to PEP analysis have incorporated subjective decisions into its executione.g., the subjective identification of change-points and distinct arcuate  1. Cartoon illustrating the relationship between an apparent polar wander path (APWP; red circles), a hotspot track (red triangles) and the Euler poles (EP; black poles) describing the kinematics of the host plate (gray polygon with static boundaries for simplicity) which gave rise to them over some interval of time ( 0 − 10 ). A sequence of paleomagnetic poles and/or hotspot volcanoes may be inverted to discover the location of the corresponding EP. (For interpretation of the references to color in this figure legend, the reader is referred to the web version of this article.) segments that together approximate an apparent polar wander path. To reduce this source of potential bias, our objective here is to introduce an unsupervised and objective framework with which to conduct PEP analysis-although our methodology is applicable to stage Euler vector inference more generally. We furthermore seek to provide an accessible implementation of this framework for community use via Jupyter notebooks (Kluyver et al., 2016) which are easily deployed using Binder (Jupyter et al., 2018), an open-source tool that enables the creation and sharing of computing environments and code.

Paleomagnetic Euler pole analysis
If a given plate's motion relative to the planetary spin-axis is constant for some interval of time, the paleomagnetic poles acquired by the plate over that interval will trace an arc across the surface of the sphere, which can be described by either a small-circle or great-circle. The fundamental task in PEP analysis is to invert an assemblage of paleomagnetic data to retrieve the average instantaneous kinematics that were acting during such an interval -i.e. the stage Euler vector -which can be concisely expressed by three parameters: the latitude and longitude of a pole of rotation ( , ) and an angular displacement . However, because plate motions intermittently change according to changing torques upon them, APWPs comprise a discrete series of arcuate segments separated by change-points whose number and timing is unknown. Thus, PEP analysis also necessitates the division of paleomagnetic data into a number of temporal subsets -or groups -such that all the paleomagnetic poles of a given group represent the same (unknown) stage Euler vector. We are thus presented with two tasks that we may treat as distinct optimization problems: 1. Grouping paleomagnetic data into subsets that manifest distinct stage Euler vectors. 2. Determination of the best-fit stage Euler vector for each paleomagnetic subset.
Here we develop an unsupervised learning method to solve these problems in the reverse order by first determining the best-fit stage Euler vector to all possible APWP segments (all ordered subsets of the paleomagnetic data), and then identify the least-cost sequence of Euler vectors approximating the total APWP and the respective changepoints. More generally, our methodology provides a tool to fit piecewise minimum circle segments to data on the sphere similarly to piecewise linear segmentation in one-dimensional time series (Jekel and Venter, 2019).

Methodology
We start by considering that we have a series of paleomagnetic poles that we call , with = 1, 2, … , . Each one of these paleomagnetic poles have an associated date and can be represented in Cartesian coordinates ( , , )-with 2 + 2 + 2 = 1-and also by their latitude and longitude, ( , ). Without loss of generality, we assume that the paleomagnetic poles are time ordered, meaning < for < . Given two indices and with < , we denote by − = { , +1 , … , } the subsequence of paleomagnetic points contained in the interval of time between and . We use to refer to Euler poles, where is the instantaneous Euler pole at time . Lastly, we denote with ( , ) the matrix describing a rigid counterclockwise rotation around the axis by the angle , which thus represents the average instantaneous rotation over the time interval -or stage -i.e. a stage Euler vector.

Euler Vector optimization from paleomagnetic data
In this section we consider the problem of fitting an arc (whether belonging to a small-or great-circle) to a subsequence of paleomagnetic poles. Given and with < , we want to find the coordinates of the Euler pole and an angle ∈ [0, ∕2] such that the points in − are approximated by the trace of a circle with angular radius , centered on Euler pole (Fig. 2a).
We infer the parameters of the circle by minimizing the sum of the squares of the geodesic distance from each data point to the circle trace. The geodesic distance can be computed as the absolute difference of the angle between the Euler pole and each paleomagnetic pole and the angle : with ( , , ) the coordinates of the Euler pole (see Fig. 2b). Then, we consider an optimization problem that seeks to minimize the cost function given by where the optimization is performed with respect to the three parameters of the circle that we write as the three-dimensional vector = ( , , ). In the noiseless case where all the points lie along a small-or great-circle and satisfy the equation + + = cos for all ∈ − , the quantity  − ( ) achieves its minimum at zero. Since each ( , , ) is typically affected by random perturbations, there is no exact fit between the set of points and a small or great circle trace. This approach was also considered in Gray et al. (1980) and Fujiki and Akaho (2009), where the authors use a constrained optimization method to find the Cartesian coordinates of the pole.
In contrast to the ordinary least squares method, this optimization problem has no analytical solutions. Although an optimal solution can be found numerically via grid search (e.g. Gordon et al., 1984), such an approach is very inefficient. Here we employ the nonlinear conjugate gradient (CG) method of Polak and Ribiere (1969) to find an optimal solution for this nonlinear optimization problem due to its very low memory requirements and fast convergence rate. We take advantage of the solver provided in the open-source SciPy package (Virtanen et al., 2020) to perform the computations. It is worth noting that even though we use the constraints the periodicity of the cosine and sine functions allow us to treat this problem as an optimization problem without constraints. The final update yields a single Euler pole which best describes the subsequence of paleomagnetic poles (Fig. 2b). Its cost function gives the sum of the residuals between the predicted and observed paleomagnetic poles, and thus, a measure of goodness of fit. Using the spherical law of cosines, from the geographic coordinates of the inverted Euler pole ( , ) we can infer a stage rotation angle , that is, the rotation angle needed to translate the youngest paleomagnetic pole to the oldest one , at an average angular velocity = ∕( −1 − ). To illustrate the performance of our methodology, an example Jupyter notebook with synthetic data is provided via Binder (https://mybinder.org/v2/ gh/LenGallo/PEPy/HEAD).

Identifying the least-cost sequence of Euler vectors approximating an APWP
Equipped with a method to invert a given set of paleomagnetic data to recover the best-fitting Euler vector describing them, we now present a method that seeks to identify the unknown change-points (i.e. platemotion changes) that divide an APWP into discrete segments associated with distinct stage Euler vectors. This is accomplished by seeking that sequence of change-points and Euler vectors which maximizes the goodness of fit between the predicted and observed paleomagnetic poles (Fig. 3).
Our goal is to identify a subsequence of ordered indices (i.e. changepoints) 1 , 2 , … , +1 , with 1 = 1 and +1 = + 1, such that each one of the segments − +1 belongs to the same circle. This implies that ( ) = = ⋯ = +1 −1 and ( ) = = ⋯ = +1 −1 for all = 1, 2, … , , where we define by ( ) and ( ) the Euler pole and radius of the circle in the th segment -or stage -of the APWP. In order to do so, we seek to minimize the following cost function where the optimization is performed over all possible increasing paths ( 1 , 2 , … , +1 ) of unknown length with 1 = 1, +1 = ; and with respect to the Euler pole ( ) and rotation angle ( ) associated to each one of the segments − +1 . The second term in Eq.
(3) was added in order to account for the contribution of the change points to both the previous and consecutive segments in the APWP.
The last term in the cost function (3) corresponds to a regularization added to avoid overfitting, with ≥ 0 being a hyperparameter of the model. Large values of will increase the value of the cost function, such that a good-fitting APWP with fewer segments will be favored. Note that Eq. (3) can be rewritten as with ( ) the parameters of the minimum circle for the segment − +1 . Based on the method introduced in Section 3.1, we can find the Euler pole and amplitude for each one of the possible segments − for each possible value of and , and then compute the minimum inside the parentheses in Eq. (4).
Our approach to minimize the cost function of Eq. (4) proceeds with an exploratory analysis to characterize all possible pathways − , and then employs a graph representation of the APWP to discover the optimal sequence of change-points ( 1 , 2 , … , +1 ) and the number which yields the minimum sum of squared misfits between the set of paleomagnetic poles and the modeled arc segments describing them.
Graphs are compact mathematical structures that amount to a set of connected objects, where objects are referred to as nodes and connections are referred to as arrows (or edges for undirected graphs). Arrows can have an associated weight that indicates the cost of traversing them. An APWP can then be considered as a graph where the nodes correspond to paleomagnetic poles and pairs of nodes are connected by arrows, to which a numerical weight can be assigned (Fig. 4). We may thus model an APWP = { 1 , 2 , … , } where the arrow from to exists for all , such that > and spans at least 3 paleomagnetic poles (at least 3 non-colinear points are required to define a plane). The weight of each arrow is given by the optimal value of the function  − plus . In this context, a candidate approximation of the APWP can be seen as the directed path ( 1 , 2 , … , +1 ) with 1 = 1 and +1 = , where each pair of consecutive nodes in the path ( , +1 ) is connected by an arrow. The total cost of each alternative series of directed paths is given by the sum of the individual weights of the arrows in the path, which coincides with Eqs. (3) and (4). The characterization of all possible pathways − is accomplished by a moving window searching algorithm, where each possible − is analyzed.
In graph theory, Dijkstra's algorithm allows identification of the shortest path between two given nodes, such that the sum of the weight of its constituent edges is minimized (Dijkstra, 1959). We perform shortest path searches through the open-source package SciPy (Virtanen et al., 2020). The hyperparameter is adjusted to avoid overfitting, namely, by preventing an over-estimation of the number of segments, . The statistical literature offers several different approaches to select the optimal value of the hyperparameter . We apply the elbow method to find the value that yields the optimum value of (Bholowalia and Kumar, 2014). This is achieved by plotting the total cost against the number of segments and evaluating their tradeoffs: small values of will add large total costs but at some threshold value of the cost will be observed to decrease sharply. Above this threshold, further increasing will not yield significant further reductions to the total cost. The graphical 'elbow' in the plot thus points to this optimal value of . Other methods to select the optimal number of segments include the Akaike information criterion (AIK) and Bayesian information criterion (BIC) (Friedman, 2017).
We summarize our algorithm to find the optimal sequence of Euler vectors to approximate a given sequence of paleomagnetic poles as follows: 1. For each time-ordered pair of paleomagnetic poles and , find the circle in − = { , … , } that minimizes the value of  − .
2. Construct a directed graph and select through the elbow method to avoid overfitting. 3. Use Dijkstra's algorithm to find the shortest path between the first pole 1 and the last pole . 4. From the optimal path, recover the parameters of the stage Euler vectors for each APWP segment.

Results
To illustrate the application of our methodology, we execute a simple test involving idealized synthetic paleomagnetic data. We start by generating a stochastic model of the drift of a plate over 200 Ma by randomly generating stage Euler vectors -represented in the matrix form as ( , ) -which remain constant during a temporal stage of given duration. The geographic coordinates of each Euler vector ( , ) are randomly selected from a uniform distribution on the unit sphere and their angular velocities are chosen from a uniform distribution constrained between 0.5 • /Ma and 2.5 • /Ma (which yields plate velocities up to 28 cm/yr). The corresponding duration of each Euler vector is randomly selected from a uniform distribution between 10 and 30 Ma, and Euler vectors are drawn until their summed durations reach or exceed 200 Ma. Using this stochastic kinematic model, we assemble a synthetic APWP populated by paleomagnetic poles every 1 Ma (Fig. 5a). To demonstrate the ability to incorporate noise, we perturb the geographic coordinates of the paleomagnetic poles by Gaussian random noise up to 0.1 • . Note that the latter step is only taken to demonstrate the capacity of our method to handle noise; a more comprehensive consideration of the feasibility of PEP analysis given standard paleomagnetic noise has yet to be addressed.
Next, we proceed to invert the synthetic paleomagnetic data according to the methodology outlined above. First, for each candidate APWP segment (all ordered subsets of the paleomagnetic data) we compute the best-fitting Euler vector by minimizing Eq. (4). We then organize the computed costs as a graph in the form of an adjacency matrix, which can be rendered visually as a heatmap (Fig. 5b). Visual inspection of this heatmap reveals strong contrasts in the magnitude of the cost function, which tends to increase sharply with the passage of a change-point. Using the regularization introduced in Eq. (4) and the elbow method ( Fig. 5c and Fig. 5d), we identify the shortest path through this graph following Dijkstra's algorithm, which yields the number of segments and a discrete set of change-points ( 1 , 2 , … , +1 ). From this sequence of change-points we retrieve the parameters of the minimum circle ( ) for each segment − +1 , enabling us to infer the complete kinematic history through the series of stage Euler vectorŝ ( , ). This workflow can be run interactively in a web browser via Binder (https://mybinder.org/v2/gh/LenGallo/PEPy/HEAD) with a synthetic example in the Jupyter Notebook (APWP.ipynb).
To measure the performance of our graph-based PEP methodology in terms of accuracy, we generate 1000 stochastic models of the drift of a plate and invert the paleomagnetic data from each for the series of underlying stage Euler vectorŝ ( , ). Comparisons between the pairs of true ( , ) and estimated̂( , ) stage Euler vector series are then assessed by a variety of metrics. Using the inverted kinematic model, we assemble a synthetic APWP and compute the time-dependent geodesic distance (i.e. residual) between it and the known APWP. In Fig. 6a we show the ensemble median of this residual, along with the 0.25-0.75 and 0.16-0.84 percentile ranges representing standard confidence intervals. As could be expected, we observe that average misfits tend to accumulate backward in time, but we note that they never exceed 3 degrees. In 6b-6c we further dissect the discrepancies in the parameters of the Euler vectors: in panel b we show the ensemble median of the time-dependent geodetic distance between Euler poles, while panel c shows the difference in angular velocities. These results reveals that the average misfit between the true and estimated stage Euler vectors of any given simulation is low. As another way to evaluate differences between the true kinematics and our inferences, we can quantify the discrepancies in the drift of a given point or points after rotation by the series vs.̂. To consider a range of misfits arising from different observation points, we start by generating 5 nodes evenly distributed in latitude (along the central meridian) and then calculate the time-dependant geodesic distance between the inferred drift and the true drift for each point. The mean of these distances is then computed for each time in order to estimate the time-dependant average drift (i.e. total motion) misfit. Fig. 6d shows that the median (of the ensemble of means) of the drift misfit never exceeds 7.5 degrees. As a final metric, we measure the surface velocity misfit associated with our kinematic inferences. Using the same points tracked above, we calculate the surface velocity at each point at each time by computing the cross-product between the point locations and the true and inferred Euler vectors (6e).
Considering all these ensemble medians to be representative of the misfit associated with the workflow in the absence of significant noise, we conclude from these metrics that our methodology can robustly retrieve the kinematics of a plate from its APWP.

Discussion
Conceptually, in providing a means to retrieve the full kinematic histories of tectonic plates from paleomagnetic data, PEP analysis presents an enormously powerful tool. And yet, despite its foundations being laid more than half a century ago (Francheteau and Sclater, 1969), lingering doubts about the rigor and viability of PEP analysis have left it largely stranded in obscurity.
One of the most significant and persistent shortcomings in PEP analysis has been the subjective determination of which APWP segments to invert, and which inversions to retain. As noted by previous studies (e.g. Tarling and Abdeldayem, 1996;Wu and Kravchinsky, 2014), the optimal fits to real paleomagnetic data tend toward short segments associated with proximal circles of small-radii, which often imply improbable rates of plate motion change, unrealistic angular velocities and/or geologically dubious kinematics. As a consequence, the selection of which segments to fit has been done a priori by visual inspection (e.g. Gordon et al., 1984;May and Butler, 1986;Wu and Kravchinsky, 2014), undercutting the rigor of the analysis and making it prone to subjective human bias. In some instances the best-fitting small-circle solutions have also been dismissed in favor of either subjective assessments of more feasible solutions (Tarling and Abdeldayem, 1996) or forced great-circle fits (Smirnov and Tarduno, 2010). Although Smirnov and Tarduno (2010) consider the use of great-circle fits to be a conservative measure, we note that approach confines all stage Euler vectors to the equatorial plane. Rose (2016) presented a significant step forward in introducing a Bayesian approach to PEP analysis (later applied by Swanson-Hysell et al. (2019)), which addressed some of these issues of subjectivity and presented a pathway toward a more comprehensive treatment of uncertainty in such analyses. However, in its present formulation, that approach still regularizes the problem through specification of the number of change-points.
Our methodology addresses several of these deficiencies, and notably the problem of objectively selecting change-points, by identifying the least-cost solution without supervision. The least-cost solution is a full approximation of the APWP, such that the location of the change-points and the parameters of the stage Euler vectors defined between them are fully and automatically determined by the algorithm. Although the method employs a regularization parameter to avoid overfitting, the elbow method allows a more objective selection of this parameter. Other parameters that can be modified by the user (number of inversion seeds, maximum duration of a given stage) -while capable of affecting the outcome if set to too low values -are used to improve efficiency and so can be set to arbitrarily high values. Admittedly, our experiments here have been conducted on idealized synthetic data, and it can be anticipated that experiments with real (noisy) data may introduce circumstances in which the least cost solution can otherwise be shown to be unrealistic (e.g. requiring impossibly high rates of rotation). However, the flexibility of our methodology is such that minor adaptions to our optimization framework (e.g. specifying constraints on rotation rates in the optimization problem) allow such issues to be tackled systematically.
Despite the advancements presented herein, there remain a number of significant challenges associated with the application of PEP analysis to real paleomagnetic data. The first is the feasibility of the methodology in light of the noise accompanying paleomagnetic data, arising from both intrinsic (geomagnetic secular variation) and extrinsic (e.g. measurement errors, tectonic rotations, inclination shallowing, erroneous age assignments) uncertainties. Although the cost function can be further constrained to avoid unrealistic solutions, there exists some level of noise above which the uncertainty in the solution space becomes so vast as to render the optimal solution meaningless. However, a systematic exploration of these noise thresholds, as well as the trade-offs between noise level and the rate of plate motion, has yet to be conducted. Furthermore, we note that current APWPs Torsvik et al. (e.g. 2012) describe plate motion in 10 Ma, yielding only small quantities of data to be fitted by our algorithm. However, advancements in the construction of APWPs from VGP-level data (e.g. Vaes et al., 2022) provide promising data-rich alternatives that can enable PEP analysis to reach higher levels of precision.
A second challenge concerns the phenomenon of true polar wander (TPW). TPW is a rotation of the entire solid Earth that occurs in response to changes in the planetary moment of inertia following changes to Earth's internal mass distribution (Goldreich and Toomre, 1969). Because TPW always acts to restore the largest inertial axis to the planetary rotation axis, TPW always occurs about an equatorial axis. Thus, during a TPW event, the entire tectonic plate system (together with the mantle) moves relative to the spin axis by a common angular displacement. Paleomagnetic data, in recording the movement of the lithosphere relative to the magnetic field (which is not rotated by TPW) therefore represents some composite signal of TPW and differential plate motion. Before PEP analysis can be applied to any given paleomagnetic dataset, the TPW signal needs to be removed from it, if not otherwise assumed to be negligible. Unfortunately, time-dependent estimates of TPW are themselves uncertain and often controversial. More significantly, prior to Cretaceous time, TPW estimates are derived from plate kinematic models (Steinberger and Torsvik, 2008); any results from PEP analysis employing these TPW estimates would thus undermine their own foundation wherever they challenged the underlying kinematic model. Future progress on this front will likely necessitate a joint modeling approach.
We end with a reminder that the methodology and tools presented herein, while motivated by the desire to improve upon existing approaches in PEP analysis, are not restricted to paleomagnetic applications. They could, for example, be equally applied to hotspot tracks. There are also broader applications of the small-circle fitting method (which is intentionally presented as a standalone tool), as arise, for example, in structural geology. In particular, the identification and analysis of conical folds is achieved by the best-fitting small circle to a set of poles to folded layers (e.g. Ramsay, 1964;Cruden and Charlesworth, 1972;Pastor-Galán et al., 2012;Mulchrone et al., 2013).

Conclusions
Owing to the axial symmetry of the Earth's magnetic field, paleomagnetic data only directly record the latitudinal and azimuthal positions of crustal blocks in the past, and paleolongitude is not constrained. However, full paleo-kinematic information can be retrieved from assemblages of paleomagnetic data with use of PEP analysis. Here we introduced an optimization approach to PEP analysis that enables it to be conducted in a fully data-driven way. Our methodology follows the solution of two distinct optimization problems. First, the parameters of the best-fitting circle to a set of paleomagnetic poles are found through a gradient-based Euler vector optimization method. Then, through an exploratory data analysis, all the possible segments at every point of the APWP are assessed in order to transform the data into a weighted, undirected graph. This graph representation of the APWP allows the identification of that sequence of stage Euler vectors which together present the least cost (shortest path of the graph) description of an observed APWP. As demonstrated by a set of simple experiments with synthetic data drawn from stochastic kinematic histories, our algorithm is able to invert paleomagnetic data to recover kinematic parameters that closely approximate the known (true) values. However, while the tools presented herein mark an important step forward, a number of significant challenges remain for the application of PEP analysis to real paleomagnetic data.

Declaration of competing interest
The authors declare the following financial interests/personal relationships which may be considered as potential competing interests: Leandro Cesar Gallo reports financial support was provided by University of Oslo.