Geometric foundations for scaling-rotation statistics on symmetric positive definite matrices: minimal smooth scaling-rotation curves in low dimensions

We investigate a geometric computational framework, called the"scaling-rotation framework", on ${\rm Sym}^+(p)$, the set of $p \times p$ symmetric positive-definite (SPD) matrices. The purpose of our study is to lay geometric foundations for statistical analysis of SPD matrices, in situations in which eigenstructure is of fundamental importance, for example diffusion-tensor imaging (DTI). Eigen-decomposition, upon which the scaling-rotation framework is based, determines both a stratification of ${\rm Sym}^+(p)$, defined by eigenvalue multiplicities, and fibers of the"eigen-composition"map $SO(p)\times{\rm Diag}^+(p)\to{\rm Sym}^+(p)$. This leads to the notion of scaling-rotation distance [Jung et al. (2015)], a measure of the minimal amount of scaling and rotation needed to transform an SPD matrix, $X,$ into another, $Y,$ by a smooth curve in ${\rm Sym}^+(p)$. Our main goal in this paper is the systematic characterization and analysis of minimal smooth scaling-rotation (MSSR) curves, images in ${\rm Sym}^+(p)$ of minimal-length geodesics connecting two fibers in the"upstairs"space $SO(p)\times{\rm Diag}^+(p)$. The length of such a geodesic connecting the fibers over $X$ and $Y$ is what we define to be the scaling-rotation distance from $X$ to $Y.$ For the important low-dimensional case $p = 3$ (the home of DTI), we find new explicit formulas for MSSR curves and for the scaling-rotation distance, and identify ${\cal M}(X,Y)$ in all"nontrivial"cases. The quaternionic representation of $SO(3)$ is used in these computations. We also provide closed-form expressions for scaling-rotation distance and MSSR curves for the case $p = 2$.


Introduction
In recent years there has been increased interest in stratified manifolds for statistical applications. For example, stratified manifolds have recently received attention in the study of phylogenetic trees [10,22] and Kendall's 3D shape space [26]. New analytic tools for such manifolds are fast developing [13,8].
Our work contributes to the development of such tools on both a theoretical and practical level, providing a solid geometrical foundation for development of statistical procedures on the stratified manifold Sym + (p), the set of p × p symmetric positive-definite (SPD) matrices.
In this work, we investigate a geometric structure on Sym + (p), resulting from the stratification defined by eigenvalue multiplicities. This stratification is tied inextricably to our main goal in this paper: the systematic characterization and analysis of minimal smooth scaling-rotation curves in low dimensions. Such curves were defined in [25] as smooth curves whose length minimizes the amount of scaling and rotation needed to transform an SPD matrix into another. The techniques developed in this paper, when applied to the case p = 3, allow us to find new explicit formulas for such curves. Our work builds fundamental mathematical and geometric grounds that facilitate developments of statistical procedures for SPD matrices, and is instrumental in understanding general stratified manifolds.
To elaborate how our work here relates to advancing statistical analysis of SPD matrices, we present below a rather long introduction. We first give some background on statistical analysis of SPD matrices, and more generally on analysis of data in stratified manifolds, followed by a brief discussion on the statistical motivation of studying scaling-rotation curves and distances. We then informally introduce the main results of the paper.

Background
Statistical analysis of SPD matrices The statistical analysis of SPD matrices has several applications, especially in some biological problems, such as diffusion-tensor imaging (DTI). A diffusion tensor may be viewed as an ellipsoid, represented by a 3 × 3 SPD matrix. DTI researchers are interested in smoothing a raw noisy diffusion-tensor field [41], registering fibers of tensor fields [2], regression models [43,40] and classification of 'noisy' tensors into strata [44]. Our eigenvalue-multiplicity stratification categorizes the ellipsoids associated with the SPD matrices into distinct shapes, which in the case p = 3 are known as spherical, prolate/oblate, and tri-axial (scalene). We believe that the scalingrotation framework studied in this work and in [25,20] will be highly useful in developing new methodologies of smoothing, registration and regression analysis of diffusion tensors.
A major hurdle in analyzing SPD matrix-valued data is that the data are best viewed as lying in a curved space, making the application of conventional statistical tools inappropriate. To briefly discuss the drawback of using a naive approach (i.e., using the fact that the data lie in the vector space of all p × p symmetric matrices), take as an example the simplest case of 2×2 SPD matrices. In order for a 2 × 2 symmetric matrix X to be positive-definite, the squared off-diagonal element x 12 must be absolutely smaller than the product of two diagonal elements x 11 and x 22 (which themselves must be positive). This entails the set {(x 11 , x 22 , x 12 ) : X = (x ij ) ∈ Sym + (2)} being a proper subset of R 3 , the set of points inside of a convex cone (this is visualized in Fig. 2 in Section 2.8.1.) A naive approach to handle data in Sym + (p) is to use the usual metric defined in the ambient space, which gives rise to Euclidean metric d E (X, Y ) = X − Y F (Frobenius norm). There are several disadvantages of using Euclidean metric: the straight line given by the Euclidean framework has undesirable features such as "swelling" [3] and limited extrapolation. Recently, several different geometric tools have been proposed to handle the data as lying in a curved space with the help of Riemannian geometry and Lie group theory [29,3,35,27,33,36,37] or by borrowing ideas from shape analysis [14,42,41]. Among these, we point out three existing frameworks.
The log-Euclidean geometric framework [29,3] handles the data in a "logtransformed space", the set of symmetric matrices, Sym(p) = log(Sym + (p)). This gives rise to the log-Euclidean metric d L (X, Y ) = log(X) − log(Y ) F . Effectively, the log-transform provides a "local linearization" of Sym + (p) near the identity matrix; the results it yields are less good for matrices farther from the identity. A second framework, the "affine-invariant Riemannian framework" [35], provides a local linearization of Sym + (p) in a neighborhood of an arbitrary point μ ∈ Sym + (p). This framework makes use of the identification of Sym(p) with the tangent space of Sym + (p) at μ to endow Sym + (p) with a GL(p, R)-invariant Riemannian metric. This gives rise to the metric d AI (X, Y ) = log(X − 1 2 Y X − 1 2 ) F . When X and Y are understood as covariance matrices of random vectors x and y, the distance d AI (X, Y ) is invariant under "affine" transformations applied to both X, Y ; for any p × p invertible matrix G, d AI (X, Y ) = d AI (GXG T , GY G T ). From a third standpoint, the Procrustes size-and-shape framework of [14] turns the problem of analyzing SPD matrices into a problem of analyzing reflection size-and-shapes of (p + 1)-landmark configurations in p dimensions. Specifically, an SPD matrix X is represented by an equivalent class {LR : R ∈ O(p)}, where the lower triangular matrix L satisfies X = LL T . The size-and-shape metric is defined as d S (X 1 , X 2 ) = inf R∈O(p) i . The size-and-shape framework can also be applied to symmetric non-negative definite matrices.
These three different measures of "distance" dictate the method of interpolation of two or more SPD matrices, and lead to different definitions of the population and sample mean. The results of smoothing a tensor field and registration of fiber tracts will also depend on the choice of geometric framework for computation. These frameworks also provide methods for local linearization of data, methods that are useful for e.g. dimension-reduction, regression modeling, approximate multivariate-normal-based inference and large-sample asymptotic distributions. The log-transformation-based geometric frameworks, log-Euclidean and affine-invariant Riemannian frameworks, have been heavily used in statistical modeling and estimations [cf. 37,44], partly due to their simple geometric structures. In previous work [25], we introduced a fourth framework, the "scaling-rotation framework", that is the subject of this paper. In [25,Section 5], we presented evidence of advantages of this framework over the popular log-transformation-based frameworks for tensor interpolations. In Section 1.2 of the present paper, we briefly discuss some other advantages of the scaling-rotation framework in statistical analysis.
Statistical analysis of data on stratified spaces As we shall see in this paper, the scaling-rotation framework leads us to treat Sym + (p) as a stratified space. Many statistical analyses now deal with data that naturally lie in non-Euclidean spaces. In particular, stratified spaces have recently received attention in the study of, e.g., phylogenetic trees [10] and Kendall's 3D shape space [26]. A stratified space is a union of "nice" topological subspaces called strata, with certain restrictions on the way the strata join. A simple example is a spider (half-lines joined by a point) or an open book (half-planes joined by a line) [22]. Another example is the phylogenetic tree space of Billera, Holmes and Vogtmann [10], the union of Euclidean positive orthants, each representing different topology of phylogenetic trees (see also [30]). The space of SPD matrices is naturally stratified by eigenvalue multiplicities. For example if p = 2, there are two strata, one consisting of SPD matrices with distinct eigenvalues and the other consisting of matrices with equal eigenvalues.
For statistical analysis on stratified spaces, it is crucial to devise appropriate notions of distance and shortest path(s) between two points, together with associated computational algorithms. These tasks, in general, are challenging. For example, it is known that for computing a graph-edit distance between two geometric tree-like shapes is NP-complete [9]. To overcome these computational burdens, Feragen and her colleagues [15,17] have proposed and studied a quotient Euclidean distance on the space of tree-like shapes, which is a stratified space. Wang and Marron [39] defined a notion of "average tree" as well as a principal-component analysis of trees, and an efficient algorithm [4] was needed to compute the principal components. For the phylogenetic-tree spaces, there has been an ongoing effort to advance efficient computations for distances [34], mean and median [5,28], clustering [12], and estimating principal components [30]. For stratified shape-spaces, Huckemann et al. [23] have also developed a form of principal component analysis.
New analytic tools for these stratified spaces are fast developing. Hotz et al. [22] established a central limit theorem for the open-book space, and showed that the sample Fréchet mean can be "sticky" to the one-dimensional stratum. For a special phylogenetic-tree space, central limit theorems were derived in [7] for each of three cases: when the population Fréchet mean is in the top stratum, a co-dimension-one stratum, or the bottom stratum (a point). See [6] for an extension. Nye has defined diffusion processes for some simple stratified spaces [32] and for the phylogenetic-tree space [31]. See [16] and references therein for other recent developments.
In analogy to the literature on tree spaces, in this paper we develop the concepts of shortest paths and scaling-rotation distance, and provide closedform formulas, as first steps toward developing eigenstructure-based statistics on Sym + (p). In the future, new concepts and analytical tools such as mean, principal component analysis, regression analysis, and inference procedures may be developed within the scaling-rotation framework. Our work contributes to the development of such tools on both theoretical and practical level, providing a solid geometrical foundation for development of eigenstructure-based statistical procedures on the stratified manifold Sym + (p).

Scaling-rotation geometric framework and its statistical importance
Recall that every X ∈ Sym + (p) can be diagonalized by a rotation matrix: X = UDU −1 = UDU T for some U ∈ SO(p), D ∈ Diag + (p). Here, Diag + (p) denotes the set of p × p diagonal matrices all of whose diagonal entries are positive. We refer to (U, D) as an eigen-decomposition of X. Conversely, for all U ∈ SO(p), D ∈ Diag + (p), the matrix UDU T lies in Sym + (p). Thus the space of eigen-decompositions of p × p SPD matrices is the manifold To name the set of eigen-decompositions corresponding to a single SPD matrix, for each X ∈ Sym + (p), we define the fiber over X to be the set The relation ∼ on M defined by lying in the same fiber-i.e. (U, D) ∼ (V, Λ) if and only if F (U, D) = F (V, Λ)-is an equivalence relation. The quotient space M/ ∼ (the set of equivalence classes, endowed with the quotient topology) is canonically identified with Sym + (p). It should be noted that F is not a submersion (cf. [1,24]), and that M is not a fiber bundle over Sym + (p); as we will see explicitly later, the fibers are not all mutually diffeomorphic (or even of the same dimension). The different structures of fibers naturally lead to a stratification of Sym + (p) and M . The stratum to which an X ∈ Sym + (p) belongs depends on the diffeomorphism type of E X . As we shall see in Section 2.6, this stratification based on "fiber types" is equivalent to stratifications by orbit-type and by eigenvaluemultiplicity type.
The strata of Sym + (p) and M are determined by patterns of eigenvalue multiplicities, and are labeled by partitions of the integer p and the set {1, . . . , p}. We will always assume p > 1, the case p = 1 being uninteresting. For each p, one can obtain the numbers of strata (of Sym + (p) and M ), the dimension of each stratum, and the diffeomorphism type of fibers belonging to each stratum. Several group-actions are involved, and the deepest understanding comes from identifying the relevant groups and the various actions.
In [36], Schwartzman introduced scaling-rotation curves as a way of interpolating between SPD matrices in such a way that eigenvectors and eigenvalues both change at uniform speed. To provide a geometric framework for these curves, Section 2 is devoted to systematic characterization of fibers and its connection to the stratification of Sym + (p). This allows us to build upon the scaling-rotation framework for SPD matrices proposed in [25], which provided a geometric interpretation for the scaling-rotation curves in [36]. In particular, our characterization of fibers is essential in understanding differential topology and geometry of this framework.
In the scaling-rotation framework for SPD matrices, the "distance" d SR (X, Y ) between any two matrices X, Y ∈ Sym + (p) is defined to be the distance between fibers E X and E Y in M , as determined by a suitable Riemannian structure on M . We choose the Riemannian metric on M = SO(p) × Diag + (p) to be a product metric determined by bi-invariant Riemannian metrics g SO , g D + on the two factors (each of which is a Lie group). The corresponding squared distance function d 2 M is a sum of squares. The geodesics connecting two fibers E X and E Y with the minimal length give rise to minimal smooth scaling-rotation curves (MSSR) curves, "efficient" scaling-rotation curves that join X and Y .
The scaling-rotation framework has the potential to improve statistical analysis of SPD matrices in situations in which eigenstructure is fundamental. Take, for example, a regression analysis of SPD-matrix-valued data. Using scalingrotation curves, one can explicitly model the changes of SPD matrices separately in terms of eigenvalues or eigenvectors. In the setting of DTI, this means that diffusion intensities and diffusion directions can be modeled individually or jointly. Thus the changes of diffusion tensor (either along the fibers of tensors, or as a function of time or covariates) may be interpreted more meaningfully than is the case with some alternative frameworks. In particular, we found in [25] that MSSR curves oftentimes exhibit deformations of ellipsoids (representing SPD matrices) that are more natural to the human eye than are the deformations determined by the interpolation methods of [3,35]; the summary measures of diffusion tensors (3 × 3 SPD matrices) such as fractional anisotropy and mean diffusivity evolve in a regular fashion. Moreover, in the scaling-rotation framework, exploratory statistics such as mean, median, and principal components may carry high interpretability, again due to separability of eigenvalues and eigenvectors. The scaling-rotation framework carries over to SPD-matrixvalued data of higher dimensions, such as in dynamic-factor models concerning covariance matrices varying over time [18]. Our computational algorithms for low dimensions are still applicable through dimension reduction; we leave such developments for future work.

Overview of main results
We carefully characterize the eigenvalue-based stratification of Sym + (p) in Section 2. We begin with identifying all the fibers of the eigen-composition map F systematically in terms of partitions of the integer p and the set {1, 2, . . . , p}. This culminates in Section 2.4 with a very explicit description of all the fibers. In Sections 2.5-2.7 we show how these ideas lead to stratifications of Sym + (p). In Section 2.8, we explicitly describe all the strata and all the fiber-types for the cases for p = 2 and p = 3.
Understanding the stratification enables us to analyze some non-trivial features of the scaling-rotation framework. For example, d SR is a metric on the top stratum of Sym + (p), but is not a metric on all of Sym + (p). For any p, the analysis of d SR (X, Y ) and MSSR curves from X and Y depend on the strata to which X and Y belong, because fibers are topologically and geometrically different for different strata. In Section 3 we review the geometry of scaling-rotation framework. In Section 3.1, we first introduce our choice of Riemannian metric g M on M , and define scaling-rotation curves in Sym + (p) as images of geodesics in (M, g M ). While the geometry of the "upstairs" Riemannian manifold (M, g M ) is relatively simple, the problem of determining MSSR curves between arbitrary X, Y in the quotient space Sym + (p) is highly nontrivial, as is determining how the set of all such curves depends on X and Y . In Section 3.2, we define scalingrotation distance and MSSR curves, and in Section 3.3 we summarize results from [20] on general tools used in computing these objects. These results are applied to the important p = 3 case in Sections 5 and 6.
As we shall see, for any X, Y ∈ Sym + (p), an MSSR curve from X to Y always exists, but need not be unique. This paper also characterizes when such a curve is unique, very explicitly for the cases p = 2 and p = 3. Precisely describing the conditions of uniqueness is vital in any probability statement on random objects on Sym + (p). For example, for any two random objects X and Y drawn from continuous distributions defined on Sym + (p), with probability 1 there exists a unique MSSR curve between them.
Because all strata of Sym + (p) other than the top stratum have positive codimension, any random object X drawn from a continuous distribution defined on Sym + (p) will lie in the top stratum with probability 1. Nonetheless, we cannot assume that a population-mean or parameter μ ∈ Sym + (p) for a continuous distribution lies in the top stratum. Therefore, with the possibility that for μ ∈ Sym + (p), μ does not have distinct eigenvalues, a closed-form expression for d SR (μ, X), and a systematic characterization and analysis of MSSR curves from μ to X, are desirable. In this paper, we focus on the cases p = 2 and p = 3.
In Section 4, for p = 2, we provide closed-form expressions for the scalingrotation distance, provide conditions on X, Y ∈ Sym + (2) under MSSR curves between X and Y are unique, and illustrate the cases of uniqueness and nonuniqueness. (When there is not a unique MSSR curve from X to Y , there are several possibilities for the number of MSSR curves from X to Y .) Sections 5-7 are devoted to the case p = 3. In Section 5, we use the quaternionic parametrization of SO(3) to help us characterize scaling-rotation dis-tances, to evaluate closed-form expressions for the distances, and to identify and parameterize MSSR curves between X, Y ∈ Sym + (3). In this section we also reduce the combinatorial complexity of these problems depending on the strata to which X and Y belong. A catalog of the "nontrivial" unique and non-unique cases of MSSR curves is given in Section 6.1, and a detailed algorithm for computing scaling-rotation distance and the set of MSSR curves is given in Section 6.2. In Section 7, we schematically illustrate the conditions on X, Y ∈ Sym + (3) in the catalog of Section 6, and provide some pictorial examples of unique and non-unique MSSR curves (including cases in which both X and Y lie in the top stratum; these cases are omitted from the catalog in Section 6).
Some of the material in Sections 2 and 3 summarizes [25], and especially, [20]. However, particularly in Section 2, for some topics we greatly expand upon [20], including giving detailed descriptions and illustrations of fibers and strata.
Frequently used notations and symbols are listed in Table 1.

Partitions of
We will consider several stratified spaces in this paper. The strata we define will be labeled by two different types of partitions. For the sake of efficiency we first review these partitions and fix some related notation.
The sets Part({1, . . . , p}) and Part(p) are partially ordered by the refinement relation. For J, K ∈ Part({1, . . . , p}), we say that K is a refinement of J, or that K refines J, if every element of K is a subset of an element of J (remember that an element of K or J is a subset of {1, . . . , p}); equivalently, if K can be obtained by partitioning the elements of J. We write J ≤ K if K refines J; "≤" is then a partial ordering on Part( Note that if D 1 , D 2 ∈ Diag(p) have each distinct diagonal entries, then G D1 = G D2 . In general, G D does not depend on the absolute or relative sizes of the diagonal entries of D, but only on which entries are equal to which others. The stabilizer group is closely related to eigenstructure: if (U, D) ∈ M is an eigendecomposition of X ∈ Sym + (p), then for any R ∈ G D , (UR, D) is also an eigendecomposition of X. But G D is precisely the group G J D defined using Definitions 2.2 and 2.3, and the identity components are related similarly:

The groups of signed-permutation matrices
In this subsection we define two groups,S p andS + p , related to the stabilizer group of D ∈ Diag(p). Both extend the symmetric group S p , and we interpret these groups in terms of matrices.

Notation 2.5.
1. We write I p for the group (Z 2 ) p = Z 2 × Z 2 × · · · × Z 2 (p copies). Each Z 2 is the group of signs with elements ±1. We write typical elements of I p by σ = (σ 1 , . . . , σ p ). We call I p the group of sign-changes, and write 1 for its identity element.
For later use, we record the orders (cardinalities) of the groupsS p andS + p : Result 2.7. The orders (cardinalities) of the groupsS p andS + p are as follows: Proof: Immediate from (2.8) and the fact thatS + p has index 2 inS p . Remark 2.8. For σ ∈ Z 2 = {±1}, let O σ (k) ⊂ O(k) denote the set of orthogonal transformations with determinant σ. In the setting of (2.4), the connected components of G J are O σ1 (k 1 )×O σ2 (k 2 )×· · ·×O σr (k r ), subject to the restriction i σ i = 1. Thus for each partition J with r blocks, there is a 1-1 correspondence between the set of connected components of G J and I + r (in which (σ 1 , . . . , σ r ) lies). This fact leads that the number of connected components is 2 r−1 , which is used in describing the fibers of F ; see Proposition 2.14.
The groupS p has a natural representation on R p , the map mat : where I σ = Diag(σ 1 , . . . , σ p ) and P π is the matrix of the linear map "π · " : R p → R p in (2.2). The entries of the permutation matrix P π are (P π ) ij = δ i,π(j) . (We will see shortly that mat is a homomorphism, justifying the term "representation on R p ".) It is easily seen that mat is injective. Definition 2.9. We call a p × p matrix P a signed-permutation matrix if for some (necessarily unique) π ∈ S p the entries of P satisfy P ij = ±δ i,π(j) . We call such P even if det(P ) = 1 and odd if det(P ) = −1. (Note that evenness of P is not the same as evenness of the associated permutation π.) The set of signed p × p permutation matrices is exactly mat(S p ) ⊂ O(p); the subset of even elements is exactly mat(S + p ) ⊂ SO(p).
It is easy to see that mat(S p ) is actually a subgroup of O(p). (This also follows from the fact, shown below, that mat is a homomorphismS p → O(p).) Furthermore, at the level of matrices, the sign-homomorphismS p → Z 2 is simply determinant: sgn(σ, π) = det(mat(σ, π)) = det(I σ P π ). (2.11) It follows that mat(S + p ) is a subgroup of SO(p). Identifying Diag + (p) with (R + ) p ⊂ R p , the action (2.2) yields an action of S p on Diag + (p), given by One may easily check that for any π ∈ S p , D ∈ Diag(p), we have π · D = P π D(P π ) T = P π D(P π ) −1 , (2.13) and that the restrictions of the map mat to the subgroups I p × {id.} ∼ = I p and {1} × S p ∼ = S p are homomorphisms. It follows easily that the map mat :S p → O(p) ⊂ GL(p, R) is a homomorphism (hence a representation on R p , as asserted earlier): Since mat is an injective homomorphism, it is an isomorphism onto its image, the subgroup mat(S p ) ⊂ O(p).
As in [25], we call the elements of mat(I p ) sign-change matrices, even or odd according to their determinants.
For any subgroup H ofS p , mat restricts to an isomorphism H → mat(H). Therefore to simplify notation, henceforth in most expressions we will not write the map mat explicitly; rather, we will use (for example) the notationS p for both S p and mat(S p ). It should always be clear from context whether our notation refers to an element (or subgroup) ofS p , or the corresponding matrix (or finite group of matrices) under the map mat. However, to avoid some odd-looking formulas we will use the following notation: Notation 2.11. We write typical elements of the (abstract) signed-permutation groupS p as g, and define the matrix P g = mat(g) ∈S p . Thus P (σ,π) = I σ P π . The image of g under the projection Proj 2 :S p → S p will be denoted π g .
We remark that ifS p is interpreted as mat(S p ), Proj 2 is the map I σ P π → π (well-defined, since every element of mat(S p ) can be written uniquely in the form I σ P π ).
Note that the action of S p on Diag + (p) lifts to an action ofS p on Diag + (p): (2.14) In terms of matrices, this is just the conjugation action: the latter equality holding since sign-change matrices are diagonal (and therefore commute with diagonal matrices).

Structure of the fibers
We are now ready to provide a systematic description of the fibers of F . We start with a result from [20]: Then The fiber E X generally has more than one connected component. The "shape" of the fiber E X depends on the partition [J D ]. 3. For any Lie group G and closed subgroup K, we write G/K and K\G for the spaces of left-and right-cosets, respectively, of K in G. (In particular, we use this notation when G is a finite group.) The groupS + p acts on M = SO(p) × Diag(p) via setting g · (U, D) = . This action preserves every fiber of F . Thus for each X ∈ Sym + (p) there is an induced action ofS + p on Comp(E X ), given by (2.18) Each g ∈S + p , acting as above, permutes the connected components of E X ; the subgroup Γ 0 J D is the stabilizer of [(U, D)] ∈ Comp(E X ) under this action.
(i) Then every (U, D) ∈ E X determines a bijection between Comp(E X ) and the setS + The proposition above is proved in [20]. An important special case of Proposition 2.14 is the case in which all eigenvalues of X are distinct. In this case, is free as well as transitive.
Examples of the fibers for p = 2, 3 can be found in Section 2.8.

Orbit-type stratification of Sym + (p)
The compact Lie group G = SO(p) acts from the left on the manifold Sym + (p) via For each X ∈ Sym + (p), the orbit G · X of X is diffeomorphic to G/G X , where G X ⊂ G is the stabilizer subgroup of X: is conjugate to G X . More generally, whether or not X, Y ∈ Sym + (p) lie in the same orbit, we say that X and Y have the same orbit type if the stabilizers an equivalence relation we will write as G X ∼ c G Y ). If X and Y have the same orbit type then the orbits G · X, G · Y are diffeomorphic. Define the orbit-type stratum of Sym + (p) associated with a given orbit-type to be the union of all orbits of that type; we refer to the collection S of these strata as the orbit-type stratification of Sym + (p). The pair (Sym + (p), S) is an example of a Whitney stratified manifold, one of several notions of "stratified space" in the literature. In all such notions, a stratification of a topological space Z is a collection S of pairwise disjoint subsets of Z, called strata, whose union is Z and which are required to satisfy certain conditions that depend on which notion of "stratified space" is being used. The "nicest" type of stratification of a manifold is a Whitney stratification [19, Section 1.1]. It is known that, for any compact Lie group acting on a smooth manifold, the orbit-type stratification is a Whitney stratification ( [11, p. 21]).
However, not all the criteria for a Whitney stratification are relevant to this paper. Slightly modifying the terminology of [19], the notion of greatest relevance here is that of a P-decomposed space, where (P, ≤) is a partially ordered set. A P-decomposition of a closed subset Z of a manifold N is a locally finite collection S = {S i } i∈P of pairwise disjoint submanifolds of N whose union is Z and for which where "overbar" denotes closure. For the purposes of this paper, we allow "stratified space" to mean simply a P-decomposition S of a closed subset Z of some manifold, where P is any partially ordered set; the submanifolds S i are called the strata of this stratification. We will make pervasive use of the "P-decomposition" notion. Our P will always be either Part(p) or Part({1, . . . , p}), and we will refer to it as a label set.

Three equivalent stratifications of Sym + (p)
There are three "types" that we will associate to each X ∈ Sym + (p). The first, already defined, is the orbit type of X under the action (2.19). The other two types, fiber type and eigenvalue-multiplicity type, will be defined below. For any of these types, "X has the same type as Y " is an equivalence relation. We will see that all three relations are identical. Thus the orbit-type stratification may be thought of just as well as a fiber-type stratification or as an eigenvaluemultiplicity-type stratification.
In this case, the fibers E X , E Y are diffeomorphic (cf. Proposition 2.14).
2. For X ∈ Sym + (p), we define the eigenvalue-multiplicity type of X, which we will denote ET(X), to be the multi-set of multiplicities of eigenvalues of X (the collection of eigenvalues of X, enumerated with their multiplicities), an element of Part(p).
For example, if p = 3, then for any R 1 , R 2 , R 3 ∈ SO(3), the matrices in Sym + (3) all have the same eigenvalue-multiplicity type, the partition 2 + 1 of 3. The relative sizes of the eigenvalues of X ∈ Sym + (p) have no bearing on the eigenvalue-multiplicity type of X; all that matters are the eigenvalue multiplicities. As we shall see later, in our stratification of Diag + (p) the three diagonal matrices in (2.21) represent two different strata. The three "types" we have defined are conceptually different: For X ∈ Sym + (p) and D ∈ Diag + (p) for which (U, D) ∈ E X for some U ∈ SO(p), (i) the concept of orbit-type is based on (though not necessarily equivalent to) diffeomorphism type of the orbit G · X, a submanifold of Sym + (p) diffeomorphic to SO(p)/G D ; (ii) the concept of fiber-type of X is based on the diffeomorphism type of the fiber E X , a possibly non-connected submanifold of SO(p) × Diag + (p) diffeomorphic to finitely many copies of G D (the number of copies being the multinomial coefficient ..kr! appearing in Proposition 2.14); and (iii) the concept of eigenvalue-multiplicity type is based directly on discrete information: the partition [J D ] of p determined by the eigenvalues of X.
Even though the three kinds of "types" are conceptually different, they are equivalent.
same orbit-type = same fiber-type = same eigenvalue-multiplicity type.
Because of Proposition 2.16, we are free to view the orbit-type stratification as an eigenvalue-multiplicity-type stratification, and to label strata accordingly. We will do this in Section 2.7.

Four stratified spaces
As it is clear that the stratifications of Sym + (p) and M = SO(p)×Diag + (p) only depend on the eigenvalues, we also define stratifications of the spaces of eigenvalues: Diag + (p) and Diag + (p)/S p . Typical elements of Diag + (p)/S p will be denoted by [D] = {π · D : π ∈ S p }. Strata of Sym + (p) (thus Diag + (p)/S p ) will be labeled by Part(p); strata of M and Diag + (p) will be labeled by Part({1, . . . , p}). The commutative diagram in Figure 1, with notation as defined in Definition 2.17, indicates the relationships among these spaces and label-sets.
(v) quo 1 and quo 2 are the quotient maps Diag The diagram suggests a natural definition of strata of the four spaces.

Example 2.19
The matrices X 1 , X 2 , X 3 in (2.21) all lie in the same stratum S [J] of Sym + (p), the one labeled by the partition [J] = 2 + 1 of 3. The diagonal matrices D 1 , D 2 , D 3 appearing in the formulas in (2.21) for X 1 , X 2 , X 3 , respectively, lie in two different strata of M : the first two lie in D J1 while the third lies Note that strata need not be connected. For example, the stratum S 2+1 in Sym + (3) has two connected components, one in which the double-eigenvalue is the larger of the two distinct eigenvalues, and one in which it is the smaller. The matrix X 1 in (2.21) lies in the first of these components, while X 2 and X 3 lie in the second. The diagonal matrices is therefore a submanifold of M . The quotient Diag + (p)/S p is simply the pfold symmetric product of R + , which can be identified homeomorphically with This homeomorphism identifies the stratum D [J] of Diag + (p)/S p with a submanifold of (R + ) p (diffeomorphic to a connected component of D J ). Thus our collections of strata of Diag + (p), Diag + (p)/S p , and M meet our definition of "stratified space". As noted earlier, our stratification of Sym + (p) is an orbit-type stratification, hence automatically a Whitney stratification. Thus, each of Diag + (p), M, Diag + (p)/S p , and Sym + (p), equipped with the strata defined above, is a stratified space. Note also that for any J ∈ Part({1, . . . , p}), If J, K ∈ Part({1, . . . , p}) and K is a strict refinement of J (i.e. K refines J but K = J; equivalently, J < K), it is easy to see that every element of the stratum S J in M can be obtained as a limit of a sequence lying in S K , but that no element of S K can be obtained as a limit of a sequence lying in S J (in the limit of a sequence of matrices, distinct eigenvalues can coalesce but equal eigenvalues cannot separate). Thus where S denotes the closure of a stratum S. A similar comment applies to strata D J , D K in Diag + (p); to strata S [J] , S [K] in Sym + (p); and to strata D [J] , D [K] in Diag + (p)/S p . For any of the stratified spaces defined in Definition 2.18, the set of strata has a natural partial ordering, given by (2.25) In each of the stratified spaces above, there is a highest stratum, corresponding to J top and [J top ], and a lowest stratum, labeled by J bot and [J bot ]. Note that for , but the converse is false for p > 2. In view of (2.24)-(2.25), a similar comment applies to M and Sym + (p): , but the converse is false for p > 2. As a counterexample,  (1) can easily be worked out; we will simply state the answers. If J ∈ Part({1, . . . , p}) and

Examples
Using Proposition 2.14, Definition 2.18 and Remarks 2.21 and 2.20, for any given p we can, in principle, describe all the fibers of F and all the strata of M and Sym + (p) very explicitly. As p grows, the number of strata and the number of diffeomorphism-types of fibers grows rapidly, so below we do this exercise only for the cases p = 2 and p = 3. (a) The two-dimensional stratum D Jtop consists of two connected components, In the top panels of Fig. 2, S J bot (respectively, D J bot ) is schematically depicted as the green plane (resp., line), which separates the two connected components of S Jtop (resp., D Jtop ). Stratification of Sym + (2) and Diag + (2)/S 2 . There are two strata of Sym + (2) (and of Diag + (2)/S 2 ), corresponding to the two partitions of 2: [J top ] = 1 + 1, and [J bot ] = 2. It is easily checked that for any J ∈ Part({1, . . . , p}), containing exactly one point of each orbit, and such that Fibers of X ∈ Sym + (2). The fibers are characterized by Corollary 2.14.
circle. An example of this circle is depicted schematically as the red line segment in the top left panel of Fig. 2.  Table 2. The features of the stratum D J we discuss below apply also to the corre-

Example: Sym
In the top panel of Fig. 3, D J bot corresponds to the green line.
The superscripts "pro" and "ob" stand for "prolate" and "oblate", respectively; see below.) The closures of these two connected components intersect in D J bot . In Fig. 3, D J1 corresponds to one of the three shaded planes except the green line. The stratum S J1 also consists of two connected components:   Fibers of X ∈ Sym + (3). In Figure 3, examples of the three types of fibers are provided. To help visualize, we show the projected fibers,

Smooth scaling-rotation curves
The space of eigen-decompositions M = SO(p) × Diag + (p) is a Riemannian manifold. We define the Riemannian metric g M as a product Riemannian metric determined by metrics on SO(p) and Diag + as follows.
The Lie algebra so(p) = T I (SO(p)) is the space of p × p antisymmetric matrices. In this paper, for U ∈ SO(p) we identify the tangent space T U (SO(p)) with the right-translate of so(p) by U : (3.1) The space Diag + (p) is also a Lie group, but since it is an open subset of a vector space, namely Diag(p), we make the identification Using (3.1), the standard bi-invariant Riemannian metric g SO on SO(p) is defined by for D ∈ Diag + (p) and L 1 , L 2 ∈ T D (Diag + (p)). Up to a constant factor, g D + is the only bi-invariant Riemannian metric on Diag + (p) that is also invariant under the action of the symmetric group S p . The product Riemannian metric is determined by the metrics above. Specifically, for (U, where k > 0 is an arbitrary parameter that can be adjusted as desired for applications. Since the metrics g SO and g D + are bi-invariant, the geodesics in M can be obtained as either left-translates or right-translates of geodesics through the identity (I, I). In this paper, the right-translates are more convenient, which is why we have chosen the identification (3.1) of the tangent spaces of SO(p).

Scaling-rotation distance and MSSR curves
Recall that in any group G, an element g is called an involution if g 2 is the identity element e but g = e. Thus R ∈ SO(p) is an involution if R 2 = I = R. That is, involutions in SO(p) are reflections. The cut-locus of the identity in SO(p) is precisely the set of all involutions. For every non-involution R ∈ SO(p), there is a unique A ∈ so(p) of smallest norm such that exp(A) = R; we define log(R) = A. If R is an involution, there is more than one smallestnorm A ∈ so(p) such that exp(A) = R, and we allow log(R) to denote the set of all such A's. However, all elements A in this set have the same norm, which we write as log(R) , where denotes the Frobenius norm on matrices: . Thus log(R) is a well-defined real number for all R ∈ SO(p), even when log(R) is not a unique element of so(p). The geodesicdistance function d M on M is then

Definition 3.3 ([25, Definition 3.10])
. For X, Y ∈ Sym + (p), the scaling-rotation distance d SR (X, Y ) between X and Y is defined by In [25], d SR (X, Y ) is interpreted as "the minimum amount of rotation and scaling needed to deform X into Y ." In the following, we provide an equivalent definition of d SR (X, Y ) as the minimum length of SSR curves from X to Y . However, we have not defined a Riemannian metric on Sym + (p), so there is no "automatic" meaning attached to the phrase length of a smooth curve in Sym + (p).

Definition 3.4.
Let γ be a piecewise-smooth curve in M and let (γ) denote the length of γ.
We say that the MSSR curve χ = F • γ corresponds to the minimal pair formed by the endpoints of γ. (iv) The set of (not necessarily unique) MSSR curves from X to Y is denoted by M(X, Y ). (v) For an SSR curve χ in Sym + (p) we define the length of χ to be (χ) := inf{ (γ) : γ is a geodesic in M and F • γ = χ}.
Definition 3.4(i) also suggests the obvious fact that an (E X , E Y )-minimal geodesic is a minimal geodesic in the usual sense: it is a curve of shortest length among all piecewise-smooth curves with the same endpoints. From the general theory of geodesics (see e.g. [24]), any such curve γ is actually smooth, and, when parametrized at constant speed, satisfies the geodesic equation ∇ γ γ ≡ 0. Thus (3.8) is equivalent to (3.9) Now with Definition 3.4(v), (3.9) becomes (3.10) Remark 3.5. As noted in [25], the "scaling-rotation distance" d SR is not a metric on Sym + (p); it does not satisfy the triangle inequality. (However, its restriction to the top stratum of Sym + (p) is a metric; see [25,Theorem 3.12].) Computing d SR (X, Y ) amounts to optimizing over the fibers of X and Y . Choosing (U, D) ∈ E X , (V, Λ) ∈ E Y , it first appears from (2.16) that this requires optimizing over However, there is quite a bit of redundancy; clearly it suffices to do a continuous optimization over each pair of connected components (an element of Comp(E X ) × Comp(E Y )) and then a combinatorial optimization over the finite set Comp(E X ) × Comp(E Y ). When both X and Y are in the top stratum, the optimization (3.8) is purely combinatorial. More generally, Proposition 2.14(i) implies that |Comp(  . Let X, Y ∈ Sym + (p) and let (U, D) ∈ E X , (V, Λ) ∈ E Y . Let Z be any set of representatives of Γ 0 J D \S + p /Γ 0 JΛ . Then the scaling-rotation distance from X to Y is given by Every minimal smooth scaling-rotation curve from X to Y corresponds to some minimal pair whose first element lies in the connected component [ (U, D)] of E X .
To illustrate the reduction in the number of required continuous optimizations (3.12) is reduced in the computation of d SR (X, Y ), take for example p = 3 and X, Y ∈ S [J mid ] , the middle stratum of Sym + (3), defined in Section 2.8.2. In this case we have |Comp(E X )| = |Comp(E Y )| = 6, but, as we shall see in Section 5.2.1, the set Z in (3.11) has cardinality 3. Thus Proposition 3.6 reduces the number of continuous optimizations needed down to 3.

Existence and uniqueness of MSSR curves
From Proposition 2.14, every fiber of F is compact, so the infimum in (3.8) is always achieved. Hence for all X, Y ∈ Sym + (p), there always exists an (E X , E Y )minimal geodesic, a minimal pair in E X × E Y , and an MSSR curve from X to Y .
Such an MSSR curve may not be unique. In [25], a sufficient condition for uniqueness is given, and an example for p = 2 is provided. With statistical analysis in mind, it is natural to ask: For which X and Y is there a unique MSSR curve from X to Y ? We address this question more generally by characterizing M(X, Y ) for all X, Y ∈ Sym + (p). In Sections 4, 6 and 7, we do this explicitly for low-dimensional cases: p = 2 and 3. As preparation for this work, we briefly discuss here how non-uniqueness can occur and introduce a tool used to characterize M(X, Y ) in low dimensions. A general treatment of this topic can be found in [20].
Different (E X , E Y )-minimal geodesics may or may not project to the same MSSR curve. For given X, Y , for uniqueness of an MSSR curve from X to Y to fail, there must be distinct (E X , E Y )-minimal geodesics γ i : [0, 1] → M , whose endpoints are minimal pairs ( There are two possible ways in which this failure can occur: There exist such γ i whose endpoints are distinct minimal pairs ((U i , D i ), (V i , Λ i )) ("Type I non-uniqueness"), or the same minimal pair ((U, D), (V, Λ)) ("Type II non-uniqueness").
Since for any D, Λ ∈ Diag + (p) the minimal geodesic from D to Λ is unique, Type II non-uniqueness with minimal pair ((U, D), (V, Λ)) is equivalent to the existence of two or more minimal geodesics from U to V , which is equivalent to U −1 V being an involution. For p = 2, 3 it is shown in [25] that Type II non-uniqueness never occurs. This is because that, for p ≤ 3, for any pair U, V ∈ SO(p) such that U −1 V is an involution, there exists a σ ∈ I + p such that d SO (UI σ , V ) < d SO (U, V ). In [20], it is further shown that for small enough values of p, Type II non-uniqueness never occurs; for large enough p, it always occurs. In particular, for p ≤ 4, for all X, Y ∈ Sym + (p) for which M(X, Y ) > 1, the non-uniqueness is purely of Type I.
Our strategy for understanding M(X, Y ) for p = 2, 3 and all X, Y ∈ Sym + (p) is to list all MSSR curves from X to Y . Proposition 3.6 assures us that, for any (U, D) ∈ E X , every MSSR curve from X to Y corresponds to some minimal pair whose first element lies in the connected component [ (U, D)] of E X . We need a way to tell whether MSSR curves corresponding to two minimal pairs with first point in [(U, D)] are the same. The following proposition, a special case of Proposition 4.19 of [20], provides such a tool. We apply this result to the p = 3 case in Section 6.

Case (i) (a > b, c > d).
Let (V i , Λ i ), i = 1, . . . , 4 be the four eigendecompositions of Y . Specifically, these four eigen-decompositions are e c )). 2 2 , and equality holds if and only if θ = π/2. On the other hand, if π/2 < θ < π, 2 4 , and equality holds if and only if θ = 0. Furthermore, we have These inequalities will be used later in the characterization of all MSSR curves for Case (i). Case (i) has seven subcases: three in which there is a unique MSSR curve, three in which there are non-unique MSSR curves with multiplicity 2, and one in which there are non-unique MSSR curves with multiplicity 3. We denote these subcases "d i ", "d i = d j ", and "d 1 = d 2 = d 3 " respectively. In the subcase denoted by "d i ", the MSSR curve from X to Y is unique, has length d SR (X, Y ) = d i , and corresponds to the minimal pair ((U, D), (V i , Λ i )) using (4.2). In the subcase denoted "d i = d j ", there are exactly two MSSR curves from X to Y , of length d SR (X, Y ) = d i = d j , and corresponding to the minimal pairs ((U, D), (V i , Λ i )) and ((U, D), (V j , Λ j )). The notation for the last subcase with three MSSR curves is similarly understood.
The seven subcases are distinguished by the relationship of the quantity m :=
For each given X and Y , if one takes k small enough that m > π/4, then MSSR curves from X to Y are always of type "d 1 " or "d 2 ". In other words, if k is small enough (for fixed X, Y ), the MSSR curve(s) are completely contained in the distinct-eigenvalue subset. 2 . If in addition c = d, then χ(t) = e (1−t)a+tc I.

Scaling-rotation distances on Sym + (3)
For the case p = 3, we will obtain explicit formulas for all MSSR curves and scaling-rotation distances by using the quaternionic parametrization of SO(3). In Section 6, we use this to give explicit descriptions of the set M(X, Y ) of MSSR curves between two points X, Y ∈ Sym + (3) in all "nontrivial" cases (as defined later in this section).

Relation of quaternions to SO(3)
The space H of quaternions, with its usual real basis {1, i, j, k} identified with the standard basis of R 4 , and with {i, j, k} identified with the standard basis of R 3 , provides a convenient parametrization of SO (3). Specifically, writing S 3 there is a natural two-to-one Lie-group homomorphism φ : S 3 H → SO (3), defined as follows. Using the basis {i, j, k} to identify R 3 with Im(H), the space of purely imaginary quaternions, for q ∈ S 3 H and x ∈ Im(H) we set φ(q)(x) = qxq, which lies in Im(H). For q 1 , q 2 ∈ S 3 H we have φ(q 2 ) = φ(q 1 ) if and only if q 2 = ±q 1 . Thus, for any Let S 2 Im(H) = {ã ∈ Im(H) : ã = 1}. Forã ∈ S 2 Im(H) and θ ∈ [0, π] let R θ,ã denotes counterclockwise rotation by angle θ about the axisã ("counterclockwise" as determined byã using the right-hand rule). Let

the set of non-involutions in SO(3). The map s : SO(3) <π → S 3 H defined by
is a smooth right-inverse to φ on SO(3) <π (i.e., φ • s is the identity map on this domain), but s is not a homomorphism and cannot be extended continuously to all of SO (3).

Distances between elements of SO(3) are related very simply to geodesic distances in S 3
H with respect to the standard Riemannian metric on where each of q U , q V , q U −1 V is either of the two elements in S 3 mapped by φ to U, V, and U −1 V respectively.

Parameters corresponding to different strata
For any subgroups H 1 , H 2 of SO(3), the map φ induces a bijection " D), and (V, Λ) be as in Proposition 3.6, and let Z be a set of representatives of " From equation (5.3) and the fact that φ is a homomorphism, it follows that in the setting of equation (3.11) From Proposition 3.6, we therefore have As equations (5.8)-(5.9) suggest, computing d SR (X, Y ) is a minimization problem that breaks into two parts, one over the discrete parameter-set Z and the other over the (potentially) "continuous" parameter-set G 0 D × " G 0 Λ . Both parameter-sets depend on X and Y .
If X or Y lies in the bottom stratum of Sym + (3) (i.e., has only one distinct eigenvalue), then the set Z has only one element ζ, which we can take to be 1 ∈ H, and at least one of the groups G 0 , the inner minimumd(ζ) in (5.8) is 0, and we immediately obtain d SR (X, Y ) = log(ΛD −1 ) . We do not need to use quaternions to obtain this result; it follows just as quickly from (3.11).
At the other extreme, if X and Y both lie in the top stratum of Sym + (p) (i.e. both have three distinct eigenvalues) then G 0 D = " G 0 Λ = {±1}, and the inner minimum is trivial to compute (d(ζ) = cos −1 |Re(ζq U −1 V )|), so we are reduced immediately to a single minimization over Z. As in the previous case, we do not need the quaternionic reframing of the distance formula at all: already in (3.11) we have G 0 D = G 0 Λ = {I}, so the distance can be found simply by minimizing over the discrete variable g ∈ Z =S + 3 . We need only have a computer calculate d M ((U, D), (V P −1 g , g · Λ) for each of the 24 g's and return the corresponding minimal pairs and MSSR curves. For combinatorial reasons, a complete algebraic classification of the pairs (X, Y ) (with X, Y both in the top stratum of Sym + (3)) for which M(X, Y ) has a given cardinality would be very complex, and we do not attempt this.
For the above reasons, for the remainder of this section we focus on the cases in which X and Y do not both lie in the top stratum of Sym + (3), and neither lies in the bottom stratum. Thus we restrict attention to the cases in which one of the matrices X, Y has exactly two distinct eigenvalues, and the other has either two or three. We refer to these cases as the "nontrivial" cases (because the set of distances between elements of E X and elements of E Y is not a finite set). To analyze them we introduce the following notation:

Scaling-rotation distances for Sym + (3) in the nontrivial cases
From now through Section 6 we assume that X ∈ S mid and that either Y ∈ S top or Y ∈ S mid . Then X has an eigen-decomposition (U, D) with D ∈ D J1 , and if Y ∈ S mid then Y has a eigen-decomposition (V, Λ) with Λ ∈ D J1 . We will always assume that our pairs (U, D), (V, Λ) have been chosen this way. Then we have It is not hard to check that The inner minimum in (5.8) is then Obviously, minimizing the arc-cosines above is equivalent to maximizing where the maximum is taken over (r U , r V ) ∈ S 1 C × S 1 C in (5.14), and over just r U ∈ S 1 C in (5.13).

The discrete parameter-sets Z in the nontrivial cases
To compute the outer minimum (over ζ) in (5.8), we will need to select sets Z of representatives of " . Thus the cardinality of Γ 1 \ Γ is 48/8 = 6. One can check that the following set Z 1, * contains a representative of each of the six right Γ 1 -cosets: Table 3 Representatives of the double-coset space " For case (ii), the double-coset space can be viewed as the set of orbits under the action of Γ 1 on Γ 1 \ Γ (the coset-space in case (i)) by right-multiplication. Thus a set of representatives can be found by imposing the double-coset equivalence relation on the set Z 1, * above. The four elements 1±j √ 2 , 1±k √ 2 all lie in the same It is easily checked that no two of 1, j, and 1+j √ 2 lie in the same ( Γ 1 , Γ 1 ) doublecoset. Hence Z 1,1 := {1, j, 1+j √ 2 } is a set of representatives of Γ 1 \ Γ/ Γ 1 . The elements ζ of Z 1, * are listed in Table 3, along with the images φ(ζ) ∈ SO(3) and π ζ ∈ S 3 . Since Z 1,1 ⊂ Z 1, * , a separate listing for Z 1,1 is not needed. In Table 3 and henceforth, we write π id for the identity permutation, and, for distinct a, b ∈ {1, 2, 3}, we write π ab for the transposition (ab), the permutation that just interchanges a and b. Remark 5.1. As seen in Section 2.8.2, E X has six connected components, each diffeomorphic to the circle G 0 J1 . In Proposition 2.14, for general p and X we exhibited a bijection between Comp(E X ) and the left-coset spaceS + p /Γ 0 J D . For any group G and subgroup H, the inversion map G → G induces a 1-1 correspondence between left H-cosets and right H-cosets, so (for general p and X), In our current p = 3, X ∈ S mid setting, the set Z 1, * := {φ(ζ) : ζ ∈ Z 1, * } is a set of representatives of The fact that right Γ 1 -cosets appear here instead of left cosets is an artifact of our having chosen X, rather than Y , to lie in S mid .

Hypercomplex reformulation of the continuous-parameter minimization
We now have To allow us to refer efficiently to the minimization-parameters in (5.16) without too much separate notation for the two cases Y ∈ S top , Y ∈ S mid , for both cases we will refer to the triple (ζ, r U , r V ), with the understanding that we always take r V = 1 when Y ∈ S top . Recall that quaternions can be written in "hypercomplex" form: we regard the complex numbers C as the subset {a+bi} ⊂ H, and write

. This gives us a natural identification
To perform the maximization of (5.13) and (5.14) (in order to minimize the arc-cosines in (5.12)), we will write q U −1 V in hypercomplex form: Henceforth whenever we refer to the quantities z and w, they are regarded as functions of the pair (U, V ), satisfying (5.18), and with the pair (z, w) determined only up to an overall sign. Because the parameters r U , r V in (5.13) and (5.14) run over the unit circle in C, it is easy to maximize (5.13) and (5.14) explicitly for each ζ in Z 1, * and Z 1,1 , respectively, and then to maximize over ζ. To express some of our answers, we define the following quantities, which we may regard as functions of the pair (U, V ): (iv) the ellipsoids of revolution corresponding to the matrices X and Y have the same axis of symmetry. The latter condition is obviously intrinsic to the pair (X, Y ), independent of any choices of eigen-decompositions. Note also that when we want to find all MSSR curves from X to Y , we do not need to express these in terms of arbitrary eigen-decompositions with D, Λ ∈ D J1 ; it suffices to use any that we find convenient. Thus, given the eigen-decomposition (U, D) of X, if (5.21) (hence (5.22)) holds we are free to replace V with U , in which case U −1 V = I and (z, w) = (±1, 0). We will adopt the following convention: and U, V are such that w = 0 or z = 0, we replace V with U , and replace (z, w) with (1, 0). We do not change U . (13) , (12) ,  , d 2 , d 2 ) and Λ as diag(λ 1 , λ 2 , λ 3 ), we also have the following comparisons of id , (13) , and (12) :
(ii) We use (5.16) to compute d SR (X, Y ). We proceed by determining the "inner" minimum for each ζ ∈ Z 1, * , and comparing the answers for the different ζ's. For a given ζ, minimizing the arc-cosine in (5.16) is equivalent to maximizing expression (5.13). Below, we use the notation (5.18), and for any nonzero ξ ∈ C we setξ := ξ/|ξ|. Facts used repeatedly in these calculations are that for all ξ ∈ C, (i) ξj and ξk are linear combinations of j and k with real coefficients, hence are purely imaginary; and (ii) jξ =ξj and kξ =ξk. We then compute Hence for each ζ ∈ Z 1, * , the value of max r U ∈S 1 C Re ζ r U (z + jw) is the entry in the last column of the corresponding line of Table 4; let us denote this as |f 2 (ζ)|, where f 2 (ζ) ∈ C. The set of elements r U ∈ S 1 C at which the maximum is attained is {± ' f 2 (ζ)} if f 2 (ζ) = 0, and all of S 1 C if f 2 (ζ) = 0. Since |z| 2 + |w| 2 = 1, we have |z ± w| 2 = 1 ± 2Re(zw), |z ± iw| 2 = 1 ∓ 2Im(zw). Thus, grouping together the elements ζ ∈ Z 1, * corresponding to the same permutation π ζ , we have the following: from which (5.27) follows. The derivations of (5.28) and (5.29) are similar.
(iii) We use the same strategy as in part (ii), but now with ζ ranging only over the set Z 1,1 = {1, j, ζ j,+ }, and with r U , r V both allowed to vary over S 1 C . This time we find and that the pairs (r U , r V ) at which the maximum is achieved are all those for which r V r U =ẑ if ζ = 1, and for which r V r U = ±ŵ if ζ = j. Thus, for these two ζ's, the set of maximizing pairs (r U , r V ) is the two circles' worth of pairs appearing in the triples (ζ, r U , r V ) in the lines for classes A 1 and A 2 in Table 4, and the last entry of each line is the corresponding maximum value (5.37). Now consider ζ = ζ j,+ = 1+j √ 2 . Since r U , r V are unit complex numbers, it is clear from (5.36) that for all r U , r V , First assume that z = 0 = w. Then the upper bound on |f 3 (ζ j,+ , r U , r V )| in (5.38) will be achieved by a pair (r U , r V ) if and only if (r V r U , r V r U ) = ±(ẑ,ŵ) . (5.39) But (5.39) is easily solved; the solution-set is exacly the set of four pairs (±(ŵẑ) 1/2 , ±(ŵẑ) 1/2 ) appearing Table 4 for Class B . Thus the upper bound in (5.38) is actually the maximum value of |f 3 (ζ j,+ , ·, ·)|. Now assume that w = 0 or z = 0; we define the corresponding set of pairs in E X × E Y (i.e. those pairs ((Uφ(r U ), D), (V φ(r V )φ(ζ j,+ ) T , Λ π ζ j,+ )) for which (r U , r V ) maximizes |f 3 (ζ j,+ , ·, ·)|) to be Class C . If w = 0 then |z| = 1, and we need only maximize 1 √ 2 |Re (r V r U z) |. Since |z| = 1, the maximum value is 1 √ 2 , and is achieved at all pairs (r U , r V ) for which r V r U = ±z. Similarly, if z = 0 then |w| = 1, and we need only maximize 1 √ 2 |Re (r V r Uw ) |. The maximum is again 1 √ 2 , now achieved at all pairs (r U , r V ) for which r V r U = ±w. Hence, for ζ = ζ j,+ , whether or not z and w are both nonzero, the right-hand side of (5.38) is the maximum value of |f 3 (ζ j,+ , ·, ·)|. But if z = 0 = w there are only four maximizing pairs (r U , r V ), while if w = 0 or z = 0 there are infinitely many. As noted in Remark 5.2, in the latter case we may replace V with U , in which case (z, w) = (1, 0) (Convention 5.3) and the maximizing pairs (r U , r V ) are exactly those listed for Class C in Table 4.
(iv) In this case G 0 D = SO (3), Γ 0 J D =S + 3 , and the set Z in Proposition 3.6 has only one element g, which we can take to be the identity. The set (3), the inner minimum in (3.11) is 0, and d SR (X, Y ) = log(ΛD −1 .

Remark 5.5 (Insensitivity to choice of eigen-decompositions). By definition,
that we have used to write down the formulas in Theorem 5.4. However, the assumption that D ∈ D J1 in the parts (ii) and (iii) of the theorem limits (U, D) to particular pair of connected components of E X out of the possible six. A similar comment applies in part (iii) to the choice of (V, Λ). So the individual numbers id , (13) , (12) on the right-hand sides of (5.23) and (5.30), which represent distances between the connected component [(U, D)] and the various connected components of E Y , may depend on the choice of (U, D), but changing (U, D) to a different pre-image of X (not necessarily with D ∈ D J1 ) must give us the same set of component-distances, and cannot change any of the the numbers id , (13) , (12) at all if the new pre-image is in the same connected component as the old. The latter a priori truth is reflected in the formulas given in , changes (z, w) to (ξz,ξw) for some ξ ∈ S 1 C .) Thus when X ∈ S mid , and Y ∈ S mid or Y ∈ S top , in (5.18)-(5.20) we can regard |z|, |w|, andzw as functions of a pair ([(U, D)], [(V, Λ)]) of connected components of fibers. Therefore the same is true of the quantities id , (13) , (12) (j), D) has the effect of replacing (z, w) by ±(w, −z), which leaves the quantities ϕ, β, β in (5.19)-(5.20) unchanged, and hence leaves each of the numbers id , (13) , (12) in (5.24)-(5.26) and(5.31)-(5.32) unchanged. The fact that replacing (U, D) by other pre-images of X cannot change the set { id , (13) , (12) } is also reflected, later in Theorem 6.3, by the symmetry of the last column of Table 5 under permutations of id , (13) , (12) .

MSSR curves for Sym + (3) in the nontrivial cases
Recall that for X, Y ∈ Sym + (p), M(X, Y ) denotes the set of all MSSR curves from X to Y . In this section, for p = 3 we determine the set M(X, Y ) for all X ∈ S mid , Y ∈ S mid S top (what we are calling the "nontrivial cases").

Explicit characterization of all MSSR curves in the nontrivial cases
For any X, Y ∈ Sym + (3) and (U, D) ∈ E X , Proposition 3.6 assures us that every MSSR curve from X to Y corresponds to some minimal pair whose first element lies in the connected component [(U, D)]. When X ∈ S mid , by keeping track of the triples (ζ, r U , r V ) at which the minimum values in (5.16) are achieved, we can find all the minimal pairs in E X × E Y whose first point lies in [(U, D)] of E X .
The classification we will give of MSSR curves involves six classes of scalingrotation curves when Y ∈ S top , and four classes when Y ∈ S mid . Not all of these classes occur for a given X and Y , and when they do occur they are not necessarily minimal. The (potentially) minimal pairs giving rise to the various classes of scaling-rotation curves can be described in terms of the data z, w and the triple (ζ, r U , r V ). Our names for these classes of pairs and curves, and the data (ζ, r U , r V ) corresponding to each class, are listed in Table 4. For the ζ appearing in each line of the table, the accompanying values of (r U , r V ) are all those that minimize the arc-cosine term in the corresponding line of (5.16), provided that any unit complex numberξ = ξ/|ξ| appearing in that line's indicated formula for (r U , r V ) is defined (i.e. provided ξ = 0); see the proof of Theorem Table 4 Names and data for classes of pairs in E X × E Y that, for some X in S mid and some Y in Stop or S mid , determine at least one MSSR curve from X to Y . For any nonzero ξ ∈ C,ξ is the unit complex number ξ/|ξ|, and ξ 1/2 is an arbitrary choice of one of the two square roots of ξ. Wherever a number of the formξ appears in this For Y ∈ Stop: class defined if w = 0 or z = 0 but left undefined otherwise. , reflected by restrictions on z and w that depend only on these connected components; e.g. for Class B 1 to be minimal we need Re(zw) ≥ 0, and for Class A 1 to be minimal we need |z| ≥ |w|. The full set of restrictions can be read off from Tables 5 and 6, which are part of Theorem 6.3 below.
Remark 6.2. In our application of Corollary 6.1 to the proof of Theorem 6.3 below, we will have D ∈ D J1 , and hence the quaternions r U,i , r V,i in (6.1) and (6.4) will lie in S 1 C . Note also that the only permutations π for which π · D = D are the identity and the transposition π 23 . Thus the only ζ's that can satisfy (6.2) are those that lie in the group However, in general the ζ i in (6.1) need not lie in C. (ii) For any data-triple (ζ, r U , r V ) as in Table 4, let is a minimal pair in each case listed in Tables 5 and 6 Table 4 and either Table 5 or Table 6.
(iii) For Y ∈ S top , depending on the value of Y the set M(X, Y ) can consist of one, two, three, or four curves, as detailed in Table 5. In Tables 5 and 6, note that "|M(X, Y )| = 1" means precisely that there is a unique MSSR curve from X to Y .
(iv) For Y ∈ S mid , depending on the value of Y the set M(X, Y ) can consist of one, two, three, or infinitely many curves, as detailed in Table 6. When |z| ≥ |w| (respectively, |z| ≤ |w|), all minimal pairs in Class A 1 (resp. A 2 ) determine the same MSSR curve, so to write down this curve it suffices to take r = 1 in the data-triple for this class in Table 4. Thus when X ∈ S mid we can always choose our pair of pre-images (U, D), (V, Λ) (with D ∈ D J1 ) to satisfy |z| ≥ |w|, or Re(zw) ≥ 0, or Im(zw) ≥ 0 (though not necessarily more than one of these inequalities at the same time). This explains the "symmetry" in Tables 5 and 6 Table 6-namely, id ≥ (12) and either w = 0 or z = 0-can be described more explicitly and geometrically in terms of the ellipsoids of revolution corresponding to X and Y . Recall from Remark 5.2 that "w = 0 or z = 0" is equivalent to the condition that these ellipsoids have the same axis of symmetry, and to the condition ϕ = 0. But (5.33) shows that when ϕ = 0, the condition id ≥ (13) is equivalent to (6.6) Table 5 The set M(X, Y ) of minimal smooth scaling-rotation curves from X to Y when X has exactly two distinct eigenvalues and Y has three. Data-combinations that are mutually exclusive are not shown (e.g. if id = (13) < (12) , it is impossible to have |z| − |w| = 0 = Re(zw)). In the subcase of id = (13) = (12) in which |z| = |w|, the hypothesis Re(zw) = 0 = Im(zw) is redundant; it is already implied by the case/subcase hypotheses. (This follows from Theorem 5.4; see the proof of Theorem 6.3.) (13) , (12) In particular, log ä must have opposite signs for (6.6) to hold, so one of the ellipsoids must be prolate and the other oblate. Conversely, given two ellipsoids of revolution with the same axis of symmetry, one prolate and the Table 6 The set M(X, Y ) of minimal smooth scaling-rotation curves from X to Y when each of X and Y has exactly two distinct eigenvalues. See text for notation.
other oblate, if their "prolateness-oblateness product" is sufficiently large-i.e. if (6.6) holds-then the set of MSSR curves from X to Y will include the 1parameter family M C . The proof below of Theorem 6.3(i) shows that for such X and Y , a choice of orientation of the common axis of symmetry naturally determines a continuous one-to-one correspondence between the "equator" of X (or Y ) and the family M C . For a graphical example illustrating several members of the family M C as evolutions of the X-ellipsoid to the Y -ellipsoid, see Fig. 17 in Section 7.
Proof of Theorem 6.3. (i) By definition, each curve-class M l is a set of (not necessarily minimal) smooth scaling-rotation curves corresponding to pairs for which r U maximizes the function |f 1 (ζ, ·)| given by (5.34) if Y ∈ S top , or for which (r U , r V ) maximizes the function |f 3 (ζ, ·)| given by (5.36) if Y ∈ S mid . In the proof of Theorem 5.4 we established that Table 4 lists all the corresponding triples (ζ, r U , r V ), with the exception that for class C we followed Convention 5.3 and listed the corresponding triples only for the case (z, w) = (1, 0) (see Remark 5.2). For i ∈ {1, 2} let (ζ, r U,i , r V,i ) be two such triples listed in Table 4 corresponding to the same class M l , and let χ i be the MSSR curves they determine. First assume that l = C . Then r U2 = ±r U,1 and r V,1 = ±r V,2 , so φ(r U2 ) = φ(r U,1 ) and φ(r V,1 ) = φ(r V,2 ). Hence the minimal pair in (SO × Diag + )(3) determined by (ζ, r U,i , r V,i ) is the same for both values of i, so χ 2 = χ 1 . Thus M l consists of a single curve. Now assume that the (ζ, r U,i , r V,i ) are associated with class C and that (z, w) = (1, 0). Then (ζ, Thus by Corollary 6.1, a necessary condition to have χ 2 = χ 1 is But (r i ) 2 j ∈ span{j, k}, which holds only if r 2 = ±r 1 . Thus if r 2 = ±r 1 , then χ 2 = χ 1 . Conversely, suppose that r 2 = r 1 , where = ±1. Then (6.7) holds, r U,1 r U,2 = , and (6.2)-(6.4) are satisfied with ζ = 1 and r = . Corollary 6.1 then implies that χ 2 = χ 1 . Thus for triples (ζ j,+ , r U,i , r V,i ) associated with class C , a necessary and sufficient condition for χ 1 , χ 2 to coincide (following Convention 5.3) is r U,2 = ±r U,1 , which is equivalent to R U,1 = R U,2 in SO (3). Since R U,i ∈ G 0 D , the preceding sets up a one-to-one correspondence between M C and the circle G 0 D :  (3)) is Hausdorff), which is M C . Therefore, in this natural topology, M C is homeomorphic to a circle.
While the above map F 1 explicitly parametrizes M C by the circle G 0 D , this parametrization is not canonical-it depends on several non-unique choices, such as a particular matrix U ∈ SO(p) among all those that satisfy UDU T = X, and our choice of representative ζ j,+ of the double-coset ( " Γ 0 D , " Γ 0 Λ ) double-coset (in Γ) in which ζ j,+ lies. There is a more directly geometric parametrization of M C = M C (X, Y ), which we exhibit next, by a circle in R 3 determined by the ellipsoids Σ X , Σ Y to which X, Y correspond.
Recall that under the Class C hypotheses (w = 0 or z = 0), X and Y have the same, unique, axis of circular symmetry L (see Remark 6.5), and hence also have a common "equatorial plane" L ⊥ . For t ∈ [0, 1] and R ∈ G 0 D , let Σ R t be the ellipsoid in R 3 corresponding to χ R C (t); note that Σ R 0 = Σ X and Σ R 1 = Σ Y for all R.
Hence γ R (1) lies in the unit circle C in the equatorial plane L ⊥ . It is easily checked that the continuous map F 2 := F 2,v0 : G 0 D → C given by F 2 (R) = γ R (1) is a bijection, hence a homeomorphism. Thus the map F v0 : One can easily check that there is at most one t ∈ (0, 1) for which the eigenvalues of χ R C (t) are not all distinct. Thus γ R is the unique continuous map This characterization shows that the parametrization F v0 : C → M C (X, Y ) is canonical up to the choice v 0 of one of the two unit vectors L. Given v 0 and a vector w ∈ C, there is a unique χ = χ w,v0 ∈ M C (X, Y ) such that the curveγ χ defined above hasγ χ (0) = v 0 andγ χ (1) = w. Moreover,γ χ,−v0 (1) = −γ χ,v0 (1), so the two parametrizations are simply related to each other ( (ii) Our proof of Theorem 5.4 established that all the MSSR curves from X to Y are accounted for by the curves coming from minimal pairs in the classes listed in Table 4. The first element (UR U , D) of each such pair lies in [(U, D)], since R U ∈ G 0 D . It remains only to establish that all MSSR curves are accounted for by one of the (sub)cases listed in Table 5 or Table 6, and that necessary and sufficient conditions for the curve(s) in a given class M l to be minimal are the conditions that can be read off from Table 5 if Y ∈ S top , or Table 6 if Y ∈ S mid . (For example, if Y ∈ S top , to read off from Table 5 the conditions for the (unique) curve χ A1 in M A1 to be minimal, we simply take the union of all the cases for which χ A1 is an element of M(X, Y ), as indicated by the third column of the table. These conditions reduce to: |z| ≥ |w| and id ≤ min{ (13) , (12) }.) First assume that Y ∈ S top . Equations (5.27)- (5.29) show that no nonvacuous subcases have been omitted in Table 5. (For example, if |z| − |w| = 0 = Re(zw) then ϕ = π 4 = β, so (5.27) shows that 2 id − 2 (13) = 0, since, by hypothesis, d 1 = d 2 and λ 1 = λ 3 ; thus for the two cases in Table 5 in which id = (13) , there are no "|z|−|w| = 0 = Re(zw)" subcases. In the last case in the table, equations (5.27)- (5.29) show that no two of the three angles ϕ, β, β can be equal, and hence that if |z| = |w| [equivalently, ϕ = π 4 ], then automatically Re(zw) = 0 = Im(zw); else we would have β = π 4 or β = π 4 . Thus the hypothesis Re(zw) = 0 = Im(zw) in the |z| = |w| subcase of id = (13) = (12) in which |z| = |w| is redundant, as asserted in the table's caption.) Hence every MSSR curve from X to Y occurs in one of the subcases listed in column 2 of Table 5.
Thus, for Y ∈ S top , all MSSR curves from X to Y are accounted for in Table 5, and in each case listed in the table, a curve χ l is minimal (where where l ∈ {A 1 , A 2 , B 1 , B 2 , C 1 , C 2 }, ζ l is the element of Z 1, * that appears in the triples to the right of class-name l in Table 4, and f 4 is as in the proof of Theorem 5.4 (again cf. (5.16)). Then id = min{ A 1 , A 2 }. A set of necessary and sufficient conditions to have (13) }. Noting that that exactly one of the classes B and C is defined for a given Y , a necessary and sufficient conditions to have d SR (X, Y ) = B (respectively C ) is (13) The same reasoning used in the case Y ∈ S top shows now that A 1 ≤ A 2 if and only if |z| ≥ |w|, and that if d SR (X, Y ) = l , then the curve-class M l associated with the data listed in Table 4 is defined. It follows that for Y ∈ S mid , all MSSR curves from X to Y are accounted for in Table 6, and in each case listed in the table, the curve(s) χ in the class M l (where l ∈ {A 1 , A 2 , B , C }) is/are minimal if and only if the conditions indicated in the table are satisfied (modulo Convention 5.3 in the case of class C ).
(iii) Since we have now established that M(X, Y ) consists of precisely those curves listed in Table 5 for the subcase corresponding to the given data ((U, D), (V, Λ)), it suffices to show that if χ 1 , χ 2 are MSSR curves in distinct classes Given such l 1 , l 2 , for i ∈ {1, 2} let (ζ i , r U,i , r V,i ) be a triple from Table 4 correspond to class l i . Since r V,1 = 1 = r V,2 for all such triples in all classes corresponding to Y ∈ S top , we may rewrite (6.1) as ζ 1 ζ 2 = ±r U,1 r U,2 . But r U,1 r U,2 ∈ C, so by Corollary 6.1 a necessary condition to have χ 1 = χ 2 is (6.12) We compute the following: Since ζ 1 ζ 2 / ∈ C in every case, it follows that χ 1 = χ 2 . (iv) Analogously to part (iii), since we have now established that M(X, Y ) consists of precisely those curves listed in Table 6 for the subcase corresponding to the given data ((U, D), (V, Λ)), and that M C (when defined) contains infinitely many curves, it suffices to show that if χ 1 , χ 2 are MSSR curves in distinct classes l 1 , l 2 ∈ {A 1 , A 2 , B , C }, then χ 1 = χ 2 . Since exactly one of the classes M B , M C is nonempty, we do not need to consider the case {l 1 , l 2 } = {B , C }. Because of Convention 5.3, we also do not need to consider the case {l 1 , l 2 } = {A 2 , C }. Thus we need only consider the case-pairs (l 1 , Given such (l 1 , l 2 ), for i ∈ {1, 2} let (ζ i , r U,i , r V,i ) again be a data-triple from Table 4 correspond to class l i . By Corollary 6.1, to show χ 1 = χ 2 it suffices to show that (6.1) is not satisfied. If l 1 = A 1 then ζ 1 = 1, and (6.1) cannot be satisfied unless ζ 2 ∈ C, which does not hold since l 2 = A 1 . For the case (l 1 , l 2 ) = (A 2 , B ), if (6.1) were satisfied we would have ζ j,+ = ξ 1 jξ 2 for some ξ 1 , ξ 2 ∈ C, an impossibility since ξ 1 jξ 2 = ξ 1 ξ 2 j ∈ Cj = span{j, k}.

Algorithm for computing MSSR curves for p = 3 in the nontrivial cases
Let X, Y ∈ Sym + (3) be as in Theorem 6.3. Starting with eigen-decompositions (U, D) of X, (V, Λ) of Y , an algorithm to compute all the MSSR curve(s) from X to Y is as follows. This algorithm applies only when p = 3, and only to the nontrivial cases.
Step 1. If U −1 V is not an involution, proceed to Step 2. If U −1 V is an involution, find an even sign-change matrix The pair (V I σ , Λ) is still a pre-image of Y since the action of sign-change matrices on diagonal matrices is trivial. Replace V by V I σ , renamed to V . Proceed to Step 2.
Note that for any class other than C , all the (r U , r V ) pairs in Table 4 determine the same scaling-rotation curve, so just choose one pair from this line of the table. If the data are in class C , there will be one MSSR curve for each r ∈ S 1 C , but the ± sign in the table can be ignored (treated as +), since the sign does not affect the image under φ.
Step 4. For the chosen (r U , r V ) in each minimal class (there will only be one in each class except for Class C ), compute the rotations R U = φ(r U ), R V = φ(r V ) from the unit complex numbers r U , r V using the general formula φ(e ti ) = ⎡ ⎣ 1 0 0 0 cos 2t − sin 2t 0 sin2t cos 2t ⎤ ⎦ for t ∈ R. (6.16) (Note that if we identify the x 2 x 3 plane with C via (x 2 , x 3 ) ↔ x 2 + x 3 i, then for ξ ∈ S 1 C the lower right 2 × 2 submatrix of φ(ξ) corresponds simply to multiplication by ξ 2 . In Case B this conveniently "undoes" the square roots in Table 4; for example, if r U = ±(ẑŵ) 1/2 , then φ(r U ) is the rotation about the x 1 axis that corresponds to multiplying x 2 + x 3 i byẑŵ.) Step 5. Read off the value of φ(ζ) from Table 3. Then plug this and the values of (R U , R V ) computed in Step 4 into (6.5), yielding (for each of these pairs) the endpoints of a geodesic from E X to E Y whose projection to Sym + (3) is an MSSR curve. Step 6. For each of the endpoint-pairs computed in Step 5, writing the endpoints as (U 1 , D) ∈ E X , (V 1 , Λ 1 ) ∈ E Y , set A = log(U −1 1 V 1 ), L = log(D −1 Λ 1 ). Then use formulas (3.5) and (3.6) (with U 1 playing the role of U in these formulas) to compute the formula for the corresponding MSSR curve χ : [0, 1] → Sym + (p). Remark 6.6. For the case in which each of X and Y has exactly two distinct eigenvalues, this algorithm for computing closed-form expressions for MSSR curves in the p = 3 nontrivial cases replaces the numerical algorithm in [25] described therein after Theorem 4.3.
if a > b = c, and tri-axial ellipsoid (or tri-axial) if a > b > c. We assume below that D and Λ have been chosen to lie in the same connected component of D Jtop . and that Re(zw) = 0 ⇐⇒ cos α = 0 ⇐⇒ α = π 2 , (7.8) Im(zw) = 0 ⇐⇒ cos α = 1 ⇐⇒ α = 0. (7.9) For any α < π/2, Re(zw) can have either sign. For α > 0, Im(zw) can have either sign. We also make use of the following parameters, concerning the eigenvalues of X and Y , scaled by k (where k > 0 is as in (3.4)): Each of m 1 and m 2 can be either positive or negative, but they must both have the same sign. To simplify the analysis, we assume m 1 = m 2 := m . if ϕ = 0. We have not found a similarly simple inequality equivalent to (7.14). Figure 9, generated numerically, indicates the regions of (ϕ, α) corresponding to different size-orders of id , 13 , 12 , for the fixed value m = −0.1. The seven cases of unique and non-unique MSSR curves summarized in Table 5 are graphically represented in Figs. 10 and 11. As |m | increases, either A l or B m becomes the only case of MSSR curves.
In the following we provide several examples of the unique and non-unique cases of MSSR curves for the case X ∈ S mid and Y ∈ S top . We first discuss the shape classification changes of the scaling-rotation curves χ A l (t), χ Bm (t) and χ Cn (t) (l, m, n = 1, 2). These depend on the sign of m . 2. If m < 0 (that is, X is oblate, and Y is tri-axial), then the shapeclassification changes of χ A l (t) are (oblate → tri-axial → prolate → triaxial → oblate → tri-axial); for χ Bm (t) they are (oblate → tri-axial); for χ Cn (t) they are (oblate → tri-axial → prolate → tri-axial).
Two scaling-rotation curves A 1 and A 2 (or B 1 and B 2 , C 1 and C 2 ) share the same scaling parameters, and also share the same rotation axis, but with different orientations (one clockwise, the other counterclockwise) and possibly different angles. The two angles differ by π.
In an attempt to visually illustrate examples in Fig. 12 to Fig. 14, a scalingrotation curve χ := χ U,D,A,L in the 6-dimensional space Sym + (3) is depicted by both a sequence of ellipsoids (representing the discretized scaling-rotation curve χ) and the combination of the rotation parameter A and changes of eigenvalues  Fig. 3 for the definition of shaded planes.) Since the rotational degrees of freedom have been projected out in the bottom panel, the relative lengths of the straight-line-segments do not accurately reflect the relative lengths of the curves.

The case in which both X and Y have exactly two distinct eigenvalues
For this special case, we parameterize X and Y with a, b, c, d ∈ R (not ordered), and a unit quaternion q = z + wj ∈ S 3 H , where z, w ∈ C, |z| 2 + |w| 2 = 1 as where U ∈ SO(3). Without loss of generality, we assume Re(q) = Re(z) > 0 so that φ(q) is not an involution. There are four different cases of MSSR curves arising from the parametrization of (7.15), as summarized in Table 4. These cases are denoted A 1 , A 2 , B and C . Our goal here is to further investigate the seven subcases of M(X, Y ) in Table 6, by partially rewriting the conditions in Table 6. Note that 0 < |z| ≤ 1, . Unlike in the Sym + (2) case (cf. Section 4), this m can be either positive or negative, but not zero. If X and Y are both prolates (or both oblates), then m > 0. If X is an oblate and Y is a prolate (or vice versa), then m < 0. By Theorem 5.4, ).  For each of the case C , the rotation axis (depicted as the black line segment) is orthogonal to the (red) major semi-axis of X. It is shown in the proof of Theorem 6.3(i) that there is one-to-one correspondence between the "equator" of X and the family M C ; see also Remark 6.5.