The physical foundations of geometric mechanics

The principles of geometric mechanics are extended to the physical elements of mechanics, including space and time, rigid bodies, constraints, forces, and dynamics. What is arrived at is a comprehensive and rigorous presentation of basic mechanics, starting with precise formulations of the physical axioms. A few components of the presentation are novel. One is a mathematical presentation of force and torque, providing certain well-known, but seldom clearly exposited, fundamental theorems about force and torque. The classical principles of Virtual Work and Lagrange–d’Alembert are also given clear mathematical statements in various guises and contexts. Another novel facet of the presentation is its derivation of the Euler–Lagrange equations. Standard derivations of the Euler–Lagrange equations from the equations of motion for Newtonian mechanics are typically done for interconnections of particles. Here this is carried out in a coordinate-free manner for rigid bodies, giving for the first time a direct geometric path from the Newton–Euler equations to the Euler–Lagrange equations in the rigid body setting.

geometric mechanics typically begins with a mathematical framework, e.g., Riemannian geometry, symplectic or Poisson geometry, Dirac structures, etc., and develops this in various directions. Good examples of this are the books [Abraham and Marsden 1978, Bloch 2003, Libermann and Marle 1987. With such treatments, the level of mathematical rigour and elegance is very appealing, and indeed is doubtless instrumental in the development of "geometric mechanics" becoming a field of mathematics in its own right. However, the entry point in the modelling process of such treatments is quite far along from what one might call "first principles." On the other hand, the derivations of physical models from these first principles have been developed primarily in the engineering, applied mechanics, or physics literature. As such, they are typically presented in a language and style that is often difficult for a mathematician to understand [e.g., , Papastavridis 1999, Pars 1965, Whittaker 1937]. There have been various attempts to intermingle mathematical rigour with classical mechanics [e.g., Arnol d 1989, Bullo and Lewis 2004, Murray, Li, and Sastry 1994, Spivak 2010. Nonetheless, until now the project has not been fully undertaken to extend the principles of sound mathematics to all aspects of the basic physical modelling of mechanical systems. 1 1.1. Features of the presentation. We shall outline the existing work, while at the same time presenting what is novel about our approach.
1. As concerns essential mechanical ideas, the paper is completely self-contained. Mechanical elements are given mathematical definitions that we believe connect directly to physically meaningful constructions. That is to say, we do not skip the steps from the physical idea to its mathematical representation. In particular, our definitions of velocity constraints, and forces and torques give precise mathematical meaning to the way in which one models these physical notions in practice. We cannot point to a treatment in the existing literature where mathematical precision and a sound physical basis appear side-by-side in this way. Typically, for example, mathematical treatments of classical mechanics neglect any comprehensive treatment of force and torque.
2. Our physical axioms are uncontroversial for the subject, as indeed they are those that are derived from centuries of empirical verification and widespread use (outside the relativistic and quantum domains). Therefore, axioms differing in any material way from the ones we give, e.g., giving possibly different formulations of mechanics, will themselves be controversial. Moreover, we are not interested in philosophical discriminations between mathematically equivalent theories.
foundations of mechanics, we hope that one can develop mechanics in physical settings less restrictive than the usual one.
To this end, we list a few places where we think there may be progress to be made along the lines of what is presented in the paper.
1. In most developments of mechanics, including ours, one assumes a nice structure to the space of configurations, namely that it has a smooth manifold structure. However, this need not always be the case. One can have physical environments with boundaries, and different regimes in which the set of configurations undergoes an abrupt change. Such configuration spaces are "nonsmooth," and our careful and precise construction of "physical configurations" (in Definition 5.2) should allow a physically natural-and mathematically precise-development of the analysis of nonsmooth configuration spaces.
2. The equations governing the dynamics of systems with velocity constraints are differential-algebraic equations. Such equations normally require further analysis to establish results concerning existence and uniqueness of solutions. Most work on the subject, including ours here, circumvents these problems by assuming that the constraint distribution has locally constant rank; in this case one can prove the usual sorts of existence and uniqueness results for physical motions, cf. Theorem 8.11. However, this is not always a physically valid assumption. For example, in "N -trailer" problems, if two or more of the trailers have their wheels aligned in a 90 • -turning position, the constraint force bundle will drop rank. This will lead to the constraint distribution itself not being smooth. As far as we know, this is a problem that is simply not understood, despite the fact that it is not uncommon in applications. A different sort of problem is when the constraint distribution is smooth but not locally constant rank. Such problems are considered by Cortés, de León, Martín de Diego, and Martínez [2001]. A key ingredient in dealing with such problems are physical assumptions that enable one to "close the loop" and arrive at consistent equations of motion. In general, the phrasing of physically meaningful and mathematically precise axioms governing a model is an important step in the modelling process. For example, our formulation of the Principle of Virtual Work in Definition 6.23 in a way that covers interconnection and constraint forces is just such a physically meaningful and mathematically precise axiom. This is, we believe, an improvement of the physically and mathematically nebulous form usually taken by the Principle of Virtual Work (but we refer to [Spivak 2010] for a detailed and reasonably clear, mathematically, description of topics surrounding the Principle of Virtual Work). Moreover, perhaps it can serve as a basis for the physical assumptions needed to correctly model systems where the constraint distribution is not smooth and of locally constant rank.
3. Appropriate physical principles can also be useful when dealing with generalising the sorts of forces one considers. For example, one can consider forces that are differential inclusions [Glocker 2001, Monteiro Marques 1993. Our unusually careful development of forces and torques may be helpful in developing principles for handling general classes of forces.
4. The extension of our methodology to models leading to partial differential equations is appealing. As we mention in the footnote above, this is carried out for continuum mechanics by Truesdell and Noll [1965] and others. We believe that, nonetheless, a mathematical refinement of this work would serve to provide a useful and interesting foundation for the physical developments of these fields along geometric lines.
The first three of the preceding possible extensions based on our modelling all fall into the broad category of "nonsmooth mechanics." This discipline was initiated in main part by work of Jean Jacques Moreau in the 1960's, and his work continued into the 1980's. There has been a lot of work in this area subsequently, and we refer to the books [Brogliato 1999, Kanno 2011] and references contained therein for a survey of this work. However, there seems to be almost no work done on nonsmooth mechanics in the geometric setting. This seems to be a direction in which the field of geometric mechanics can be profitably extended.
1.3. Organisation of paper. The organisation of the paper is as follows. We describe our notion of spacetime, and transformations and motions of and through space in Section 2. In Section 3 we review what we mean by a rigid body, following, in spirit, the presentation of Bullo and Lewis [2004, §4.2]. Special consideration is given in Section 4 to rigid bodies with degenerate inertia tensor. Here many of the new technical results in the paper are initiated. Single rigid bodies are generalised to interconnections of multiple rigid bodies in Section 5, where we define what we call an "interconnected rigid body system." The emphasis in this section is on "kinematic" constructions, including velocity constraints. In Section 6 we provide a systematic presentation of forces on rigid bodies and in the context of interconnected systems with constraints. In Section 7 we present the Newton-Euler equations for the motion of an interconnection of rigid bodies subject to external forces and velocity constraints. The Euler-Lagrange equations, and a proof of their equivalence with the Newton-Euler equations, is the topic of Section 8. Included as an essential part of this equivalence are various versions of the "Lagrange-d'Alembert Principle." 1.4. Notation and background. Our set theoretic conventions are standard, except that we denote set inclusion by "⊆," with "⊂" standing for strict set inclusion. By id X we denote the identity map on a set X. Sometimes, when the sets have cumbersome notation attached to them, we shall just denote the identity map by id, the set being understood from context. By 2 X we denote the set of subsets of a set X.
By Z we denote the set of integers with Z >0 denoting the positive integers and Z ≥0 denoting the nonnegative integers. By R we denote the set of real numbers with R >0 denoting the set of positive real numbers.
If X is a topological space and if A ⊆ X, by int(A) we denote the interior of A. If A ⊆ B ⊆ X, by int B (A) we denote the interior of A in B, i.e., the interior of A in the relative topology of B induced by X.
We suppose the reader to be familiar with standard linear algebra. Our vector spaces will typically be over the field of real numbers. By Hom R (U; V) we denote the space of R-linear maps between R-vector spaces U and V. We abbreviate V * = Hom R (V; R) and End R (V) = Hom R (V; V). If α ∈ V * and v ∈ V, we will use either α; v or α(v) to denote the evaluation of α on v. If S ⊆ V, then we denote by ann(S) = {α ∈ V * | α(v) = 0 for all v ∈ S} the annihilator of S. If S ⊆ V, by span(S) we denote the subspace generated by S. For A ∈ Hom R (U; V), we denote the dual of A by A * : V * → U * , and recall that it is defined to satisfy A * (β); u = β; A(u) , u ∈ U, β ∈ V * .
By k (V * ) we denote the space of alternating k-forms on V, which we think of as being the alternating multilinear mappings from V k into R. That is, we identify k (V * ) with the mappings A : V × · · · × V k times → R that are linear in each entry and which satisfy A(v σ(1) , . . . , v σ(k) ) = sign(σ)A(v 1 , . . . , v k ) for every permutation σ of the set (1, . . . , k), where sign(σ) denotes the parity of the permutation.
We shall rely heavily on constructions special to vector spaces with inner products and orientations, and here outline our notation for these. We let V be an n-dimensional Rvector space. An orientation of V is an equivalence class in n (V * ) \ {0}, where nonzero n-forms are equivalent if they agree up to a positive scalar multiple (keeping in mind that dim R ( n (V * )) = 1). We shall designate the choice of orientation by a single nonzero nform θ, taking for granted that it is understood that it is the equivalence class of θ that we have in mind. A basis (e 1 , . . . , e n ) of V is positively-oriented if θ(e 1 , . . . , e n ) ∈ R >0 . An invertible linear map A ∈ End R (V) is orientation-preserving if A * θ defines the same orientation as θ, where A * θ(v 1 , . . . , v n ) = θ(A(v 1 ), . . . , A(v n )). Now let V be a finite-dimensional R-vector space with inner product g and orientation θ.
We refer to [Berger 1987, §8.2, 8.11] for basic properties of SO(V, g, θ). We denote by so(V, g, θ) the subspace of End R (V) comprised of g-skew-symmetric linear mappings, i.e., those satisfying We denote by g : V → V * the natural isomorphism defined by g (u); v = g(v, u), u, v ∈ V.
The inverse of g we denote by g . If U ⊆ V is a subspace, then U ⊥g denotes the g-orthogonal complement of U. We denote by g −1 the inner product on V * induced by g: g −1 (α, β) = g(g (α), g (β)), α, β ∈ V * .
Our principle interest is in 3-dimensional oriented vector spaces with an inner product g. We aggregate the data in a triple (V, g, θ). In this case, we define a bilinear mapping (v 1 , v 2 ) → v 1 × v 2 ∈ V on V × V as follows. First of all, for u, v ∈ V, define n u,v ∈ V by asking that n ∈ (span(u, v)) ⊥g and that (u, v, n u,v ) be a positively-oriented orthogonal basis if u and v are not collinear, and n u,v = 0 if u and v are collinear. Then define θ u,v = cos −1 ( g (u,v) u v ) and u × v = u v sin(θ u,v )n u,v .
We refer to [Berger 1987, §8.11] for details and generalisations. We can directly verify the identity g(u × v, w) = g(w × u, v), u, v, w ∈ V, (1.1) that we shall frequently use. We define an injective homomorphism v →v from V into End R (V) by requiring thatv(u) = v × u for u ∈ V. One can use (1.1) to prove that v ∈ so(V, g, θ). Moreover, a dimension count shows that· is surjective onto so (V, g, θ), and so is an isomorphism onto this latter space. The inverse map from so(V, g, θ) to V we denote by A →Ǎ.
While we shall mostly work with abstract vector spaces, when we work with R n we denote by x = (x 1 , . . . , x n ) a typical point in R n . By R m×n we denote the set of m × n matrices with entries in R, which we think of as being members of Hom R (R n ; R m ). The standard basis for R n we denote by (e 1 , . . . , e n ).
We also work with affine spaces, following [Berger 1987, Chapter 2]. We recall that an affine space A modelled on a vector space V is defined by an effective, transitive action of the Abelian group V on A. (Note that this is the definition of "+"!) Note that effectiveness and transitivity of the action ensures that, given x 1 , x 2 ∈ A, there exists a unique v ∈ V such that x 2 = x 1 + v. Thus we denote this unique v by x 2 − x 1 . That is, we can (1) add elements of V to elements of A, which we think of as "translating" the element of A by the element of V and (2) subtract elements of A, which we think of as returning the amount we must translate to get from one point to the other. If the model vector space V has an inner product g, we define a metric on A by d(x 1 , x 2 ) = g(x 2 − x 1 , x 2 − x 1 ). (1.2) Other constructions concerning affine spaces will be introduced in the text and notation provided for these as required. If S ⊆ A is a subset of an affine space, by conv(S) = {x 1 + s(x 2 − x 1 ) | s ∈ [0, 1], x 1 , x 2 ∈ S} we denote the convex hull of S, i.e., the union of all line segments between points of S. By aff(S) = {x 1 + s(x 2 − x 1 ) | s ∈ R, x 1 , x 2 ∈ S} we denote the affine hull of S, i.e., the union of all lines through points in S. It is convenient in two places in our presentation to use measures, one when discussing mass distributions for rigid bodies, and the other for discussing forces and torques. We refer to [Cohn 2013] as a basic reference for measure theory. By supp(µ) we denote the support of a measure µ. One non-basic element of measure theory we shall use is that of a vector measure. Let (M, A) be a measurable space with T a topological vector space. A vector measure with values in T is a mapping µ : A → T such that, if (A j ) j∈Z >0 is a sequence of pairwise disjoint measurable sets, then the sum converging in the topology of T. We note that, if λ is an element of the continuous dual of T, then we define a signed measure µ λ on (M, A) by µ λ (A) = λ; µ(A) .
We shall assume the reader is familiar with basic differential geometry, and use [Abraham, Marsden, and Ratiu 1988] as a reference whose notational conventions we mainly adopt. We shall not often work with general manifolds, but when we do we shall suppose them to be smooth, Hausdorff, and second countable. By π TM : TM → M we denote the tangent bundle of a manifold M and by π T * M : T * M → M the cotangent bundle. The fibres of these bundles at x ∈ M we denote by T x M and T * x M, respectively. We shall sometimes denote the zero vector in a tangent or cotangent space at x by 0 If Φ : M → N is a differentiable mapping of manifolds, by T Φ : TM → TN we denote the derivative. We also denote T x Φ = T Φ|T x M. If I ⊆ R is an interval and if γ : I → M is a differentiable curve, then we define γ : I → TM by γ (t) = T t γ · 1. If X is a vector field and f is a function, by L X f we denote the Lie derivative of f with respect to X.
We shall make some reference to elementary Riemannian and affine differential geometry and we refer to [Bullo and Lewis 2004, §3.8] for a treatment at the level we shall use here, and for notation. For a Riemannian metric G on a manifold M, we denote the Levi-Civita affine connection by G ∇. An arbitrary affine connection we denote by ∇. Thus ∇ X Y denotes the covariant derivative of Y with respect to X. For a curve γ : I → M and a vector field Y : I → TM along γ-i.e., Y (t) ∈ T γ(t) M-we denote by ∇ γ (t) Y (t) the covariant derivative of Y along γ. Thus, in particular, geodesics are curves γ that satisfy ∇ γ (t) γ (t) = 0.
2.1. Space models and body reference spaces. In order to prevent a proliferation of R 3 's, all having different physical meanings, we shall use an abstract model for space.
2.1 Definition: (Newtonian space model) A Newtonian space model is a quadruple S = (S, V, g, θ), where (V, g) is a three-dimensional inner product space, θ is a nonzero member of 3 (V * ) (defining an orientation on V), and S is an affine space modelled on V. • The affine space S serves as our model for points in space. We can make the abstraction of the preceding space model more concrete by use of the following device.
2.2 Definition: (Spatial frame) A spatial frame for a Newtonian space model S = (S, V, g, θ) is a pair (O spatial , (s 1 , s 2 , s 3 )), where O spatial ∈ S and (s 1 , s 2 , s 3 ) is a positivelyoriented orthonormal basis for V.
• Given a spatial frame, for each x ∈ S we can write for some unique x 1 , x 2 , x 3 ∈ R. This, then, establishes a bijection between S and R 3 that can be useful in making some of our abstract constructions concrete. We shall, however, not explicitly pursue making this translation from abstract to concrete in this paper.
In the preceding discussion, we described our model of physical space. We will posit a similar model for the reference space in which a rigid body resides.
2.3 Definition: (Body reference space) A body reference space is a quadruple B = (B, U, G, Θ), where (U, G) is a three-dimensional inner product space, Θ is a nonzero member of 3 (U * ), and B is an affine space modelled on U. • We also have the corresponding notion of a reference frame for a body reference space.
2.4 Definition: (Body frame) A body frame for a body reference space is a positively-oriented orthonormal basis for U. • 2.2. Rigid transformations. We now present the manner in which we will describe the motion of a rigid body. We suppose that we have a Newtonian space model S = (S, V, g, θ) and a body reference space B = (B, U, G, Θ). The notion of a rigid motion is described by relating these spaces, so the body is not actually required for this; it merely "goes along for the ride" after the fact. Thus we only introduce bodies in Section 3.

Definition: (Rigid transformation)
For a Newtonian space model S = (S, V, g, θ) and a body reference space B = (B, U, G, Θ), a rigid transformation of B in S is a map Φ : B → S having the following properties: (i) Φ is an affine map, i.e., there exists R Φ ∈ Hom R (U; V) such that for any X 0 ∈ B; By Rgd(B; S) we denote the set of rigid transformations of B in S.

•
We note that, for Φ to be affine, it suffices to check that (2.1) holds for some X 0 ∈ B [Berger 1987, Proposition 2.3.1]. We also note that this uniquely defines R Φ ∈ Hom R (U; V) [Berger 1987, Proposition 2.3.1]. The last two conditions mean that R Φ is an orientation-preserving isometry. With this mind, let us denote by Isom + (B, S) the set of linear orientation-preserving isometries from U to V. We will also have occasion to refer to the groups of orientation preserving isometries of B and S, which we denote by Rgd(B) and Rgd(S), respectively. Note that these are indeed groups (with the operation of composition), but that Rgd(B; S) is not naturally a group. As previously, SO(U, G, Θ) and SO(V, g, θ) denote the groups of orientation-preserving linear isometries of U and V, respectively.
To understand the structure of the space of rigid transformations, the following result provides a useful insight.
2.6 Proposition: (Group actions on the space of rigid transformations) Let S = (S, V, g, θ) be a Newtonian space model and let B = (B, U, G, Θ) be a body reference space.

Then the mappings
define left-and right-actions of Rgd(S) and Rgd(B), respectively, on Rgd(B; S). Moreover, both actions are free and transitive.
Proof: That A spatial and A body are actions is easily verified. Let us verify the freeness and transitivity of A spatial , the similar assertions for A body following in a similar manner. Let Ψ ∈ Rgd(S) have the property that Ψ • Φ = Φ for every Φ ∈ Rgd(B; S). Let x ∈ S and note that giving Ψ = id S . This proves freeness of A spatial . Next let Φ 1 , Φ 2 ∈ Rgd(B; S) and note that Φ 2 • Φ −1 1 ∈ Rgd(S) since compositions of orientation-preserving isometries are orientationpreserving isometries. Moreover, giving transitivity of A spatial .
We now give a series of lemmata that give further structure to the set of rigid transformations. We begin by showing that the set of rigid transformations can be given the structure of a group by fixing a reference rigid transformation with respect to which others are measured. 2.7 Lemma: (Rigid transformations with respect to a reference transformation) Let S = (S, V, g, θ) be a Newtonian space model and let B = (B, U, G, Θ) be a body reference space. Let Φ 0 ∈ Rgd(B; S). Then the binary operation Proof: The group operation is associative: The identity element is Φ 0 : giving the first part of the result.
For the second assertion, we note that compositions of isometries (resp. orientationpreserving maps) are again isometries (resp. orientation-preserving maps).
The group described in the preceding lemma we denote by Rgd Φ 0 (B; S). The lemma establishes isomorphisms of this group with both Rgd(S) and Rgd(B). In this paper we shall focus on the former representation, although it is also possible to work with the latter.
In order to understand better the structure of a rigid transformation, we have the following result.
2.8 Lemma: (Rigid transformations with respect to origins) Let S = (S, V, g, θ) be a Newtonian space model and let B = (B, U, G, Θ) be a body reference space. Let Moreover, R Φ is uniquely determined by Φ and does not depend on x 0 or X 0 .
Proof: First note that, for any r ∈ V and R ∈ Hom R (U; V), the mapping is affine. Let us take r Φ = Φ(X 0 ) − x 0 and R Φ to be the linear mapping in Definition 2.5, and define Ψ(X) = x 0 + (r Φ + R Φ (X − X 0 )).
Note that Ψ(X 0 ) = x 0 + r Φ = Φ(X 0 ) and so The result says that, upon a choice of body and spatial origin, one can think of a rigid transformation as a rotation followed by a translation.
Whenever we specify a body origin, we shall use this to assign R Φ ∈ Isom + (B; S) to Φ ∈ Rgd(B; S), and we shall do so without explicitly mentioning that this is done via the preceding lemma. If we additionally choose a spatial origin x 0 , we shall assign r Φ ∈ V to Φ ∈ Rgd(B; S), again without making explicit mention that this is done by the lemma.
The preceding two lemmata can be combined to give the following description of the space of rigid transformations.
2.9 Lemma: (A concrete group representation for rigid transformations) Let S = (S, V, g, θ) be a Newtonian space model and let B = (B, U, G, Θ) be a body reference space.

Then the mapping
if the latter has the group operation (R, r) · (S, s) = (R • S, r + R(s)).
(2.3 ) ). From this one directly computes that from which the lemma follows.
The preceding lemma says that, upon a choice of spatial and body origin, and of a reference configuration, the set of rigid transformations becomes the well-known semidirect product of rotations and translations.
Let us make the representation of rigid transformations even more concrete by the use of spatial and body frames.
2.10 Lemma: (Rigid transformations with respect to frames) Let S = (S, V, g, θ) be a Newtonian space model with spatial frame (O spatial , (s 1 , s 2 , s 3 )), let B = (B, U, G, Θ) be a body reference space with body frame (O body , (b 1 , b 2 , b 3 )), and let Φ 0 ∈ Rgd(B; S) be defined by r 0 = 0 and R 0 (b j ) = s j , j ∈ {1, 2, 3}, where r 0 and R 0 are as in Lemma 2.9. Let r Φ and R Φ be as in Lemma 2.8, taking x 0 = O spatial and X 0 = O body . Let r Φ ∈ R 3 be the vector of components of r Φ with respect to the basis (s 1 , s 2 , s 3 ) and let R Φ ∈ SO(3) be the matrix of R Φ with respect to the bases (b 1 , b 2 , b 3 ) and (s 1 , s 2 , s 3 ). Then the following statements hold: (iii) the components of the vector Φ(X) − O spatial ∈ V with respect to the basis (s 1 , s 2 , s 3 ) are the components of vector r Φ + R Φ X, where X are the components of the vector X − O body with respect to the basis (b 1 , b 2 , b 3 ).
Proof: (i) In the proof of Lemma 2.8 we showed that r Φ = Φ(O body ) − O spatial , and this immediately gives this part of the lemma.
and this part of the result follows directly from this. (iii) As in Lemma 2.8 we have We now have at hand three groups connected with the set of rigid transformations: 1. Rgd Φ 0 (B; S) with the group operation of Lemma 2.7 that depends on a choice of Φ 0 ∈ Rgd(B; S); 2. SO(V, g, θ) × V with the group operation The preceding lemmata establish identifications of each of these groups with Rgd(B; S) by various choices, namely 1. a choice of Φ 0 ∈ Rgd(B; S), 2. a choice of x 0 ∈ S, X 0 ∈ B, and Φ 0 ∈ Rgd(B; S), or 3. the data from 2 plus a choice (O spatial , (s 1 , s 2 , s 3 )) of spatial frame and (O body , (b 1 , b 2 , b 3 )) of body frame, respectively. The final, most concrete, representation of a rigid transformation is that which is most commonly used [e.g., Bullo andLewis 2004, Murray, Li, andSastry 1994]. However, there is substantial clarification in working with the more abstract representations, especially when we come to talk about degenerate rigid bodies in Section 4.1 below. Indeed, we shall not for the remainder of the paper make any essential reference to spatial or body frames.

Rigid motions.
If we put things in motion, this requires time.

Definition
for some t 0 ∈ T and for some interval I ⊆ R.
• Suppose that we have an affine space A modelled on a finite-dimensional R-vector space V and a curve φ : T → A from a time interval T ⊆ T. The curve is differentiable at t 0 ∈ T if the limit exists. Note that the limit, when it exists, is necessarily in V, and we denote this limit bẏ φ(t 0 ). Note thatφ and φ are not the same thing, and the reader will want to keep this in mind at some points. Ifφ : T → V is differentiable, then we denotë when this limit exists. Again this limit is in V when it exists, and when it exists we say that φ is twice differentiable at t 0 . Next we consider the situation when a rigid body is in motion.
2.12 Definition: (Rigid motion) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, and let T be a time axis. A rigid motion of B in S is a curve φ : T → Rgd(B; S) defined on a time interval T ⊆ T.
• As we saw in the preceding section, by a choice of spatial and body frame, we establish an identification Rgd(B; S) SO(3) × R 3 . Using this identification, we can assign regularity to a rigid motion φ by considering its regularity when represented by a curve φ in SO(3) × R 3 . Since two representations of Rgd(B; S) by SO(3) × R 3 are related by a smooth diffeomorphism (as is easily seen), any notion of regularity up to infinitely differentiable 3 will be independent of choices of spatial and body frames. Thus we shall freely speak of things like "differentiable rigid motions." The upshot of the preceding, of course, is that Rgd(B; S) is a smooth manifold that is diffeomorphic to SO(3) × R 3 via any choice of spatial and body frames.
Let us make some observations about velocities associated with rigid motions.
2.13 Lemma: (Velocities associated with a rigid motion) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, and let T be a time axis with T ⊆ T a time interval. For a differentiable rigid motion φ : T → Rgd(B; S), and for x 0 ∈ S and X 0 ∈ B, let R φ : T → Isom + (B; S) and r φ : as in Lemma 2.8. Then the following statements hold: ; (ii)ṙ φ depends only on φ and X 0 , and not on x 0 .
Differentiating these relations with respect to t gives this part of the result.
(ii) Now suppose thatx 0 andX 0 = X 0 are alternate origins for S and B. As we saw in the proof of Lemma 2.8, we havẽ The lemma has the following corollary that will be useful to us when representing velocities associated to rigid motions of a rigid body.
2.14 Corollary: (A convenient representation of the velocity of a rigid motion) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, and let T be a time axis with T ⊆ T a time interval. Let X 0 ∈ B and x 0 ∈ S. Then there is an injective vector bundle mapping depending only on X 0 and satisfying τ X 0 (φ(t)) = (φ(t), (Ṙ φ (t),ṙ φ (t))) for every differentiable rigid motion φ : T → Rgd(B; S).
Proof: That the mapping τ X 0 as defined depends only on X 0 follows since R φ depends only on φ (by Lemma 2.8) and sinceṙ φ depends only on X 0 (by Lemma 2.13). To see that τ X 0 is injective, let Γ : and, for X ∈ B, define Γ X : Rgd(B; S) → S by Γ X (Φ) = Γ(Φ, X). In like manner define ) so thatφ(t 0 ) ∈ ker(T φ(t 0 ) Γ X ) for every X ∈ B. Now consider the identification Φ → φ(t 0 ) −1 • Φ of Rgd(B; S) with the group Rgd(B), along with the following commutative diagram: This identifies S with B, and Rgd(B; S) with Rgd(B) in such a way that the group action Γ is identified with the group action Γ body . This group action is effective, and so ∩ X∈B T id Γ X body = {0}. This givesφ(t 0 ) = 0, giving injectivity of τ X 0 . As an application of the preceding result, let us indicate the nature of the lift to T(Rgd(B; S)) of the natural left-and right-actions of Rgd(S) and Rgd(B).
2.15 Lemma: (The lifts of the natural group actions of the space of rigid transformations) Let S = (S, V, g, θ) be a Newtonian space model and let B = (B, U, G, Θ) be a body reference space. Let x 0 ∈ S and X 0 ∈ B. Consider the actions A spatial and A body of Rgd(S) and Rgd(B), respectively, on Rgd(B; S) (see Proposition 2.6). Then we have and Proof: We first prove the spatial formula. Given the choices of origin, we represent Φ ∈ Rgd(B; S) with (R Φ , r Φ ) ∈ Isom + (B; S)×V and Ψ ∈ Rgd(S) with (R Ψ , r Ψ ) ∈ SO(V, g, θ)× V. We then have that Ψ•Φ ∈ Rgd(B; S) is represented by (R Ψ • R Φ , r Ψ +R Ψ (r Φ )), cf. equation (2.3). Differentiating with respect to (R Φ , r Φ ) gives the desired formula. Similarly, for the body formula, we represent ). Differentiating with respect to (R Φ , r Φ ) again gives the desired result.
We can now sensibly make the following definition.
2.16 Definition: (Rigid body velocities) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, let T be a time axis with T ⊆ T a time interval, and let x 0 ∈ S and X 0 ∈ B. For a rigid motion φ : We then make the following definitions.
(i) The spatial angular velocity for the motion is (ii) The body angular velocity for the motion is (iii) The spatial translational velocity for the motion is (iv) The body translational velocity for the motion is It is convenient to package these representations of velocity into vector bundle mappings and for a differentiable mapping φ : T → Rgd(B; S). We note that the definition of τ spatial requires a choice of both x 0 ∈ S and X 0 ∈ B, whereas the definition of τ body requires only a choice of X 0 ∈ B.
With the possible exception of the spatial translational velocity (for a discussion of which we refer to [Murray, Li, and Sastry 1994, page 55]), these velocity representations have simple physical characterisations. The body translational velocity is the velocity of the image of the body reference point X 0 relative to the spatial reference point x 0 , viewed from the body coordinate system. The direction of the spatial angular velocity ω(t) is the axis about which the body instantaneously rotates at time t, seen in the spatial frame, while the body angular velocity is the same thing, seen in the body frame as it is mapped by the motion into the spatial frame. A thoughtful discussion of angular velocity is undertaken by Crampin [1986].
To close this section, we prove a couple of useful identities that relate to angular velocity.
(ii) For u ∈ U and v ∈ V, using part (i), which gives From the definition of·, the result follows.

Rigid bodies
In this section we present a rigorous treatment of rigid bodies, and define the three attributes of a rigid body that are required to describe its dynamics.
3.1. Rigid bodies, mass, and centre of mass. Suppose that we have a body reference space B = (B, U, G, Θ). Note that a body frame give us a representation of X ∈ B by for unique X 1 , X 2 , X 3 ∈ R. Thus we have, as with spatial frame, an identification of B with R 3 , and so we have a topology for B induced by that for R 3 . Moreover, rotation and translation invariance of the topology for R 3 ensures that the topology for B is independent of the choice of body frame. Moreover, this is also the metric topology for S associated with the metric defined by G, cf. (1.2). The point is that the "Borel σ-algebra" associated with B is meaningful, and we denote it by B(B). The generality of the preceding definition allows us to consider particles as being special cases of rigid bodies. Indeed, a particle of mass m at X 0 ∈ B is, as a rigid body, for any B ∈ B(B). The generality of the definition also allows for other idealisations of rigid bodies, such as bodies occupying a line segment in B, i.e., infinitely thin rods. These sorts of constructions will lead us to a detailed consideration of degenerate rigid bodies in Section 4. For a rigid body, we define the following two notions.
3.2 Definition: (Mass, centre of mass) Let B = (B, U, G, Θ) be a body reference space and let (B, µ) be a rigid body in B.
(ii) The centre of mass of (B, µ) is 3.3 Remark: (Definition of centre of mass) The adding and subtracting of X 0 in the formula for the centre of mass is required since the integral is defined for real-valued functions, and so for finite-dimensional vector space-valued functions by choosing a basis. Thus the expression 1 is not defined, since the integrand is in an affine space. Thus this is made sense of by subtracting, then adding, X 0 . Moreover, the integral so defined is independent of the choice of X 0 , as we shall shortly see. • When no ambiguity can arise, we shall denote the mass and centre of mass by m and X c , respectively.
The introduction of a rigid body in a body reference space gives us an immediate physically meaningful origin, namely the centre of mass. We shall often make this choice without mention, particularly by using Corollary 2.14 to identify T(Rgd(B; S)) with a subset of Rgd(B; S) × (Hom R (U; V) ⊕ V).
The following lemma records some useful properties of the centre of mass.
3.4 Lemma: (Properties of centre of mass) Let B = (B, U, G, Θ) be a body reference space with (B, µ) a rigid body. The following statements hold: (i) the expression Proof: (i) LetX 0 ∈ B and computẽ (ii) By definition of X c and part (i), we have Now suppose thatX c ∈ B has the property that B (X −X c ) dµ(X) = 0.
Then, by (i), from which we conclude thatX c = X c . (iii) If X c is on the relative boundary of conv(B) or not in B, then there exists a hyperplane P in B passing through X c such that there are points in B which lie on one side of P, but there are no points in B on the opposite side. In other words, there exists λ ∈ U * such that the set is empty. But this would imply that B λ(X − X c ) dµ(X) > 0, contradicting (ii).

Inertia tensor.
In the preceding definition we introduced two of the three components of a rigid body required to determine its dynamical behaviour. The third is the following.
3.5 Definition: (Inertia tensor) Let B = (B, U, G, Θ) be a body reference space, let (B, µ) be a rigid body in B, and let X 0 ∈ R 3 . The inertia tensor of (B, µ) about X 0 is the linear map I X 0 (B, µ) ∈ Hom R (U; U) defined by for u ∈ U. By I c (B, µ) we denote the inertia tensor about the centre of mass of (B, µ). • When no ambiguity can arise, we shall denote the inertia tensor about X 0 by I X 0 and the inertia tensor about the centre of mass by I c .
An essential physical property of the inertia tensor is the following.
3.6 Lemma: (Symmetry and definiteness of the inertia tensor) For a body reference space B = (B, U, G, Θ), for a rigid body (B, µ) in B, and for X 0 ∈ B, I X 0 is symmetric and positive-semidefinite with respect to the inner product G.
Proof: We prove the symmetry of I X 0 as follows: using (1.1). That I X 0 is positive-semidefinite follows directly from (3.1), taking v 1 = v 2 = v.
From the lemma, we immediately deduce that I X 0 has real, nonnegative eigenvalues, which we call principal inertias, and a basis of eigenvectors, which we call principal axes. In Section 4 we shall carefully study the situation when some or all of the principal inertias are zero.
3.3. Fibre semimetrics associated with a rigid body. The physical data of a rigid body can be conglomerated into one geometric object, which will also have spatial and body representations. In this section we introduce these geometric constructions.
Let us begin with the definition, noting that a Riemannian semimetric has all of the attributes of a Riemannian metric, except it is only positive-semidefinite rather than positivedefinite.

•
The motivation for calling this the kinetic energy semimetric is given by Lemma 7.8 below.
Let us give a basic property of the kinetic energy semimetric. Proof: Recall that the left action in question is denoted by A spatial and is defined by A spatial (Ψ, Φ) = Ψ • Φ. Let Ψ ∈ Rgd(S) and note that by Lemma 2.15. We then directly compute the latter since R Ψ is g-orthogonal. These two formulae together show that G B is Rgd(S)invariant.
Let us give the representation of G B in spatial and body velocity representations. Thus we define fibre semimetrics G spatial respectively. Let us give the explicit expression for these semimetrics.
Proof: Note thatω = R φ •Ω • R T Φ . The result follows from applying this fact, along with Lemma 2.17, for the first equation and the fact that R Φ ∈ Isom + (B; S) for the second equation.

Rigid bodies with degenerate inertia tensors
One of the contributions of our approach to rigid body dynamics is that we carefully consider the situation of rigid bodies with degenerate inertia tensor, i.e., particles and infinitely thin rods. We shall see that, in such cases, the inertia tensor possesses an internal symmetry that we will use to perform a reduction of the space Rgd(B; S) of configurations of the rigid body. This, in turn, will be essential in our construction of the usual "kinetic energy" of a mechanical system that is essential in the formulation of the Euler-Lagrange equations.
4.1. Degeneracies of the inertia tensor and internal symmetry. Let us first consider the possible ways in which the inertia tensor can be degenerate.
4.1 Lemma: (Degenerate inertia tensors) Let B = (B, U, G, Θ) be a body reference space, let (B, µ) be a rigid body in B, and let X 0 ∈ B. Let I X 0 denote the inertia tensor of (B, µ) about X 0 . The following statements hold: (i) if I X 0 has a zero eigenvalue, then the other two eigenvalues are equal; (ii) if I X 0 has two zero eigenvalues, then I X 0 = 0.
Proof: (i) Let u be a unit eigenvector for the zero eigenvalue. We claim that the support of the measure µ must be contained in the line To see that this must be so, suppose that the support of µ is not contained in u . Then there exists a Borel set B ⊆ B \ u so that µ(S) > 0. This would imply that for all X ∈ B, and this would imply that I X 0 (u, u) > 0. But this contradicts u being an eigenvector with zero eigenvalue, and so the support of µ must be contained in the line u . To see that this implies that the other eigenvalues are equal, we shall show that any vector that is G-orthogonal to u is an eigenvector for I X 0 . First let (u 1 , u 2 ) be an orthonormal basis for the G-orthogonal complement to span(u) and write for all v ∈ U. Now recall the property of the cross-product that u × (v × u) = v provided v is orthogonal to u and that u has unit length. Therefore, we see that, for any v that is orthogonal to u, we have meaning that all such vectors v are eigenvectors with the same eigenvalue, which is what we wished to show.
(ii) It follows from our above arguments that, if the zero eigenvalue has multiplicity 2, then the support of µ must lie in the intersection of lines u 1 and u 2 for orthogonal eigenvectors for the zero eigenvalue. This intersection is a single point that must, therefore, be X 0 . From this and the definition of I X 0 it follows that I X 0 = 0.
Note that in, proving the result, we have proven the following corollary.
4.2 Corollary: (The "shape" of rigid bodies with degenerate inertia tensors) Let B = (B, U, G, Θ) be a body reference space, let (B, µ) be a rigid body in B, and let X 0 ∈ B.
Let I X 0 denote the inertia tensor of (B, µ) about X 0 . The following statements hold: (i) I X 0 has a zero eigenvalue if and only if B is contained in a line through X 0 ; (ii) if I X 0 has two zero eigenvalues then B = {X 0 }, i.e., B is a particle located at x 0 ; (iii) if there is no line through X 0 that contains the support of µ, then the inertia tensor is nondegenerate.
With the preceding discussion undertaken, we make the following essential definition.

Examples: (Internal symmetry groups)
1. Let us consider a particle of mass m located at X 0 ∈ B as a rigid body, i.e., (B, µ) = Thus Φ must be a rotation about X 0 . 2. Next let us consider the case where (B, µ) is a rigid body, and not a particle, for which the support of µ is contained in a line for some nonzero u ∈ U. By Lemma 3.4(iii) we may take X 0 = X c (B, µ). We claim that i.e., I(B) consists of rotations about the u-axis fixing X c (B, µ). First suppose that Φ is such a mapping. Let (B , µ ) have the same mass, centre of mass, and inertia tensor as (B, µ). Then the support of µ is contained in u by Corollary 4.2.
i.e., Φ is a rotation about u fixing X 0 .
3. Next we consider a rigid body (B, µ) with nondegenerate inertia tensor. Here we claim that , it is only the other inclusion that requires proof. Let I 1 , I 2 , I 3 ∈ R >0 be the principal inertias for (B, µ) and let {u 1 , u 2 , u 3 } be principal axes. We abbreviate m = µ(B) and so that r 1 , r 2 , and r 3 are distinct. Denote We define a measure µ with support equal to B by asking that µ be a sum of six Dirac measures: One readily verifies that (B , µ ) has the same mass, centre of mass, and inertia tensor as and, since Φ(B ) = B and r 1 , r 2 , and r 3 are distinct, That is, Φ is a rotation fixing X c about the axes u 1 , u 2 , and u 3 . The only such rotation is the identity. Next we consider the invariance of the inertia tensor under the internal symmetry group. To do this, we note that elements of the internal symmetry group are rotations about X c , and so we can regard I(B) as a Lie subgroup of SO(U; G, Θ). For Φ ∈ I(B) we shall denote by R Φ ∈ SO(U, G, Θ) the corresponding rotation. With this notation, we have the following result.
4.6 Lemma: (Invariance of inertia tensor under internal symmetries) Let B = (B, U, G, Θ) be a body reference space and let (B, µ) be a rigid body in B. For each Ψ ∈ I(B), . We then compute, using Lemma 4.5, as desired.
We then have the following more or less immediate consequence for the kinetic energy semimetric.
using Lemma 2.17, the fact that R Ψ ∈ SO(U, G, Θ), and Lemma 4.6. This is, by Lemma 2.15, exactly the assertion that G B is invariant under I(B).
The Lie subalgebra associated to the Lie subgroup I(B) we denote by i(B, µ). Just as we identify I(B) with a Lie subgroup of SO(U, G, Θ), we identify i(B, µ) with a Lie subalgebra of so(U, G, Θ). For ξ ∈ i(B, µ), we denote byΩ ξ ∈ so(U, G, Θ) the associated skew-symmetric linear map, with Ω ξ ∈ U the associated vector. The following result gives a characterisation of the Lie algebra i(B, µ) in terms of the inertia tensor.
Proof: We prove this case-by-case, following Example 4.4. In the case when dim(i(B, µ)) = 0, i.e., when the inertia tensor is nondegenerate, then we have In the case when dim(i(B, µ)) = 3, then the body is a particle and the latter equality because a particle has the property that Finally, we consider the case dim(i(B, µ)) = 1. In this case, the measure µ has support contained in a line {X c + su | s ∈ R} for some unit vector u ∈ U, and I(B) consists of the rotations about this line. Thus However, as we saw in the proof of Lemma 4.1, we have ker(I c ) = span(u), giving the result.

Reduced configuration space.
In the development of our correspondence of the Newton-Euler equations with the Euler-Lagrange equations, we will be careful with degenerate inertia tensors. To do this, we will quotient by the action of the internal symmetry group, and in this section we consider this process for a single rigid body.
We consider a Newtonian space model S = (S, V, g, θ) and a body reference space B = (B, U, G, Θ), with (B, µ) a rigid body in B. As usual, I(B) denotes the internal symmetry group of the body, and this acts freely and properly on the right on Rgd(B; S) the unit sphere in V, then we have the following result describing the orbit space of this group action.
4.9 Proposition: (Reduced configurations of a rigid body) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, and let (B, µ) be a rigid body in B. There is a one-to-one correspondence between Rgd(B; S)/I(B) and Moreover, the canonical projection π B : Rgd(B; S) → Rgd(B; S)/I(B) is a smooth surjective submersion.
Proof: As we have seen, if Ψ ∈ I(B) then Ψ(X c ) = X c . Thus Ψ is an orientation-preserving rotation about X c . Thus we can write for some r Φ ∈ V and R Φ ∈ Isom + (B; S). We now consider the three cases.
The first case we consider is when the inertia tensor is nondegenerate. In this case, I(B) is the identity subgroup, and the result follows by assigning to the orbit through Φ the Next we consider the case when I c is degenerate, but nonzero. Here, as we saw in Example 4.4-2, I(B) consists of rotations about some u ∈ U \ {0} fixing X c . We suppose that G(u, u) = 1. To Φ ∈ Rgd(B; S) we assign the element ( To show that this mapping is well-defined, we note that R Φ (u) ∈ S(V, g) since R Φ is a linear isometry. We also claim that the assignment establishes a bijective correspondence between Rgd(B; S)/I(B) and • Φ 1 fixes X c + au for every a ∈ R. This means that Φ −1 2 • Φ 1 ∈ I(B) and so Φ 1 and Φ 2 are in the same orbit of the action of I(B). To prove surjectivity of the correspondence, let (v, r) ∈ S(V, g) × V and let we see that the orbit of Φ is associated to (v, r) by our correspondence.
Finally, if I c is zero, then I(B) consists of all orientation preserving rotations about X c . In this case, to Φ ∈ Rgd(B; S) we assign r Φ ∈ V by asking that Φ(X c ) = x 0 + r Φ . First we show that the correspondence between Rgd(B; S)/I(B) and V is injective. Suppose that is a rotation about X c , giving that Φ 1 and Φ 2 are in the same orbit. For surjectivity of the correspondence, for r ∈ V we note that, if Φ ∈ Rgd(B; S) satisfies for some R ∈ Isom + (B; S), then the orbit of Φ is associated with r ∈ V.
The final assertion of the lemma follows from the following observation. By fixing x 0 ∈ S and X c ∈ S and by fixing orthonormal spatial and body frames, we have a smooth diffeomorphism of Rgd(B; S) with SO(3) × R 3 . The three cases of the internal symmetry groups then correspond to the subgroups and their standard right-actions. The assertion then follows from [Abraham and Marsden 1978, Corollary 4.1.21].
We remark that the correspondence of the lemma depends on the choice of x 0 ∈ S, as well as the natural fixing of the point X c ∈ B.
Let us make a couple of observations about the various degenerate rigid bodies.
4.10 Remarks: (Reduced configuration spaces for degenerate rigid bodies) 1. Let Trans(S) be the affine translation group of S. Thus an element of Trans(S) is a mapping of S of the form It is a simple matter to deduce that this action is well-defined and free. Moreover, in the case that I(B) SO(U, G, Θ), we have that this action is transitive. This implies that the reduced configuration space in this case is identified with the set of spatial affine translations. This corresponds to what we expect for the configurations of a particle. Moreover, this space is then realised in two ways, one as a quotient by the internal symmetry group of the body and the other as an inclusion of the spatial affine translation group. That this quotient and inclusion agree is a consequence of the fact that the rotations about a point in the body reference space are a normal subgroup of Rgd(B). 2. For rigid bodies with degenerate but nonzero inertia tensors, the situation is quite different from that for particles, in the sense that the reduced configuration space cannot be embedded as a submanifold of Rgd(B; S). To make this statement sensible (and correct), let us think for a moment about what reasonable properties such an embedding should have. Thus we assume that we have an embedding ι : Rgd(B; S)/I(B) → Rgd(B; S).
For such an embedding to be physically meaningful, it needs to interact with the fact that Rgd(B; S) corresponds to actual configurations of the body, as does the projection π B onto the reduced configuration space. Therefore, for the embedding ι to respect the physically meaningful data, the following diagram should commute: This, of course, means that ι is a section of π B . However, no continuous section of π B exists. This we argue as follows. First, let us make everything concrete by choosing spatial and body frames so that what we have is a continuous section of By restriction, this implies that we have a continuous section of SO(3) → S 2 . By [Husemoller 1994, Corollary 4.8.3] this implies that the bundle SO(3) → S 2 is trivial. This would imply that SO(3) is homeomorphic to S 2 × S 1 . This, however, is not the case, since the fundamental group of SO (3)  .14]. This shows that the reduction process we use in the paper for degenerate bodies cannot be replaced in a rational and general way by using submanifolds of Rgd(B; S).
• According to the previous discussion, we make the following definition.  4.12 Lemma: (Vertical bundle associated to degenerate inertia tensor) Proof: First we note that Next let Φ ∈ Rgd(B; S), let Ω ∈ i(B, µ), and denote (with abuse of notation) by eΩ ξ t the element of Rgd(B) consisting of a rotation by eΩ ξ t about X c , i.e., the map Then compute noting that the derivative of an affine map is its linear part. Combining the preceding two calculations gives the lemma.
Next we note that mere differentiation of the surjective submersion gives the surjective vector bundle mapping T π B : T(Rgd(B; S)) → T(Rgd(B; S)/I(B)).
To give an alternative characterisation of this mapping, we first note that I(B)-invariance of G B (by Corollary 4.7) means that V(Rgd(B; S)) is I(B)-invariant by Lemma 4.12. Thus we have an action of I(B) on the quotient of T(Rgd(B; S)) by V(Rgd(B; S)) by is an isomorphism of smooth vector bundles over the identity on Rgd(B; S)/I(B).
Proof: First let us show that the mapping is well-defined. Suppose that (Φ 1 , (A 1 , v 1 )) and By the definition (4.1) of the action of I(B) on the quotient vector bundle, it follows that and this gives the well-definedness of β B . It is clear from the definition that β B preserves the fibres of the two vector bundles and is linear on each fibre. We can conclude that β B is smooth by the following argument. The mappinḡ is smooth, being the composition of the quotient by a smooth subbundle followed by the quotient by a free and proper smooth group action. Now, to show that β B is smooth, it suffices to show that β * B f is smooth for every smooth function f on (T(Rgd(B; S))/V(Rgd(B; S)))/I(B). Let f be such a function and note that, since T π B is a surjective submersion, one can easily show that , and from this we conclude that β B is indeed smooth. Now, by [Abraham, Marsden, and Ratiu 1988, Proposition 3.4.12], it follows that β B is a smooth vector bundle map over the identity on Rgd(B; S)/I(B). To show that β B is injective, suppose that Then [(Φ, (A, v)) + V(Rgd(B; S))] = 0 which means that gives injectivity of β B . Finally, to show that β B is surjective, we need only note that every point in (T(Rgd(B; S))/V(Rgd(B; S)))/I(B), by definition, has the form [(Φ, (A, v)) + V(Rgd(B; S))] for some (Φ, (A, v)) ∈ T(Rgd(B; S)). Now we consider how the reduction of the set of configurations manifests itself in our velocity representations by the mappings τ spatial and τ body . We first give the lifted action of I(B) on T(Rgd(B; S) under the spatial and body representations of velocity. To do so, we fix a point x 0 ∈ S and X c ∈ B (which we need to do to define τ spatial in any case) so that we have a diffeomorphism under this diffeomorphism, the lifted action to the tangent bundle, under this representation, then has the form by Lemma 2.15. We then compute using Lemma 2.17. We denote these actions on by α T spatial and α T body , respectively. We now introduce a rigid body (B, µ) and define At (R, r) ∈ Isom + (B, µ) × V, the fibres of these subbundles are the degenerate spatial velocities and body velocities, respectively, for the body. These subbundles are I(B)-invariant.
4.14 Lemma: (Invariance of subbundles of zero inertia) The subbundles Z spatial and Z body are invariant under the right-actions α T spatial and α T body of I(B). Proof: This follows from the following two computations: twice using Lemma 4.6.
Note, then, that the quotient bundles inherit the right-actions α T spatial and α T body of I(B) by (with an abuse of notation) (Ω, V )), S) + Z body . We then have the following result that describes the spatial and body representations of velocity when the rigid body has degenerate inertia tensor.
4.15 Proposition: (Reduced rigid body velocities with degenerate inertia) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, and let (B, µ) be a rigid body in B. Then the mappings and τ red body : are vector bundle isomorphisms.
Proof: We start with the following data: 1. the vector bundles T(Isom + (B; S)×V), Isom + (B; S)×V×(V⊕V), and Isom + (B; S)× V × (U ⊕ U), along with the vector bundle isomorphisms τ spatial and τ body ; 2. the right-actions α T B , α T spatial , and α T body of I(B); 3. the Riemannian semimetric G B on Isom + (B; S) × V of Definition 3.7; 4. the fibre semimetric G spatial B given in (3.2); 5. the fibre semimetric G body B given in (3.3). We then make the following observations: 6. the vector bundle isomorphisms τ spatial and τ body are equivariant (by the constructions (4.2) and (4.3)); 7. the vector bundle isomorphisms τ spatial and τ body (and compositions of these) map the semimetrics above onto one another (by construction); 8. V(Isom + (B; S) × V) = ker(G B ) (by Lemma 4.12); 9. Z spatial = ker((G spatial B ) ) and Z body = ker((G body and the projection T π B is just the natural projection of these two quotient operations (by Lemma 4.13). An absorption of the preceding facts leads one to conclude that the three vector bundles are obtained by starting with the three isomorphic vector bundles and then applying the construction of first quotienting by the kernel of a fibre semimetric and then quotienting by an action of I(B). Moreover, the three semimetrics are obtained from one another by transferring by the corresponding vector bundle isomorphisms, and the vector bundle isomorphisms are equivariant with respect to the group actions.
We comment that there is a fundamental difference in the mappings τ spatial and τ body , and τ red spatial and τ red body , respectively. 4.16 Remark: (The nature of reduced rigid body velocities) If a rigid body has a nondegenerate inertia tensor, then τ red spatial = τ spatial and τ red body = τ body . In this case, these vector bundle isomorphisms provide trivialisations of the tangent bundle of Rgd(B; S)/I(B): If the rigid body has zero inertia tensor, then, while the reduced velocity mappings τ red spatial and τ red body are no longer equal to their unreduced counterparts, they are still trivialising as, upon quotienting by the internal symmetry group, they essentially take the form However, when the inertia tensor is nonzero but degenerate, then the reduced velocity mappings are no longer trivialising, as indeed the tangent bundle to the reduced configuration space is no longer trivialisable by Proposition 4.9(ii) and the Hairy Ball Theorem [Abraham, Marsden, and Ratiu 1988, Theorem 7.5.13]. This means that we do not have a convenient notational device in this case for separating out the spatial and body angular and translational velocities.
The Riemannian metric G 0,B we call the kinetic energy metric for the body. Of course, one also has spatial and body versions of G 0,B as fibre metrics on the vector bundles that we denote by G spatial 0,B and G body 0,B , respectively. Explicitly, and

Interconnected rigid body systems: kinematics
Next we turn our attention to a general class of mechanical systems derived from interconnections of rigid bodies. What we do, essentially, is give a multibody version of the development of Sections 3 and 4, along with an introduction of constraints on configurations that arise from interconnections. In this section we also consider constraints on the velocities of the system. We consider throughout the effects of degenerate inertia tensors for the rigid bodies of the system, and much of the development is concerned with the technicalities required to do this.

Configuration manifold.
We consider a Newtonian space model S = (S, V, g, θ) and a finite collection B a = (B a , U a , G a , Θ a ), a ∈ {1, . . . , m}, of body reference spaces. In each of the body reference spaces B a , a ∈ {1, . . . , m}, we have a body (B a , µ a ). The masses, centres of mass, and inertia tensors about the centre of mass of these bodies we denote by m a , X c,a , and I c,a , a ∈ {1, . . . , m}.
We first consider the artificial situation where all bodies may move freely in space, even allowing that multiple bodies occupy the same physical space.
5.1 Definition: (Free configuration manifold) Let S be a Newtonian space model, let B a , a ∈ {1, . . . , m}, be body reference spaces, and let (B a , µ a ), a ∈ {1, . . . , m}, be rigid bodies. The free configuration manifold for the system is Q free = m a=1 Rgd(B a ; S). • We next wish to consider sets of admissible interconnections of the bodies (B 1 , . . . , B m ) in physical space.
(ii) A physical configuration space is a subset P ⊆ m a=1 2 S of physical configurations. A point in P is called an admissible physical configuration. (iii) The configuration space associated with a physical configuration space P is the subset Q ⊆ Q free defined by A point in Q is called an admissible configuration. • In general, the set of admissible configurations may be quite complicated. However, in this paper we shall consider the following situation, which is the one that is commonly (if tacitly) made in treatments at the level we are considering.

Definition:
(Interconnected rigid body system) An interconnected rigid body system consists of the following data: (i) a Newtonian space model S = (S, V, g, θ); (ii) body references spaces B a = (B a , U a , G a , Θ a ) with rigid bodies (B a , µ a ), a ∈ {1, . . . , m}; (iii) a physical configuration space P for which the associated configuration space Q is a smooth immersed submanifold of Q free , called the configuration manifold , i.e., Q is the image of a smooth immersion i : Q → Q free , and has the differentiable structure induced by this immersion. • The preceding definition certainly captures the setting we wish to develop in this paper, by which we mean the "usual" framework for geometric mechanics. Indeed, it is quite general in this regard, since we allow for configuration manifolds that are merely immersed: generally one supposes that Q is embedded, but an example where it is immersed is given by Bullo and Lewis [2004, Exercise 4.2]. However, one can certainly do mechanics in frameworks more general than this, including what is normally referred to as "nonsmooth mechanics" where the set Q of admissible configurations may have boundaries or other nonsmooth phenomenon [Brogliato 1999, Kanno 2011].

Notation: (Configuration manifold)
1. We shall at times simply use words like "an interconnected rigid body system with configuration manifold Q," with the understanding that all of the other data needed to give meaning to these words is tacitly present.
2. Note that a point in Q is an element of Q free . We shall thus denote a point in Q by Φ = (Φ 1 , . . . , Φ m ). and we note that I(B) is a subgroup of m a=1 Rgd(B a ). This subgroup acts freely and properly on the right on Q free with an action that we denote by α B and define by We call I(B) the internal symmetry group of the bodies B.
With the preceding constructions, we make the following definition.

Definition:
(Reduced free configuration manifold) Let S be a Newtonian space model, let B a , a ∈ {1, . . . , m}, be body reference spaces, and let (B a , µ a ), a ∈ {1, . . . , m}, be rigid bodies. The reduced free configuration manifold for the system is Q 0,free = Q free /I(B). By π 0 : Q free → Q 0,free we denote the canonical projection. • Of course, the reduced configuration manifold will be a product of manifolds as prescribed by Proposition 4.9, i.e., a product of components, with the ath component being diffeomorphic to one of (1) Isom + (B; S)×V (when the I c,a is nondegenerate), (2) S(V, g)×V (when I c,a is degenerate but nonzero), or (3) V (when I c,a is zero).
Let us make a fundamental observation about the rôle of the internal symmetry group in the most general setting for configurations of rigid bodies. Proof: This follows since, for any a ∈ {1, . . . , m}, if Ψ a ∈ I(B a ) then Ψ a (B a ) = B a , this by Lemma 4.5.
By the lemma, we can make the following definition. 5.7 Definition: (Reduced configuration space) Let S be a Newtonian space model, let B = (B 1 , . . . , B m ) be rigid bodies in body reference spaces B 1 , . . . , B m , and let P be a physical configuration space with Q the corresponding configuration space. The reduced configuration space is Q 0 = Q/I(B). • The following result relates this subgroup to our framework of configuration spaces. Motivated by the lemma, we make the following definition.

5.9
Definition: (Reduced configuration manifold) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and with configuration manifold Q. The reduced configuration manifold is Q 0 = Q/I(B). By π 0 : Q → Q 0 we denote the canonical projection (accepting a slight abuse of notation). • Note that the obvious equivariance of the inclusion of Q in Q free gives the following commuting diagram: That is to say, there is a natural inclusion of Q 0 in Q 0,free . This will be important to us in our development of the Euler-Lagrange equations from the Newton-Euler equations in Section 8.

Motion of interconnected rigid body systems.
We next consider the velocity representations from Section 2.3 adapted to interconnected rigid body systems. We thus consider an interconnected rigid body system comprised of m rigid bodies and with configuration manifold Q.
First we consider the free motions of the system, i.e., without imposing the requirement that motions remain in Q. By taking the centre of mass for each body as an origin for its body reference space we establish, by a repeated application of the constructions of Corollary 2.14, a vector bundle monomorphism defined by requiring that the restriction to T(Rgd(B a ; S)) be the map τ Xc,a of Corollary 2.14 for each a ∈ {1, . . . , m}.
In like manner, a repeated application of Lemma 2.13 and Definition 2.16 gives natural vector bundle isomorphisms defined by requiring that the restriction to T(Rgd(B a ; S)) be the maps τ spatial,a and τ body,a for each a ∈ {1, . . . , m}. Note that the definition of τ spatial requires, not just a prescription of the centres of mass for all bodies as body origin, but also the choice of a spatial origin x 0 ∈ S. However, this choice of spatial origin is not required for the construction of τ body .
Finally, still considering free motions of our rigid bodies, we have reduced velocity mappings, which are vector bundle isomorphisms where VQ free = ker(T π 0 ) is the vertical bundle for the free configuration space as a principal bundle over the reduced free configuration space. Now it is an easy matter to include interconnections into the preceding development. Indeed, since Q ⊆ Q free , we have TQ ⊆ TQ free . We can thus restrict the vector bundle monomorphism τ c to a vector bundle monomorphism with domain TQ and codomain V c . We can also restrict the vector bundle isomorphisms τ spatial and τ body to vector bundle monomorphisms with domain TQ, the codomains being the restricted vector bundles V spatial |Q and V body |Q, respectively. If the bodies (B a , µ a ), a ∈ {1, . . . , m}, possess the internal symmetry group I(B), then we have the inclusion TQ 0 ⊆ TQ 0,free . Thus the vector isomorphisms τ red spatial and τ red body restrict to vector bundle monomorphisms with domain TQ 0 , the codomains being the restricted vector bundles V red spatial |Q 0 and V red body |Q 0 , respectively. Note also that the following diagram commutes: While for TQ free we have convenient trivialisations τ spatial and τ body , we do not generally have such trivialisations for the reduced velocities, cf. Remark 4.16.
When additional brevity is needed, we may sometimes write such a point simply as (Φ, (A, v)), accepting a mild abuse of notation in this case. (i) a motion for the system is a curve φ : T → Q whose domain is a time interval in a time axis T; (ii) the reduced motion associated to a motion φ : For a differentiable motion φ : T → Q, we can define corresponding spatial velocities ω φa : T → V and v φa : T → V, a ∈ {1, . . . , m}, by requiring that and body velocities Ω φa : T → U a and V φa : T → U a , a ∈ {1, . . . , m}, by requiring that , . . . , (Ω φm (t), V φm (t)))), t ∈ T , using the notation of Definition 2.16. One also has velocities for the reduced motion, and spatial and body representations of these. These are defined by Since these have no particularly convenient representation in general, we do not attempt to give expressions for these.

Riemannian semimetrics and metrics for interconnected rigid body systems.
It is a relatively straightforward matter to adapt the constructions of Sections 3.3 and 4.4 to the situation where we have interconnections. In this section we perform the more or less obvious constructions and give the resulting notation we shall use. We shall make a few statements in this multibody setting that we shall not prove, in all cases these following easily from single body analogues. We let S = (S, V, g, θ) be a Newtonian space model, let B a = (B a , U a , G a , Θ a ), a ∈ {1, . . . , m}, be body reference spaces with bodies (B a , µ a ), a ∈ {1, . . . , m}. By applying the construction of Definition 3.7 to each component of the product Q free , we obtain a Riemannian semimetric on Q free that we denote by G B . Explicitly, 1,a , v 1,a ), (A 2,a , v 2,a )), (5.2) for (A 1 , v 1 ), (A 2 , v 2 ) ∈ T Φ Q free and Φ ∈ Q free . This has spatial and body representations-by applying the constructions of (3.2) and (3.3) componentwise-as fibre semimetrics on the vector bundles V spatial and V body . These fibre metrics we denote by G spatial B and G body B , respectively. Explicitly, for (A 1 , v 1 ), (A 2 , v 2 ) ∈ T Φ Q free and Φ ∈ Q free . By Corollary 4.7, G B is invariant under the right action of I(B) on Q free . Thus, just as in Lemma 4.17, the Riemannian semimetric G B descends to a Riemannian metric on Q 0,free . We denote this Riemannian metric by G 0,B . Explicitly, This has spatial and body representations as a fibre metric on the vector bundles V red spatial and V red body that we denote by G spatial 0,B and G body 0,B , respectively. Explicitly, and for (A 1 , v 1 ), (A 2 , v 2 ) ∈ T Φ (Q free ) and Φ ∈ Q free . Now we consider the effects of interconnections. Thus we suppose that, along with the above data, we have a set of admissible configurations defining an interconnected rigid body system with configuration manifold Q. Simply by restriction, the Riemannian semimetric G B induces a Riemannian semimetric on Q. We shall denote this Riemannian semimetric by G. Since G B and Q are I(B)-invariant, the Riemannian semimetric G on Q descends to a Riemannian metric G 0 on Q 0 . This Riemannian metric agrees with the restriction of the Riemannian metric on Q 0,free to Q 0 by virtue of the inclusion defined by the commuting diagram (5.1).

5.5.
Velocity constraints for interconnected rigid body systems. We will consider velocity constraints for systems of interconnected rigid bodies that model physical phenomenon like rolling contact. We do so by first introducing the notion of a constraint on the motion of a single rigid body. This will allow us to give a physically meaningful construction of velocity constraints by amalgamating such constraints over the various bodies of an interconnected rigid body system.
Thus we start with the notion of a velocity constraint for a rigid body.

5.12
Definition: (Primary velocity constraint, constraint subspace) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, and let (B, µ) be a rigid body in B. Let Φ ∈ Rgd(B; S) and X 0 ∈ B.
The constraint subspace associated to a primary velocity constraint C Φ,X 0 is the subspace D Φ,X 0 ⊆ T Φ (Rgd(B; S)) given by using the identification of Corollary 2.14 with respect to X c . • Let us give the physical meaning of the preceding mathematical definitions. First, let us consider a primary velocity constraint C Φ,X 0 and let (ω, v) ∈ C Φ,X 0 . We should think of v as being a possible spatial translational velocity of the point Φ(X 0 ) and ω as being a possible spatial angular velocity of the body Φ(B) about the point Φ(X 0 ). Thus a primary velocity constraint is a constraint on the velocities of the body about the reference point Φ(X 0 ). The condition (5.4) means that we should not constrain the inertialess motions of the body. Now let us understand the meaning of the constraint subspace associated with a primary velocity constraint. Let us suppose that we have a differentiable rigid motion φ : T → Rgd(B; S) such that φ(t 0 ) = Φ. Then the spatial motion of the point X 0 in the body is given by Differentiating this expression at t 0 gives If we require that using the identification of T Φ (Rgd(B; S)) from Corollary 2.14. The preceding definition gives as a precise way to transfer the physical data of how one might typically prescribe a primary velocity constraint into a subspace tangent to the configuration space of the rigid body. Now we present the way in which one makes these definitions for interconnections of rigid bodies. (i) A physical velocity constraint is an assignment, to each Φ = (Φ 1 , . . . , Φ m ) ∈ Q, the following data: (a) for each a ∈ {1, . . . , m}, k Φ,a ∈ Z ≥0 satisfying k α B,Ψ (Φ),a = k Φ,a for every Ψ ∈ I(B); (b) for each a ∈ {1, . . . , m} and j ∈ {1, . . . , k Φ,a }, a point X Φ,a,j ∈ B a satisfying X α B,Ψ (Φ),a,j = X Φ,a,j for every Ψ ∈ I(B); (c) for each a ∈ {1, . . . , m} and j ∈ {1, . . . , k Φ,a }, a primary velocity constraint C Φ,a,j for B a at (Φ a , X Φ,a,j ) satisfying C α B,Ψ (Φ),a,j = C Φ,a,j for every Ψ ∈ I(B).
(ii) For Φ ∈ Q, a ∈ {1, . . . , m}, and j ∈ {1, . . . , k Φ,a }, the constraint subspace is the subspace D Φ,a,j ⊆ T Φ Q free given by The constraint distribution associated to a physical velocity constraint is the assignment, for each Φ ∈ Q, of the subspace D Φ ⊆ T Φ Q given by The physical meaning of this mathematical definition is the natural adaptation of that above for single rigid bodies, allowing for the facts that (1) each body may have primary velocity constraints applied at multiple points and (2) the allowable velocities for the ath body might depend on the configurations of the other bodies. The constraint distribution is the set of velocities satisfying all constraints, as well as the constraint of remaining in Q.
The following basic attribute of constraints will allow us to reduce these by the internal symmetry group.

Lemma: (Invariance of constraint distributions under internal symmetry)
Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and with configuration manifold Q. If we are given physical velocity constraints giving rise to a constraint distribution D ⊆ TQ, then D is invariant under the right-action of I(B) in the sense that T Φ α B,Ψ (D Φ ) = D α B,Ψ (Φ) for every Φ ∈ Q and Ψ ∈ I(B), where α B denotes the rightaction of I(B).

Proof: Since I(B) acts by rotations about the centres of mass of the bodies, we have
by Lemma 2.15. Thus the result follows from (1) the following computations: a ∈ {1, . . . , m}, j ∈ {1, . . . , k Φ,a }, whenever along with (2) the fact that C α B,Ψ (Φ),a,j = C Φ,a,j . In the second of the above computations we have used Lemma 4.5.
Generally speaking, of course, a constraint distribution as prescribed above will have no nice properties; indeed, one must ask that all subspaces "fit together" in a nice way. This is done according to the following definition. (i) The velocity constraint is cosmooth if, for every Φ 0 ∈ Q, there exists a neighbourhood U of Φ 0 and smooth one-forms (α a ) a∈A on U such that for every Φ ∈ U.
(ii) The velocity constraint has locally constant rank if it is cosmooth and the function It is valid to speculate on why we ask that the annihilating codistribution for a constraint distribution be smooth, and not the constraint distribution itself. The reason for this is that, in practice, it is the constraint forces that vary smoothly, not the spaces of admissible velocities. We shall look at constraint forces in Section 6.7. In this paper we shall only consider locally constant rank velocity constraints, and in this case cosmoothness of the velocity constraint is equivalent to smoothness of the constraint distribution itself. As far as we are aware, there does not exist a satisfactory physical theory for velocity constraints that do not have locally constant rank, although there do exist physical systems with constraints of this type.
Note that there are many interesting attributes of constraint distributions that can be discussed using differential geometric techniques, e.g., integrability in the sense of Fröbenius and the Chow-Rashevsky Theorem. These, however, belong more properly to the subject of geometric mechanics itself, and not as much to the physical foundations such as interest us here. We refer to books such as [Bloch 2003] and [Bullo and Lewis 2004, §4.5] for details on these and other topics concerning "nonholonomic constraints" in geometric mechanics.
Finally, let us indicate how constraint distributions descend to distributions on the reduced configuration space.

Lemma: (Constraint distributions descend to the reduced configuration space)
Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and with configuration manifold Q. Suppose that we have physical velocity constraints giving rise to the constraint distribution D. If we denote D 0,π 0 (Φ) = T Φ π 0 (D Φ ), then the following statements hold: (i) D 0 is well-defined, i.e., it depends only on π 0 (Φ) and not on Φ; (ii) if D is cosmooth, then so is D 0 ; (iii) if D has locally constant rank, then so does D 0 .
Proof: (i) This follows at once from I(B)-invariance of D, along with the fact that π 0 = π 0 • α B,Ψ for every Ψ ∈ I(B).
(ii) Since π 0 : Q → Q 0 is a principal I(B)-bundle, we may place on it a smooth connection [Kobayashi and Nomizu 1963, Theorem II.2.1]. This gives a decomposition TQ = VQ ⊕ HQ for some complement HQ of VQ ker(T π 0 ). We then have a corresponding decomposition T * Q = V * Q ⊕ H * Q of the cotangent bundle. For Φ ∈ Q and Φ 0 = π 0 (Φ), we have isomorphisms Let us abbreviate the second of these maps by i Φ . Now let Φ 0 ∈ Q 0 , let U ⊆ Q 0 be a neighbourhood of Φ 0 , and let σ : U → Q be a local section of π 0 : Q → Q 0 . Let Φ = σ(Φ 0 ), let V ⊆ Q be a neighbourhood of Φ, and let (α a ) a∈A be one-forms on V such that for each Φ ∈ V. Suppose that U is sufficiently small that σ(U) ⊆ V. For a ∈ A, let α h a be the projection of α a onto H * Q, noting that α h a is smooth. We then define a smooth one-form β a on U by A bit of linear algebra then shows that which gives this part of the lemma. (iii) This follows from the following observation about dimensions: dim(D Φ ) = dim(I(B)) + dim(D 0,π 0 (Φ) ), this because of Lemma 5.14 and the fact that VQ free |Q ⊆ D by definition of primary velocity constraint.
The preceding lemma allows us to make the following definition.

Definition: (Reduced constraint distribution)
Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and with configuration manifold Q. Suppose that we have physical velocity constraints giving rise to the constraint distribution D. The distribution D 0 ⊆ TQ 0 is the reduced constraint distribution. • Since the preceding constructions may not be entirely transparent, we illustrate them with a simple example.

5.18
Example: (Sliding rod) The physical system we consider is a rod, one tip of which slides on a plane. We shall be very concrete, and so we let S = B = R 3 which have the standard structure as affine spaces modelled on the vector spaces V = U = R 3 . We use the standard inner products and orientations on both U and V. We take the spatial origin to be x 0 = 0 and use the standard orthonormal basis (e 1 , e 2 , e 3 ) for R 3 . The body in this case we define as i.e., the body is an infinitely thin rod of length b − a. By Lemma 3.4 we have X c = (0, 0, z c ) for some z c ∈ (a, b). We assume, without loss of generality, that z c = 0. In S we define a plane The set of physical configurations we take to be i.e., the tip of the rod at (0, 0, a) is required to reside in the plane P as in Figure 2. The P Figure 2. Rod with tip constrained to move in a plane velocity constraint is that the motion of the rod be such that the velocity of the point of contact of the rod with P be tangent to P. Thus, according to Definition 5.12, we take X 0 = (0, 0, a) and let Φ ∈ Rgd(B; S) be an admissible configuration, i.e., one for which Φ(X 0 ) ∈ P. The translational velocities of the point Φ(X 0 ) allowed by the velocity constraint are then of the form (v x , v y , 0) ∈ V. A motion of the system has the form t → φ(t) = (R(t), r(t)) ∈ SO(3) × R 3 , where r(t) = φ(t)(X c ). Thus Thus the primary velocity constraint for the system at (Φ, X 0 ) is To see that this in indeed a primary velocity constraint, note that ker(I c ) = span(e 3 ). Therefore, for some a ∈ R. Thus, if ω ∈ ker(R Φ • I c • R T Φ ), then g(e 3 , 0 + ω × (R Φ (X 0 ))) = 0 since ω and R Φ (X 0 ) are both collinear with R Φ (e 3 ). The corresponding constraint subspace is We note that, because of the simplicity of this example, the tangent bundle of Q at Φ is exactly the constraint subspace D Φ,X 0 . Thus the constraint distribution is defined by

Interconnected rigid body systems: forces and torques
The last mechanical element in our formulation is that of forces and torques. As usual, we present these starting from an elemental physical formulation, giving the mathematical meaning of all basic notions. Part of the formulation necessarily includes a discussion of forces and torques required to maintain interconnections, and forces and torques required to maintain velocity constraints.
6.1. Forces and torques distributed on a body. To see how forces and torques of a general character arise, in this section we consider forces and torques as vector measures on a body, and indicate how forces of this type are reduced to point forces and torques. First we give the notion of force and torque as a measure.
6.1 Definition: (Distributed force, distributed torque) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, let (B, µ) be a rigid body in B, and let Φ ∈ Rgd(B; S).
(i) A distributed force on B at Φ is a vector measure σ : B(B) → V from the Borel • The condition on a distributed torque means that a torque does no work on the inertialess motions of the system (we will discuss work in Section 6.4). Note that, with distributed forces and torques, we are intermingling body and spatial points of view, since points where force are applied are given in body coordinates (reflected by the fact that a force distribution has as its domain the Borel subsets of B), but takes values in the spatial vector space V).
Let us give some examples of distributed forces and torques before we consider further properties and related constructions.

Examples: (Distributed forces and torques)
1. Let X 0 ∈ B and let f ∈ V. By δ X 0 denote the Dirac scalar measure on B(B). The point force f at X 0 is the vector measure δ X 0 · f ; thus for B ∈ B(B).
2. In like manner, a point torque at X 0 is of the form δ X 0 · τ for some τ ∈ (ker(R Φ • I c • R T Φ )) ⊥g . 3. Let us indicate how gravitational forces are distributed forces in our setting. We let a g ∈ V be a nonzero vector in the direction of gravity whose length is the gravitational acceleration in appropriate units. The gravitational force on a body (B, µ) is then the vector measure F g = µ · a g . Thus we define a force distribution f φ (t 0 ) and a torque distribution τ φ (t 0 ) by for B ∈ B(B). We call these the inertial force distribution and the inertial torque distribution. The relevance of these will be apparent in our statement of the Lagrange-d'Alembert Principle in Proposition 7.6. • With the notion of a point force at hand, we can define some additional constructions with distributed forces.
6.3 Definition: (Resultant force, resultant torque) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, and let (B, µ) be a rigid body in B. For a distributed force σ and a distributed torque ρ at Φ, and for x 0 ∈ S: • 6.4 Remark: (Computation of resultant torque) Let us clarify the manner in which the resultant torque of a distributed force is computed. Thus we consider a vector measure For v ∈ V we define a signed measure ν v on B by ν v = p v • ν. With this notation, given an oriented orthonormal basis (e 1 , e 2 , e 3 ) for V, we then define

understanding, as in Remark 3.3, how we integrate functions taking values in finitedimensional vector spaces. It is a simple matter to show that this definition is independent of the choice of basis. •
With these constructions, we have the following notion. (ii) if f σ = 0, then there exists τ σ ∈ V such that τ σ,x 0 = τ σ for every x 0 ∈ S, and we have the following cases: (a) if τ σ = 0 then there is no centre of force; (b) if τ σ = 0 then the set of all centres of force is S.
Proof: Let x 0 ∈ S, letx ∈ S, and note thatx is a centre of force if and only if We note that this is a linear equation We examine this equation in various cases.
First we suppose that f σ = 0. We claim that ker(A σ ) = image(A σ ) ⊥g . Indeed, using (1.1). Since ker(A σ ) = span(f σ ), the linear equation forx − x 0 has solutions if and only if b σ,x 0 is g-orthogonal to f σ . To this end, we let (e 1 , e 2 , e 3 ) be an oriented orthonormal basis for V and compute using (1.1) for the third line and Fubini's Theorem for the fourth line. This shows that b σ,x 0 is indeed g-orthogonal to f σ , showing that the linear equation forx − x 0 has solutions when f σ = 0. Moreover, any two such solutions will differ by an element of ker(A σ ), and this proves (i). Now suppose that f σ = 0 and let x 1 , x 2 ∈ S. We compute showing that the resultant torque τ σ,x 0 is indeed independent of x 0 , which gives the first assertion in part (ii). Let us denote the resultant torque, then, by τ σ . First suppose that τ σ = 0. Then, in our linear equation above forx − x 0 , A σ = 0 and b σ,x 0 = 0. Thus the linear equation has no solutions. When τ σ = 0, the linear equation simply reads "0 = 0," and so the set of solutions is all of V.
We are not aware of this sort of result having been stated or proved in this generality, so let us make a few comments on it. 6.7 Remarks: (Centre of force) 1. The fact that, when f σ = 0, the centre of force is not uniquely defined is a reflection of the fact that a force can be moved along its line of application and the resultant torque, which is orthogonal to this line, remains unchanged. 2. When f σ = 0 but τ σ = 0, then the force is a "pure torque." The fact that, in this case, a centre of force does not exist is a reflection of the oft-cited principle that torque is not applied about any particular point. 3. When both f σ and τ σ vanish, then the force is an equilibrium force. That the centre of force is arbitrary is a reflection of the oft-cited principle that, at equilibrium, one can take moments of forces about any point one wishes. • Let us illustrate the preceding notions for the force distributions we introduced in Example 6.2.
6.8 Examples: (Resultant force, resultant torque, centre of force) 1. For the point force f at X 0 , the resultant force is obviously f and the set of centres of force is 2. It is an easy matter to show that the centre of mass is a centre of force for the gravitational force distribution of Example 6.2-3. The set of centres of force is The resultant force is, of course, ma g . 3. In Example 6.2-4 we introduce the inertial force for a body (B, µ) along a motion φ : T → Rgd(B; S) at t 0 ∈ T . The set of centres of force for this force distribution is easily seen to be The resultant force is mv φ (t 0 ), as we shall see in Lemma 7.1. The resultant torque is computed to be d as we shall see in Lemma 7.1. • We can now reduce a distributed force to a specific, equivalent point force and a torque.
The fact that the central force is a force distribution, whereas the central torques are merely vectors in V, reflects the fact that forces require points of application, whereas torques do not.
We should verify that central torques do no work on inertialess motions.
6.10 Lemma: (Central torques and inertialess motions) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, let (B, µ) be a rigid body in B, and let Φ ∈ Rgd(B; S). If σ is a force distribution and ρ is a torque distribution on B at Φ, then We consider three cases. First, if the inertia tensor is nondegenerate, the assertion follows trivially since (ker( If the inertia tensor is degenerate but nonzero, then all points on the body lie on a line through X c in the direction of ker(I c ) (combining Example 4.4-2 and the proof of Lemma 4.8). Let (e 1 , e 2 , e 3 ) be an oriented orthogonal basis for V. If X ∈ B, then Thus, for any X ∈ B and v ∈ V, we have

Then we have, for every
giving the assertion in this case. Finally, if the inertia tensor is zero, then the body is a particle, and so X − X c = 0 for every X ∈ B. Since (ker(R Φ • I c • R T Φ )) ⊥g = {0}, our assertion follows immediately in this case.
As we shall see in Theorem 7.4-and as is well-known-a distributed force can be replaced with its central force and torque and the dynamics will be unchanged. For this reason, we shall often use a point force and torque in place of a distributed force. When the effects of forces arising from multiple physical effects are to be accumulated, this can simply be accounted for by adding the force distributions by the following reasoning: (1) the set of distributed forces is a vector space under pointwise addition and scalar multiplication (with points being elements of the Borel σ-algebra); (2) the maps assigning a resultant force and torque to a distributed force are R-linear.
6.2. Primary torque-force. With the above construction of forces on physical grounds, and the reduction of these to central forces and torques, we next give the elementary mathematical definition of forces as we shall use them to incorporate into our simple equations of motion for a rigid body.
6.11 Definition: (Primary torque-force (single rigid body)) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, let (B, µ) be a rigid body in B, and let Φ ∈ Rgd(B; S). A primary torque-force on B at Φ is a pair Mathematically, of course, these definitions merely say that central force and torque are elements of V. The condition on a primary torque indicates that a torque should do no work on the inertialess motions of the system. In accord with our discussions regarding distributed forces, we think always of a primary force as being a point force at the centre of mass, or more precisely at the image of the centre of mass under the configuration Φ. We think of the direction of a primary torque τ as being the axis about which the torque is applied and the magnitude of the torque is the length of τ . In Figure 3 we depict how f τ Figure 3. Central torque-force on a rigid body in a configuration one should physically think of a primary force and torque.
It will be convenient to extend the previous single body construction to systems of rigid bodies.
6.12 Definition: (Primary torque-force (multiple rigid bodies)) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and with configuration manifold Q. Let Φ ∈ Q.
Note that a primary torque-force can arise from an amalgamation of force and torque distributions because of the additivity of distributed forces and torques. Thus a primary torque-force is to be thought of as a simple mathematical representation of what may well be complicated physical data.
6.3. Torque-force fields. In practice, external (rather than internal, which we shall consider shortly) forces and torques are not just prescribed at a point and in a given configuration, but are given as functions of, possibly, time, position, and velocity. Other dependencies are possible and can be worked into the constructions as needed.
We first consider the case of force and torque distributions on single rigid bodies. To facilitate this, we denote by M(B; V) the set of vector measures from the Borel sets B(B) into V.
6.13 Definition: (Force distribution field, torque distribution field) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, let (B, µ) be a rigid body in B, let T be a time axis, and let T ⊆ T be a time interval. for which ρ(t, (Φ, (A, v))) is a torque distribution for B at Φ for each (t, (Φ, (A, v))) ∈ T × T(Rgd(B; S)).

•
Of course, given a force and torque distribution field and (t, (Φ, (A, v))) ∈ T × T(Rgd(B; S)), there are central force and torque distributions associated to the force and torque distributions σ(t, (Φ, (A, v))) and ρ(t, (Φ, (A, v))). Thus, as usual, the effects of force distribution and torque distribution fields can be reduced to the consideration of primary force and torque fields, a notion that we define next.
6.14 Definition: (Torque-force field) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , configuration manifold Q, time axis T, and time interval T ⊆ T. A torqueforce field is a mapping for all Ψ ∈ I(B) and such that, for each t ∈ T , (Φ, (A, v)) ∈ TQ, and a ∈ {1, . . . , m}, is a primary torque-force on B a at Φ a . • The preceding definition of torque-force fields is very general, and will not have any useful properties, e.g., one may not have a motion associated to the physical laws governing rigid body motion, i.e., the Newton-Euler equations that we will consider in Section 7. Since such motions arise, of course, as solutions to differential equations, the regularity with which torque-force fields depend on time and position/velocity can be as general as required for existence and uniqueness of solutions to differential equations. This is considered in a unified and general way, for many sorts of regularity, in the book of Jafarpour and Lewis [2014]. Here we will not burden our presentation with such generality, and shall merely say that a torque-force field is smooth if it is a smooth function on T × TQ in the usual sense. If we have force distribution and torque distribution fields defined on the bodies of our system, we shall say they are weakly smooth if the resulting torque-force field, defined using the associated central force and torque distribution fields, is smooth.
Let us now show that, in the presence of internal symmetry, torque-force fields descend to the reduced configuration manifold. First we denote which we think of as being the vector bundle where torque-force fields take their values. This vector bundle has an action of I(B), this being given by cf. equation (4.2). Note that, by Lemma 4.6, the fibre of T-F over Φ ∈ Q depends only on the I(B)-orbit π 0 (Φ). Thus T-F/I(B) is a well-defined vector bundle over Q 0 and the projection to the quotient has the form with appropriate restrictions on τ .
(ii) This follows since π 0 and T π 0 are smooth surjective submersions by Proposition 4.9.
The final notion we define is that of a torque-force field along a motion of a system.
6.16 Definition: (Torque-force along a motion) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and configuration manifold Q. Let T be a time axis with T ⊆ T a time interval. For a differentiable motion φ : T → Q, a torque-force field along φ is a mapping τ ⊕ f : T → V m ⊕ V m such that (τ (t), f (t)) is a primary torque-force on B at φ(t). • 6.4. Work and power. An important physical notion that can be used to assign attributes to forces and torques is that of work, and its infinitesimal variant, power. In this section we indicate how these concepts are defined in our setting.
, . . . , m}, exist and are finite; Work and power can be dropped to the reduced configuration space. Note that we have the following natural bilinear pairing on the fibres of vector bundles: That this pairing is well-defined follows immediately from the definitions of T-F and V red spatial and because the fibres of T-F/I(B) are "the same" as the fibres of T-F, cf. (6.1).
6.18 Lemma: (Reduction of work and power) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and configuration manifold Q. Let T be a time axis, let T ⊆ T be a time interval, let φ : T → Q be a differentiable curve, and let τ ⊕ f : T → V m ⊕ V m be a continuous torque-force field along φ. Then Proof: Clearly it suffices to prove the assertion for power, but this follows directly from the definition of Σ g .
6.5. Virtual displacements, virtual power, and virtual work. The Principle of Virtual Work, and the closely related Lagrange-d'Alembert Principle, are devices for characterising torque-forces and for understanding the rôles of torque-force in variational principles. Behind these principles are the notions of virtual displacement and virtual work. The presentation of these ideas in the applied literature is typically very opaque and, in the mathematical literature, they are typically not covered at all, or at best as a mathematical triviality. 4 Therefore, here we provide a clear characterisation of the tools used in the Principle of Virtual Work, including the physical ideas behind these. We consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and configuration manifold Q. We let T be a time axis with T ⊆ T a time interval, and let φ : T → Q be a differentiable motion for the system. We then consider a variation of this motion, i.e., a differentiable map σ : [−r, r] × T → Q free such that σ(0, t) = φ(t). Let us denote σ s = σ(s, t) and define σ (s, t) = σ s (t). Note that σ s is a curve on Q for all s ∈ [−r, r]. Denote By choosing a spatial origin x 0 we can define, for a ∈ {1, . . . , m}, and so δ σa R σa (t) = d ds s=0 R σa (s, t), δ σa r σa (t) = d ds s=0 r σa (s, t).
We then denote With these constructions at hand, let us make the basic definitions we will use.
6.19 Definition: (Virtual displacement, virtual power, virtual work) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and configuration manifold Q. Suppose that we have velocity constraints defining a constraint distribution D. We let T be a time axis with T ⊆ T a time interval, and let φ : T → Q be a differentiable motion for the system for which φ (t) ∈ D φ(t) for each t ∈ T . Let τ ⊕ f be a torque-force field along φ. With the above constructions, we have the following definitions.
We note that a virtual displacement is a vector field with values in TQ free along the curve φ. An important question is just which vector fields along φ arise as virtual displacements.
The following result provides some answers to this question.
6.20 Proposition: (The character of virtual displacements) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and configuration manifold Q. Suppose that we have velocity constraints defining a constraint distribution D. We let φ : [t 0 , t 1 ] → Q be a differentiable motion for the system for which φ (t) ∈ D φ(t) for each t ∈ [t 0 , t 1 ]. The following statements hold: exists a Q-variation σ of φ such that δσ = V ; (iii) if σ is a Q-variation, then it is an infinitesimal Q-variation; (iv) there exists a constraint distribution D and a C 1 -mapping V : [t 0 , t 1 ] → TQ free such that V (t) ∈ D φ(t) and such that there is no D-variation σ of class C 2 for which δσ = V ; (v) there exist a constraint distribution D and a D-variation that is not an infinitesimal D-variations.
Proof: (i) We shall prove this in a more general setting in Lemma 8.3 below. Note that, while the vector field in Lemma 8.3 vanishes at the endpoints of the interval, the proof does not rely on this fact as it only ensures that the resulting variation is a fixed endpoint variation, which we are not requiring here.
(ii) This also follows using the arguments from the proof of Lemma 8.3. (iii) This is obvious since the curve s → σ(s, t) is in Q if σ is a Q-variation.
(iv) We will build an example. We will work with a single body and take U = V = R 3 with S = B = R 3 affine spaces in the canonical way. We use coordinates (x, y, z) for R 3 . We thus have Q free = SO(3) × R 3 and we define a distribution on Q free by defining a distribution on R 3 , then asking that it have no component in the SO(3)-direction. The distribution D that we define on R 3 is that generated by the vector fields We take a curve φ : We define a vector field along φ, noting that V (t) ∈ D φ(t) for all t ∈ [0, 1]. Now consider a variation σ : [−r, r] × [0, 1] → R 3 of φ and write σ(s, t) = (σ x (s, t), σ y (s, t), σ z (s, t)).
One readily sees that σ is a D-variation if and only iḟ "dot" representing time-derivative. Differentiating this expression with respect to s at s = 0, and using equality of mixed partials, we have we conclude that there is no D-variation σ of class C 2 for which V = δσ. Now let us see how we can conclude the same thing for variations, not just in R 3 , but in SO(3) × R 3 . This follows since, by the reasoning that gives the conclusion of part (ii) of the proposition, the following two statements are equivalent: 1. there exists a D-variation, in SO(3) × R 3 , of φ of class C 2 such that δσ = V ; 2. there exists a D-variation, in {id} × R 3 , of φ of class C 2 such that δσ = V .
(v) We do as in the previous part of the proof and work with Q free = SO(3) × R 3 and consider the same distribution D. We consider the same curve φ as in the previous part of the proof. We then consider the D-variation We then compute which we readily determine is not D-valued. In this case, the extension of the conclusion to SO(3) × R 3 is immediate, since one can consider the variation As we shall see in Section 6.7, the problems of parts (iv) and (v) of the preceding result have real consequences in mechanics, consequences that have caused significant confusion in the literature dealing with velocity constraints.
6.6. Interconnection torque-forces. Interconnected rigid body systems require interconnection forces and torques between the bodies and with the external surroundings to maintain membership in the configuration manifold. To this end, we make the following definition.
6.21 Definition: (Interconnection torque-force) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and configuration manifold Q. Let Φ ∈ Q. A primary torque-force (ι, i) ∈ V m ⊕ V m is an interconnection torque-force at Φ if P (ι ⊕ i, ψ)(0) = 0 for every differentiable motion ψ : [−r, r] → Q free of the system satisfying ψ (0) ∈ T Φ Q. We denote which is the interconnection torque-force bundle for the system. • Let us explore the relationship between our characterisation of interconnection torqueforces and virtual displacements. By the very definition, interconnection torque-forces are those that have zero virtual power for virtual displacements associated to infinitesimal Q-variations, i.e., virtual displacements that are tangent to Q. By parts (i) and (ii) of Proposition 6.20, we can work with either infinitesimal Q-variations or with Q-variations. By taking integrals along motions, we can also say that we have zero virtual work for these virtual displacements along motions of the system. The significance of using virtual work in place of virtual power is not revealed here, but only when we discuss variational principles in Section 8. 6.7. Constraint torque-forces. We next consider forces and torques required to maintain velocity constraints. As with interconnection forces and torques, the crucial ingredient is the power of the forces and torques along motions of the system.

Definition: (Constraint forces and torques)
Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and configuration manifold Q. Suppose that we have physical velocity constraints giving rise to a smooth locally constant rank constraint distribution D. Let Φ ∈ Q. A primary torque-force (λ, ) ∈ V m ⊕ V m , is a D-constraint torque-force at Φ if P (λ ⊕ , ψ)(0) = 0 for every differentiable motion ψ : [−r, r] → Q free of the system satisfying ψ(0) ∈ D Φ . We denote which is the D-constraint force bundle for the system. • Let us explore the connection between our characterisation of constraint torque-forces and virtual displacements. There are some important differences between this and what we saw for interconnection torque-forces at the end of Section 6.6. First of all, a torque-force is a constraint torque-force exactly when the virtual power is zero for virtual displacements corresponding to infinitesimal D-variations. By parts (i) and (ii) of Proposition 6.20, such variations can be chosen to be Q-variations or not. However, by parts (iv) and (v) of Proposition 6.20, it is neither necessary nor sufficient to use virtual displacements corresponding to D-variations to characterise constraint torque-forces. Matters such as this are explored by Gràcia, Marin-Solano, and Muñoz-Lecanda [2003].
There is a large and sometimes vituperative literature around the subject of the preceding paragraph, and we shall overview a little of it, just for fun. First of all, we point out that, if one uses D-variations to characterise constraint torque-forces-and the variational principles associated to this-this gives the empirically incorrect governing equations. Nonetheless, this technique is often stated as being physically correct [e.g., Goldstein, Poole, Jr, and Safko 2001, §2.4], probably from some ill-placed belief in the primacy of variational principles. Indeed, this is the argument of Kozlov [1992] in support of this strategy of variations. This is argued against by Kharlomov [1992] in the same journal issue. Lewis and Murray [1995] show that the use of D-variations gives the physically correct equations of motion when the constraints are holonomic, i.e., when the constraints are actually on configurations and not on velocities (this is well-known and has been commented on by many authors). As well, Lewis and Murray give a specific example illustrating that there are physical motions that cannot arise from the use of D-variations. A somewhat recent overview of this subject is given by Flannery [2005]. There are many other "contributions" to this problem of variations, and it is probably one that will never really die. However, our view on this is that it is born partly from (1) the unnecessarily complicated and imprecise language in which the Principle of Virtual Work is couched and (2) an unjustified belief that Nature seeks to optimise in a certain restricted sense. 5 The first of these problems is easily rectified by the following definition. 6 6.23 Definition: (Principles of Virtual Power and Virtual Work) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and configuration manifold Q. Suppose that we have physical velocity constraints giving rise to the constraint distribution D. Let T be a time axis and let T ⊆ T be a time interval.
(i) The Principle of Virtual Power is: for every differentiable motion φ : (a) a continuous torque-force field τ ⊕ f along φ is an interconnection torque-force if and only if the virtual power of τ ⊕ f for every virtual displacement associated to an infinitesimal Q-variation of φ is zero; (b) a continuous torque-force field τ ⊕ f along φ is a D-constraint torque-force if and only if the virtual power of τ ⊕ f for every virtual displacement associated to an infinitesimal D-variation of φ is zero.
(ii) The Principle of Virtual Work is: for every differentiable motion φ : (a) a continuous torque-force field τ ⊕ f along φ for which (τ ⊕ f , φ) is integrable is an interconnection torque-force if and only if the virtual work of τ ⊕ f for every virtual displacement associated to an infinitesimal Q-variation of φ is zero; (b) a continuous torque-force field τ ⊕ f along φ for which (τ ⊕ f , φ) is integrable is a D-constraint torque-force if and only if the virtual work of τ ⊕ f for every virtual displacement associated to an infinitesimal D-variation of φ is zero. • Note that, in Definitions 6.21 and 6.22, we essentially use the Principle of Virtual Power along a trivial curve, i.e., a stationary curve defined on a time interval that is a point. This makes sense in this context since virtual power is an instantaneous notion.
We believe that our definitions give simple and physically meaningful statements that gives the correct equations of motion when they are translated into the problems of dynamics in Section 7. We know, however, that this will not be satisfying to some. . .

Interconnected rigid body systems: dynamics
In this section we present a clear account of the Newton-Euler equations for rigid body motion of single rigid bodies, interconnected systems of rigid bodies, and interconnected systems of rigid bodies subject to velocity constraints. We shall formulate the equations first for single rigid bodies, and then extend to multiple bodies with interconnections.
7.1. Momenta. The dynamical equations of mechanics are balance equations, and what is balanced is momentum in its appropriate forms. In this section we give the definitions of momentum, starting from the following basic physical constructions.
7.1 Lemma: (Translational and angular momentum) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ), and let (B, µ) be a rigid body in B. Let x 0 ∈ S. Let T be a time axis, let T ⊆ T be a time interval, and let φ : T → Rgd(B; S) be a rigid motion. Then we have the following formulae: .
Then the first formula follows immediately since B (X − X c ) dµ(X) = 0 by Lemma 3.4(ii). Now note that, by Lemma 2.17, The second assertion of the lemma follows immediately from the preceding two facts and the computation For the final assertion we have as desired.
Let us name the quantities in the lemma.
7.2 Definition: (Translational and angular momentum) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ), and let (B, µ) be a rigid body in B. Let x 0 ∈ S. Let T be a time axis, let T ⊆ T be a time interval, and let φ : T → Rgd(B; S) be a rigid motion.
(i) The spatial translational momentum for the motion is (ii) The spatial angular momentum about x 0 for the motion is . (iii) The spatial angular momentum for the motion is • 7.2. Newton-Euler equations for a single rigid body. Now we formulate the basic equations of rigid body dynamics, the "Newton-Euler equations." In order to fully understand the origins of these equations, it is useful to start with a framework of distributed forces and torques.
for every t ∈ T . • By stating these equations in integral form, we see clearly how the mass distribution µ, and the torque distribution and force distribution fields ρ and σ allow one to state the equations of motion as infinitesimal (in space) momentum balance equations. However, these equations are clearly more conveniently expressed by doing the integrals and providing "macroscopic" versions of the equations. In order to do this, associated with ρ and σ we define torque and force fields τ ρ , τ σ,x 0 , and f σ in the obvious way: With these constructions we can state the following result which lists various equivalent forms of the Newton-Euler equations. be weakly smooth torque distribution and force distribution fields. For a rigid motion φ : T → Rgd(B; S), the following statements are equivalent: (i) φ satisfies the integral form of the Newton-Euler equations about x 0 for the torque distribution and force distribution fields ρ and σ; (ii) φ satisfies the equationsṗ for every t ∈ T ; (iii) φ satisfies the equationsṗ for every t ∈ T ; (iv) φ satisfies the equations for every t ∈ T ; (v) φ satisfies the equations for every t ∈ T .
Proof: The equivalence of parts (i) and (ii) follows immediately just by the definition of the symbols and Definitions 7.2 and 7.3.
(ii) ⇐⇒ (iii) If either of the equations of parts (ii) or (iii) hold, then mr φ (t) = f σ (t, φ (t)). Given this, we then calculatė We also calculate Combining the preceding two computations gives the desired equivalence.
Let us now prove the equivalence of the final three statements of the theorem. For brevity, let us denote We will use this computation a few times below.
(iii) =⇒ (iv) This follows immediately from the preceding calculation.
The equations of part (ii) are the spatial Newton-Euler equations about x 0 and the equations of part (iii) are the spatial Newton-Euler equations. The equivalence of these two forms of the Newton-Euler equations is a reflection of the well-known principle that one can obtain physically correct equations of motion by either (1) balancing moments about a spatially fixed point or (2) balancing moments about the centre of mass of a body. The equations of part (iv) are simply a re-expression of the spatial Newton-Euler equations with velocity derivatives instead of momentum derivatives. Thus these equations are to be regarded as spatial, as with the first three parts of the theorem. The only "body" equations are those of part (v), which we call the body Newton-Euler equations.
7.3. Newton-Euler equations for interconnected rigid body systems with velocity constraints. Now we turn to the Newton-Euler equations for multiple rigid bodies, allowing for interconnections and velocity constraints. By virtue of Theorem 7.4, we can consider central forces and torques by taking suitable resultant torques for non-central force distributions. Therefore, without loss of generality for the dynamics, we can work only with force and torque fields in the usual sense.
The following definition gives the required construction. Let τ ⊕ f be a torque-force field and suppose that we have velocity constraints giving rise to a smooth locally constant rank constraint distribution D. A C 2 -motion φ : T → Q for which φ (t) ∈ D φ(t) satisfies the Newton-Euler equations if there exist (i) an IT-F-valued torque-force field ι ⊕ i : for each a ∈ {1, . . . , m} and t ∈ T . • The Newton-Euler equations are, in essence, differential-algebraic equations. As such, the matter of existence and uniqueness of solutions is not just immediately clear. However, we shall see in Theorem 8.18, when the data is sufficiently smooth, these equations are actually equivalent to well-posed second-order differential equations (the constrained Euler-Lagrange equations, of course), and so the matter of existence and uniqueness of solutions becomes that for standard ordinary differential equations.

Lagrange-d'Alembert
Principle. The Newton-Euler equations can alternatively be framed using the Lagrange-d'Alembert Principle. For our purposes, this will provide a convenient device for the transition from the balance framework of the Newton-Euler equations to the variational framework of the Euler-Lagrange equations.
In our setting, the following result gives what we want.
This immediately implies that φ satisfies the instantaneous Lagrange-d'Alembert Principle.
(ii) =⇒ (i) Suppose that φ satisfies the instantaneous Lagrange-d'Alembert Principle and define a torque-force field T ⊕ F along φ by T a (t) =μ φa (t) − τ a (t, φ (t))), F a (t) =ṗ φa (t) − f a (t, φ (t)), t ∈ T , a ∈ {1, . . . , m}. Note that T ⊕ F is already CT-F-valued, and so we can conclude the proof by taking ι ⊕ i = 0 and λ ⊕ = T ⊕ F . However, many other choices are possible. For example, we can require that the constraint force satisfy λ ⊕ (t) ∈ τ spatial (TQ), and then take which conditions uniquely prescribe the interconnection and constraint force-torques, and as well has the property that m a=1 (g(ι a (t), λ a (t)) + g(i a (t), a (t))) = 0, i.e., the interconnection and constraint torque-forces are orthogonal in an appropriate sense.

Remarks: (On the Lagrange-d'Alembert Principle)
1. The proof of the preceding proposition bears out the fact that, while there is no ambiguity in the assignment of (ι + λ) ⊕ (i + ), each of ι ⊕ i and λ ⊕ are not uniquely defined by the the requirement that they determine a solution to the Newton-Euler equations.
2. Note that, for a C 2 -motion φ : T → Q, we have a torque-force field τ φ ⊕ f φ along φ defined by τ φ,a (t) =μ φa (t), f φ,a (t) =ṗ φa (t), t ∈ T , a ∈ {1, . . . , m}. This we can call the inertial torque-force field along the motion. Note that, for the ath body and for t ∈ T , τ φ,a (t)⊕f φ,a (t) is the central torqueforce associated with the inertial force and inertial torque distribution of Example 6.2-4 and Example 6.8-3. The instantaneous Lagrange-d'Alembert Principle is then exactly the Principle of Virtual Power applied to the difference between the inertial and applied torque-force fields along the motion. When T is compact (to ensure integrals exist), this is equivalent to , v σa (t))) dt = 0 for every infinitesimal D-variation σ of φ (we refer to Section 6.5 for notation). This is the typical form in which one sees the Lagrange-d'Alembert Principle presented. However, the advantage of using the integral formulation is not presently clear, and will only be seen in Section 8 when we connect dynamics to variational principles. In this section we show how kinetic energy is connected to the Riemannian (semi)metrics of Section 5.4. We start by considering a single rigid body.
7.8 Lemma: (Kinetic energy of a rigid body) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, let (B, µ) be a rigid body in B, let T be a time axis, let T ⊆ T be a time interval, and let φ : T → Rgd(B; S) be a rigid motion. Then , Ω φ (t)).
Proof: Let x 0 ∈ S; as we shall see, the argument does not depend on the selection of such a spatial origin. We have using (1.1). By definition of the inertia tensor, the result follows.
Correspondingly with our integral characterisations of momentum and the Newton-Euler equations, the preceding lemma gives the characterisation of kinetic energy as being an infinitesimal version of " 1 2 mv 2 ," integrated over the body. Next we provide the multi-body version of the preceding lemma, incorporating interconnections and degenerate rigid bodies. In (5.2) we defined a Riemannian semimetric G B on Q free . This drops to the Riemannian metric G 0,B on the reduced free configuration manifold Q 0,free defined by (5.3). The following result connects these constructions with the kinetic energy of the interconnected system.
Proof: The first equality is an immediate consequence of the definition of G B in equation (5.2) and from Definition 3.7, and noting that G is just the restriction of G B to Q. The second equality is an immediate consequence of Lemma 4.17, again along with the fact that G 0 is just the restriction of G 0,B to Q 0 .

Interconnected rigid body systems: Euler-Lagrange equations
In this section we develop the Euler-Lagrange equations from the Newton-Euler equations of Section 7. Let us give the "quick and dirty" version of our transition from Newton-Euler to Euler-Lagrange in the case when all rigid bodies of the system have nondegenerate inertia tensors. First we establish, using variational arguments following Marsden and Scheurle [1993], the equivalence of the unforced Newton-Euler equations with geodesic equations on Rgd(B; S) (we refer to the introduction of [Cendra, Holm, Marsden, and Ratiu 1998] for an historical account of the development of this approach). If one has multiple bodies, then their free, i.e., non-interconnected, motion is simply a product of these geodesic equations. In order to consider external forces, interconnections, and velocity constraints, we develop how a torque-force is translated into a cotangent vector on the configuration manifold of the system. We then use the Lagrange-d'Alembert Principle in various guises to prove the equivalence of the general Newton-Euler equations to the forced Euler-Lagrange equations. A lot of the bulk in our presentation is a result of the fact that we do consider degenerate rigid bodies. It is worth pointing out, however, that all of our work with degenerate rigid bodies comes together in this section, as all of the work we have done for such bodies over Sections 4-7 gets used here.
The tools for the approach are developed in Sections 8.1, 8.2, and 8.3.
8.1. Some constructions in variational calculus. In this section we review the elements of variational calculus needed for our arc from Newton-Euler to Euler-Lagrange. Variational calculus is set up as a means of tackling certain sorts of optimisation problems, typically optimisation problems over spaces of curves in its simplest guise. We, however, are not concerned with the details of this connection to optimisation problems, but are only concerned with the first-order conditions for optimisation, i.e., first derivative conditions. These first-order conditions, as is well-known, give rise to the Euler-Lagrange equations for an optimisation problem, and solutions to these equations are called "extremals." Thus our discussion of variational calculus is limited to the definition of extremals. Moreover, since we are only interested in differential equations, which by nature are local, we can make simplifying assumptions to the setup that will keep us from having to fiddle with unnecessary technicalities, while maintaining mathematical rigour. Our presentation in this section is general, so we work with an arbitrary smooth paracompact Hausdorff manifold M. First let us consider the sort of curves we will work with.

Definition: (Arc)
Now we can vary arcs.
The infinitesimal variation of a fixed endpoint variation σ is the mapping Of course we have seen these ideas before when discussing virtual displacements in Section 6.5. Here we are working in a non-physical setting, and so prefer to use non-physical language.
Let us show that sufficiently rich classes of variations exist.
8.3 Lemma: (Existence of variations with prescribed infinitesimal variations) Let γ : [t 0 , t 1 ] → M be a C 2 -arc and let V : [t 0 , t 1 ] → TM be a C 1 -map with the following properties: Then there exists r ∈ R >0 and a fixed endpoint variation σ : Proof: Let G be a Riemannian metric on M and, for x ∈ M, let exp x be the exponential map which is defined on a neighbourhood of 0 x ∈ T x M taking values in M. By a standard compactness argument, there exists r ∈ R >0 such that exp x (sV (t)) is defined for each (s, t) ∈ [−r, r] × [t 0 , t 1 ]. We claim that the map is of class C 2 . Indeed, we note that σ is the composition of the mappings and The first mapping of of class C 2 . The second is smooth since it is the time 1 flow of the vector field ∂ ∂s ⊕ Z on [−r, r] × TM, Z being the geodesic spray for the Levi-Civita affine connection associated with G. Moreover, since s → exp x (sv x ) is the geodesic with initial velocity v x , we have δσ(t) = V (t) for every t ∈ [t 0 , t 1 ]. Now we consider extremal arcs associated to a Lagrangian.

Definition: (Lagrangian, extremal) For a manifold M:
(i) a Lagrangian is a smooth function L on TM; (ii) an L-extremal is a C 2 -arc γ : [t 0 , t 1 ] → M such that for every fixed endpoint variation σ of γ. • As mentioned in the introductory remarks to this section, while L extremals are connected to the minimisation of the function we shall not make use of this fact. We shall also not write the familiar coordinate expression for the Euler-Lagrange equations [Goldstein, Poole, Jr, and Safko 2001, Equation (2.18)] for extremals as this would violate our "no coordinates" rule for the paper. We shall be able to write intrinsic forms of these equations in the cases of interest. Nonetheless, in order to connect our language with standard terminology, we can say that a curve γ is a solution to the Euler-Lagrange equations for the Lagrangian L if it is an L-extremal.
8.2. Affine connections, distributions, and submanifolds. In the next section we will consider an abstract framework for mechanics, and in doing so we will get some benefit from seeing how affine connections, particularly the Levi-Civita connection associated with a Riemannian metric, interact with distributions and submanifolds. We give here a selfcontained treatment of the material we require from the paper of Lewis [1998].
We first work with a Riemannian manifold (M, G) with a smooth, locally constant rank distribution D ⊆ TM. We have well-defined orthogonal projections P D : TM → TM, P ⊥ D : TM → TM onto D and D ⊥ G , respectively. We then have the affine connection D ∇ on M given by thinking of G ∇ X P ⊥ D as a (1, 1)-tensor field on M, i.e., a section of the endomorphism bundle of TM. The following lemma provides some attributes of these constructions.

Lemma: (Affine connections and distributions)
The affine connection D ∇ has the following properties: Proof: (i) For vector fields X and Y we have For a D-valued vector field Y and any vector field X we have Combing the preceding two calculations gives P ⊥ D ( This gives the result.
We next apply this construction in a special case. We shall make such extensions without mention.
We must show that D ∇ is torsion-free and preserves the metric G 0 . We first show that D ∇ is torsion-free. Let X and Y be vector fields on M 0 . We have is D-valued if X and Y are D-valued. Matching the Dand D ⊥ G -components of (8.1), we obtain Therefore, which, when restricted to M 0 , shows that D ∇ preserves G 0 .
8.3. Mechanics on abstract Riemannian manifolds. In this section we consider a Riemannian manifold (M, G) and make mechanics-like constructions on it that we will subsequently use to connect the Newton-Euler equations to the Euler-Lagrange equations. The first construction we make provides a characterisation of geodesics as extremals of a certain Lagrangian, namely the "kinetic energy Lagrangian": The following theorem is, of course, well-known, but we provide a proof for the purposes of making the paper self-contained.
8.7 Theorem: (Geodesics as extremals) Let (M, G) be a Riemannian manifold. For a C 2 -arc γ, the following are equivalent: (i) γ is an L G -extremal; (ii) γ is a geodesic of the Levi-Civita connection G ∇.
Proof: Suppose that γ : [t 0 , t 1 ] → M is a C 2 -arc and let σ : [−r, r] × [t 0 , t 1 ] → M be a fixed endpoint variation of γ. We define Thus t → S σ (s, t) should be thought of as a vector field along the curve σ s , and t → T σ (s, t) should be thought of as the tangent vector field of the curve σ s . The following lemma records a useful property of these two tangent vector fields along the curve σ s .
Using the fact that mixed partial derivatives agree for C 2 -functions, the result now follows.
Note that, since vanishes for every fixed endpoint variation σ if and only if G ∇ γ (t) γ (t) = 0, by virtue of Lemma 8.3.
Next we work with an abstract setting for mechanics on a Riemannian manifold (M, G), which we think of as playing the rôle of (Q 0,free , G 0,B ) for an interconnected rigid body system. As well as this manifold, we assume the following data: 1. a manifold M 0 with an injective immersion κ : M 0 → M (playing the rôle of Q 0 ); 2. the Riemannian metric G 0 = κ * G on M 0 (playing the rôle of G 0 ); 3. the orthogonal projection P 0 : TM|M 0 → TM 0 . 4. a time axis T and a time interval T ⊆ T;

a smooth mapping
κ(x) M (playing the rôle of an external force); 6. the smooth mapping The meaning of "playing the rôle of" will be made precise when we describe the Euler-Lagrange equations for an interconnected rigid body system. For now, we will just work with the above entities as abstractions of mechanical objects. We shall call the data (M, M 0 , G, T , F, D) an abstract mechanical system. Note that all other data above is determinable from this data. The associated restricted abstract mechanical system is (M 0 , G 0 , T , F 0 , D). Given the preceding abstract data, we make the following definitions.
is satisfied for t ∈ T , and (ii) satisfies the restricted forced and constrained geodesic equations if γ (t) ∈ D γ(t) for each t ∈ T and if there exists a smooth mapping λ 0 : T → TM 0 satisfying the following conditions: is satisfied for t ∈ T . • Next we explore two alternative characterisations of the forced and constrained geodesic equations.
Proof: (i) =⇒ (ii) If γ satisfies the forced and constrained geodesic equations, let ι and λ satisfying be the corresponding mappings. Since the forced and constrained geodesic equations hold from which it immediately follows that (ii) holds.
(ii) =⇒ (i) Suppose that γ satisfies the abstract instantaneous Lagrange-d'Alembert Principle. For t ∈ [t 0 , t 1 ] we can write . We then define vector fields ι and λ along κ•γ by requiring ι(t) and λ(t) to be the projections Taking the G-orthogonal component of this equation tangent to TM 0 gives where P I is the G-orthogonal projection onto I. Thus this part of the result follows after we note that (ii) =⇒ (i) Let γ satisfy the restricted forced and constrained geodesic equations, and let λ : [t 0 , t 1 ] → TM 0 be the corresponding mapping. We thus have Keeping in mind (8.4), this part of the result follows by taking ι(t) = −( G ∇ γ (t) P ⊥ I )(γ (t)).
We have been rather pedantic in the preceding development about the rôle of the inclusion κ, carefully not identifying M 0 with its image under κ. This is really necessary since κ(M 0 ) is only an immersed submanifold.
This turns out to be a convenient setting to discuss the existence and uniqueness of solutions to forced and constrained mechanics. Let us do this so as to resolve this issue in the cases where it has been resolved in the literature. In doing this, we see that we must assume that D is smooth and locally constant rank. We have the following result, essentially following from [Lewis 1998, Theorem 5.4].
8.11 Theorem: (An affine connection characterisation of the forced and constrained geodesic equations) Let (M, M 0 , G, T , F, D) be an abstract mechanical system. For a C 2 -curve γ : T → M 0 , the following statements are equivalent: (i) γ satisfies the forced and constrained geodesic equations; (ii) γ (t 0 ) ∈ D γ(t 0 ) for some t 0 ∈ T and γ satisfies the equation (8.5 ) In particular, the forced and constrained geodesic equations possess uniquely determined solutions locally in time.
Proof: By Proposition 8.10 we shall work with the restricted forced and constrained geodesic equations.
It remains to show that γ (t) ∈ D γ(t) for all t ∈ T . That is to say, thought of as a curve in TM 0 , we show that γ (t) ∈ D ⊆ TM 0 if γ (t 0 ) ∈ D γ(t 0 ) for some t 0 ∈ T . To do this, let Z denote the geodesic spray associated with the affine connection D ∇. Thus Z is the second-order vector field on TM 0 for which the projections of integral curves to M 0 are geodesics of D ∇. We also denote vlft(P D • G 0 • F 0 (t, γ (t))) = d ds s=0 (γ (t) + sP D • G 0 • F 0 (t, γ (t))) ∈ T γ (t) TM.
We then note that we can write (8.5) in the form Υ (t) = Z(Υ(t)) − vlft(P D • G 0 • F 0 (t, Υ(t))) (8.6) for a curve Υ : T → TM 0 [Bullo and Lewis 2004, Equation (4.25)]. We abbreviate γ = π TM • Υ. First note that, since it follows from the definition that vlft(P D • G 0 • F 0 (t, Υ(t))) is tangent to D. By [Lewis 1998, Theorem 5.4] the tangency of Z to D will follow if we can show that D ∇ Y Y is D-valued for every D-valued vector field Y . This, however, follows from Lemma 8.5(i). We thus conclude that any solution to (8.6) for which Υ(t 0 ) ∈ D for some t 0 ∈ T will have the property that Υ(t) ∈ D for all t ∈ T . Therefore, solutions of (8.5) with γ (t 0 ) ∈ D γ(t 0 ) for some t 0 ∈ T will satisfy γ (t) ∈ D γ(t) for all t ∈ T .
The final assertion of the theorem follows from the fact that the equation (8.6), whose solutions with initial conditions in D are exactly the solutions of the restricted forced and constrained geodesic equations, is just that for integral curves of a smooth time-varying vector field.
As mentioned in the introduction, when D is not smooth or of locally constant rank, the matter of existence and uniqueness of solutions to the forced and constrained geodesic equations that arise from mechanics-see Theorem 8.18 below-is a subject that requires further research.
8.4. Euler-Lagrange equations for a single, free, unforced rigid body. A key pivot point for our transition from Newton-Euler to Euler-Lagrange is the means by which this is done in the simplest possible case, that of a single rigid body, with no constraints or forces imposed upon it.
First let us characterise extremals for a particular Lagrangian on T(Rgd(B; S)). We define L B : T(Rgd(B; S)) → R (Φ, (A, v)) → 1 2 G B ((A, v), (A, v)), i.e., the kinetic energy for the body. We then have the following result.
8.12 Lemma: (Solutions of the Newton-Euler equations as extremals) Let S = (S, V, g, θ) be a Newtonian space model, let B = (B, U, G, Θ) be a body reference space, and let (B, µ) be a rigid body in B. For a C 2 -arc φ : [t 0 , t 1 ] → Rgd(B; S), the following statements are equivalent: (i) φ is an L B -extremal; (ii) φ satisfies the Newton-Euler equations.
We will adopt the notational convention that evaluation at (s, t) = (0, t) will simply be abbreviated with evaluation at t. We also freely interchange sand t-derivatives since all functions are of class C 2 and take values in finite-dimensional vector spaces. Let us make some preliminary computations. We define Ξ σ (t) = R T σ (t) • δR σ (t), X σ (t) = R T σ (t)(δr σ (t)).
With these preliminary computations out of the way, we may now proceed with the Apart from brevity, this notation will show that certain of our arguments are rather general regarding Lie group actions on manifolds.
The right-action of Rgd(B) on Rgd(B; S) is then by matrix multiplication on the right: We shall writeσ (s, t) = R (s, t)r(s, t) 0 1 .
Then, just as in the first part of the proof, d ds s=0 t 1 t 0 L G 0,B (σ 0 (s, t)) dt, and the theorem follows from Theorem 8.7 and Lemma 8.12.
8.5. Lagrangian representations of torque-forces. A crucial element in our development of the Euler-Lagrange equations from the Newton-Euler equations is the precise understanding of how forces and torques in the Newton-Euler setting, i.e., those from Definition 6.11, are translated to a geometrically meaningful object on the configuration manifold of the system.
The key is the following result.
Proof: The existence and uniqueness of F follows since the maps (g(τ a , ω a ) + g(f a , v a )) are linear, and thus so is their composition. Thus the right-hand side of the expression defining F in the statement of the lemma does indeed define a linear function on T Φ Q free .
A reference to Section 6.5 will lead the reader to note that our characterisation of the cotangent vector F is made by using the device of virtual power.
Let us give names to the objects constructed in the lemma.
8.15 Definition: (Lagrangian force) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and configuration manifold Q. Let τ ⊕ f be a primary torque-force on B at Φ ∈ Q.
(i) The element F ∈ T * Φ Q free characterised in Lemma 8.14 is the Lagrangian force associated to τ ⊕ f . (ii) The element F 0 ∈ T * π 0 (Φ) Q 0,free characterised in Lemma 8.14 is the reduced Lagrangian force associated to τ ⊕ f . • Note that a Lagrangian force and a reduced Lagrangian force reside in the cotangent space. This is geometrically consistent with the physical notion that a force is an object that does work on motions of a system.
We can also characterise interconnection and constraint torque-forces.
8.16 Lemma: (Lagrangian representations of interconnection and constraint forces) Consider an interconnected rigid body system with S a Newtonian space model, B = (B 1 , . . . , B m ) rigid bodies in body reference spaces B 1 , . . . , B m , and configuration manifold Q. Suppose that we have velocity constraints giving rise to a smooth locally constant rank constraint distribution D. Let τ ⊕ f be a primary torque-force on B at Φ ∈ Q with F and F 0 the associated Lagrangian force and reduced Lagrangian force, respectively. Then the following statements hold: (i) τ ⊕ f is an interconnection torque-force if and only if F ∈ ann(T Φ Q); (ii) τ ⊕ f is an interconnection torque-force if and only if F 0 ∈ ann(T π 0 (Φ) Q 0 ); (iii) τ ⊕ f is a D-constraint torque-force if and only if F ∈ ann(D Φ ); (iv) τ ⊕ f is a D-constraint torque-force if and only if F 0 ∈ ann(D 0,π 0 (Φ) ). (g(τ a , ω ψa (0)) + g(f a , v ψa (0))) = 0, giving the result.
Of course, the preceding constructions are immediately generalised to the case when τ ⊕ f is a torque-force field, and not just a primary torque-force. Thus, if τ ⊕ f : T × TQ → V m ⊕ V m is a torque-force field, then we define F : T × TQ → T * Q and F 0 : T × TQ 0 → T * Q 0 by asking that F (t, (Φ, (A, v))) be the Lagrangian force and F 0 (t, (Φ, (A, v))) be the reduced Lagrangian force associated to τ ⊕ f (t, (Φ, (A, v))), respectively. We call F and F 0 the Lagrangian force field and the reduced Lagrangian force field , respectively. 8.6. Equivalence of Newton-Euler and Euler-Lagrange equations. We are now in a position to state the equivalence of the Newton-Euler and Euler-Lagrange equations, completing the path from the physical world of mechanics to the mathematical world of Riemannian geometry, and so to many of the mathematical representations of mechanics that comprise "geometric mechanics." We include variational characterisations in our list of equivalent characterisations of the solutions to the Newton-Euler equations. To this end, for a collection B = (B 1 , . . . , B m ) of rigid bodies, we define i.e., the kinetic energy of the free system. With this notation, we have the following result.