The geometric discretisation of the Suslov problem: a case study of consistency for nonholonomic integrators

Geometric integrators for nonholonomic systems were introduced by Cort\'es and Mart\'inez in [Nonholonomic integrators, Nonlinearity, 14, 2001] by proposing a discrete Lagrange-D'Alembert principle. Their approach is based on the definition of a discrete Lagrangian $L_d$ and a discrete constraint space $D_d$. There is no recipe to construct these objects and the performance of the integrator is sensitive to their choice. Cort\'es and Mart\'inez claim that choosing $L_d$ and $D_d$ in a consistent manner with respect to a finite difference map is necessary to guarantee an approximation of the continuous flow within a desired order of accuracy. Although this statement is given without proof, similar versions of it have appeared recently in the literature. We evaluate the importance of the consistency condition by comparing the performance of two different geometric integrators for the nonholonomic Suslov problem, only one of which corresponds to a consistent choice of $L_d$ and $D_d$. We prove that both integrators produce approximations of the same order, and, moreover, that the non-consistent discretisation outperforms the other in numerical experiments and in terms of energy preservation. Our results indicate that the consistency of a discretisation might not be the most relevant feature to consider in the construction of nonholonomic geometric integrators.


Introduction
An extension of the theory of variational integrators [19,17,15] to nonholonomic systems was proposed by Cortés and Martínez [4] by introducing a discrete version of the Lagrange-D'Alembert principle (DLA). Their approach requires the definition of two objects. The first of them is the discrete Lagrangian L d which is a real valued function on the Cartesian product Q × Q, where Q is the configuration space. The second object involved in their discretisation is the discrete constraint space D d which is a submanifold of Q × Q that is the discrete counterpart of the non-integrable distribution D ⊂ T Q defined by the nonholonomic constraints.
The discrete dynamics defined by the DLA algorithm are sensitive to the choice of the discrete Lagrangian L d and the discrete constraint space D d . While there is no recipe to construct these objects, Cortés and Martínez suggest that they should be constructed in a consistent manner with respect to a finite difference map (rigorous definitions for these concepts are given in §2. 3). More precisely, in Remark 3.1 of their article they state without proof: "To guarantee that the DLA algorithm approximates the continuous flow within a desired order of accuracy, one should select the discrete Lagrangian L d and the discrete constraint space D d in a consistent way". Similar statements appear in other sources like [6], [7]. The use of consistent discretisations is also suggested in [16], [12], [13].
The goal of this paper is to examine the importance of the consistency condition by considering in detail the performance of two different discretisations of the classical nonholonomic Suslov problem, only one of which is consistent. Recall that the Suslov problem [18] is a simple example of a nonholonomic system that can be realised physically and that exhibits some of the main features that distinguish nonholonomic from Hamiltonian systems, like the non-existence of a smooth invariant measure, and the presence of attracting and repelling periodic orbits on the energy level sets of the system.
Our analysis of the performance of the two integrators focuses on the calculation of their local truncation errors, on a discussion of their energy-preservation properties, and on the execution of numerical experiments.
Our results, explained in more detail below, indicate that the consistent discretisation does not perform better than the other in any of the aspects described above. Our research suggests that consistency is not the most relevant property to consider in order to construct geometric integrators for nonholonomic systems with an enhanced behaviour.
1.1. Two different geometric discretisations of the Suslov problem. The configuration space for the Suslov problem is the Lie group Q = G = SO (3), and both the Lagrangian and the constraints are invariant with respect to left multiplication on G, making the problem into a classical example of an LL-system. The reduced dynamics takes place on the reduced velocity phase space, which is the two-dimensional subspace d of the Lie algebra g = so(3) that defines the constraint distribution at the group identity, and is governed by the Euler-Poincaré-Suslov equations.
Both of the discretisations that we consider fall into the scheme proposed by Fedorov and Zenkov [6] in which the discrete objects L d and D d are chosen to be invariant with respect to the diagonal left multiplication of G on G × G. As a consequence, the discrete constraint space D d ⊂ G × G is determined by a discrete displacement subvariety S ⊂ G which is the discrete version of the reduced velocity phase space d ⊂ g. Similarly, the discrete Lagrangian L d is determined by a reduced discrete Lagrangian d : G → R. The discrete dynamics on D d drop to S and define discrete Euler-Poincaré-Suslov equations.
In our analysis we define the discrete displacement subvariety S as the image of d under the Cayley transform. This definition of S is reminiscent of the work in [6], where S is chosen as the image of d under the exponential map. In fact, as we explain in §4.2, both approaches to define S are equivalent since the images of d under the Cayley and the exponential map coincide on an open dense subset of SO(3) that contains the identity.
The advantage to consider the Cayley transform over the exponential map is that the inverse transformation can be explicitly written down as a rational map from SO(3) to so (3). Moreover, in our work, we interpret this inverse map as a (reduced) difference map ψ which we use to define a reduced discrete Lagrangian that we denote by (∞,ε) d which can be explicitly computed (here and in what follows ε > 0 is the time step). The discretisation of the Suslov problem resulting from the choice of S and (∞,ε) d is consistent with respect to ψ. An alternative choice of the (reduced) discrete Lagrangian is the one considered by Moser and Veselov in their celebrated discretisation of the free rigid body [17]. We denote this discrete Lagrangian by (1,ε) d . The discretisation of the Suslov problem resulting from the choice of S and (1,ε) d is a reparametrisation of the one considered by Fedorov and Zenkov in [6] and, as we show in Proposition 4.1, it is not consistent.

1.2.
Local truncation error of the approximation of the continuous flow. In order to compare the discrete and the continuous (reduced) flows on a common space, it is necessary to pass to the momentum phase space so(3) * . At the continuous level, this passage is defined by the inertia tensor of the body I which is interpreted as a linear isomorphism between so(3) and so(3) * . The image d * := I(d) is a two-dimensional subspace of so(3) * that is the reduced momentum phase space for the continuous Euler-Poincaré-Suslov equations.
At the discrete level one defines the discrete Legendre transformation F d : G → so(3) * as the right trivialisation of the derivative of d . The discrete Euler-Poincaré-Suslov dynamics in momentum variables takes place in the momentum locus u := F d (S) which is a nonlinear, possibly non-smooth, subvariety of so(3) * . Since the definition of the momentum locus involves the discrete reduced Lagrangian, the momentum loci u (1) ε and u (∞) ε , defined respectively by The two discretisations of the Suslov problem introduced above, give rise to discrete evolution maps B * (1,ε) and B * (∞,ε) defined respectively on u (1) ε and u (∞) ε that approximate the continuous Euler-Poincaré-Suslov equations on d * , provided that the time step ε is sufficiently small. We obtain asymptotic expansions for B * (1,ε) and B * (∞,ε) as ε → 0, that allow us to conclude that the local truncation error in both cases is of second order (Theorems 5.1 and 6.1).
1.3. Energy preservation. As reported in [6], the non-consistent discretisation, whose reduced constrained Lagrangian is , defines a multi-valued map that exactly preserves the energy of the system.
On the other hand, the consistent discretisation corresponding to (∞,ε) d defines a multivalued map that only preserves the energy of the system if a certain, non-generic, condition on the inertia tensor of the body holds.

Numerical experiments.
In §8 we present a series of work-precision diagrams that illustrate the convergence of both discrete systems to the continuous one as the time step ε → 0. The only advantage that results by working with the consistent discretisation is that the resulting algorithm is well-defined for larger values of ε. However, for a sufficiently small time step, the non-consistent discretisation appears to give a better approximation of the continuous flow.
1.5. Organisation of the paper. The paper is structured as follows: in §2 we quickly recall the necessary ingredients for our developments and we give the definitions of finite difference maps and of consistency of a discretisation. We also consider equivariant finite difference maps for systems on Lie groups and we give a characterisation of consistency in this special case. In §3 we review the main aspects of the Suslov problem and we give a Taylor expansion of its solutions that is later used to compute the local truncation error of the discretisations. In §4 we define the two different discretisations of the Suslov problem and show that only one of them is consistent. In §5 (respecively, §6) we give working formulae for the discrete algorithm of the non-compatible (respectively compatible) discretisation and show that the local truncation error in the approximation of the continuous flow is second order. In §7 we discuss the preservation of energy for both discretisations and in §8 we present some numerical results. The main conclusions of the paper are given in §9. Finally, we present an appendix that shows that the momentum loci defined by both discretisations are contained in the zero level sets of certain polynomials.

Preliminaries
2.1. Nonholonomic systems. A nonholonomic system on an n dimensional configuration manifold Q is determined by the triple (Q, D, L). Here D is a non-integrable constraint distribution over Q of constant rank that at each q ∈ Q defines an s-dimensional subspace D q ⊂ T q Q. A curve q(t) on Q satisfies the nonholonomic constraints ifq(t) ∈ D q(t) at all time t. The Lagrangian function L : T Q → R is of the form kinetic energy minus potential energy, L = T − V , where T defines a Riemannian metric on Q and V : Q → R.
The equations of motion are obtained via the Lagrange-D'Alembert principle of ideal constraints. In local coordinates, it leads to 1 d dt In the above equations µ α = µ α i dq i are any set of n − s independent one-forms on Q whose joint annihilator is D, and the scalars λ α , α = 1, ..., n − s, that specify the reaction forces, are sometimes called Lagrange multipliers and are uniquely determined by the constraints (1b). Since the constraints are ideal, the energy E = T + V is conserved along the flow. For more details we refer to [1,2].

Discrete nonholonomic systems.
Discretisations of the Lagrange-d'Alembert principle for nonholonomic systems have been introduced in [4,16] as an extension of variational integrators (see [15,17] and references therein for more details in this last topic). According to these references, the discretisation of the nonholonomic system (Q, D, L) requires the construction of a discrete Lagrangian and a discrete constraint space D d ⊂ Q×Q. In the following definition we emphasise the role of the time step which is important for our purposes.
and, moreover, for all q ∈ Q where the curve q(t) satisfies q(t k ) = q k , q(t k + ε) = q k+1 , see e.g. [15]. The dependence of L (ε) d on ε is not made explicit in [4], [6], [16] and [7]. 1 The sum convention over repeated indices is in use. (2) is required so that the discrete constraint space D d is a consistent approximation of the constraint distribution D. It is tacitly assumed in [6] and a particular case of it is mentioned in [16].

Remark 2.2. Condition
The discrete Lagrange-D'Alembert principle (DLA) defined by Cortés and Martínez in [4] yields the set of discrete nonholonomic equations Here, just as in (1), the independent one-forms µ α are an arbitrary basis of the annihilator of D. At each step, the multipliers λ (k) α appearing in (3a) are determined by the condition (3b). Under certain technical conditions, the above is a well defined algorithm for the discrete approximation of the solutions of (1); see [4] for details. For generalisations of discrete nonholonomic systems, see [10].

2.3.
Finite difference maps and consistency. The performance of a DLA nonholonomic integrator will depend on the choice of the pair (D d , L (ε) d ). A possibility to construct them is to use finite difference maps [16].
is a neighbourhood of the diagonal I d in Q × Q and T 0 Q denotes a neighbourhood of the zero section of T Q which satisfies the following (1) Ψ(I d ) is the zero section of T Q; (2) τ Q • Ψ(N 0 (I d )) = Q; and (3) along the diagonal I d we have where τ Q : T Q → Q is the bundle projection and π 1 , π 2 are the projections from Q × Q to Q.
A simple example of a finite difference map with time step ε if Q = R n is given by With a finite difference map at hand, one can define the discrete Lagrangian L that is the simplest quadrature approximation of the action t k +ε t k L(q(t),q(t)) dt. Following the discussion in [4], we make the following definition. In Remark 3.1 of [4] the authors claim that consistency is a necessary condition to guarantee approximation of the continuous flow to a desired level of accuracy. The main contribution of this paper is to examine the validity of this statement by treating in detail a concrete example. We will examine two different discretisations of the Suslov problem. Only one of them is consistent, but they both lead to discrete algorithms having local truncation errors of the same order. Our results indicate that the statement made in [4] is not entirely correct.
2.4. Nonholonomic LL systems. If the configuration space is a Lie group Q = G, and both the constraint distribution D and the Lagrangian L are invariant under left multiplication on G, then we speak of an LL system. In this case there is a subspace d of the Lie algebra g = T e G, where e is the group identity, such that D g = gd for all g ∈ G. The non-integrability of the constraints means that d is not a subalgebra.
By left invariance, the Lagrangian L : T G → R is of pure kinetic energy and is determined by its value at the identity. We have L(g,ġ) = L(e, g −1ġ ) := (ξ), where ξ = g −1ġ ∈ g. We call : g → R the reduced Lagrangian. It is defined by the inertia tensor I : g → g * that specifies the left invariant kinetic energy metric at the identity. We will label such LL system by the triple (G, d, ).
The reduction of the system by the left action of G leads to the Euler-Poincaré-Suslov (5b) Here M = I(ξ) and a α ∈ g * are independent vectors whose joint annihilator is d. As before, the multipliers λ α appearing in (5a) are uniquely determined by (5b). The above equations are consistent with the Lagrange-D'Alembert principle of ideal constraints and therefore preserve the energy of the system.

2.5.
Discrete nonholonomic LL systems. The discretisation of LL systems in accordance with the DLA algorithm was thoroughly considered by Fedorov and Zenkov [6]. The authors proposed a discretisation scheme under the natural assumptions that both the discrete Lagrangian L (ε) d : G × G → R and the discrete constraint space D d ⊂ G × G are invariant under the diagonal action of G on G × G by left multiplication. We briefly recall some of their results.
By invariance of L (ε) d , one can define a reduced discrete Lagrangian Similarly, by left invariance of D d there exists a discrete displacement subvariety S ⊂ G determined by the condition if and only if W k = g −1 k g k+1 ∈ S. Given that (g, g) ∈ D d for all g ∈ G, it follows that the identity element e ∈ S. Moreover, one can easily show, using (2), that T e S = d.
Therefore, under the above invariance conditions on (D d , L  (1) The discrete displacement subvariety S is a submanifold of G that contains the identity and T e S = d. In particular, dim(S) = dim(d).
Remark 2.3. In accordance with Remark 2.1, the reduced discrete Lagrangian is an approximation of the action: For W k ∈ S we define the associated discrete momentum 2 . One should understand F d as an approximation of the inertia tensor I : g → g * . Note that numerous complications arise in the discrete setting. Firstly, F (ε) d is locally invertible but in general will fail to be globally invertible (see [6] and the discussion in §5.3). Secondly, F (ε) d is a nonlinear map, in fact its domain is not even a linear space.
The discrete momentum locus is defined in [6] as The discrete momentum locus is a nonlinear subvariety of g * that approximates the linear constraint space d is a consistent approximation of the continuous reduced action, it will contain 0 ∈ g * and will be tangent to d * .
Fedorov and Zenkov [6] prove that the discrete nonholonomic equations (3) reduce to the discrete Euler-Poincaré-Suslov equations where W k ∈ S. As usual, the multipliers λ (k) α are determined by (7b). Since the discrete Legendre transform is only locally invertible, this scheme will generally be multivalued and some care should be taken in the choice of the branch to adequately approximate the solutions of (5) (see the discussion in [6]).
2.6. Left invariant finite difference maps. Given that the discretisation of LL systems proposed by Fedorov, Zenkov in [6] considers a discrete Lagrangian L d and a discrete constraint space D d that are invariant with respect to the diagonal action of G on G × G by left multiplication, it is natural to consider the construction of these objects using a finite difference map Ψ that is equivariant. Namely, one that satisfies for some g k (and hence all g k ∈ G) then Ψ induces a reduced difference map defined by where, as before, W k = g −1 k g k+1 . Note that ψ is defined for W k on a neighbourhood of the of the identity e ∈ G and maps into the Lie algebra. Moreover, we have ψ(e) = 0. We now state the following proposition whose proof is a direct consequence of the definitions.
Let Ψ be an equivariant difference map that satisfies (8). Then, the discretisation of the In practice, to construct an invariant consistent discretisation of an LL system one can start with a retraction map, that is a local diffeomorphism τ : g → G, with the property that τ (0) = e and T 0 τ = id g . The reduced finite difference map ψ can be taken as the local inverse of τ (defined in a neighbourhood N e of e ∈ G), and the discrete displacement subvariety as S := τ (d) ∩ N e . Examples of retraction maps are the exponential and the Cayley maps.

The Suslov problem
The Suslov problem, first introduced in [18], is a prototype example of a nonholonomic LL system in which the Lie group G = SO(3). Physically it models the motion of a rigid body under its own inertia subject to the nonholonomic constraint that one of the components of the angular velocity as seen in the body frame vanishes. Here we recall a series of known facts of the problem and we give a Taylor expression of its reduced solutions in order to later examine the performance of geometric integrators for the system.

3.1.
Euler-Poincaré-Suslov equations. Without loss of generality, we assume that the body frame has been chosen in such way that the nonholonomic constraint is ω 3 = 0, where ω ∈ R 3 denotes the angular velocity vector written the body frame. As usual, we interpret ω as an element of the Lie algebra g = R 3 equipped with the vector product ×. We have ω = g −1ġ , where g ∈ SO(3) is the attitude matrix of the body and ω is the skew-symmetric matrix that represents ω ∈ R 3 via the hat map, see (19) below.
The nonholonomic constraint defines a left-invariant non-integrable, rank 2 distribution D ⊂ T SO(3) determined at the identity by the linear subspace It is clear that d is not a subalgebra of (R 3 , ×). The reduced Lagrangian of the system is (ω) = 1 2 Iω, ω where I is the inertia tensor of the body and ·, · is the euclidean product in R 3 . Rotating the body frame about its third axis if necessary, we can assume without loss of generality that The Euler-Poincaré-Suslov equations (5) for the angular momentum vector M = Iω arė where λ is a multiplier that is uniquely determined by the condition that M ∈ d * := I(d), and e 3 = (0, 0, 1). Explicitly we have Therefore, the first two components of (12) becomė The above equations preserve the restriction of the energy 1 2 I −1 M, M to the constraint space d * . Explicitly, the first integral is Using this first integral, equations (14) can be integrated explicitly in terms of hyperbolic functions.

3.2.
Taylor expansion of the solutions. In order to evaluate the order of local truncation error of the geometric integrators for the Suslov problem to be defined ahead, we perform a Taylor expansion of the solutions of (14). By repeated differentiation of these equations we find , and 3.3. Expressions in so (3). For our treatment ahead it is useful to write the angular velocity ω, the angular momentum M and the reduced Lagrangian , in terms of skew-symmetric 3×3 matrices.
Recall (see e.g. [14]) that the hat map is the Lie algebra isomorphism : The hat map is also an isometry between R 3 with the standard Euclidean metric and so (3) equipped with the scalar product A short calculation shows that where the symmetric matrix J is given by Using the properties of the trace and the skew-symmetry of ω we can write In the following sections we will make an indistinctive treatment of vectors in R 3 and skewsymmetric matrices in so (3). There should be no risk of confusion and any kind ambiguities are resolved by the hat map.

Two different discretisations of the Suslov problem
We now define two different invariant discretisations of the Suslov problem that follow the prescription of Fedorov-Zenkov [6] described in §2.5. Recall that such discretisations are determined by a choice of discrete displacement subvariety S ⊂ SO(3) and a reduced discrete Lagrangian The choice of S for both discretisations is given in §4.1. The two different choices of the reduced discrete Lagrangian are respectively given in §4.2 and §4.3. We show that only the second discretisation is consistent in the sense of Definition 2.3.
Our construction will use the Cayley map (see e.g. [11]). For a skew-symmetric matrix The Cayley map is injective and for W ∈ range(Cay ε ) ⊂ SO(3) , we have By Euler's theorem (see e.g. [14]) we know that, except for the identity, any matrix in SO (3) is a rotation through an angle θ about a certain axis. Such matrix will have eigenvalues 1, e iθ , e −iθ . On the other hand, from the above formula we see that the range of Cay ε consists of matrices in SO(3) that do not have eigenvalue −1. Therefore, the range of Cay ε consists of those matrices whose angle of rotation θ lies strictly between −π and π.
4.1. The discrete constraint displacement subvariety. We define S ⊂ SO(3) by where d is given by (11).
Using (25) one can check that the condition T e S = d in Definition 2.4 is satisfied.

4.2.
Definition of the non-consistent discretisation. As first considered by Moser and Veselov in [17], we define the reduced discrete Lagrangian (1,ε) : Up to the addition of an irrelevant constant, it is obtained by formally putting ω = 1 ε (W − e) in (21). The superscript 1 in the notation indicates that 1 ε (W − e) is the first term in the expansion of (23) as a power series in W − e. Proof. Up to an irrelevant constant term we have In view of Proposition 2.1 and equation (10), the discretisation is consistent if there exists an open submanifold S of S = Cay ε (d) containing the identity, such that σ 1 (S ) ⊂ d. But this is not possible since for W ∈ S, In [6] Fedorov and Zenkov considered a geometric discretisation of the Suslov problem having discrete reduced Lagrangian given by (26) and with constraint displacement subvariety S := exp(d). Such choice of S consists of all matrices in SO(3) having axis of rotation perpendicular to e 3 . Therefore, S given by (24) is an open and dense subset of S that contains the identity. The complement S \ S consists of matrices in SO(3) that are a rotation by π about an axis that is perpendicular to e 3 . In particular, given that S and S coincide on a neighbourhood of the identity, we conclude that the geometric discretisation that we have introduced above is equivalent to the one considered by Fedorov and Zenkov.
Corollary 4.1. The geometric discretisation of the Suslov problem considered by Fedorov and Zenkov in [6] is not consistent in the sense of Definition 2.3.
It is remarkable that Fedorov and Zenkov [6] do not seem to notice that their discretisation is not consistent given that they explicitly mention the consistency condition at the end of Section 2 in their paper. Having chosen S = exp(d), a way to construct a consistent discretisation is to define where log is a local inverse of the exponential map in a neighbourhood of 0 ∈ so(3). The inverse Cayley map allows us to make an alternative consistent construction without dealing with the matrix logarithm. This is explained next.

4.3.
Definition of the consistent discretisation. Consider the reduced constrained Lagrangian where is given by (21). Explicitly, for W in the range of It is immediate to check that the geometric discretisation of the Suslov problem defined by S and (∞,ε) d is consistent. The role of ψ in (10) is played by Cay −1 ε . The ∞ superscript in the notation indicates that we use the exact value of Cay −1 ε in the definition of (∞,ε) d instead of a partial sum approximation in its expansion as a power series in W − e (compare with the definition of (1,ε) d ).
5. Analysis of the non-consistent discretisation (S, As mentioned before, this is a reparametrisation of the discretisation of the Suslov problem considered by Fedorov and Zenkov in [6]. We will determine the order of local truncation error of the approximation of the continuous flow by the resulting discrete Euler-Poincaré-Suslov equations. To avoid double subscripts, throughout this section we write ω 1 = u, ω 2 = v.

The momentum locus u
(1) ε . We identify so(3) * with so(3) via the inner product (20). The discrete Legendre transformation (6) associated to the discrete Lagrangian (1,ε) given by (26) is computed to be Using (25), we find that for W = Cay (u, v, 0) we have M = F (1,ε) d (W ) given by where we are using the hat map to identify so(3) * ∼ = so(3) with R 3 . By putting (u, v) = (0, 0) in (28) it is clear that 0 ∈ u (1) ε . Moreover, a direct calculation using this parametrisation shows that a normal vector to u (1) ε at the origin is (I 22 I 13 , I 11 I 23 , −I 11 I 22 ). This vector is readily seen to be the normal vector to the plane d * defined in (13). Therefore we have shown the following.   The figure illustrates that u (1) ε is bounded, which can be directly proved from (28). It is a complicated surface with pinch points and self intersections. In the Appendix we show that u (1) ε is contained in the zero locus of a degree 4 polynomial in M 1 , M 2 , M 3 . 5.2. The discrete Euler-Poincaré-Suslov equations. Now, we turn our attention towards the discrete Euler-Poincaré-Suslov equations (7), which in our case read where W k ∈ S and the multiplier λ k is determined by the condition that M k+1 ∈ u Using (25) and the hat map, the above identity between skew-symmetric 3 × 3 matrices is written in vector form as Taking into account (28), the first two components in the above equation yield Given (u k , v k ), in order to obtain (u k+1 , v k+1 ) one needs to solve the polynomial equations for u and v where p 2 , q 2 are degree two polynomials given by Here A and B depend on (u k , v k ) as .
In order to understand the number of solutions that (31) has, we compute the resultant of the polynomials p 2 , q 2 with respect to the variable u. This gives a polynomial of degree four in v whose leading coefficient is Under the assumption that the inertia tensor is not diagonal, for generic (u k , v k ), this is non-zero. Hence (31) generically admits 4 different (possibly complex) solutions for (u, v). Experiments show that if u k , v k are real, and ε > 0 is small enough, then two solutions are real and two are complex. This is consistent with the results reported by Fedorov-Zenkov in [6], and with the degree of the polynomial p 4 given in Proposition 9.1 in the Appendix.

5.3.
Approximation of the continuous flow. From the above discussion it follows that (30) generically defines a 4-valued map from C 2 to itself. We shall now explain how to select a branch to construct a (locally defined) single valued discrete time map 4 B (1,ε) on S that approximates the flow of the continuous Suslov problem on SO(3). A matrix W k ∈ S specifies a point (u k , v k ) ∈ R 2 through the inverse Cayley map (W k is represented by (25) replacing (ω 1 , ω 2 ) by (u k , v k )). Applying the implicit function theorem to equations (30) one can show that, for ε small enough, there exists a unique solution for (u k+1 , v k+1 ) that depends smoothly on u k , v k and ε and converges to (u k , v k ) when ε → 0. The point (u k+1 , v k+1 ) defines the matrix W k+1 := B (1,ε) (W k ) ∈ S via the Cayley map (25). The asymptotic expansion for (u k+1 , v k+1 ) in terms of (u k , v k ) is given by where It can be shown that the momentum locus u (1) ε ⊂ so(3) * has pinch points and selfintersections. This is suggested by its graph in Figure 1 and agrees with the discussion in Fedorov and Zenkov [6]. Away from these points, the inverse of the discrete Legendre transformation is defined and is smooth, so we can locally define the discrete time momentum mapping on u We will compute the order of local truncation error of the approximation of the continuous flow of (14) by the projection of B * (1,ε) onto the M 1 -M 2 plane. By the implicit function theorem, the equations (28a), (28b), can be inverted in a vicinity of (M 1 , M 2 ) = (0, 0) where we consider ε as a parameter. The asymptotic expansion for the inversion as ε → 0 is computed to be given by An asymptotic expansion for the projection of B * (1,ε) onto the M 1 -M 2 plane can now be obtained using (32), (35) and (37).
6.1. The momentum locus u (∞) ε . Upon the same considerations of §5.1, the discrete Legendre transformation (6) associated to the discrete Lagrangian (∞,ε) d given by (27) is computed to be where Using (40) and (25), we obtain the following expression for the components of M = Also in analogy with the discussion presented in §5, one can check that the momentum locus u (∞) ε contains the origin, where it is tangent to the plane d * defined in (13).  Notice that u (∞) ε appears to be unbounded. This can be directly proved using the parametrisation (41) and contrasts with the situation encountered in the study of the momentum locus defined by 6.2. Discrete Euler-Poincaré-Suslov equations. Using (25), (41a) and (41b), the components of the Euler-Poincaré-Suslov equations (29) that do not involve the multiplier λ k may be rewritten as a set of implicit equations to determine (u k+1 , v k+1 ) in terms of (u k , v k ). These equations are where p ± 3 and q ± 3 are the degree three polynomials In order to understand the number of solutions that (42)  . The asymptotic expansion for such (u k+1 , v k+1 ) in terms of (u k , v k ) as ε → 0 is given by Analogous to (34), we can locally define the discrete time momentum mapping on u We will now compute the order of local truncation error of the approximation of the continuous flow of (14) by the projection of B * (∞,ε) onto the M 1 -M 2 plane as we did with B * (1,ε) in §5.3. The inversion of equations (41a) and (41b) in a neighbourhood of (M 1 , M 2 ) = (0, 0) yields where F 1 , G 1 are given by (36), and where µ 1 , µ 2 , ν 1 , ν 2 , are given by (17) Expansions (45) are valid in a vicinity of (M 1 , M 2 ) = (0, 0) and for a small ε > 0.
Comparing (16) and (18), with (45) and (46), we conclude the following. Theorems 5.1 and 6.1 show the local truncation error for both discretisations is of the same order. This indicates that consistency of the discretisation of a nonholonomic geometric integrator is not the essential feature to guarantee an approximation of the continuous flow within with a desired order of accuracy. This seems to contradict the Remark 3.1 in [4] quoted in the introduction.

Discrete evolution of the energy
Here we discuss the energy preservation properties of both discretisations. 7.1. Non-consistent discretisation. It is shown by  that the discretisation of the Suslov problem defined by S and (1,ε) d exactly preserves the restriction of the constrained energy E c defined by (15) to the momentum locus u (1) ε . At the group level, this conservation law is formulated in the following.
Proposition 7.1. The discrete system (30) evolves on the level sets of the rational function Proof. A direct calculation shows that So, using (30), we get but the expression on the right hand side equals R(u k , v k ).
It is important to notice that all branches of the multi-valued discrete map defined by (30) possess this invariant.
The relationship between the rational function R and E c is established by noticing that, in view of (28a), (28b), the relation R(u k+1 , v k+1 ) = R(u k , v k ) can be rewritten as But, up to multiplication by a constant factor, I 22 M 2 1 + I 11 M 2 2 coincides with E c . The proposition above shows that the discrete map defined by (30) evolves on the algebraic curve where h = R(u 0 , v 0 ). The MAPLE package "algcurves" indicates that the compactification of C has genus 3 and is not hyperelliptic. 7.2. Consistent discretisation. The consistent discretisation defined by S and (∞,ε) d only preserves the constrained energy E c defined by (15) if I 11 = I 22 . One can show that this condition implies that the axis of forbidden rotations of the body e 3 is contained in an inertia eigen-plane. 5 At the group level, we have the following. (1) If I 11 = I 22 then the discrete system (42) evolves on the level sets of Q.
(2) If I 11 = I 22 then for successive points defined by (42), we have Proof. A direct calculation shows that , v k+1 ), and also It follows from (42) that Note that this proposition applies to all branches of the multi-valued discrete map defined by (42). In particular, if I 11 = I 22 , all 5 branches preserve Q. Given that in this case e 3 lies on an inertia eigen-plane, by performing a rotation of the body frame about e 3 , we may assume that I 13 vanishes and Q takes the simplified form Hence, the multi-valued map (42) evolves on the algebraic curve where the constant h = Q(u 0 , v 0 ). According to the MAPLE package "algcurves" the compactification of C has genus 4 and is not hyperelliptic. The relationship between the polynomial Q and the constrained energy E c given by (15) is established by noting that in view of (41a) and (41b) we have which, up to multiplication by a constant factor, coincides with E c .
Remark 7.1. If I 11 = I 22 , the proposition states that the approximation of E c by the discrete flow is O(ε 3 ) which is considerably good. One could imagine that in this case there still exists an energy-like first integral of the discrete system, for instance an O(ε)-perturbation of E c . We were unable to prove or disprove this possibility that was brought to our attention by one of the anonymous referees of the paper.
Remark 7.2. Note that in the energy-preserving cases mentioned above, the discrete flow at a fixed energy value defines a multi-valued map on an algebraic curve. Such curve has symmetries and is a covering of another curve with lower genus. As future work, it is interesting to investigate if an exact analytical expression for the n th iterate of these maps can be obtained in any of these curves.

Numerical simulations
In this section we compare the performance of the integrators B * (1,ε) and B * (∞,ε) by conducting numerical experiments. We consider two different choices of the inertia tensor. The first one is generic while the second one satisfies the condition I 11 = I 22 that ensures that B * (∞,ε) exactly preserves E c . In both cases we observe that the approximation by the non-consistent integrator B * (1,ε) is slightly better. 8.1. A generic inertia tensor. Throughout this section we suppose that the inertia tensor has I 11 = 3, I 22 = 4, I 33 = 5, I 13 = 1, I 23 = 1 2 .
Our numerical experiments show that the momentum integrator B * (1,ε) corresponding to the non-compatible discretisation defined by S and (1,ε) d approximates (M 1 (t), M 2 (t)) for 0 ≤ t ≤ 1 if the the time step ε ≤ 0.3. For ε ≥ 0.4 we cannot generate sufficient iterations for the approximation. The problem arises since the map (F (1,ε) d ) −1 that appears in (34) only exists locally. We did not find this type of restriction on the time step for the momentum integrator B * (∞,ε) corresponding to the consistent discretisation. In Figure 3 below we show approximations of (M 1 (t), M 2 (t)) by both integrators for ε = 0.015, 0.030. On the insets we show respectively the approximation of the constrained energy E c and the signed distance ρ to the constraint subspace d * given by (13) at each time step. For the latter graph it is necessary to compute the iterates of all components of the vector M . The graph on the left shows the approximation of M 1 and the evolution of the energy E c (inset). The graph on the right shows the approximation of M 2 and the evolution of the signed distance ρ to the constraint subspace d * (inset).  (14) having initial condition M 1 (0) = 179.9836568, M 2 (0) = 2.4255507998, Once again, the choice of initial condition is such that the solution will evolve from one asymptotic state to another over the time interval 0 ≤ t ≤ 1.
This time the non-consistent momentum integrator B * (1,ε) has problems if ε ≥ 0.018. Figure  4 below is analogous to Figure 3 in the previous section for ε = 0.007, 0.014. Remark 8.1. Note that the numerical experiments show that the evolution of the nonconsistent discrete map B * (1,ε) is slower than the continuous dynamics while the evolution of the consistent discrete map B * (∞,ε) is faster. We have no explanation for this phenomenon.

Conclusions
To our knowledge, this work is the first in which the usefulness of the consistency condition of a discretisation of a nonholonomic system is explored. Our results indicate that consistency may not be the essential property to consider in order to construct nonholonomic integrators. Indeed, for the specific discretisations of the Suslov problem that we considered, we have shown that (1) Both the consistent and the non-consistent discretisations approximate the continuous flow of the system with local truncation errors of the same order. (2) The non-consistent discretisation preserves the energy of the continuous system for an arbitrary inertia tensor. On the other hand, the consistent discretisation only preserves the energy of the continuous system for a family of rigid bodies whose inertia tensor has a specific type of symmetry, and it is unclear if an energy-like first integral exists in the general case. (3) Our numerical experiments indicate that the non-consistent discretisation gives rise to an integrator that performs better than the consistent one for the same value of the time step.