The Toda lattice, old and new

Originally a model for wave propagation on the line, the Toda lattice is a wonderful case study in mechanics and symplectic geometry. In Flaschka's variables, it becomes an evolution given by a Lax pair on the vector space of real, symmetric, tridiagonal matrices. Its very special asymptotic behavior was studied by Moser by introducing norming constants, which play the role of discrete inverse variables in analogy to the solution by inverse scattering of KdV. It is a completely integrable system on the coadjoint orbit of the upper triangular group. Recently, bidiagonal coordinates, which parameterize also non-Jacobi tridiagonal matrices, were used to reduce asymptotic questions to local theory. Larger phase spaces for the Toda lattice lead to the study of isospectral manifolds and different coadjoint orbits. Additionally, the time one map of the associated flow is computed by a familiar algorithm in numerical linear algebra. The text is mostly expositive and self contained, presenting alternative formulations of familiar results and applications to numerical analysis.


Introduction
The Toda lattice is a beautiful mathematical object: mathematical miracles and serendipity are abundant. Somehow, everybody has something to say about it, and this extends way beyond mathematicians: the connections with physics and numerical analysis are fruitful and clarifying. Moreover, the formalism is versatile, accomodating a number of interesting dynamical systems.
In this text, we present some of these many aspects of the Toda lattice. After more than thirty years of intriguing discoveries, the subject is not exhausted even in its most elementary formulation. We aim at concrete examples, and provide occasional pointers to more abstract approaches.
In Section 2, the differential equation is presented as the physical system introduced by Toda and converted into an evolution on Jacobi matrices by Flaschka's remarkable change of variables. The two basic properties are discussed: the conservation of eigenvalues and the very simple asymptotic behavior. A comparison with the archetypical Lax pair, the Korteweg-de Vries equation, motivates the introduction of Moser's inverse variables on Jacobi matrices, given by eigenvalues and (discrete) norming constants.
The change of variables taking the original mechanical system in R 2n to the set of Jacobi matrices with a fixed trace J 0 leads to a brief description of the Hamiltonian formalism. After a brief review of completely integrable systems and coadjoint orbits as symplectic spaces, we interpret the Toda equation as a Hamiltonian system on coadjoint orbits (in particular J 0 ), as done originally by Adler and Kostant. We then consider larger phase spaces: generic coadjoint orbits and isospectral manifolds of tridiagonal matrices. Section 4 is another geometric approach to the Toda lattice, which might have anticipated its discovery by decades. It leads very naturally to the solutions by factorization by Symes. Section 5 is dedicated to an interpolation theorem which connects the subject to numerical analysis: one Hamiltonian in the completely integrable collection associated to the Toda equation gives rise to a flow which, at integer times, consists of matrices obtained by the so called QR algorithm, extensively used in algorithms to compute eigenvalues.
Norming constants have a drawback: they do not extend to the boundary of the set of Jacobi matrices, which is where the fine asymptotic properties of the flows occur. Section 6 describes bidiagonal coordinates, which provide charts for the full isospectral manifold of tridiagonal matrices. A brief interlude on QR algorithms with shifts is provided as an example of the versatility of this new instrument.
As shown above, the content of this review is biased. Indeed, a short text cannot provide coverage for all the ramifications of the subject. As a minimal list of alternatives, the curious reader is invited to consider [35] and [38] for algebraic aspects related to integrability, [31] for uses of symplectic geometry to the analysis of (variations of) the original Toda system, [23] for an essentially orthogonal overview and [37] for a relativistic mutation, one of many interesting physical directions from our starting point.

Physical origins and Flaschka's variables
In the mid sixties, the Japanese physicist M. Toda [42] proposed a model for wave propagation along n particles in a line by the Hamiltonian exp(x k − x k+1 ).
As usual, positions x k and velocities y k vary aṡ Here, x 0 = −∞, x n+1 = +∞ -this is the non-periodic case of the Toda lattice. The particles are labeled and from a mathematical viewpoint, there is nothing wrong with occasional collisions: they simply pass each other. Clearly, the energy H(x(t), y(t)) is a conserved quantity. The invariance under translation in configuration space (i.e., the fact that H(x, y) = H(x + c, y), for any fixed c ∈ R n ) implies that the center of mass of the system moves uniformly (i.e., the linear momentum of the system is preserved).
Following Flaschka [16], shift to center of mass coordinates (i.e., work with differences x k − x k+1 ), get rid of exponentials and adjust constants, a k = − y k 2 , k = 1, . . . , n, b k = 1 2 e (x k −x k+1 )/2 , k = 1, . . . , n − 1 and the evolutions for x(t) and y(t) become which, surprisingly, can be cast as the matrix differential equatioṅ Here, J is a Jacobi matrix with diagonal entries a k and principal off-diagonal entries b k . A Jacobi matrix J is a real n × n symmetric matrix which is tridiagonal (i.e., J ij = 0 if |i − j| > 1) and such that the principal off-diagonal entries (those for which |i − j| = 1) is a positive number. The matrix Π sk M is skew-symmetric, with the same lower triangular part as M .
There is nothing wrong in considering equation (J) for arbitrary real symmetric matrices (and even non-symmetric matrices). Explicitely, consider the differential equation on real symmetric matrices S(t) given by

Lax pairs, asymptotic behavior
We present two fundamental facts. The first one is a question about differential equations. What kind of differential equation on matrices gives rise to an evolution M (t) which preserves the eigenvalues? More precisely, what kind of vector field gives rise to a flow of the form M (t) = (P (t)) −1 M (0)P (t)? Simply take derivatives, In words, equations given by Lax pairs M ′ (t) = [M (t), X(t)] are solved by conjugating the initial condition. In particular, if the initial condition is a real, symmetric matrix S(0), and we require P to be an orthogonal matrix (so that symmetry, together with spectrum, is preserved along the orbit S(t)), one should consider a Lax pair of the form S ′ (t) = [S(t), A(t)], where A(t) is a real, skew symmetric matrix (this follows from the standard computation -take the derivative at 0 of a curve of orthogonal matrices Q(t) with Q(0) = I).
Proposition 1 The solution S(t) of (S) starting from a real, symmetric matrix S(0) is of the form S(t) = Q(t) * S(0)Q(t): it is well defined for all t ∈ R.
An explicit form of Q(t) is given in Section 4.2.
Proof: We are left with showing global existence: the matrix norm ||S|| 2 = tr S T S is conserved along an orbit.
We now consider the asymptotic behavior for S(t) and J(t), the orbits starting from arbitrary symmetric and Jacobi matrices. In the Jacobi case, we may get information from the physical interpretation: informally, minimizing the potential energy (the second term in the Hamiltonian) leads to the spreading of the particles, so that one might expect x k −x k+1 → −∞. Thus, for t → ∞, particles should be essentially independent from each other and undertake uniform motion. In Flaschka's variables, the orbit J(t) should converge to a diagonal matrix.

Proposition 2
The limits for t → ±∞ of S(t) are diagonal matrices. When t → ∞ (resp. t → −∞), J(t) converges to a diagonal matrix with strictly decreasing (resp. increasing) entries along the diagonal.
Proof: The differential equations for the diagonal entries of S(t) are given by , and, in general, S ′ kk = − j<k S 2 kj + j>k S 2 kj . The equation for S ′ 11 indicates that it is nondecreasing in time. The same is true not for S ′ 22 but for S ′ 11 + S ′ 22 , and in general the partial traces k j=1 S jj are nondecreasing. Since the norm of S(t) is constant, all entries S ij and the partial traces (and their derivatives) are uniformly bounded. In particular, the derivatives of the partial traces are integrable, Lipschitz functions on R and hence all entries S ij , i = j must go to zero. This implies the diagonal convergence of S(t) (and hence of J(t)) at ±∞.
are always positive (indeed, if b k = 0 at some time, it is always zero). When t → ∞, we must then have a k+1 − a k < 0, so that the diagonal limit matrix has entries in strictly decreasing order (the fact that the eigenvalues of a Jacobi matrix are all distinct is proved in Lemma 2.1). A similar argument obtains the result for t → −∞.
As a side remark, notice that the above proposition implies the spectral theorem (only in finite dimensions: the presence of continuous spectrum brings up new issues to the asymptotics of the Toda flow [11]). The interested reader might enjoy a proof of the Wielandt-Hoffman theorem along similar lines ([13]).

Scattering and inverse variables
is equivalent to the operator evolution where Lax then reproved the seminal discovery of Gardner, Greene, Kruskal and Miura ( [19]) that the the evolution of KdV varies the operator L preserving its eigenvalues. Analogous computations replacing L by higher order operators lead to the Gelfand-Dickey flows, which include the Boussinesq equation and more ( [5]).
The connection between differential equations like KdV and the inverse scattering method discovered in the late sixties ( [19], [3]) led to intense research on integrable systems. Originally as a formalism, inverse data provided (infinite dimensional) action-angle variables for the KdV equation.
Inverse variables for Jacobi matrices were pervasive in the early approaches to the Toda flows, starting with Moser ( [32]). The theorem below, that he attributed to Stieltjes, provides inverse variables for the Toda equation (J) ( [36], [21]).
More precisely, the diffeomorphism takes J to its ordered eigenvalues and to the first coordinates of its associated eigenvectors, normalized so as to be positive.
Parlett made the intriguing observation that this algorithm is used by numerical analysts to obtain a tridiagonal matrix from a full symmetric matrix, not from a diagonal matrix.
Proof: We sketch a procedure to invert this map presented in [12]. Define the matrix Λ and the vector v as and consider the sequence of vectors v, Λv, . . . , Λ n−1 v. The vectors q 1 = v, q 2 , . . . , q n obtained from this sequence by applying the Gram-Schmidt method are the columns of an orthogonal matrix Q for which J = Q T Λ Q.
The entries of the vector v are the norming constants associated to J. Still in analogy with KdV, as J(t) solves equation (J), the eigenvalues stay put, as Flaschka knew, and Moser showed that the norming constants v(t) varied in a simple fashion: simply normalize (under the Euclidean norm) the vector exp(tΛ) v(0).
From Proposition 2, at t = ±∞ Jacobi orbits J(t) converge to diagonal matrices with ordered eigenvalues along the diagonal entries. Using norming constants, Moser computed the scattering map of the Toda flow. Particles group in pairs with the same asymptotic velocity at extreme times: the quantity of interest is the shift, the distance between the two straight lines tangent to the asymptotic motion at ±∞ of particles in the same pair.
Norming constants, alas, do not parameterize the limit matrices of the flows. Actually, they degenerate on a large part of the boundary of the set of Jacobi matrices. This will be circumvented by the bidiagonal coordinates in section 6.
There is a basic issue which has not been handled carefully so far. Equation (J), given in terms of skew symmetric matrices A(t), was shown to imply that the solution is an orthogonal conjugation of the initial condition J(0) -thus, in particular, J(t) is always real, symmetric, and eigenvalues are preserved. But why should the evolution preserve the tridiagonal form? This has to happen, if Flaschka's change of variable preserves the physical meaning of the variables. Also, this fits with the solution by inverse variables of (J). A more conceptual argument showing that J(t) is always a Jacobi matrix will be presented in the next section. Yet another argument will come up in Section 4.2.

Some symplectic geometry
From its physical description in terms of positions x i and velocities y i , it is clear the Toda lattice admits a Hamiltonian formulation in R 2n . By Flaschka's change of variables, after removal of the (trivial) evolution of the center of mass, the phase space for the differential equation becomes the set of Jacobi matrices with trace equal to zero, a cone of dimension 2n − 2, from Theorem 2.2. Somehow, one should be able to transfer the original Hamiltonian formulation to the new variables, and still proceed with the study of the Toda lattice as a problem in mechanics within the new phase. This indicates a more general context, the starting point of a vast field, symplectic geometry. In the next subsection, we outline some basic requisites for this project.

Complete integrability
We start with a brief description of a more general definition of Hamiltonian formulation of a vector field. Take a differential equation with phase space M , or more precisely, consider the associated vector field Z defined on the tangent bundle T M . First equip M with a closed, nondegenerate, 2-form ω: the pair (M, ω) is a symplectic manifold, necessarily of even dimension, say 2n (excellent sources for symplectic geometry are [4], [20]).
Each Hamiltonian H : M → R gives rise to a vector field X H as follows: the contraction of ω with X H should obtain the 1-form dH. Said differently, for every The vector field Z admits a Hamiltonian formulation if Z = X H for an appropriate choice of ω and H.
The simplest example is the standard 2 and, as is well known, When is a Hamiltonian vector field X H completely integrable? Complete integrability requires n commuting Hamiltonians H k : M → R (i.e., their induced vector fields X k = X H k commute, or equivalently, such that {H i , H j } = 0 for the Poisson bracket induced by the 2-form ω, {H i , H j } = ω(X Hi , X Hj ) ) among themselves and with H itself. Finally, the Hamiltonians H k should be functionally independent (i.e., their gradients should be linearly independent on a dense set of M ).
Very few dynamical systems are completely integrable, but these are especially important for being situations in which explicit computations may be performed. Indeed, one can make precise the idea that a generic Hamiltonian vector field does not have a second conserved quantity (recall that H itself is conserved along orbits). But this is just the opposite of what we expect from certain iterations in numerical analysis: to compute eigenvalues, for example, we expect to change something (an original matrix, an approximation of an eigenvector) without varying the objects being computed (the eigenvalues themselves).
The Liouville-Arnold-Jost theorem states that, under appropriate hypothesis, the phase space of completely integrable equations foliates into invariant tori (i.e., products of lines and circles), given by levels of the conserved quantities H k (frequently called the action variables in this context). In each torus, in angle variables, the evolution is just straight line motion. The angles vary smoothly at neighboring tori, and the global dynamics is mostly dependent on the arithmetic properties of the angular velocities.

Toda flows in coadjoint orbits
Using Flaschka's change of variables, one might push forward the standard 2-form in R 2n to J 0 , the cone of Jacobi matrices with zero trace, thus converting it into a symplectic manifold. The surprising fact is that the resulting 2-form comes up from another construction of great interest, which we now describe briefly.
A large class of symplectic manifolds is given by coadjoint orbits, equipped with the Lie-Kirillov-Poisson 2-form. Let G be a (finite dimensional) Lie group, G its Lie algebra, and identify G * , the dual of the Lie algebra, by means of a nondegenerate coupling The coupling is bilinear and nondegeneracy means that the restrictions α, . and ., A , for α, A = 0, give rise to nonzero functionals respectively on G * and G. Thus, all linear functionals in G are of the form A → α, A for some α ∈ G * .
For the Toda flow, start with G = U + , the group of n × n upper triangular real matrices with positive diagonal entries. Then G = U is the vector space of real upper triangular matrices and we may identify G * with S, the vector space of real, symmetric matrices, through the pairing S, A = tr SA.
The group G acts on itself by conjugation and on its Lie algebra G by its derivative at the origin, the adjoint action. For G = U + , the adjoint action is given by Back to the Toda context, The bad news is that gαg −1 is not a symmetric matrix. Denote by sU the vector space of real, strictly upper triangular matrices. Clearly, for E ∈ sU and A ∈ U we have tr EA = 0. Consider the (unique) splitting We clearly have from which we finally obtain Ad * g α = Π S (gαg −1 ). By definition, M and Π S M have the same lower triangular part (and this includes the diagonal). In particular, as observed by Adler [1] and Kostant [22], Jacobi matrices with fixed trace form a coadjoint orbit.
We now recall the celebrated Lie-Kirillov-Poisson 2-form on a coadjoint orbit. Let O α be the coadjoint orbit through α. Any vector a in the tangent space of O α at α is the derivative at zero of Ad * exp(tA) α fro some A in the Lie algebra G. Let A and B in G give rise to tangent vectors a and b. Set The minus sign is innocuous, but there is so much to prove here. First, it is not clear that ω is well defined: other elements in G might give rise to the same tangent vectors at α. More, ω has to be proven nondegenerate and closed. We will take all those issues for granted.
Instead, we continue with the computations related to the Toda flow. In this case, the curve Ad * exp(tA) α = Π S e tA αe −tA has the tangent vector Π S [A, S] at α. The 2-form is given by A simple computation shows that the initial Hamiltonian H(x, y) for the Toda lattice in physical variables converts to H(S) = tr S 2 /2 in Flaschka' s variables.
Proposition 3 The Toda lattice is the vector field X H associated to the Hamiltonian H(S) = tr S 2 /2 defined on a coadjoint orbit O S . Proof: We search for a vector field X H = Π S [U, S] for which, given any vector field Y = Π S [V, S], we must have Let A be the vector space (Lie algebra!) of real, skew symmetric matrices. We consider another splitting, and make use of the orthogonalities A ⊥ S and U ⊥ sU: We are thus led to a very geometric explanation for the fact that the solution of the equation A similar computation obtains the commutativity of the Hamiltonians H k (J) = tr J k , k = 1, . . . , n). Indeed, the computation above generalizes to yield the Adler- From Theorem 2.2, the related invariant Liouville-Arnold tori, given by sets of n × n Jacobi matrices with fixed spectrum, is diffeomorphic to the set of possible choices for norming constants -the positive octant of the unit sphere in R n−1 .

Toda flows in larger phase spaces
From the computations above, the same Hamiltonian H = tr S 2 /2 induces equation (S) on larger coadjoint orbits and the proof of complete integrability on these phase spaces requires many more conserved quantities. The problem was considered in two cases. In [8], the authors consider the orbits of maximal dimension given by 2[n 2 /4] inS. The new commuting Hamiltonians H(S) are obtained by chopping: they are the (symmetric functions of the) roots of the determinants of the matrices obtained by removing the first k rows and last k columns of the matrix S − λI. In analogy to Moser's computations in the Jacobi orbit, angle variables are essentially the first components of (generalized) eigenvectors.
The generic phase space for real, nonsymmetric matrices is handled in [9]. First, one needs to interpret the Toda equation as a Hamiltonian on an appropriate coadjoint orbit of dimension n 2 − n of a Lie group given by a semidirect product. At a matrix M , the new commuting Hamiltonians are the coefficients of the polynomial p(z, λ) = det(M − zM T − λ), so a Riemann surface p(z, λ) = 0 is associated to M and is invariant under the Toda flow. The existence of such additional structure has been known since the first studies of the periodic Toda flow, where the particles move on a circle instead of in the line ( [45], [24], [25]). For nonsymmetric matrices, the additional angle variables are obtained from an extension of the Abel-Jacobi map, by integrating a set of explicit meromorphic 1-forms on the surface along specific divisors, related again by generalized eigenvectors.

The isospectral manifold of tridiagonal matrices
The set J Λ of Jacobi matrices with a given simple spectrum There is a natural enlargement of this set -define T Λ , the set of real, symmetric tridiagonal matrices with spectrum Λ. This set is actually a compact manifold ( [43]). The first step in the understanding of T Λ is taking the closure (within the space of real,symmetric matrices) of J Λ . It turns out thatJ Λ has an interesting combinatorial structure, which we now describe.
For matrix dimensions n = 3 and n = 4 (and any choices of different eigenvalues) this set is homeomorphic to the polytopes in the figure. The vertices correspond to the n! diagonal matrices with the same spectrum as Λ. For n = 3, the diagram consists of matrices with eigenvalues 2, 4 and 8. Vertices are diagonal matrices, in a self-explanatory notation. Edges correspond to matrices having a single zero in the main off-diagonal entries. Thus the top edge consists of matrices with (1, 2) entry equal to zero; the eigenvalue 8 is trapped at entry (1, 1) and the bottom 2 × 2 block consists of Jacobi matrices with eigenvalues 4 and 2, whose closure is homeomorphic to the whole edge. Edges are invariant under the Toda flow, and arrows indicate the sense of the flow. All interior points have the same α and ω limits.
For n = 4 the boundary still consists of points with zero off-diagonal entries, which split the matrix in two kinds of sets, eight of which are homeomorphic to the 3 × 3 case (hexagons), and six which are homeomorphic to the product of two 2 × 2 blocks with fixed spectrum, the quadrilaterals. Again, the curves represent some orbits of the Toda flow. For the general case ( [43]), [6]), define the permutohedron P Λ = conv π∈Sn {(λ π(1) , . . . , λ π(n) )}.
The existence of this homeomorphism in [43]   the identification of the homeomorphism as a momentum map, given by the formula above. A simple proof was later presented in [31].
FromJ Λ to the full isospectral manifold T Λ , it's a game of mirrors (or, more precisely, the construction of an appropriate Coxeter group ([43])). In a nutshell, dropping the signs of the off-diagonal entries of a real, symmetric tridiagonal matrix (preserving its symmetry!) does not change the eigenvalues -this is something that numerical analysts use systematically: to compute eigenvalues of a matrix in T Λ , it suffices to handle Jacobi matrices. In particular, the sets of matrices in T Λ with nonzero entries split into 2 n−1 connected components, all isomorphic to J Λ . To get T Λ , one takes the closure of these components and glues them along faces which are naturally identified. For n = 3 and Λ = {7, 5, 4}, Figure 2 shows some edges already glued. Edges along the boundary have to be identified: boundary vertices must be the same, together with a sign (which?). The resulting bitorus is drawn in Figure 3 so as to emphasize the boundaries of the hexagons (more about this picture on Section 6). The universal cover of T Λ is R n−1 and the homology groups are torsion free ( [43], [17]). The sum of the partial traces, F (S) = i (n − i + 1)s ii is a perfect Morse function: the equilibria are the diagonal matrices, and the Betti numbers are obtained by counting how many such matrices there are with a given signature.

A missed opportunity
We present another trail ( [31]) leading to the Toda equation: it might have anticipated the study of these equations by fifty years. We

Commuting flows in I M
The notation ψ(f, M ) was introduced to suggest that we are close to a flow on I M .
Proof: It suffices to show that Now, from the uniqueness of the QR decomposition, for X with det X > 0, Using that f (P XP −1 ) = P f (X)P −1 and the first equality just above, where e tf (M(0)) = Q(t) R(t), so that, dropping time dependence, To get a differential equation, eliminate M (0) = Q(t)M (t)(Q(t)) T : Since Q(t) is a curve of orthogonal matrices, A(t) = Q T Q ′ is a curve of skew symmetric matrices. Thus The case f (x) = x is the Toda lattice after Flaschka's change of variables. Equation ( * ) is Symes's solution by factorization to the differential equation ( [39]). The fact that the Toda flow admits two different formulas by factorization, given by orthogonal and upper triangular conjugation of the initial condition, immediately implies that the evolution preserves the real tridiagonal symmetric form of the initial condition. Indeed, orthogonal conjugations preserve symmetry, and upper triangular conjugations preserve the upper Hessenberg form (i.e., the only nonzero entries below the diagonal lie in the subdiagonal of entries (i + 1, i)).

Toda, QR and other algorithms
In the fifties, Francis [17] came up with the QR algorithm to compute eigenvalues of symmetric matrices. Say S = S 0 is a real, symmetric matrix of positive simple spectrum. Consider the alternation of QR decompositions and reorderings, Clearly, S 1 ∈ O S0 , so S 0 and S 1 are both symmetric with a common spectrum: the QR step is a diffeomorphism from O S0 to itself. The remarkable thing about it is that iteration of this map converges to a diagonal matrix Λ -since a QR step clearly preserves spectrum, the diagonal entries of Λ are the eigenvalues of S 0 ! It was Moser, again, who drew Deift's attention to Symes's beautiful connection between the Toda flow and the QR algorithm [40]: at integer times n, the solution J(t) of the Toda differential equation satisfies exp J(n) = E n , where E n is the n-th term in the QR sequence starting from E 0 = expJ 0 . On the other hand, since the Toda equation is one within a family of flows of the kindJ = [f, Π skew f (J)], one may fudge with the functional parameter ( [12]) and get a simpler relationship.
Thus, S 0 = J(0) = Q(1)R(1) and S(1) = Q * (1)S(0)Q(1) = J 1 , the matrix obtained from S 0 from a QR step. Since a Toda flow interpolates the QR iteration, the convergence properties of Toda flows are also satisfied by the iteration. The vocabulary of dynamical systems clarifies certain eigenvalue computations. The right side of Figure 1 representsJ Λ , the closure of the set of 4 × 4 Jacobi matrices with eigenvalues 1, 2, 3 and 4. The vertices correspond to diagonal matrices, which are equilibria for the Toda vector field. The vertex I is a source, A is a sink and the remaining vertices are saddles with different signatures (i.e., dimensions of the unstable manifold). The presence of saddles explains why orbits bifurcate close to some vertices (say, H, E and F ) in the neighborhood of which an orbit spends a long time (i.e., many QR iterations), a fact that was known in the numerical literature as root disorder.
Numerical analysts might have realized a long time ago that the QR algorithm is the integer evaluation of a flow. Indeed, it has been known for decades that one can obtain directly the matrix S n of the QR iteration starting with a symmetric matrix S 0 is given by S n = Q * n S 0 Q n = R n S 0 R −1 n where Q n and R n are obtained from the QR factorization S n 0 = Q n R n . Morally (and indeed correctly), the step S 1/n = Q * 1/n S 0 Q 1/n = R 1/n S 0 R −1 1/n , S 1/n 0 = Q 1/n R 1/n is an n-th root of the usual QR step (in the sense that n such steps yield the usual QR step). Now, to obtain the interpolating flow (which belongs to the Toda family, as we saw) simply compute  The factorization can be performed (uniquely) for matrices in G + , having upper principal minors with positive determinant. Now ( [9]), on G + define the product g * h = h L gh U . For an appropriate coupling, the induced dual Lie algebra is the phase space which accommodates the Cholesky iteration, M n = L n U n , M n+1 = U n L n , n = 0, 1, . . . and its continuous interpolation (notice that blowups may happen). The Lie bracket associated to this group structure is an example of the so called R-matrix formalism applied to the standard matrix Lie bracket, but this is another story.

Choleski and SVD
Given a real matrix M , its singular values are the (nonnegative) lengths of the semi-axis of the ellipsoid obtained by applying M to the unit (Euclidean) sphere. A singular value decomposition of M is a product M = QΣU , where Q and U are orthogonal matrices and Σ is a diagonal matrix having the singular values as diagonal entries.
There is an efficient algorithm, similar to QR, to compute singular values of tridiagonal matrices, which was shown by Demmel and Kahan to have remarkable stability properties with respect to relative errors ( [15]). In [7], these properties were studied from a symplectic setup: the appropriate phase space is chosen taking into account the specific concern with relative errors. The Jacobian M (i, j) of the map sending a matrix at step i to its (discrete) evolution at time j is analyzed using Krein's perturbation theory for symplectic matrices. A number of properties arise, which are responsible for the good performance of the algorithm: for large i, the spectrum of M (i, j) is simple, lies in the unit circle and M (i, j) converges to a limit, explicitly computed. The agreement between theoretical estimates and experiments is remarkable: the rate between computed and estimated error was never larger than 8, independent of dimension.

Bidiagonal coordinates
Norming constants in Section 2.2 have a drawback: they do not cover the limit points of algorithms which converge to reduced matrices (i.e., tridiagonal matrices with some main off-diagonal entries equal to zero), like the Toda flows and QR type algorithms. This problem has been circumvented by the introduction of bidiagonal coordinates ( [28]). As an extra bonus, bidiagonal coordinates provide an atlas for T Λ . The construction goes as follows.
Let L 1 (n) denote the group of lower triangular matrices with unit diagonal entries. For M ∈ M(n), the LU positive factorization, when it exists, is M = LU , where L ∈ L 1 (n) and U ∈ U + (n). Clearly, this happens if and only if the determinants of the upper principal minors of M are positive (more on the appendix).
There is one chart of T Λ for each permutation π. Each chart has for domain the set U π Λ ⊂ T Λ consisting of matrices T = Q * π Λ π Q π for which there exists an orthogonal matrix Q π admitting an LU positive factorization Q π = L π U π . This is not as restrictive as it looks: if T = Q * π Λ π Q π , another spectral decomposition is given by T = Q * π E Λ π EQ π , where E is a diagonal matrix with entries equal to ±1 along the diagonal: one may use E to force the positivity of the determinants of the principal minors of EQ π , provided that the corresponding determinants of Q π are nonzero. Notice that the request that Q π admits an LU positive factorization gives rise to a unique spectral decomposition T = Q * π Λ π Q π . We then have Theorem 6.1 The matrix B π is lower bidiagonal with diagonal Λ π . The principal off-diagonal entries β π k = (B π ) k+1,k define a diffeomorphism ψ π : U π Λ → R n−1 . Each domain U π Λ is an open, dense set U π Λ , containing one diagonal matrix. The charts {ψ π : U π Λ → R n−1 , π ∈ S n } form an atlas for T Λ . The signs of β π k and T k+1,k are equal and their quotient goes to one, when one of them goes to zero.
Proof: From the expressions for B π , it is simultaneously lower triangular and upper Hessenberg. So it is actually lower bidiagonal, with the same spectrum as Λ π .
To show that the chart is a diffeomorphism to R n−1 , consider the construction of its inverse. Build B π out of off-diagonal entries β π j and (distinct) eigenvalues λ π(i) . Diagonalize B π = L −1 π Λ π L π and get Q π out of the QR factorization L π = Q π R π , so that, automatically, Q π admits an LU positive factorization, hence Q π ∈ U π Λ . Finally set T = Q * π Λ π Q π . Since U ∈ U + (n), the equation B π = U π T U −1 π gives that the signs of β π k and T k+1,k are equal. The remaining statements are left to the reader. Figure 2 is an example for n = 3: here Λ π = diag (7,4,5) and U π Λ is the interior of the polygon with boundary given by the unglued edges (the glued edges belong to U π Λ ). Figure 3 was obtained using such charts: the standard norming constants would distort too much the picture (and degenerate completely) at boundaries of (signed) Jacobi matrices. Once eigenvalues are fixed, the bitorus T λ lies in the intersection of a hyperplane (of matrices with the same trace as S) and a sphere (same sum of squared eigenvalues), and the figure is the image of a conformal projection of T λ in R 2 .
Another remarkable property of bidiagonal coordinates is that their evolution under the Toda equations manages to be even simpler that the evolution of the standard inverse variables. We consider the Toda vector fields on real, symmetric, tridiagonal matrices, where the time dependence is explicit, (T ) Proposition 5 Fix π, take T (0) ∈ U π Λ . The chart domain U π Λ is invariant under equation (T). The evolution of the bidiagonal coordinates is Proof: We first prove that (T ) leaves U π Λ invariant. Take T (0) ∈ U π Λ : omitting the permutation π, we have T (0) = Q * 0 Λ Q 0 , where Q 0 has an LU positive decomposition. Solve (T ) as in Section 4.2: for exp(tf (T (0))) = Q(t) R(t), We have to prove thatQ(t) = Q 0 Q(t) admits an LU positive factorization: . The upper principal minors ofQ(t) and Q 0 have the same signs, since exp(tf (Λ)) is a positive diagonal matrix and R −1 (t) ∈ U + (n).
We now consider bidiagonal coordinates. Clearly Now, B(t) = L −1 (t) Λ L(t) and we obtain B ′ (t) = [B(t), L −1 (t) L ′ (t)] by now familiar computations. The matrix L is obtained by the LU positive factorizatioñ Q(t) = L(t) U (t), so that, dropping the time dependence,Q ′ = L ′ U + LU ′ yields Consider the split M = Π sl M + Π u M of a matrix M into strictly lower and upper triangular parts. Since L has a diagonal of ones, L −1 L ′ is strictly lower triangular. The solution by factorization in Section 4.2 givesQ * Q′ = Π sk (f (T )), so that Now, the matrices M and Π sk M have the same strictly lower triangular part and U is upper triangular, so Adding up, The proof requires interpretation in the case f (x) = ln x, which is of relevance for QR interpolation: we need exp(tf (Λ)) to be a positive diagonal matrix -in this case, the diagonal entries must be equal to |λ k |.
Bidiagonal coordinates are especially convenient to study asymptotic behavior of Toda flows. As an application, the reader may find in [28] a rather natural computation of their scattering map, described in Section 2.2, by filling up the following inevitable outline. Recall from Figure 1 that Toda flows starting from Jacobi matrices have for ω and α limits the diagonal matrices associated to π ω (i) = i and π α (i) = n − i + 1. On each chart, the Toda evolution in bidiagonal coordinates is simple. The change of charts required to keep track of both extremes of an orbit is equally simple.
An appropriate extension of this formalism provides charts on isospectral manifolds of real and complex matrices with given profile, a natural extension of the concept of tridiagonality -a text is under preparation ( [41]).

QR steps with shifts
Numerical analysts have ways to speed up the convergence of QR. The original QR step is just the choice f (x) = ln x in the family Bidiagonal coordinates may be used to indicate interesting alternatives: interpret a step as a time 1 map for a differential equation and integrate the trivial flow which describe evolutions in Proposition 5. For f (x) = ln g(x), taking into account the caveat after its proof, the change of the bidiagonal coordinates of a matrix T ∈ U π Λ under a step is (β π 1 , . . . , β π n−1 ) → g(λ π(2) ) g(λ π(1) ) β π 1 , . . . , g(λ π(n) ) g(λ π(n−1) ) β π n−1 .
We are interested in functions g for which some β π k becomes small -in this case, the corresponding matrix essentially decouples in two smaller tridiagonal matrices, for which the computation of eigenvalues is simpler. Taking into account the denominators in the formula, a natural possibility is a function g which equals zero at λ π(n) but does not vanish at the other eigenvalues. Keep in mind that, unfortunately, we do not know the eigenvalues. In particular, it is especially hard to obtain such a g which would reduce drastically a centrally located β π k (i.e., k ∼ n/2). Notice that, from Theorem 6.1, the matrix T is Jacobi if and only if its bidiagonal coordinates β π k are positive, and the sign is preserved under general QR steps. On the other hand, removing the absolute values in the formula shows that one can replace iterations lying within Jacobi matrices to iterations on tridiagonal matrices without changing their asymptotic properties: only the off-diagonal entries eventually have different signs. The new iteration is smooth on the isospectral manifold T Λ , and convergence rates may be obtained using Taylor expansions.
Typically, along an iteration of an algorithm searching for eigenvalues, one has good approximations s for one of them: a natural choice is g(x) = x−s, the iteration with shift s. The bidiagonal coordinates, in this case, change as follows: Numerical analysts frequently do not wait for a good approximation s. There are different shift strategies ( [34]) -we consider the Rayleigh quotient shift for which s = T n,n , and the Wilkinson shift: compute the eigenvalues ofT , the bottom 2 × 2 principal minor of T , and take for s the one which is closer to T n,n .
Under the Rayleigh shift, once iteration approaches convergence, a simple Taylor expansion shows that the bottom entry b = β π n−1 converges cubically to zero, in the sense that b m+1 = O((b m ) 3 ) for bottom entries at consecutive iterations. Once b is small enough, the matrix undergoes deflation: T n,n is declared a good approximation of an eigenvalue and the algorithm proceeds with the top (n − 1) × (n − 1) block. There is one catch however: for some initial conditions, the Rayleigh shift gives rise to periodic orbits.
This does not happen for the Wilkinson shift. The dynamics in this case is richer: we describe the results but may only indicate [29] and [30] for proofs. Theorem 6.2 For a generic initial condition, a Wilkinson iteration leads to cubic convergence of b = β π n−1 ∼ T n,n−1 . If the original matrix T has no three eigenvalues in arithmetic progression, this is always the case. Otherwise, there may be a Cantorlike set of initial conditions for which iteration is quadratic.
In a nutshell, this peculiar behavior is caused by the discontinuities in the definition of s, at points where T n,n is equidistant from the eigenvalues of the block T . The Cantor-like set in the statement of the theorem consists of matrices all of whose iterations lie in this situation. Typically, this never happens or happens just at few steps and convergence is cubic.

Lax pairs beyond Toda
Forty years of contributions from a large community greatly increased our understanding of the Toda flow, which sometimes is undistinguishable from the more general Lax pair evolution. Mutations are abundant: for a beautiful starting point about different representations of the Toda flow (and other geometric issues which may studied through them), the reader should refer to [23].
In this text, we emphasized examples over theory. In order to indicate the versatility of the concepts which have been presented, we close with a final example.
One has to circumvent a technical difficulty: A 0 (0) is not invertible -it is a rank one matrix! Still, as in the finite dimensional cases, the evolution is interpolated by a differential equation. This time, the solution formula by factorization involves a Riemann-Hilbert problem, with a mild singularity at λ = 0.

Appendix 1: The QR and LU factorizations
Let M an invertible, real matrix. Then there is a unique Q ∈ SO(n) and R ∈ U(n) for which M = QR, the QR factorization of M . Indeed, let e i , i = 1, . . . , n be the canonical vectors: the reader should have no difficulty in showing that the subspaces generated by M e i and Qe i , for i = 1, . . . , k (k arbitrary) should be the same, if such a decomposition exists. Since the columns of Q are orthonormal, they must be vectors obtained by applying the Gram-Schmidt orthogonalization procedure to the columns of M sequentially. The entries of R are the coefficients used in the representation of M e i in terms of the columns Qe 1 , . . . , Qe i . Since M is invertible, the process is feasible, and appropriate normalizations give rise to positive diagonal entries of R.
Similarly, the (unique) LU decomposition of M is M = LU , where L ∈ L 1 (n) and U ∈ U + (n). This decomposition exists if and only if the principal diagonal minors of M (i.e., the determinants of the k × k submatrices M k with entries in the intersection of the first k rows and columns, k = 1, . . . , n) are strictly positive. In a nutshell, from M k = L k U k , it is clear that an inductive construction is at hand: the details are left to the reader or in standard texts ( [14], [44]).