A subspace theorem for manifolds

We prove a theorem that generalizes Schmidt’s Subspace Theorem in the context of metric diophantine approximation. To do so we reformulate the Subspace theorem in the framework of homogeneous dynamics by introducing and studying a slope formalism and the corresponding notion of semistability for diagonal ﬂows.


Introduction
In 1972, Wolfgang Schmidt formulated his celebrated subspace theorem [32, Lemma 7], a far reaching generalization of results of Thue [37], Siegel [35], and Roth [28] on rational approximations to algebraic numbers.Around the same time, in his work on arithmeticity of lattices in Lie groups, Gregory Margulis [26] used the geometry of numbers to establish the recurrence of unipotent flows on the space of lattices GL d (R)/GL d (Z).More than two decades later, a quantitative refinement of this fact, the so-called quantitative non-divergence estimate, was used by Kleinbock and Margulis [18] in their solution to the Sprindzuk conjecture regarding the extremality of non-degenerate manifolds in metric diophantine approximation.As it turns out, these two remarkable results -the subspace theorem and the Sprindzuk conjecture -are closely related and can be understood together as statements about diagonal orbits in the space of lattices.In this paper we prove a theorem that generalizes both results at the same time.We also provide several applications.This marriage is possible thanks to a better understanding of the geometry lying behind the subspace theorem, in particular the notion of Harder-Narasimhan filtration for one-parameter diagonal actions, which leads both to a dynamical reformulation of the original subspace theorem and to a further geometric understanding of the family of exceptional subspaces arising in Schmidt's theorem.The proof blends the diophantine input of Schmidt's original theorem with the dynamical input arising from the Kleinbock-Margulis approach and refinements recently obtained in [2] and [8].
We now formulate the main theorem.Let M = φ(U ) be a connected analytic submanifold of GL d (R) parametrized by an analytic map φ : U → GL d (R), where U ⊂ R n is a connected open set and n ∈ N. Let µ be the push-forward in M of the Lebesgue measure on U .The Zariski closure of M is said to be defined over Q if for (a ij ) ij ∈ GL d the ideal of polynomial functions in C[a ij , det −1 ] that vanish on M can be generated by polynomials with coefficients in Q.
Theorem 1 (Subspace theorem for manifolds).Assume that the Zariski closure of M in GL d is defined over Q.Then there exists a finite family of proper subspaces V 1 , . . ., V r of Q d with r = r(d) such that, for µ-almost every L in M , for every ε > 0, the integer solutions x ∈ Z d to the inequality all lie in the union V 1 ∪ • • • ∪ V r , except a finite number of them.
Here the L i are the linear forms on R d given by the rows of L ∈ GL d (R), and • is the canonical Euclidean norm on R d .
We will in fact prove the theorem under a slightly weaker assumption on M requiring only that what we call the Plücker closure of M be defined over Q, see §1.8 for the definition of Plücker closure.
Note that we recover the original Schmidt's subspace theorem [33,10,5] as the special case when M is a singleton {L} (then n = 0 and µ is the Dirac mass at L). On the other hand, even in the case where M is defined over Q, the theorem is non-trivial.Indeed, as we shall see in Section 3.1, it recovers the main result of Kleinbock and Margulis regarding the Sprindzuk conjecture.
The exceptional subspaces V i are independent of ε, as in Vojta's refinement [38,34], and they depend only the rational Zariski closure of M .In fact they are determined by what we call the rational Schubert closure of M , that is the intersection of all rational translates S σ g := BσB containing M , where g ∈ GL d (Q) and S σ is a standard Schubert variety associated to a permutation σ and a Borel subgroup B containing the diagonal subgroup.Each V i contains infinitely many solutions to (1), regardless of ε.The number r of exceptional subspaces can be bounded by a number depending only on d (see Lemma 3 and the remark following it).
The proof of Theorem 1 goes via the proof of a stronger result, a parametric subspace theorem for manifolds, Theorem 3 below.This reformulates the problem in terms of the dynamics of a one-parameter diagonal flow (a t ) t>0 on the space of lattices.We may summarize it informally as follows: Theorem 2 (Parametric version).For µ-almost every L in M the lattice a t LZ d assumes a fixed asymptotic shape as t tends to +∞.
By "fixed asymptotic shape" we mean two things.Firstly that the successive minima are asymptotic to e Λ k t for some real numbers Λ k , Lyapunov exponents of sorts, depending only on M and a = (a t ) t>0 (the dependence on a is piecewise linear).In particular as t varies there can only be oscillations of subexponential size for successive minima.And secondly that the successive minima determine a fixed partial flag in Z d .In other words there is a fixed partial rational flag W 1 ≤ . . .≤ W d in Q d , such that if Λ k < Λ k+1 , then the k first successive minima of a t LZ d are always realized by vectors from W k when t is large enough.The Λ k 's and the flag {W k } k depend only on a and on the rational Schubert closure of M .Grouping together the different W i obtained by varying the one-parameter subgroup a we obtain the family of exceptional subspaces V i appearing in Theorem 1.
This rational flag arises naturally as the Harder-Narasimhan filtration associated to a certain submodular function on the rational grassmannian: the maximal expansion rate of the subspace under the flow.We recall in §1.3 that any submodular function on a grassmannian gives rise to a Grayson polygon, a notion of semistability, a Harder-Narasimhan filtration and certain coefficients, the slopes of the polygon, which in our case will correspond to the Lyapunov exponents Λ k mentioned above.This is the so-called "slope formalism", which arises in particular in the study of Euclidean lattices as first described by Stuhler [36] and Grayson [14], and in many other subjects as well [6,27].
Although we have restricted to the current setting for clarity of exposition in this introduction, the result will be proved for more general measures µ than push-forwards of the Lebesgue measure by analytic maps; the exceptional subspaces then depend only on the Zariski closure of the support of µ.The right technical framework is that of good measures, which are closely related to the friendly measures of [17], see §1.9.
The paper is organized as follows.In Section 1 we begin by formulating the technical version of Theorem 2 and then proceed to describe the slope formalism on the grassmannian associated to a one-parameter flow and in particular discuss the associated notion of Harder-Narasimhan filtration.The proof of Theorems 1 and 2 is carried out in §1.5 and §1.6 after a discussion of the Kleinbock-Margulis quantitative non-divergence estimates.In Section 2 we formulate and sketch a proof of an extension of Theorem 1 to arbitrary number fields, which is analogous to the classical extension of Schmidt's subspace theorem due to Schlickewei to multiple places and targets [31,5,39].Finally in Section 3 we prove several applications of the main result.
For the sake of brevity we do not state these applications in the introduction and refer the reader to Section 3 directly instead.Let us only briefly mention that there are five main applications: (i) we explain how to recover the Sprindzuk conjecture (Kleinbock-Margulis theorem) from Theorem 1, (ii) we establish a manifold version of the classical Ridout theorem regarding approximation by rationals whose denominators have prescribed prime factors, (iii) we recover the main results of [2] regarding (weighted) diophantine approximation on submanifolds of matrices showing that they hold also for submanifolds defined over Q (and not only over Q), (iv) we prove an optimal criterion for strong extremality (Corollary 3), which answers in this case a question from [19,4], (v) we prove a Roth-type theorem for non-commutative diophantine approximation on nilpotent Lie groups, extending to algebraic points what was done for Lebesgue almost every point in our previous work with Aka and Rosenzweig [1,2].
Further applications and an extension of some of the results of this paper to other reductive groups and homogenous varieties can be found in the second author's forthcoming work [30].
1 The main result

Dynamical formulation
A lattice, that is a discrete subgroup ∆ of rank d in R d , can be written: where (u i ) 1≤i≤d is a basis of R d .And the space Ω of lattices can be identified with the homogeneous space The position of a lattice ∆ in the space Ω, up to a bounded error, is described by its successive minima where rk(A) for A ⊂ ∆ denotes the rank of the free abelian subgroup of ∆ generated by A, and B(0, λ) is the Euclidean ball of radius λ centered at the origin in R d .Theorem 1 will be deduced from the following description of the asymptotic behavior of the successive minima along a diagonal orbit of the lattice LZ d , where L is a µ-generic point of M .Here, as in Theorem 1, M = φ(U ) is the image of a connected open set in some Euclidean space U ⊂ R n under an analytic map φ : U → GL d (R), and µ is the push-forward under φ of the Lebesgue measure on U .
Theorem 3 (Strong parametric subspace theorem for manifolds).Assume that the Zariski closure of M is defined over Q.Let (a t ) t≥0 be a diagonal oneparameter semigroup in GL d (R).Then there exist real numbers . ., d} are chosen so that then there exist rational subspaces V , = 0, . . ., h in Q d such that • for µ-almost every L ∈ M the first d successive minima of a t LZ d are attained in V provided t is large enough.
In other words: for all ε > 0, for µ-almost every L ∈ M , there is t L,ε > 0 such that for t > t L,ε , = 1, . . ., h, and x ∈ Z d , When M is reduced to a singleton, the above theorem is a refinement of the parametric subspace theorem, often attributed to Faltings and Wüstholz [13,Theorem 9.1].As shown below, it can also be obtained directly from Schmidt's subspace theorem in its original form.The two results are really equivalent.
An important point to make is that (2) is a limit and not only a liminf or a limsup.This can be understood as saying that the diagonal orbit of a lattice defined over Q has an asymptotic shape at infinity.Of course the λ k can fluctuate, but only up to a small exponential error.This is of course in sharp contrast with what happens for certain specific values of L. Indeed it is possible to construct matrices L for which the successive minima have an almost arbitrary behavior along a given diagonal orbit, see [29,Theorem 1.3] and [7,Theorem 2.2].
Corollary 1 (Parametric subspace theorem for manifolds).Keep the same assumptions as in Theorem 3. Let a = (a t ) t≥0 be a one-parameter diagonal semigroup in SL d (R).There exists a proper subspace V (a) of Q d such that given ε > 0, for µ-almost every L ∈ M , there is t L,ε > 0 such that if t > t L,ε and Proof.Here we have assumed that a t is unimodular.By Minkowski's theorem the product of all d successive minima of a t LZ d is bounded above and below independently of t.In view of (2), this implies that The rational subspaces V i appearing in Theorem 3 depend only on (a t ) t≥0 and on the rational Zariski closure of M , namely the intersection of the closed algebraic subsets of GL d defined over Q and containing M .This will be clear from the proof of Theorem 3 given below, where a more precise description of V (a) and the V i will be given.As we will see, the filtration {V i } i is the Harder-Narasimhan filtration associated to M and (a t ) t≥0 , and the Λ i are the slopes of the Grayson polygon.The next few paragraphs contain preparations towards the proof of Theorem 3 given at the end of this section.

Expansion rate and submodularity
In this subsection M is an arbitrary subset of GL d (R) and a = (a t ) t≥0 a oneparameter diagonal semigroup.We write a t = diag(e A1t , . . ., e A d t ) for some real numbers A 1 , . . ., A d .For a non-zero subspace V ≤ R d we define its expansion rate with respect to a by where v represents V in an exterior power ∧ k R d .This quantity takes values in the finite set of eigenvalues of log a 1 in exterior powers.More precisely: where By convention we will also set τ ({0}) = 0. We leave it to the reader to check that if where Similarly we see readily that From this formula it is clear that for all subspaces where Zar(M ) is the Zariski closure.
Lemma 1 (Submodularity of expansion rate).Let M ⊂ GL d (R) and assume that its Zariski closure is irreducible.Then the map V → τ M (V ) is submodular on the grassmannian, i.e. satisfies, for every pair of subspaces Proof.Given a subspace It is therefore enough to prove the lemma in the case M = {L}.Now let u be a vector representing U = W 1 ∩ W 2 in some exterior power of R d .Let also w 1 and w 2 be such that u ∧ w 1 and u ∧ w 2 represent W 1 and W 2 , respectively.The subspace W 1 + W 2 is then represented by u ∧ w 1 ∧ w 2 , and moreover, for every t, Together with formula (4), this shows that τ L is submodular.Note that this is compatible with the convention τ M ({0}) = 0.

Harder-Narasimhan filtration
Submodular functions on partially ordered sets give rise to a "slope formalism" as in [36,14,13,6,27].This is well known.In this paragraph we recall the main facts we need and for the reader's convenience we give short proofs.The key to them is the following submodularity lemma, which in implicit form goes back at least to Stuhler [36] and Grayson [14] in the context of Euclidean sublattices and their covolume and which we rediscovered in [2] in the present context (subspaces and their expansion rate).Let k be a field and Grass(k d ) the Grassmannian of non-zero subspaces of k d .Let φ : Grass(k d ) → R be a submodular function, that is satisfying (9) with φ in place of τ M .
Lemma 2 (Submodularity lemma).There is a subspace and containing all other such subspaces.
Proof.Let I be that infimum.Without loss of generality, up to changing φ into φ−φ(0), we may assume that φ(0) = 0. We begin by observing that φ is bounded below: if (V n ) n is a sequence of distinct subspaces of maximal dimension with φ(V n ) → −∞, pick a fixed line L with L ⊂ V n for infinitely many V n ; by submodularity inf n φ(V n + L) = −∞, contradicting the maximality of dim V n .So I is finite.For k ≥ 1, we set I k to be the same infimum restricted to those subspaces V with dim V ≥ k.There is a maximal k 0 such that I = I k0 .If k 0 = d, then we can take V φ = k d and there is nothing to prove.Otherwise let ε > 0 so that I k0+1 > I + 2ε.If a subspace W satisfies then dim W ≤ k 0 .By definition there is such a subspace with dim V φ ≥ k 0 , call it V φ .If Z is another subspace with ( * ), then which forces dim(Z + V φ ) ≤ k 0 , and hence Z ≤ V φ , as desired.
Definition 1 (Semistability).We say that k d is semistable with respect to φ if Definition 2 (Grayson polygon).Let P φ : [0, d] → R be the convex piecewise linear function that is the supremum of all linear functions whose graph in Let (d i , f i ), i = 0, . . ., h be the vertices of the Grayson polygon with d 0 = 0 and d h = d, that is the angular points, where the slope changes, i.e. for i = 1, . . .h − 1, where The main result is the following: Proposition 1 (Harder-Narasimhan filtration).For each i = 0, . . ., h, there is forming the so-called Harder-Narasimhan filtration of φ.
In particular, we see that k d is semistable if and only if its Harder-Narasimhan filtration is the trivial one {0} < k d .
Remark.Note that given any k-subspace V ≤ k d the function φ V (W ) := φ(W )−φ(V ) defined on the quotient k d /V , where W = W/V for any k-subspace W containing V is submodular on Grass(k d /V ).It is clear from the proposition that V i /V i−1 is semistable with respect to φ Vi−1 for i = 1, . . ., h, and that Proof.The existence and uniqueness of V 1 is exactly what the submodularity lemma tells us.Suppose V i has been defined.We may apply the submodularity lemma again to φ Vi on the quotient k d /V i and thus obtain a subspace V i+1 containing V i strictly and such that the function We now need to show that the Grayson polygon coincides with the polygon P drawn out of the points (dim V i , φ(V i )).In other words we have to prove that if V is a subspace of k d and i is such that dim Moreover, again by definition of V i+1 we have On the other hand φ is submodular, so φ( Combining this with (11) and (12) we obtain: But i+1 ≥ i .So (10) follows.This shows the existence of the V i and the fact that they are nested.To see the uniqueness note that if dim V = dim V i and φ(V (13), which is a contradiction.
In the sequel, we apply this general theory by taking k = Q and φ(V ) the expansion rate τ M (V ) defined in (7) on the grassmannian of rational subspaces.The above definition of semistability reads: Definition 3. A non-zero rational subspace V in R d will be called M -semistable with respect to a = (a t ) t≥0 if for every rational subspace W ≤ V , Similarly this yields the notions of Grayson polygon and Harder-Narasimhan filtration of M with respect to a.
Remark (unstable subspace).In the case when det a = 1, a subspace V with τ (V ) < 0 corresponds to a point v in some ∧ k R d , which is unstable with respect to a in the terminology of geometric invariant theory, i.e. its a-orbit contains 0 in its closure.So R d is M -semistable if and only there are no unstable subspaces in the full grassmannian.
Next we make a remark about the dependence of the Harder-Narasimhan filtration with respect to the choice of one-parameter semigroup.It is easy to see that the Grayson polygon depends continuously on a.This is not so for the filtration, because new nodes can appear under small deformations, but the following lemma shows that the subspaces involved remain confined to a fixed finite family.Let b(n) be the ordered Bell number, that is the number of weak orderings (i.e.orderings with ties) on a set with n elements.To see the claim note that every slope for some I (because I W ⊆ I V as follows from (6) and τ M can be replaced by τ L for some fixed L as in the proof of Lemma 1), and Proposition 1 tells us that V i (a) is defined as the unique solution to an extremal problem involving the comparison of slopes.So only their order matters.Since there are at most b(2 d ) possible orders, we are done.
We also see from this proof that the slopes Λ i are continuous and piecewise linear in log a and actually linear on each one of the cells cut out by the hyperplanes Remark.The ordered Bell number b(n) grows super-exponentially with n.This gives a rather poor bound on the number of exceptional subspaces in Theorem 1, especially in view of Schmidt's bound d 2d 2 from [34].A more refined argument, which we do not include here and is based on the study of the set of permutations arising from the Schubert closure of M (see §1.8) allows to improve this to (2d) d .

Dynamics of diagonal flows
We now describe the dynamical ingredient of the proof.Using the quantitative non-divergence estimates (see Theorem 5 below), Kleinbock showed in [16] the existence of a well-defined almost sure diophantine exponent for analytic manifolds.As described in our previous work [2] with Menny Aka and Lior Rosenzweig this holds for more general measures and the exponent actually depends only on the Zariski closure of the support of the measure, a property called inheritance in this paper, because the measure inherits its exponent from the Zariski closure of its support.We will need the following version of these results: Proof.A lemma of Mahler [25,Theorem 3], which is a simple consequence of Minkowski's second theorem in the geometry of numbers, asserts that an embedding of algebraic varieties, it maps Zar(M ) onto Zar(ρ k (M )).These observations allow to reduce the proof to the case where k = 1, which we now assume.To this end we first recall the quantitative non-divergence estimates in a form established in [15]: and µ be as in Theorem 4.
There are C, α > 0 such that the following holds.Let ρ ∈ (0, 1] and t > 0, and let B := B(z, r) be an open ball such that B(z, Then for every ε ∈ (0, ρ], we have: And we set: Note that by construction, if S ⊂ S , then β(S) ≤ β(S ).Proof of claim: If β > β(S), then (C β ) fails.This implies that there exists t arbitrarily large such that sup g∈S a t gw ≤ e kβt for some w = 0. However by Minkowski's theorem applied to the sublattice represented by a t gw, this means that sup g∈S λ 1 (a t gZ d ) ≤ e βt .Hence µ 1 (L) ≤ β.
The opposite inequality for S = φ(B) will follow from the quantitative nondivergence estimate combined with Borel-Cantelli.Let β < β(φ(B)).Then (C β ) holds and, given δ > 0, Theorem 5 applies with ρ := ce βt and ε = e Summing this over all t ∈ N, we obtain by Borel-Cantelli that for almost ev- Since δ > 0 is arbitrary, this proves the first claim.Now we make the following key observation.For every bounded set S ⊂ GL d (R) and compact set K ⊃ S, Here H(S) is the preimage in GL d (R) under ρ of the linear span H S of all ρ(g), g ∈ S, where ρ is the linear representation with total space E = ⊕ d k=1 ∧ k R d .This follows immediately from the following claim: 2nd claim: There is C = C(S) > 0 such that for all w and t we have: Proof of claim: We note that H S = H Zar(S) = H Zar(S)∩K , because S and Zar(S) ∩ K have the same Zariski closure.Now consider the space L(H S , E) of linear maps from Therefore for any two such sets X, X there is a constant C X ,X > 0 such that for all This applies in particular to the sets X := ρ(S) and X := ρ(Zar(S) ∩ K).Now (16) follows by setting L : A → a t Aw, an element of L(H S , E).This ends the proof of the second claim.
We may now finish the proof of Theorem 4. Since M is connected, it follows from the first claim that the µ-almost sure value of µ 1 (L) for L ∈ M is unique and well-defined and equals β(φ(B)) for any ball B as in Theorem 5.It is also equal to sup L∈M µ 1 (L) by the first part of the claim.However, since φ is analytic, the Zariski closure of φ(B) coincides with Zar(M ).So (15) entails The right-hand side is the µ-almost sure value of µ 1 (L), so this inequality is an equality.This ends the proof.

Proof of Theorem 3
Without loss of generality we may assume that We need to show that for each = 1, . . ., h and for µ-almost every L the limit (2) holds when d −1 < k ≤ d and that for t large enough the first d −1 successive minima of a t LZ d are attained in V −1 .Suppose this has been proved for all < i and let us prove it for = i.By Minkowski's second theorem, for every L ∈ M lim sup On the other hand we already know that for µ-almost every L lim Hence lim sup Therefore to prove (2) it suffices to show that for µ-almost every L lim inf This will also prove (3) for = i as we now explain.By Minkowski's theorem In view of (17) this quantity is actually equal to τ M (V i−1 ).Therefore the d i−1 first minima of a t LZ d are attained in V i−1 , and (3) follows from (18).We now establish (18) separating two cases.
First case: Consider the linear forms L 1 , . . ., L d on R d given by the rows of L. They are linearly independent and have coefficients in Q. Fix ε > 0 and consider the integer solutions v ∈ Z d to the inequality This implies ∀k ∈ {1, . . ., d}, e Recall the subset of indices I defined in (6).
We define By construction M k vanishes on V i−1 and induces a linear form M k on the quotient R d /V i−1 .Also by construction, the linear forms where because by definition of V i and Λ di we know that (d We are thus in a position to apply Schmidt's subspace theorem [33,Theorem 1F] and conclude that all integer solutions to this inequality, except finitely many, lie in a finite union of proper rational subspaces of R d /V i−1 of minimal dimension.Let V /V i−1 be one of them.By definition of the Harder-Narasimhan filtration, And since Since V is rational and the linear forms {M k } k∈I V \I are linearly independent in restriction to V /V i−1 , we may apply Schmidt's subspace theorem once again and conclude that, apart from finitely many of them, the integer solutions are contained in a finite subset of proper rational subspaces.This would however contradict the minimality of V .In turn this implies that if t is large enough all integer solutions to (19) are contained in V i−1 .This shows (18) as desired.
General case: Note that Zar(M ) is an irreducible algebraic variety, so τ M is submodular.In view of (5), given any rational subspace is a proper closed subvariety of Zar(M ) that is defined over Q and of bounded degree.By Lemma 4 below (applied to for all rational V .Then the Harder-Narasimhan filtrations and the Grayson polygons of M and {L 0 } coincide.By the first part of the proof, Now we may invoke Theorem 4. In view of (25) and (17) this implies (18).
Lemma 4 (countable unions of proper subvarieties).Let X be an irreducible algebraic variety defined over a number field K. Let F be an algebraic extension of K of infinite degree.Let k ≥ 1 and suppose (X j ) j≥1 is a countable family of proper closed subvarieties of X of degree at most k, each defined over a field K j of degree at most k over K. Then X(F ) is not contained in the union of the X j (Q), j ≥ 1.
Proof.Looking at a finite cover by affine varieties, we may assume that X is affine.Then by Noether's normalization theorem, there is a finite morphism f : X → A d defined over Q, where d = dim X.So again without loss of generality, we may assume that X = A d , the d-dimensional affine space.But we can of course find elements for all j, so (x 1 , . . ., x d ) will not belong to any X j .

The subspace theorem
We are now ready for the proof of Theorem 1.
Proof of Theorem 1.Without loss of generality, we may assume that M is a bounded subset of GL d (R).In particular there is We are going to show that there is a finite set P , with |P | ≤ (2Cd 2 ε −1 ) d of one-parameter unimodular diagonal semigroups a = (a t ) t≥0 with the following property.If L ∈ M and x ∈ Z d is a solution to (1) such that the integer part t of ε 4d log x is at least 1, then there is a ∈ P such that a t Lx ≤ de −t .
On the other hand Changing some n i to the next or previous integer if needed, we may ensure that Then we set a t = diag(e n1t , . . ., e n d t ) and let P be the finite set of such diagonal semigroups.Note that |P | ≤ (6D) d .Clearly i + n i t ≤ −t and (26) follows.Now we may apply Corollary 1 and conclude that for µ-almost every L ∈ M , if x is a large enough solution of (1), it must lie inside V (a) for some a in P .This shows that the number of exceptional rational subspaces is finite.However by Lemma 3 above the subspace V (a) can take at most b(2 d ) possible values as a varies among all unimodular diagonal semigroups.This ends the proof.
Remark.Note that conversely each V (a), and hence each V i in Theorem 1, contains infinitely many solutions to (1) for every ε > 0.
Remark.Furthermore the rational subspaces V 1 ∪ • • • ∪ V r depend only on the rational Zariski closure of M and not on the choice of L. And because they are defined by a simple slope condition their height is effectively bounded in terms of the height of Zar(M ).On the other hand, the finite set of exceptional solutions lying outside the V i depends on L and ε and there is no known bound on their height or number, see [12,Prop. 5.1].When M is a single point it is however possible to group together the finitely many exceptional solutions into another set of proper subspaces whose number, but not height, can be effectively bounded, see [11].

Varieties defined over R
What happens if we remove the assumption that the Zariski closure of M is defined over Q in Theorem 3 ?Without this assumption, diagonal flow trajectories may not behave as nicely and typically no limit shape is to be expected.However we may give a simple upper and lower bound on the almost sure value of µ k (L) for k = 1, . . ., d, which exists by Theorem 4, in terms of the rational and real Grayson polygons as we now discuss.So far we have only considered the rational Harder-Narasimhan filtration and its rational polygon G Q with slopes s Q i , because we have restricted ourselves to considering the grassmannian of rational subspaces.But we may also take k = R in §1.
Proof.The upper bound follows from Minkowski's theorem: the first d i succes- Figure 2: G µ lies between the rational and the real polygons sive minima in a t LZ d are smaller than those attained in Since this holds for each i = 0, . . ., h Q , and the polygons are convex, we get that G sup µ lies below G Q .The lower bound follows from a modified version of the proof of Theorem 4 that uses instead the refined quantitative non-divergence estimates for successive minima already mentioned and established in [30,Chapter 6] or [24,Theorem 5.3].We now explain this briefly and refer to [30,Theorem 7.3.1]for the details of a more general statement.Let B be a small ball around some x ∈ U .Note that φ(B) is Zariski dense in M and thus τ M = τ φ(B) .Let β k be the infimum of all τ M (W ), where W ranges among real subspaces with dimension k.In view of (8) this implies that there is I with I(a) ≥ β k such that (Lw) I = 0 for some L ∈ M .By compactness of the grassmannian of k-dimensional subspaces there is c > 0 such that max I,I(a)≥β k sup L∈M (Lw) I ≥ c w for all W .It follows in particular that sup L∈M a t Lw e tβ k uniformly in t > 0 and in non-zero integer valued w.Now let γ k ≤ β k be the largest such function of k that is convex in k.Then the exact same Borel-Cantelli argument used in the proof of Claim 1 in Theorem 4, using instead the refined quantitative non-divergence in the form of [24,Theorem 5.3] shows that for µ-almost every we have shown that G µ lies above G R .This proposition also holds for measures µ on GL d (R) which are good in the sense of §1.9 below, see [30,Theorem 7.3.1].
If Zar(M ) is defined over Q, then by uniqueness of the Harder-Narasimhan filtration, we see that the real and rational filtrations coincide.In particular, in this case all three polygons coincide.If Zar(M ) is defined over Q, then the filtration over R is in fact defined over Q, and Theorem 3 asserts that G µ coincides with G Q .However G R may be different.

Similarly:
Corollary 2. Let M be as in Theorem 1, except we no longer assume that Zar(M ) is defined over Q.Let a = (a t = exp(tA)) t be a diagonal flow.Assume that R d is M -semistable with respect to a (see Def. 3).Then for µ-almost every

Plücker closure
In this paragraph we define the notion of Plücker closure of a subset M ⊂ GL d (R) and we explain why in Theorems 1 and 3 instead of assuming that Zar(M ) is defined over Q, it is enough to assume that the Plücker closure of M is defined over Q.
Let M ⊂ GL d (R) be a subset and (E, ρ) be the direct sum of the exterior power representations of GL d , namely We denote by H M the R-linear span in End(E) of all ρ(g), g ∈ M .We further define the Plücker closure H(M ) of M as the inverse image of H M under ρ in GL d (R).Note that H(M ) contains the Zariski closure Zar(M ) of M .We say that H(M ) is defined over Q if H N has a basis with coefficients in Q in the canonical basis of E. An obvious sufficient condition for this to hold is to ask for Zar(M ) to be defined over Q.
We also say that M is Plücker irreducible if ρ(M ) is not contained in a finite union of proper subspaces of H M in End(E).Clearly Zariski irreducibility implies Plücker irreducibility.It is clear that Plücker irreducibility of M is enough to guarantee that τ M is submodular by the argument of Lemma 1.It is also clear that τ M = τ H(M ) .
For simplicity of exposition in this paper, we have chosen to state the assumptions in our main theorems in terms of the Zariski closure of M , but in fact Theorems 1 and 3 hold assuming only that the Plücker closure is defined over Q.The proof is verbatim the same as the one we have given, except that in Theorem 4, we get the following slightly stronger statement: for µ-almost every L in M and each k, Again the proof of this equality is exactly the one given for Theorem 4 when k = 1.However the reduction to the case k = 1 via Mahler's lemma no longer works here, because H(M ) may differ from H(ρ(M )).Instead one may use the enhanced version of the quantitative non-divergence estimates already mentioned in the proof of Proposition 2, namely [30,Chapter 6] or [24,Theorem 5.3] in place of Theorem 5, which enables one to run the argument simultaneously for all k.This shows that the almost sure value of each µ k (L), and thus the polygon G µ defined in the previous paragraph, depend only on the Plücker closure of M .
There is another notion of envelope of M that is also natural to consider, namely the intersection Sch(M ) of all translates S σ g of Schubert varieties S σ = BσB containing M .Here B is one of the d! Borel subgroups containing the diagonal subgroup and the closure is the Zariski closure.This Schubert closure contains the Plücker closure.It is easy to see from (6) that the submodular function τ M depends only on Sch(M ).Restricting to rational translates S σ g with g ∈ GL d (Q) one obtains the rational Schubert closure Sch Q (M ).The asymptotic shape and the exceptional subspaces appearing in Theorems 1 and 3 depend only on Sch Q (M ).A natural question we could not answer is whether or not the main theorem remains valid under the (weaker) assumption that Sch(M ) is defined over Q.The answer is clearly yes when Sch(M ) is defined over Q as follows readily from Proposition 2.

More general measures
In this paragraph, we define a class of measures called good measures, which is wider than the family considered so far of push-forwards of the Lebesgue measure under analytic maps, and for which Theorems 1, 3 and 4 continue to hold.This class is very closely related to the so-called friendly measures of [17,22]: instead of being expressed in terms of an affine span, the non-degeneracy condition is defined in terms of local Plücker closures.
First we need to recall some piece of terminology.We fix a metric on GL d (R), say induced from the euclidean metric on the matrices M d (R).Given two positive parameters C and α, a real-valued function f on the support of µ is called (C, α)-good with respect to µ if for any ball B in GL d (R) and all ε > 0 where f µ,B = sup x∈B∩Supp µ |f (x)|.The measure µ is doubling on a subset X ⊂ GL d (R) if there exists a constant C such that for every ball B(x, r) ⊂ X, µ(B(x, 2r)) ≤ C µ(B(x, r)).
Then we say that a Borel measure µ is locally good at L ∈ GL d (R) if there exists a ball B around L and positive constants C, α such that (i) The measure µ is doubling on B.
(ii) For every k ∈ {1, . . ., d}, for every pure k-vector and every a ∈ GL d (R), the map y → ay • w is (C, α)-good on B with respect to µ.
Recall next that given a subset S ⊂ GL d (R) and a point x ∈ S, the local Plücker closure H x (S) of S at x is the intersection over r > 0 of the Plücker closures of S ∩ B(x, r): We may now define the class of good measures.Definition 4. A locally finite Borel measure µ on GL d (R) will be called a good measure if Supp µ is Plücker irreducible and if it satisfies the following assumptions for µ-almost every L: 1.The measure µ is locally good at L; 2. The local Plücker closure of Supp µ at L is equal to that of Supp µ.
Of course, the most important example of a good measure is that of the push-forward of the Lebesgue measure under an analytic map.
Proposition 3 (Analytic measures are good).Let n ∈ N, let U be a connected open set in R n , and let ϕ : U → GL d (R) be a real-analytic map.Then the push-forward under ϕ of the Lebesgue measure on U is a good measure on the Zariski closure M of ϕ(U ).
Proof.Note that the maps of the form u → aϕ(u)•w 2 are linear combinations of products of matrix coefficients of ϕ(u) and hence belong to a finite dimensional linear subspace of analytic functions on U independent of the choice of a ∈ GL d (R) and w ∈ ∧ * R d .So [16, Proposition 2.1] applies.Theorem 4 holds for all good measures µ on GL d (R) with the same proof (suitably modified via the enhanced quantitative non-divergence estimates as mentioned in the previous paragraph).In fact the definition of good measures has been tailored precisely for Theorem 4 to hold.Thus the subspace theorem for manifolds and the parametric subspace theorem, Theorems 1 and 3, hold for good measures µ such that the Plücker closure of Supp(µ) is defined over Q.

A generalization to number fields
Schmidt himself observed in [32] that his theorem for the field Q of rationals implied a more general version for any number field K, and this was generalized shortly after by Schlickewei [31], who gave a statement allowing also finite places.In this section, we formulate a similar generalization of Theorems 1 and 3.
For a place v of a number field K, the completion of K at v is denoted by K v .As in Bombieri-Gubler [5, § 1.3-1.4]we use the following normalization for the absolute value where Q v is the completion of Q at the place v restricted to Q and N Kv/Qv (x) is the norm of x in the extension K v /Q v .The product formula then reads v |x| v = 1 for all x ∈ K.If S is a finite set of places of K containing all archimedean ones, the ring O K,S ⊂ K of S-integers is the set of x ∈ K such that |x| v ≤ 1 for all places v lying outside S. Elements of its group of units are called S-units.
Let d be a positive integer.The d-dimensional space K d v will be endowed with the supremum norm • v , i.e. for an element x = (x More generally, we let K S = v∈S K v be the product of all completions of K at the places of S, and if x = (x (v) ) v∈S is an element of K d S , we define its norm x , its content c(x) and height H(x) by It is clear that c(x) ≤ H(x) and x ≤ H(x) ≤ max{1, x |S| }.It follows from the product formula that x ≥ 1 if 0 = x ∈ O K,S .The image of O K,S in K S under the diagonal embedding is discrete and cocompact in K S [23, Chapter VII], and closed balls in K S for the norm are compact.Furthermore, it is easily seen from Dirichlet's unit theorem that there is a constant C = C(K, S) > 0 such that for all x ∈ K d S with c(x) = 0, there is an S-unit α ∈ O K,S such that For each v ∈ S, we denote by GL d (K v ) the group of invertible d × d matrices with coefficients in K v , and we set A product measure µ = ⊗ v∈S µ v on GL d (K S ) will be called a good measure if each µ v is a good measure on GL d (K v ).The definition of a good measure given in §1.9 for K v = R extends verbatim to other local fields K v .
Examples of good measures are provided by push-forwards of Haar measure under strictly analytic maps.Indeed Proposition 3 continues to hold for analytic maps φ : U → GL d (K v ) whose coordinates are defined by convergent power series on a ball U : . This is because, on the one hand the push-forward of Haar measure on K v under φ will be locally good everywhere by [22,Prop. 4.2], and on the other hand, the Plücker closure of the image of a ball B ⊂ U of positive radius is independent of the ball, because convergent power series that vanish on an open ball must vanish everywhere. Let We will say that the Plücker closure of µ is defined over Q if for each v ∈ S, the subspace H Supp µv of End(E v ) (see §1.8) is defined over Q, i.e. is the zero set of a family of linear forms on End(E v ) with coefficients in Q∩K v .Clearly this is the case if for each v ∈ S, the Zariski closure of Supp µ v in GL d (K v ) is defined over Q.We will also denote by H(µ) the cartesian product of all H(µ v ), where We are now ready to state: Theorem 6 (Subspace theorem for manifolds, S-arithmetic version).Let K be a number field, S a finite set of places including all archimedean ones, O K,S its ring of S-integers, and d in N. Let µ be a good measure on GL d (K S ) whose Plücker closure is defined over Q.Then there are proper subspaces V 1 , . . ., V r of K d such that for µ-almost every L and for every ε > 0, the inequality has only finitely many solutions up to multiplication by an S-unit.
Note that the left and right-hand sides of (28) are unchanged if x is changed into αx for some S-unit α, so we will focus on the equivalence classes of solutions.The bound on the number r of exceptional subspaces depends only on d and |S|, and each subspace contains infinitely many (classes of) solutions to (28).
When µ is a Dirac mass at a point , then the theorem is exactly the S-arithmetic Schmidt subspace theorem as stated in [5, 7.2.5].

Parametric version
Theorem 6 is deduced from Theorem 7 below, which is a parametric version analogous to Theorem 3. To formulate it, we need to define the S-arithmetic analogues of a lattice and its successive minima.A family of vectors x 1 , . . ., x k in K d S is said to be linearly independent if it spans a free K S -submodule of rank k.Equivalently the vectors x In other words, it is a subgroup ∆ ≤ K d S that can be written for some linearly independent elements Conversely they are all of this form.Note that if x 1 , . . ., x k are vectors from a lattice ∆, they are linearly independent if and only if x k are linearly independent for some place v.
is its k-th successive minimum, where c(x j ) is the content defined earlier.
It follows from (27) that we may have defined the successive minima using x j |S| in place of c(x j ) without much difference.In particular either definition will suit the theorem below.
The analogue of Theorem 3 now reads as follows.Fix a diagonal element a = diag(a , and consider the flow a t = a t , for every t ∈ N. Theorem 7 (S-arithmetic strong parametric subspace theorem).Let K be a number field and S a finite set of places containing all archimedean ones.Let (a t ) t∈N be a diagonal flow in GL d (K S ) and µ a good measure on GL d (K S ) whose Plücker closure is defined over Q.Then there are K-subspaces and real numbers s 1 < . . .< s h such that for each i = 1, . . ., h and for µ-almost every L, In other words for every ε > 0 and µ-almost every L, there is The K-subspaces V i , i = 1, . . ., h, appearing in Theorem 7 are the terms of the Harder-Narasimhan filtration associated to M := Supp µ ⊂ GL d (K S ) and the quantities s i are its slopes.We describe this filtration in the next paragraph.

Harder-Narasimhan filtrations for K d S
In this paragraph we associate a submodular function on the grassmannian Grass(K d ) that generalizes the expansion rate τ M defined earlier in §1.2.Given a K S -submodule V in K d S we define its expansion rate as follows is the content as defined earlier.Here we identify ∧ k K v with K N v , N = d k and use the standard basis e i1 ∧ . . .∧ e i k to define the norm, and note that As in (8) above, we see that if where i |, and w I is defined by the expression w = |I|=k w I e I , where e I = e i1 ∧ . . .∧ e i k when I = {i 1 , . . ., i k }.
We will say that M is Plücker irreducible if its projection to GL d (K v ) is Plücker irreducible for each v ∈ S.Under this assumption we see by the same argument as in Lemma 1 that τ M is submodular on the set of all K S -submodules of K d S .If we restrict τ M to the set of K-subspaces of K d , i.e. the grassmannian Grass(K d ), then we thus obtain a well-defined notion of Harder-Narasimhan filtration, Grayson polygon and slopes.This gives what we will call the rational Grayson polygon G K .And a K-linear space V will be M -semistable for the semigroup (a t ) if for every K-subspace W < V , But we may also consider τ M as a function on the set of all K S -submodules of K d S .Since the dimension of each projection to K v may not be the same for all v, we use the following definition for the dimension of a submodule Then dim V is a modular function on the "full grassmannian", i.e. the set of all K S -submodules of K d S .Thus Proposition 1 and its proof are still valid and we obtain a "full" Harder-Narasimhan filtration and a "full" Grayson polygon G K S , whose nodes now have x-coordinates in 1  |S| N.

Inheritance principle and proofs
In this section we discuss the proof of Theorems 6 and 7. A basic ingredient is the S-arithmetic version of Minkowski's second theorem, which without paying attention to numerical constants, takes the following form: Theorem 8 (Minkowski's second theorem).Let ∆ be a sublattice in K d S as in (5).Then where the constant involved in the Vinogradov notation depends only on K, S and d, not on ∆.
The content c(x 1 ∧ . . .∧ x k ) is proportional to the covolume of ∆ in its K S -span.See [5,Theorem C.2.11,page 611] for a proof when S has no nonarchimedean places and [21] in the general case with the caveat that the normalizations used in the latter paper differ from ours especially at the complex place, leading to a slightly different definition of the successive minima.
The derivation of Theorem 6 from Theorem 7 works verbatim as that of Theorem 1 from Theorem 3 given in §1.6.One replaces d with d|S| and treats all linear forms L (v) i on an equal footing.The constant D needs to be increased appropriately and at non-archimedean places v the exponential used in the definition of the flow will be replaced by a power of a uniformizer π v of K v , namely a ) for integers n i,v .One needs also to recall that in view of (27), for each T > 0 there are only finitely many classes of x ∈ O d K,S with c(x) ≤ T , so we may assume that c(x) is large.The number r of distinct V i obtained is similarly bounded by b(2 d|S| ) as in Lemma 3. Now the proof of Theorem 7 is again verbatim as that of Theorem 3, treating all L (v) i on an equal footing and keeping the argument unchanged.One needs to invoke the S-arithmetic subspace theorem (in the form of Theorem 6 for a single point, or as [5, 7.2.5]) in place of the ordinary subspace theorem.For the dynamical ingredient at the end, one applies instead the following generalization of Theorem 4.
Theorem 9 (Inheritance principle).Let K be a number field, S a finite set of places containing all archimedean places, and d in N. Let (a t ) be a diagonal one-parameter semigroup in GL d (K S ), and µ a good measure on GL d (K S ) with for each L ∈ GL d (K S ).Then, for µ-almost every L, Sketch of proof.When k = 1 the proof of this result is identical to the proof we gave of Theorem 4 with the following adjustments.Theorem 5 was extended to this context by Kleinbock-Tomanov in [22, §8.4] (technically speaking only for K = Q, but the general case is entirely analogous).The norm there and in the definition (C β ) of β must be replaced by the content of the corresponding vector.Theorem 8 must be used in place of the original Minkowski theorem to prove the first claim.Also (16) continues to hold with the content in place of the norm for subsets of GL d (K S ) that are cartesian products of subsets of GL d (K v ) for v ∈ S, which is the case for Supp(µ) by definition of a good measure.So the second claim also holds.The case k > 1 is analogous, but as in the proof of Prop.2, one needs to use the refined quantitative non-divergence estimate proved in [30,Chapter 6] and [24, Theorem 5.3] instead of Theorem 5.
We end this section by stating the analogue of Proposition 2 in the Sarithmetic context.The following describes what is left of Theorem 7 if we remove the assumption that the Plücker closure of the good measure µ is defined over Q.In §2.2 we have defined two Grayson polygons: the rational one G K coming from Grass(K d ) and the full one G K S coming from the full grassmannian of all K S -submodules.And of course we have as before the polygon G µ with nodes (k, µ k ), k ∈ [1, d], where µ k is the µ-almost sure value of µ k (L) provided by Theorem 9 and the polygon G sup µ with nodes (k, µ sup k ), where µ sup k is the supremum of µ sup k (L) over L ∈ M and µ sup k (L) is defined as µ k (L) with a limsup in place of liminf.
Proposition 4 (S-arithmetic sandwich theorem).The polygons G µ , G sup µ lie in between G K and G K S , i.e.
Again the proof is mutatis mutandis that of Proposition 2.

Examples and Applications
In this last section we present a number of applications of the main theorem.

Sprindzuk conjecture
We begin by the demonstration of how the main result of Kleinbock and Margulis [18, Conjectures H1, H2], namely the Sprindzuk conjecture, can easily be deduced from Theorem 1.Let us recall this result.For q ∈ Z d we define Π + (q) := d 1 |q i | + , where |x| + := max{1, |x|} for all x ∈ R. A point y ∈ R n is said to be very well multiplicatively approximable (or VWMA for short) if for some ε > 0 there are infinitely many q ∈ Z d such that A manifold M ⊂ R d is said to be strongly extremal if Lebesgue almost every point on M is not VWMA.
As often with applications of the subspace theorem, proofs proceed by induction on dimension.The induction hypothesis will be as follows: if g = (g 0 , . . ., g d ) is a tuple of linearly independent analytic functions on U and b 1 , . . ., b d ∈ Z d are linearly independent, then for almost every x ∈ U and every ε > 0, there are only finitely many solutions v ∈ Z d+1 to the inequalities It is straightforward that this statement implies Theorem 10, by letting v = (p, q), g = (1, f ) and b 1 , . . ., b d the standard basis of Z d .
Let φ(x) be the matrix whose rows are L 0 /g 0 , L 1 , . . ., L d .The linear independence assumption implies that φ(x) ∈ GL d+1 (R) on an open subset U ⊂ U , and without loss of generality we may assume that U = U .The equations defining the Plücker closure of φ(U ) are linear combinations of k × k minors of φ(x).Since those are linear combinations of the g i /g 0 , the Plücker closure of φ(U ) is the Plücker closure of the set of all matrices in GL d+1 (R) whose first row is (1, y) with y ∈ R d arbitrary and whose other rows are L 1 , . . ., L d .So it is defined over Q.
We may thus apply Theorem 1 and conclude that there is a finite number of proper rational hyperplanes V , such that for almost every x, the large enough solutions v to (31) are contained in some V .
Now pick one such V , and consider the restriction of The L i are linearly independent, so the (L i ) d 0 have rank at least d on R d , and thus (c 1 , . . ., c d ) has rank at least d − 1.Up to reordering, we may assume that c 1 , . . ., c d−1 are linearly independent.Similarly we see that h 1 , . . ., h d are linearly independent.
Finally observe that a solution v ∈ V to (31) yields a k ∈ Z d such that k ε and thus, by induction hypothesis, k belongs to a finite set of points.Hence so does v.This ends the proof.

Ridout's theorem for manifolds
Ridout's theorem [5,39] is an extension of Roth's theorem where p-adic places are allowed.This improves the exponent in Roth's theorem from 2 to 1 in case the rational approximations have denominators with prime factorization in a fixed subset.In this paragraph, we present one possible similar variant of the Kleinbock-Margulis theorem (Theorem 10).This will rely on the S-arithmetic subspace theorem for manifolds (Theorem 6).
Theorem 11.Let S be a finite set of primes.Let U ⊂ R n be a connected open subset and f 1 , . . ., f d : U → R be real analytic functions, which together with 1 are linearly independent over R. Write f = (f 1 , . . ., f d ).Then for every ε > 0 and for Lebesgue almost every u ∈ U , we have for all (p 1 , . . ., p d , q) ∈ Z d+1 with all prime factors of q in S, except for finitely many exceptions.
This is in contrast with the same result [18] without the restriction on the denominator q, where the right-hand side of (32) needs to be replaced by the weaker bound q − 1 d −ε .We have chosen simultaneous approximation for a change, but a similar statement with similar proof holds also for linear forms.Besides the theorem holds under the weaker assumption that the subspace of R d+1 spanned by all vectors (1, f 1 (x), . . ., f d (x)), x ∈ U , is defined over Q.
Proof sketch.The proof is a straightforward modification of the one we gave of Theorem 10 in §3.1.We define linear forms on R d+1 , L (∞) u,i is constant equal to x i .And we note that if x := (q, p 1 , . . ., p d ) ∈ Z d+1 contradicts (32), then So Theorem 6 applies and if q is large enough, x belongs to a finite family of proper rational subspaces.This allows us to use induction after restricting the linear forms to one of these subspaces, as in the proof of Theorem 10.The proof is left to the reader.

Submanifolds of matrices: extremality and inheritance
In this paragraph, we describe the extension of the Kleinbock-Margulis theorem to the case of subsmanifolds of matrices and we show how to recover from the subspace theorem for manifolds one of the main result of [2], which is a criterion for extremality in terms of so-called constraining pencils and an explicit computation of the exponent.
In what follows, E and V are two finite-dimensional real vector spaces with a Q-structure.We fix a lattice ∆ in V , which defines the rational structure.For x in Hom(V, E), we define Note that β(x) only depends on the subspace ker x in V .
Given a subspace W in V and an integer r, we define the pencil of endomorphisms P W,r by We say that P W,r is constraining if dim W/r < dim V / dim E and rational if W is so.In the case of algebraic sets defined over Q, the following theorem was proved in joint work with Menny Aka and Lior Rosenzweig [2, Theorem 1.2].The approach taken here yields a different proof of that result, and allows to generalize it to subsets defined over Q. Remark.For any sublattice ∆ ≤ ∆, one may define Then, the formula in the above theorem is simply, for almost every x in M , Proof of Theorem 12.We prove the theorem by induction on d = dim V , using the subspace theorem.Given x in M , we denote by x i , 1 ≤ i ≤ d, its columns, which are vectors in E. The rank of x is almost everywhere constant, and taking a coordinate projection if necessary, we assume that this rank is m ≤ d and that for almost every x in M , the vector space E is spanned by the first m columns x i , 1 ≤ i ≤ m.We want to show that, for almost every x in M , for all ε > 0, the set of inequalities has only finitely many solutions v = (v 1 , . . ., v d ) in Z d .For 1 ≤ i ≤ m, define a linear form on V R d by and for m < i ≤ d, L i (v) = v i .Since (x i ) 1≤i≤m spans E, the family of linear forms (L i ) 1≤i≤d is linearly independent.Moreover, as We may therefore apply Theorem 1: there exists a finite family V 1 , . . ., V h of hyperplanes in Q k such that, for almost every x in M , the integer solutions to (33) all lie in the union V 1 ∪ • • • ∪ V h except a finite number of them.It now suffices to check that in each V i , there can be only finitely many solutions.This follows from the induction hypothesis applied to V = V i , ∆ = V i ∩ ∆ and to the manifold M image of M under restriction to V .The converse inequality β(x) ≥ β is true for all x in M , as is easily seen using the classical Dirichlet argument.
It is also worth observing the following relation between the notions of extremality and semistability.In [18,19] an analytic submanifold M of M m,n (R) is said to be extremal if for almost every Y ∈ M and every ε > 0 there are only finitely many vectors q ∈ Z n and p ∈ Z m such that Y q − p ≤ q −(n/m+ε) .Now consider the matrix The image M of M under Y → L Y defines a submanifold of GL n+m (R).Using Proposition 2 and Theorem 3, it is then easy to see that: Proposition 5 (extremality vs. semistability).If R n+m is M -semistable with respect to the unimodular flow: a t = (e tn , . . ., e tn , e −tm , . . ., e −tm ), then M is extremal.If the Zariski (or Plücker) closure of M is defined over Q, then the almost sure diophantine exponent β from Theorem 12 can be read off the rational Grayson polygon of M by the formula where γ is the smallest slope of the rational Grayson polygon.In particular Q n+m is M -semistable if and only if M is extremal.

Multiplicative approximation and strong extremality
Following the suggestion of Baker [3, page 96] to study the multiplicative diophantine properties of the Mahler curve, Kleinbock and Margulis introduced the notion of strong extremality for manifolds in R n .This was later generalized in [19,4] to the context of diophantine approximation on matrices, but in that generalized setting the optimal criterion for strong extremality remained to be found [19,4].The method of the present paper can be used to answer this problem.Below we apply Theorem 1 and give a complete solution when the Zariski closure of the manifold is defined over Q.
Let m and n be two positive integers, and M m,n (R) the space of m × n matrices with real entries.Following Kleinbock and Margulis [18], we say that a matrix Y ∈ M m,n (R) is very well multiplicatively approximable (VWMA) if there exists ε > 0 such that the inequality has infinitely many solutions (p, q) ∈ Z m × Z n .In the above inequality, Y i denotes the i-th row of Y , for i = 1, . . ., m, and |q| + = max(|q|, 1).More generally, we define as in [8, §1.4] and denote by L i , i = 1, . . ., m + n the linear forms on R m+n given by the rows of the matrix L Y : If W is a linear subspace of R n+m , and I a non-empty subset of {1, . . ., m + n}, we let s I,W = rk(L i | W ) i∈I .
Definition 7 (Multiplicative pencils).Let I, J be proper subsets of {1, . . ., m+ n} such that I ⊂ {1, . . ., m} ⊂ J and r, s non-negative integers.Given a subspace W ≤ R n+m , we define a subvariety of endomorphisms P I,J,r,s,W ⊂ M m,n (R) by To justify the relevance of this definition to our problem, we start by an easy proposition, which is a consequence of Minkowski's first theorem or Dirichlet's pigeonhole principle.Then, Remark.By convention, if r = 0, the ratio is equal to +∞.Note that one can always take W = R m+n , I = J = {1, . . ., m}, and r = s = m, in which case the ratio is equal to 1.
Proof.If r = 0, then one must have Y i q − p i = 0 for some integer vector (p, q) in Z m+n .It is then clear that ω × (Y ) = ∞.So we may assume that r = 0.
Since by definition ω × (Y ) ≥ 0, we may also assume that s < k, otherwise there is nothing to prove.
Assuming that Y ∈ P I,J,r,s,W , we shall prove that there exists a constant C > 0 depending only on Y and W such that the inequality has infinitely many solutions v in W .This will yield the desired lower bound on ω × (Y ).Let Q > 0 be a large parameter.Pick i 1 , . . ., i r in I such that L i1 | W , . . ., L ir | W are linearly independent, pick i r+1 , . . ., i s in J such that L i1 | W , . . ., L is | W are linearly independent, and complete with i s+1 , . . ., i k such that L i1 | W , . . ., L i k | W are linearly independent.The symmetric convex body in W defined by has volume 1 and therefore, by Minkowski's first theorem, it contains a nonzero point v in W ∩ Z n+m .By our choice of the indices i , Now, as in the introduction, let M = φ(U ) be a connected analytic submanifold of M m,n (R) endowed with the measure µ equal to the push-forward of the Lebesgue measure under the analytic map φ : U → M m,n (R).When the Zariski closure of M is defined over Q -for example when φ is given by a polynomial map with coefficients in Q -the above proposition actually provides a formula for ω × (Y ), when Y is a µ-generic point of M .In other words: Theorem 13 (Formula for the multiplicative exponent).Assume that the Zariski closure of M is defined over Q We stated the result for analytic submanifolds for convenience, but it holds with the same proof for all good measures µ in the sense of §1.9 provided the Zariski closure of the support of µ is defined over Q.We shall say that a multiplicative pencil P I,J,r,s,W is constraining if it satisfies Our criterion1 for strong extremality immediately follows from the above formula.
Corollary 3 (Criterion for strong extremality).If M is an analytic submanifold of M m,n (R) whose Zariski closure is defined over Q.Then M is strongly extremal if and only if it is not contained in any rational constraining pencil.
The proof of Theorem 13 is inspired by Schmidt's proof of an analogous result on products of linear forms [33, §12, page 242].
Proof of Theorem 13.Let ω > 0 be such that for Y in a set of positive measure in M , the inequality has infinitely many solutions (p, q) in Z m+n .We want to show that M is included in a pencil P I,J,r,s,W such that Let W be a rational subspace of minimal dimension k containing infinitely many solutions to (34).If k = 1, then there must exist i ∈ {1, . . ., m} and (p, q) ∈ Z m+n such that Y i q−p i = 0, and one can take I = {i}, J = {1, . . ., m}, r = 0, and s = 1.So we assume k ≥ 2. Reordering the indices if necessary, we may assume that W contains infinitely many solutions to (34) satisfying and By (35), one has c 1 ≥ • • • ≥ c f ≥ 0. By (36) and the fact that one always has Moreover, by minimality of W , the subspace theorem applied in W to the set of linear forms In conclusion, one sees that the k-tuple (c 1 , . . ., c f , d 1 , . . ., d g ) belongs to the convex polytope (P ) defined by ] is non-negative on some point in the convex polytope (P ), so it must be non-negative on one of its vertices.The polytope (P) has gf vertices, given by Remark.Strong extremality also relates to semistability in a similar way as in Proposition 5. We leave it to the reader to check that in the setting of Theorem 13, M is strongly extremal if and only if Q m+n is semistable for {L Y , Y ∈ M } with respect to all unimodular flows a t = (e tA1 , . . ., e tAn+m ) with A 1 ≥ . . .≥ A m ≥ 0 ≥ A m+1 ≥ . . .≥ A m+n .See [19] where the relevance of this family of flows for strong extremality was uncovered.
The case when k = 2 and G = (R, +) is exactly Roth's theorem.In this case β 2 = 1 and U is the family of lines in R 2 with rational slopes.
The basic idea for the proof of Theorem 15, developed in [2], is to reduce the problem to a question of diophantine approximation on submanifolds.For that, we introduce the free Lie algebra F k over k generators x 1 , . . ., x k .Inside F k , the ideal of laws L k,g on the Lie algebra g of G is the set of elements r in F k such that r(X 1 , . . ., X k ) = 0 for every X 1 , . . ., X k in g, and the ideal of rational laws L is the real span of the intersection of L k,g with F k (Q), the natural Q-structure on F k .The Lie algebra F k,g,Q = F k /L k,g,Q has a graded structure where k,g,Q is the homogeneous part of F k,g,Q consisting of brackets of degree i.For r = r i with r i ∈ F where • is a fixed norm on F k,g,Q .Endowed with this quasi-norm, the Lie algebra F k,g,Q is quasi-isometric to the group of word maps on G, endowed with the word metric [2, Proposition 7.2].This yields the following characterization for the above diophantine exponent β(Γ g ) proved in [2,Proposition 7.3].We say that Γ g is relatively free in G if the only relations satisfied by g are the laws of G.This holds for all g outside a countable union of proper algebraic subvarieties of bounded degree defined over Q, and in particular for Lebesgue almost every g ∈ G k .
Proposition 7. Let G be a simply connected nilpotent Lie group, with Lie algebra g.Let g = (e X1 , . . ., e X k ) be a k-tuple in G such that Γ g is relatively free in G. Then the exponent β(Γ g ) defined above is also the infimum of all β > 0 such that r(X 1 , . . ., X k ) ≥ |r| −β (38) holds for all but finitely many r ∈ F k,g,Q (Z).
In the next paragraph, we show how Theorem 1 yields a formula for diophantine exponents defined with quasi-norms, as in the above proposition.

Weighted diophantine approximation
This paragraph generalizes the results of §3.3 to diophantine approximation with quasi-norms, also called weighted diophantine approximation (see e.g.[20, §1.5] and references therein).Again, E and V are two finite-dimensional real vector spaces, and ∆ is a lattice in V , defining a rational structure.
We fix a norm • on E. On V , we measure the size of vectors using a quasi-norm | • | given by the formula where α = (α 1 , . . ., α d ) is a d-tuple of positive real numbers and (u * i ) 1≤i≤d a basis of V * .Given x in Hom(V, E), define the diophantine exponent of x by Remark.It is not hard to see that this limit exists and equals i∈I W α i for a certain subset I W ⊂ [1, d].Indeed the restriction of | • | to W is itself comparable up to multiplicative constants to a quasi-norm with exponents α i , i ∈ I W , where I W = {i 1 , . . ., i k } is defined as follows.Choose i 1 minimal such that the restriction of u * i1 to W is non-zero, then inductively choose i j minimal such that the linear forms u * i1 | W , . . ., u * ij | W are linearly independent.This gives us a lower bound for the diophantine exponent, using a standard Dirichlet type argument: It turns out that this lower bound is in fact attained almost everywhere on analytic submanifolds of Hom(V, E) whose Zariski closure is defined over Q.This is the content of the next theorem.As earlier, we endow M := φ(U ) with the push-forward µ of the Lebesgue measure on the connected open subset U ⊂ R d via the analytic map φ : U → Hom(V, E). (40) This equality is also true for every Q-point of Zar(M ) outside a union of proper algebraic subsets of M defined over Q and of bounded degree.
Remark.Note that α can take only finitely many values, so the maximum and minimum in the above formula are indeed attained.
We can now easily derive our theorem about nilpotent groups.
The map G k → Hom(V, E), g → x g is a polynomial map with coefficients in K.In particular the Zariski closure of its image is defined over Q.When Γ g is relatively free, Proposition 7 shows that β(Γ g ) = β(x g ), where β(x g ) is the diophantine exponent with respect to the quasi-norm | • | on V defined in (37).By Theorem 16, β(x g ) = max min ; W ≤ V rational subspace for Lebesgue almost every g ∈ G k .This shows that β k is well defined, and since α takes rational values, this formula shows that β k ∈ Q.For each rational W ≤ V , the set of all h ∈ G k such that α(W ∩ ker x h ) > β k dim(W/(W ∩ ker x h ) is a proper algebraic subset of G k defined by equations of bounded degree with coefficients in K. Their union forms a proper subset of G k by Lemma 4 and Theorem 16 implies that β(Γ g ) = β k for every g outside this union.

Lemma 3 .
Let M ⊂ GL d (R) with irreducible Zariski closure.There is a finite set S M of rational subspaces of R d with |S M | ≤ b(2 d ) such that, as a = (a t ) t≥0 varies among all one-parameter diagonal semigroups of GL d (R), the subspaces V i (a) arising in the Harder-Narasimhan filtration all belong to S M .Proof.For I ⊂ [d], let I(a) = 1 |I| i∈I A i , where a t = diag(e A1t , . . ., e A d t ).We claim that the entire Harder-Narasimhan filtration of M depends on a only via the ordering of the various I(a) for I ⊂ [d].Namely if I(a) and I(a ) define the same weak ordering on the family of subsets of [d], then the filtrations coincide.
an analytic map, and µ the image of the Lebesgue measure under φ.Let M := φ(U ) and Zar(M ) its Zariski closure in GL d (R).Then for µ-almost every L in M and each k = 1, . . ., d, 1st claim: For all L ∈ S, β(S) ≥ µ 1 (L).If S = φ(B), where φ and B are as in Theorem 5, equality holds for µ-almost every L ∈ φ(B).
3 with the same submodular function τ M .This yields a new Harder-Narasimham filtration {V R i } h R i=0 for the real field and a new Grayson polygon G R with slopes s R i , that obviously lies below the rational polygon.Let as before M = φ(U ) be the image of a connected open set U ⊂ R n under an analytic map φ : U → GL d (R) and µ the measure on M that is the image of the Lebesgue measure on U .In this paragraph we no longer assume that Zar(M ) is defined over Q.Let µ k be the µ-almost sure value of µ k (L) as given by Theorem 4 and µ sup k the supremum over all L ∈ M of µ sup k (L), where µ sup k (L) is defined by the same formula as µ k (L) with a limsup in place of the liminf.Consider the points (k, µ k ) for k = 1, . . ., d and interpolate linearly between them, so as to form a polygon G µ as in Fig 1. Similarly form G sup µ with (k, µ sup k ).Note that G sup µ is convex (being a supremum of convex polygons), but G µ may not be.Proposition 2 ("Sandwich theorem").The polygons G µ , G sup µ lie in between the rational Grayson polygon G Q and the real Grayson polygon G R .In other words, for each k = 1, .

k
are linearly independent over K v for each place v.Definition 5 (Lattice in a number field).For any positive integers k ≤ d, we define a sublattice in K d S of rank k to be a discrete free O K,S -submodule of rank k in K d S .

Theorem 12 (
Diophantine exponent for submanifolds of matrices).Let U ⊂ R n a connected open set and φ : U → Hom(V, E) an analytic map.Assume that the Zariski closure of M = φ(U ) is defined over Q.Then, for Lebesgue almost every u in U , setting x = φ(u), β(x) = max dim W r − 1 ; W rational subspace such that M ⊂ P W,r .

d = 1 .
The result is clear because, for every x, the subgroup x(∆) is a discrete subgroup of E. d − 1 → d.Suppose the result has been proven for d − 1 ≥ 1.Let m = dim E and d = dim V .Fixing bases for E and ∆, we identify Hom(V, E) with m × d matrices.

Proposition 6 (
Dirichlet's principle).Fix Y ∈ M m,n (R), and denote by L i , i = 1, . . ., m + n, the rows of the matrix L Y .Assume that W is a k-dimensional rational subspace of R m+n and that ∅ = I ⊂ {1, . . ., m} ⊂ J {1, . . ., m + n} are such that r = rk(L i | W ) i∈I and s = rk(L i | W ) i∈J .

Lemma 5 .
Let V and | • | be as above.For any x in Hom(V, E),β α (x) ≥ max{ α(W ∩ ker x) dim W − dim W ∩ ker x ; W ≤ V rational subspace}.Proof.Let R > 0 be some large parameter.The number of points v in W ∩ ∆ such that |v| ≤ R is roughly R α(W ) , and their images in x(W ) W/(W ∩ ker x) lie in a distorted ball of volume O(R α(W )−α(W ∩ker x) ).Comparing volumes, we find that balls of radius ε around those points cannot be disjoint if ε R − α(W ∩ker x) dim W −dim W ∩ker x .

Theorem 16 (
Diophantine exponent for quasi-norms).Assume V and | • | are as above, and that the Zariski closure Zar(M ) of M is defined over Q.Then, for almost every x in M ,β α (x) = max min y∈M α(W ∩ ker y) dim W − dim W ∩ ker y ; W ≤ V rational subspace .
With this definition, we see that a matrix Y is VWMA if and only if ω × (Y ) > 1.Using the Borel-Cantelli lemma, it is not difficult to check that for the Lebesgue measure, almost every Y in M m,n (R) satisfies ω × (Y ) = 1.It is therefore natural to ask what other measures µ on M m,n (R) satisfy this property.We now set up some notation to formulate our criterion for stong extremality.Given a matrix Y in M m,n (R), with rows Y 1 , . . ., Y m , we let