Intrinsic Diophantine approximation on quadric hypersurfaces

We consider the question of how well points in a quadric hypersurface $M\subset\mathbb R^d$ can be approximated by rational points of $\mathbb Q^d\cap M$. This contrasts with the more common setup of approximating points in a manifold by all rational points in $\mathbb Q^d$. We provide complete answers to major questions of Diophantine approximation in this context. Of particular interest are the impact of the real and rational ranks of the defining quadratic form, quantities whose roles in Diophantine approximation have never been previously elucidated. Our methods include a correspondence between the intrinsic Diophantine approximation theory on a rational quadric hypersurface and the dynamics of the group of projective transformations which preserve that hypersurface, similar to earlier results in the non-intrinsic setting due to Dani ('86) and Kleinbock--Margulis ('99).


Introduction and motivation
Classical theorems in Diophantine approximation theory address questions regarding the way points x ∈ R d are approximated by rational points, considering the trade-off between the height of the rational point -the size of its denominator -and its distance to x; see [20,62] for a general introduction. Often x is assumed to lie on a certain subset of R d , for example a smooth manifold M , leading to Diophantine approximation on manifolds. This area of research has experienced rapid progress during the last two decades, owing much of it to methods coming from flows on homogeneous spaces; see e.g. [44] for a proof of long-standing conjectures of A. Baker and V. G. Sprindžuk. See also [3,4] for more recent developments.
It was observed in [18,26,27] that all sufficiently good rational approximants to points on certain rational varieties must in fact be intrinsic -that is, they are rational points lying on the variety itself. These results, in part, have motivated a new field of intrinsic approximation, which examines the quality to which points on 1.2. Diophantine approximation in R d . In the classical Diophantine approximation setup one has X = M = R d , Q = Q d , and H = H std (recall (1.1)).
Dirichlet's theorem asserts that for all x ∈ R d and T ≥ 1 there exists p/q ∈ Q d with q ≤ T satisfying where C > 0 is a constant depending on the choice of the norm on R d . A corollary is that (1.5) ψ 1+1/d is uniformly Dirichlet for (R d , Q d , H std ) 2 The converse is also true assuming the σ-compactness of M , see [31, Proposition 2.7]. We remark that in the case Q ⊆ M , uniqueness considerations for optimal Dirichlet functions were discussed at length in [31, §2].
(see Convention 2). Note that when the distance is given by the supremum norm on R d one can take C = 1 in (1.4), and thus C x ≡ 1 in (1.2 generalizing a result of V. Jarník [39], who proved the case d = 1 of (1.6). Moreover, the set BA d is hyperplane winning (see Section 4), which implies certain intersection properties (see Proposition 4.2). We shall refer to (1.6) as the Jarník-Schmidt theorem. Note that together, Dirichlet's theorem and the Jarník-Schmidt theorem solve problems 1 and 2 above for the case of the Diophantine triple (R d , Q d , H std ).
Resolving problem 3 gives rise to theorems of Khintchine and of Jarník-Besicovitch. For convenience let us denote A(ψ, R d , Q d , H std ) by A d (ψ). If λ is Lebesgue measure on R d , it was proven by A. Y. Khintchine [41] that, if ψ is non-increasing 3 (which will be our standing assumption whenever approximating functions ψ are considered), A d (ψ) is either null or conull depending on whether the series An important special case is given by letting ψ = ψ c for some c > 1 + 1/d (cf. Convention 2). This makes the series (1.7) converge; thus the sets A d (ψ c ) have Lebesgue measure zero; the Jarník-Besicovitch theorem shows that their Hausdorff dimension is given by dim A d (ψ c ) = (d + 1)/c. In particular, if (1.9) VWA d :=

c>1+1/d
If M is nondegenerate, then M is not contained in an affine hyperplane; the converse holds if M is connected and real-analytic.
Whether or not ψ 1+1/d is an optimal Dirichlet function for (M, Q d , H), where M is an arbitrary nondegenerate manifold, was not known until 2013, when V. V. Beresnevich [4] proved 5 that for such M , the set BA d ∩ M = BA(ψ 1+1/d , M, Q d , H) has full Hausdorff dimension; the optimality of ψ 1+1/d for the ambient approximation on M follows.
Analogues of the Khintchine and Jarník-Besicovitch theorems for Diophantine approximation on nondegenerate manifolds have a long history. The theory was initiated by K. Mahler who in the 1930s considered the Veronese curve  [3] which established the divergence case of a Khinchin-type theorem on these manifolds.

Main results
Convention 3. In this section, propositions which are proven later in the paper will be numbered according to the section they are proven in. Propositions numbered as 2.# are either straightforward or quoted from the literature.

Intrinsic
Diophantine approximation: what to expect. We now consider the main setup of the paper, namely that of intrinsic approximation. One way to do it is to take X = R d , choose a k-dimensional submanifold M of R d , and let Q = Q d ∩ M and H = H std as in (1.1). However we have chosen a different approach: state and prove the main results of the paper for submanifolds of projective spaces. This way in most cases statements of results and their proofs become more natural and transparent, see Remark 2.1 below.
Let P d R denote the d-dimensional real projective space, and let π : R d+1 \ {0} → P d R be the quotient map π(x) := [x]. For a subset S of R d+1 , we let [S] = π(S \ {0}). With some abuse of notation, let us define the standard height function H std : P d Q → N by the formula Here and elsewhere · represents the max norm. In particular, if M is a nondegenerate submanifold of R d , then ι d (M ) is a nondegenerate 6 submanifold of P d R , and the Diophantine triples T aff := (M, Q d ∩ M, H std ) and T proj := ι d (M ), P d Q ∩ ι d (M ), H std are "locally isomorphic". However, both the bi-Lipschitz constant and the implied constant of (2.2) depend on 5 In [4] it is assumed that M is real-analytic, but the method seems to work in the smooth category as well. Also, earlier this was proven by D. A. Badziahin and S. L. Velani [1] for C 2 nondegenerate planar curves. 6 Cf. Definition 3.1.
the chosen bounded set B. Thus concepts which are robust under point-dependent multiplicative constants will not be affected by the transformation. For example, whether or not a function is Dirichlet will be the same for the triples T aff and T proj , but it is conceivable that a function could be uniformly Dirichlet for the triple T proj but not for the triple T aff . Because of this difference, it is perhaps worthwhile to give a justification for why we are stating our results in projective space rather than affinely. The simplest answer to this question is that the projective statements are closest to how the results are actually proven. Moreover, in those cases where projective statements cannot be reformulated as affine statements, we feel it is important to keep the full strength of the projective theorem. To give a simple example, consider the classical Dirichlet's theorem. By examining its proof, we can deduce that (2.3) ψ 1+1/d is uniformly Dirichlet for (P d R , P d Q , H std ). This result is stronger than the classical (1.5), in the sense that simply translating (1.5) to projective space along the lines indicated above does not yield (2.3), while translating (2.3) to affine space yields (1.5) at least on the unit cube [0, 1] d , and applying translations recovers the full force of (1.5).
To guide the reader, we have included Affine Corollaries after most of the main results. Each Affine Corollary can be deduced from its corresponding result together with Remark 2.1. We omit those Affine Corollaries which would merely be restatements of the theorems with P d R replaced by R d . In what follows we will be considering Diophantine triples ( Clearly in the intrinsic setup much more depends on the manifold itself than in the theory of ambient approximation considered in the previous subsection. To take an extreme case, it is easy to find manifolds M for which P d Q ∩ M = . For such manifolds, intrinsic approximation is impossible. Even if P d Q ∩ M is dense in M , the "quantitative denseness" which determines Diophantine properties will depend on M , as shown by the following trivial example: Example 2.2. Fix n ≥ 2, 7 and let Φ n : R → P 2 R be defined by Φ n (x) = [1 : x : x n ]. Let C n = Φ n (R). Then it is easy to see that P 2 Q ∩ C n = Φ n (Q) and H std • Φ n (p/q) = q n . Together with (1.5), this implies that ψ 2/n is an optimal uniformly Dirichlet function for intrinsic approximation on C n .
Affine Corollary. For n ≥ 2, the function ψ 2/n is an optimal uniformly Dirichlet function for intrinsic approximation on any compact subset of the set {(x, x n ) : x ∈ R} which has nonempty relative interior.
So it makes sense to consider two classes of theorems concerning intrinsic Diophantine approximation: • the class of theorems which apply to all manifolds indiscriminately (except for some basic assumptions like nondegeneracy, analyticity, and/or algebraicity), and • the class of theorems which apply only to specific manifolds.
Because of examples like Example 2.2, theorems in the former class must always be "negative" theorems, i.e. those which say that points cannot be approximated too well by intrinsic rationals. In Diophantine approximation, it is typically the case that in order to tell whether you have the right "negative" theorem, there must be a corresponding "positive" theorem which exactly corresponds to the "negative" theorem (perhaps up to a constant). 8 In our case, since we have no general positive theorems, we propose to demonstrate the optimality of our negative theorems by specific positive theorems. Thus the two classes of theorems mentioned above will be complementary: every general result applies to specific cases, while specific results demonstrate the optimality of general results. 7 We exclude n = 1 because the resulting curve is degenerate. 8 For example, Jarník's result on the full dimension of the set of badly approximable numbers on the real line is significant mainly because of Dirichlet's theorem, which can be interpreted to say that "no number can be less well approximable than a badly approximable number".
In view of the interplay between general results and specific results described above, in the case of specific results we have chosen to focus our attention on the case of nonsingular rational quadratic hypersurfaces. One reason for this is that these manifolds are fortunate enough to possess positive theorems which are complementary to our general negative results. The other reason is that the group of projective transformations preserving a specific nonsingular rational quadratic hypersurface is reasonably large (it is a noncompact Lie group), and so we may apply the dynamical methods of Margulis and the second-named author [45] to prove our results.
2.2. General manifolds. Example 2.2 shows first of all that no function can be an optimal Dirichlet function for every curve in P 2 R . Moreover, as n → ∞ the functions ψ 2/n decay more and more slowly. However, none of the functions ψ 2/n decay faster than ψ 2/2 = ψ 1 . The question arises: is it possible for any nondegenerate curve in P 2 R to have an optimal Dirichlet function which decays faster than ψ 1 ? The answer is no. We need to introduce certain notation to state our first theorem: For any 1 ≤ k ≤ d, let n k,d ∈ N be maximal such that for some m k,d ≥ 0, and let m k,d be the unique integer satisfying (2.4). Let (No changes needed for the Affine Corollary.) In particular it is easy to see that in view of the last equality, Theorem 4.3 is a generalization of the Jarník-Schmidt theorem (1.6). Specializing to k = 1 and d = 2 gives the result stated in the beginning of the subsection, namely that no nondegenerate curve in P 2 R has a Dirichlet function decaying faster than ψ 1 . The proof of Theorem 4.3 uses a variant of Schmidt's game known as the hyperplane game (see Section 4). The essence of the proof is to demonstrate an analogue of the Simplex Lemma (Lemma 5.1). In the classical setting, the Simplex Lemma says that all rationals in R d lying inside a certain ball with denominators in a certain range must lie on a hyperplane; see e.g. [50,Lemma 4]. We prove the analogous assertion in this setting, which motivates the term N k,d above: we will see, cf. Claim 5.2 and its proof, that the exponent N k,d shows up naturally in a volume computation.
One may ask whether Theorem 4.3 is the best of its kind. This question may be made rigorous by means of the following is Dirichlet (and automatically optimal) for intrinsic approximation on M .
For such manifolds, Theorem 4.3 proves not only the optimality of ψ c(k,d) , but also the abundance of badly approximable points. Example 2.2 describes such a manifold in the case where k = 1 and d = 2, namely the standard parabola. This example can be generalized in two different ways to yield maximally approximable manifolds in higher dimensions: one is the class of Veronese varieties (Example 3.3; the Veronese curve (1.10) is a special case), and another is given by nonsingular rational quadratic hypersurfaces, discussed in the second half of the paper. Whether or not for every 1 ≤ k ≤ d, there exists a maximally approximable submanifold of R d of dimension k, is not clear, see Section 3 for a detailed discussion.
We now turn to intrinsic analogues of the Khintchine and Jarník-Besicovitch theorems for general nondegenerate manifolds. As explained earlier, one can only hope to have a convergence case of those theorems. Recall that in the classical setting, the convergence case follows directly from the Borel-Cantelli and Hausdorff-Cantelli lemmas and from estimates for the number of rational points whose height is less than a fixed number T . So in the case of intrinsic approximation one must find upper bounds on expressions of the form  [13,15,16,36,38]). Let M ⊆ P d R be a nonsingular algebraic hypersurface of dimension k = d − 1. Then for every ε > 0, Using the Hausdorff-Cantelli lemma [7, Lemma 3.10], one can immediately deduce the following corollary: Corollary 2.6. Let M be as in Theorem 2.5. Then for all c > 1, Affine Corollary. Let M ⊆ R d be an algebraic hypersurface whose projectivization is nonsingular. Then for all c > 1, (2.6) holds.
In fact, if M is a quadratic hypersurface, then equality holds in (2.6) (see Theorem 2.13). Thus Corollary 2.6 is the optimal result in terms of giving a bound for the Hausdorff dimension of the set of intrinsically ψ-approximable points on nonsingular algebraic hypersurfaces (without any information about degree).
Note that Theorem 2.5 cannot be used to give a corresponding upper bound for λ M A M (ψ c ) . This is because of the presence of ε in the exponent of (2.5). One might conjecture that (2.5) holds with ε = 0. However, this is false. A notable counterexample is given in [37, Theorem 7], where it is computed that N M (T ) ≍ × T 2 log(T ) for a certain quadratic hypersurface of P 3 R . We will call this manifold the exceptional quadratic hypersurface and consider it in detail in Section 11.
As a consequence of Corollary 2.6, we see that the sets A M (ψ c ) are Lebesgue nullsets whenever M is a nonsingular algebraic hypersurface and c > c(k, d). In other words, the set is a Lebesgue nullset. In the following theorem, we show that the hypothesis that M is a nonsingular algebraic hypersurface can be dropped from this assertion: The following stronger theorem is also true: Let Ψ : U → M be a local parameterization of M , let µ be an absolutely friendly (see Definition 5.4) measure on U , and let ν = Ψ[µ]. If ν-almost every point of M is D-nondegenerate, then VWA M is a ν-nullset.
(No changes needed for the Affine Corollary.) Remark 2.7. An interesting question which is not addressed in this paper is whether one could improve this result by estimating the Hausdorff dimension of those sets for a fixed ε.

Quadratic hypersurfaces.
In Part 2 of the paper we provide a complete theory of intrinsic approximation on quadratic hypersurfaces. Our main tool is a correspondence between the Diophantine properties of points in our hypersurface with dynamics in the space of latticesà la Margulis and the second-named author [45]; see Section 7. In particular, our version of Khintchine's theorem (Theorem 9.1) will be proven using [45, Theorem 1.7] and the reduction theory of algebraic groups [52,Proposition 2.2].
In the following theorems, fix d ≥ 2, and let Q be a nonsingular (see Definition 6.3) quadratic form on R d+1 with integral coefficients. (Cf. Remark 2.15 for a discussion of the singular case.) Denote by Proposition 6.15). Thus the results of §2.2 apply to the theory of intrinsic approximation on M Q . Manifolds M Q which take the above form are called nonsingular rational quadratic hypersurfaces.
To avoid trivialities, in our theorems we will make the standing assumption that For the affine corollaries to our theorems, we consider a quadratic polynomial Q aff : R d → R with integral coefficients, and we let Q : R d+1 → R be the projectivization of Q aff . Then M Q aff , the zero set of Q aff , is equal to ι −1 d (M Q ). We call M Q aff a nonsingular rational quadratic hypersurface if M Q is. Note that it may be the case that M Q has singularities at infinity; in this case, M Q aff is also considered singular.
The problem of intrinsic approximation on M Q was implicitly considered by C. Druţu in [27] where the Hausdorff dimension of sets A MQ (ψ) was computed. (Druţu actually studied ambient approximation on M Q , and, generalizing an earlier result of D. Dickinson and M. M. Dodson [26,Lemma 1], showed that it reduces to intrinsic approximation if ψ is assumed to decay fast enough.) The case Q(x) = x 2 1 +· · ·+x 2 d −x 2 0 was recently considered in [46]. 10 One of the theorems from the latter paper asserts that there exists C > 0 (possibly depending on d) 11  In particular, it follows that ψ 1 is uniformly Dirichlet for intrinsic approximation on M Q . It was also shown in [46] that: (i) ψ 1 is optimal -moreover, BA MQ (ψ 1 ) has full Hausdorff dimension (this also follows from Theorem 4.3); (ii) for any ψ : N → (0, ∞) such that (2.10) the function q → qφ(q) is nonincreasing, the Lebesgue measure of A MQ (ψ) is full (resp. zero) iff the sum ∞ q=1 q k−1 ψ(q) k diverges (resp. converges). The last statement was also shown to imply, via the Mass Transference Principle of Beresnevich and Velani [6, Theorem 2], a similar statement for Hausdorff measures.
In the present paper we generalize all the aforementioned results to the case of arbitrary quadratic hypersurfaces. In particular, we prove the following:  [46] is written in the affine setup; specifically, the manifold S k ⊆ R d is discussed. Since this set is bounded, Remark 2.1 gives an exact correspondence for Diophantine results in S k and those in ι d (S k ) = M Q . 11 N. G. Moshchevitin [58] has recently provided an elementary proof of this assertion for the case M Q aff = S 2 . His proof gives an explicit value for the constant C appearing in (2.9).
(i) ψ 1 is Dirichlet for intrinsic approximation on M Q .
(ii) ψ 1 is uniformly Dirichlet if and only if p Q = p R .
(iii) The following are equivalent: Affine Corollary. Let M Q aff ⊆ R d be a nonsingular rational quadratic hypersurface satisfying (2.8).
The following are equivalent: has positive λ MQ aff -measure.
In particular, it follows that nonsingular rational quadratic hypersurfaces are maximally approximable, as mentioned in the paragraph following Definition 2.4. On the other hand, applying Theorem 4.3 immediately yields the following: Theorem 2.8 (Jarník-Schmidt for quadratic hypersurfaces). Let M Q ⊆ P d R be a nonsingular rational quadratic hypersurface. Then dim BA MQ (ψ 1 ) = dim(M Q ). In particular, the Dirichlet function ψ 1 is optimal.
(No changes needed for the Affine Corollary.) We remark that the optimal Dirichlet function ψ 1 is independent of the dimension of M Q . This is in contrast to the classical situation, where the optimal Dirichlet function for R d , namely ψ 1+1/d , depends on the dimension d.
An alternate proof of Theorem 2.8 can be given by applying the main result of [43] to a dynamical interpretation of BA MQ (ψ 1 ); cf. Corollary 7.3.
Before stating the analogues of Khintchine's theorem and the Jarník-Besicovitch theorem for intrinsic approximation on quadratic hypersurfaces, let us introduce the following definition, which will be used throughout the paper in Sections 9-11: Definition 2.9. The exceptional quadratic hypersurface is the hypersurface M Q0 ⊆ P 3 R defined by the quadratic form If a quadratic form Q : R 4 → R is conjugate over Q to Q 0 , we will write Q ∼ Q 0 . We remark that Q ∼ Q 0 if and only if p Q (Q) = p R (Q) = 2 (cf. Proposition 6.10 for the backwards direction).
The exceptional quadratic hypersurface M Q0 has very interesting properties for intrinsic Diophantine approximation. We study it in detail in Section 11. Note that if Q ∼ Q 0 , then the intrinsic Diophantine theory on M Q will be more or less the same as the intrinsic Diophantine theory on M Q0 . Specifically, the rational equivalence between Q and Q 0 defines a diffeomorphism between M Q and M Q0 which sends rational points to rational points and preserves heights up to a multiplicative constant.
Recall that N M (T ) denotes the number of points in P d Q ∩ M whose height is bounded above by T , and that k := d − 1 is the dimension of M Q . Theorem 2.10 ([37, Theorems 5, 6, 7, and 8] and [37, p.12], cf. Appendix A). Let M Q ⊆ P d R be a nonsingular rational quadratic hypersurface satisfying (2.8). Then Let 2 denote the set of all integer powers of 2. Then for any nonincreasing function ψ : N → (0, ∞), we may write Combining with (2.13) and using the Hausdorff-Cantelli lemma [7, Lemma 3.10], one can immediately deduce the following corollary: R be a nonsingular rational quadratic hypersurface satisfying (2.8). Fix 0 ≤ s ≤ k, and let ψ : N → (0, ∞) be nonincreasing. If the series The case s = k corresponds to Lebesgue measure. Based on the above, one would expect that Khintchine's theorem for quadratic hypersurfaces would state that the converse of Corollary 2.11 holds when s = k (possibly with some additional assumptions on ψ). However, we instead have the following: Theorem 9.1 (Khintchine's theorem for quadratic hypersurfaces). Let M Q ⊆ P d R be a nonsingular rational quadratic hypersurface satisfying (2.8). Fix ψ : N → (0, ∞), and suppose that (I) q → qψ(q) is nonincreasing and tends to zero, and (II) log ψ exp is uniformly continuous. Then A MQ (ψ) has full measure with respect to λ MQ if and only if the series (No changes needed for the Affine Corollary.) Remark 2.12. Hypothesis (II) can be weakened slightly. Call a function ψ regular if for every C 1 > 1, there exists C 2 > 1 such that for all q 1 , q 2 , if 1/C 1 ≤ q 2 /q 1 ≤ C 1 , then 1/C 2 ≤ ψ(q 2 )/ψ(q 1 ) ≤ C 2 . This may be stated succinctly as follows: q 1 ≍ × q 2 implies ψ(q 1 ) ≍ × ψ(q 2 ). Then hypothesis (II) can be weakened to the hypothesis that ψ is regular.
When Q ∼ Q 0 , the series (9.1) is simply the series (2.14) evaluated at s = k; thus in this case, the convergence case of Theorem 9.1 follows directly from Corollary 2.11. However, when Q ∼ Q 0 , this is not true; for certain ψ, the series (9.1) may converge even if the series (2.14) diverges at s = k. This means that the converse to the Borel-Cantelli does not hold for the collection of sets defining A MQ (ψ), and so philosophically, there is some nontrivial relation between these sets. A description of this relation is given in Section 11 (see in particular Remark 11.3), where an elementary proof of the convergence case of Theorem 9.1 for the manifold M Q0 is given. Specifically, it will turn out that the convergence case can be proven by a more sophisticated application of the Borel-Cantelli lemma than the one outlined above.
Note that Theorem 9.1 is analogous to the main result of [35], the difference being that we are considering intrinsic approximation and the authors of [35] are considering a specific type of extrinsic approximation. Also, it is likely that the techniques of C. Druţu [27] can be used to prove Theorem 9.1 in the case Q ∼ Q 0 via the use of ubiquitous systems as considered in [5]. Such a proof would weaken the regularity requirements on ψ to the hypothesis that ψ is nonincreasing. On the other hand, Druţu's methods to not apply to the exceptional quadratic hypersurface M Q0 , since Druţu makes a standing assumption that the lattice Γ is irreducible (cf. [27, §2.5, §4.5]), which does not hold for that hypersurface (see p.41). Moreover, the authors take the point of view that the regularity hypotheses for Khintchine theorems are not too important as long as most "interesting borderline" cases satisfy the hypotheses. Thus, we instead use the machinery of Kleinbock  Theorem 2.13 (The Jarník-Besicovitch theorem for quadratic hypersurfaces). Fix 0 < s < k. Let ψ : N → (0, ∞) be regular, and suppose that q → q k ψ s (q) is nonincreasing and tends to zero. If the series In particular, combining with Corollary 2.11, for all c > 1 we have We remark that if Q ∼ Q 0 , then (2.15) may converge while (2.14) diverges. In this case, we do not know the value of H s A MQ (ψ) . However, the coarser dimension result (2.16) holds regardless. For reasons explained in Remark 11.3, the authors conjecture that Theorem 2.13 remains true if (2.15) is replaced by (2.14). If so, then the value of H s A MQ (ψ) would be known in every case.
Note also that if q 2 ψ(q) → 0, then all ψ-good rational approximations of points in M Q are intrinsic, Remark 2.14. Let H d denote the d-dimensional hyperbolic space. Given a quadratic hypersurface M Q ⊆ P d R satisfying p Q = p R = 1, there exists a lattice Γ ≤ Isom(H d ) and a diffeomorphism Φ : ∂H d → M Q such that if P Γ ⊆ ∂H d is the set of parabolic fixed points of Γ, then Φ(P Γ ) = P d Q ∩ M Q . This correspondence allows one to deduce the case p Q = p R = 1 of all the results of this subsection as consequences of known theorems about Diophantine approximation of lattices in Isom(H d ); see §6.5 for more detail.
Remark 2.15. In the above theorems, the form Q is always assumed to be nonsingular with integral coefficients. The latter assumption may be made without loss of generality, since if Q is a quadratic form which is not a scalar multiple of any quadratic form with integral coefficients, then P d Q ∩ M Q is not dense in M Q ; cf. Remark 8.9. On the other hand, the nonsingularity assumption does involve a loss of generality. In Theorem 8.1, the singular case can be deduced from the nonsingular case; cf. Remark 8.7. However, this is not the case for Theorem 9.1. The use of the nonsingularity assumption appears unavoidable in Theorem 9.1 since if Q is singular, then the associated algebraic group O(Q) is not semisimple.
2.4. The structure of the paper. The first part (General Theory) will be divided as follows. In Section 3 we introduce some more notation and terminology, and test the notions discussed above on a special class of manifolds (Veronese varieties), thereby familiarizing the readers with some of the tools used in the paper. In Section 4 we introduce the hyperplane game, our main technical tool for proving Theorem 4.3 (full dimension of BA), and state a strengthening of Theorem 4.3 which is phrased in terms of the hyperplane game (Theorem 5.3). In Section 5 we state and prove a version of the Simplex Lemma (Lemma 5.1), which we then use to prove both Theorem 5.3 (and thus Theorem 4.3) and Theorem 5.5.
In the second part we consider quadratic hypersurfaces. In Section 6 we recall the necessary preliminaries from the theory of quadratic forms. In Section 7 we state and prove the Correspondence Principle, which relates intrinsic Diophantine approximation on a nonsingular rational quadratic hypersurface M Q with dynamics on a certain space of arithmetic lattices. This correspondence is similar to the one developed for ambient approximation by Davenport-Schmidt and Dani, see [24,25,21,45,44] and generalizes the one used in [46]. In particular, we prove (Corollary 7.3) that [x] ∈ BA MQ (ψ 1 ) if and only if a certain trajectory on the corresponding homogeneous space is bounded.
In Section 8 we prove Theorem 8.1 (Dirichlet for quadratic hypersurfaces). In Section 9 we use [45, Theorem 1.7] to reduce Theorem 9.1 (Khintchine for quadratic hypersurfaces) to a statement about Haar measure on the space of Q-arithmetic lattices (Proposition 10.9). In Section 10 we use the generalized Iwasawa decomposition [49,Proposition 8.44] and the reduction theory for algebraic groups [52, Proposition 2.2] to prove Proposition 10.9, thus completing the proof of Theorem 9.1. Finally, in Section 11 we analyze in detail the exceptional quadratic hypersurface M Q0 , and we explain intuitively why the converse to (the naive application of) Borel-Cantelli does not hold for intrinsic approximation on this hypersurface. Part 1. General theory

A discussion of maximal approximability in special cases
Given a submanifold M ⊆ P d R , let us say that a coordinate lift of M is a map Φ : U → π −1 (M ) such that the map π • Φ : U → M is a local parameterization of M .
We start by giving a more detailed definition of a nondegenerate submanifold of the projective space.
: α ∈ N k , |α| ≤ j Here the power ∂ α is taken using multi-index notation: if α = (α 1 , . . . , α k ), then ∂ α = ∂ α1 1 · · · ∂ α k k . We will call T  Connected manifolds which are degenerate at every point but are not contained in a hyperplane exist but are very pathological; we refer to [66] for a detailed account, stated in somewhat different language.
The next example describes an important family of nondegenerate submanifolds of P d R : 3), and consider the Veronese embedding Φ k,n : where the power t α is taken using multi-index notation. Then it can be straightforwardly verified that the The one-dimensional special case (k = 1, d = n) is usually called Veronese curve or rational normal curve; its affine analogue is given by (1.10). Letting k = 1 and n = 2 yields V 1,2 = C 2 , as in Example 2.2. Recall that the latter curve was our first example of a maximally approximable manifold (see Definition 2.4).
The following lemma shows that the map Φ k,n is an "isomorphism" between the Diophantine triples Lemma 3.4. The map Φ k,n is a diffeomorphism between P k R and V k,n ; moreover The proof is a straightforward computation which is left to the reader.
Corollary 3.5. For any k, n ∈ N, V k,n is a maximally approximable submanifold of P d R .
Proof. We begin by proving the following more general assertion: Lemma 3.6. Fix d, n ∈ N, and let M be a maximally approximable submanifold of P d is an optimal Dirichlet function for the Diophantine triple (M, M ∩ P d Q , H std ). It follows that the function ψ c(k,d)/n is an optimal Dirichlet function for the Diophantine triple (M, M ∩ P d Q , H n std ), and thus (by Lemma 3.4) also for the Diophantine triple Since ψ c1 and ψ c2 cannot be optimal Dirichlet functions for the same Diophantine triple unless c 1 = c 2 , the lemma follows. ⊳ 13 Even stronger, every point of V k,n is n-nondegenerate. Moreover, by for any manifold M . Thus the ambient dimension of V k,n is maximal among all n-nondegenerate k-dimensional manifolds.
14 The map Φ k,n is not the only embedding which is an isomorphism in this sense; more generally, if Φ : R → P d Ê is an embedding defined by polynomials with integer coefficients, then a relation between H std • Φ and H std was found in [18, Proof of Lemma 2]. Similarly to Corollary 3.5, this relation can be used to discover an optimal Dirichlet function on the corresponding curve. However, in most cases the resulting curve is not maximally approximable. 15 This follows, for example, if M is connected, real-analytic, and Zariski dense in P d Ê .
Since P k R is a maximally approximable submanifold of itself, to complete the proof it suffices to show that (3.2) holds when k = d. Indeed, 16 which implies the desired result.
It will be observed that Corollary 3.5 is simply the end result of transferring Dirichlet's theorem in P k R into V k,n via the map Φ k,n . Similarly, using Lemma 3.4 it is possible to transfer the theorems of Khintchine and of Jarník-Besicovitch from P k R to V k,n . We omit the statement for brevity. Thus, intrinsic Diophantine approximation on V k,n is essentially the same as Diophantine approximation on P k R , and does not introduce any new phenomena. By contrast, we will see new phenomena when we study nonsingular rational quadratic hypersurfaces in Part 2, demonstrating that Diophantine approximation on these hypersurfaces cannot be reduced to Diophantine approximation on P k R in the same way. We end this section with a discussion of the following question: there exists a maximally approximable submanifold of P d R of dimension k}. Trivially (k, k) ∈ M for all k ∈ N. Moreover, since every nonsingular rational quadratic hypersurface is maximally approximable (Theorem 8.1 below), we have (k, k + 1) ∈ M for all k ∈ N. On the other hand, by Corollary 3.5, we have (k, [k, n] − 1) ∈ M for all k, n ∈ N. Taking the special case k = 1, we have (1, d) ∈ M for all d ∈ N. Thus in every dimension, there exist both a maximally approximable curve and a maximally approximable hypersurface.
It is theoretically possible to get more pairs in M by using Lemma 3.6. Namely, if (k, d) ∈ M and if (3.2) holds for some n ∈ N, then (k, [d, n] − 1) ∈ M. However, we do not have any examples of pairs (k, d) which we can prove to be in M this way but which were not proven to be in M in the above paragraph.
Although the list of pairs known to be in M is so far quite meager, the elegance of the calculation which produces the number c(k, d) (cf. Lemma 5.1 and its proof) leads the authors to believe that there could be many more examples. It is even conceivable that all dimension pairs are in M. We therefore ask the following question: Open Question 3.7. Is it true that for every 1 ≤ k ≤ d, there exists a maximally approximable submanifold of P d R of dimension k? The smallest pair (k, d) for which we do not know the answer to this question is the pair (2, 4), which satisfies c(2, 4) = 5/6.

The hyperplane game and two variants
In [61], W. M. Schmidt introduced the game which is now known as Schmidt's game. A variant of this game was defined by C. T. McMullen [55], and in turn a variant of McMullen's game was defined in [11]. For the purposes of this paper, we will be interested only in this last variant, called the hyperplane absolute game, 17 and not in Schmidt's game or McMullen's game. However, we note that every hyperplane winning set is winning for Schmidt's game [11, Proposition 2.3(a)].
Given β > 0 and k ∈ N, the β-hyperplane game is played on R k by two players Alice and Bob as follows:

Bob chooses an initial ball
After Bob's nth move B n , Alice chooses an affine hyperplane A n ⊆ R k . We say that Alice "deletes the neighborhood of A n ".

After Alice's nth move A n , Bob chooses a ball
n and ρ n+1 ≥ βρ n .
Here and elsewhere S (ε) denotes the ε-thickening of a set S. If he is unable to choose such a ball, he loses. 18 A set S ⊆ R k is said to be β-hyperplane winning if Alice has a strategy which guarantees that S is hyperplane winning if it is β-hyperplane winning for some β > 0.
Remark 4.1. By modifying slightly the proof of [30,Proposition 4.4], one can show that if Bob's balls are required to satisfy ρ n+1 = βρ n rather than ρ n+1 ≥ βρ n , then the class of sets which are hyperplane winning remains unchanged. Thus we can assume that ρ n → 0, in which case the intersection ∞ 1 B n is a singleton.
We list here three important results regarding hyperplane winning sets, the proofs of which can be found in [ In fact, in (iii) more is true: the intersection of a hyperplane winning set with a sufficiently nondegenerate fractal (a hyperplane diffuse set) is winning for Schmidt's game on that fractal [11, Propositions 4.7 and 4.9] and therefore has Hausdorff dimension equal to at least the lower pointwise dimension of any measure whose support is equal to that fractal [48, Proposition 5.1]. In particular, if the fractal is Ahlfors regular then the intersection has full dimension relative to the fractal.
In [47, §3], the notion of hyperplane winning was generalized from subsets of Euclidean space to subsets of arbitrary manifolds. Namely, a subset S of a manifold M is hyperplane winning relative to M if whenever Ψ : U → M is a local parameterization of M and K ⊆ U is compact, the set Ψ −1 (S)∪(R k \K) is hyperplane winning. 19 We now state our main result concerning the abundance of badly intrinsically approximable points: be a submanifold of dimension k, and let c(k, d) be as in Notation 2.3. Suppose that for some D ∈ N, every point of M is D-nondegenerate. Then BA M (ψ c(k,d) ) is hyperplane winning relative to M .
Using Theorem 5.3, we deduce as a corollary the following result which was stated in the introduction: In order to prove Theorem 5.3, we will introduce two variants of the hyperplane game. The first allows Alice to delete neighborhoods of algebraic sets rather than just hyperplanes, and the second allows her to delete neighborhoods of levelsets of smooth functions. It will turn out that each of these variants is equivalent to the hyperplane game, meaning that any set which is winning for one of the games is winning for all three games.
Definition 4.4. Fix β > 0 and D ∈ N. The rules of the (β, D) algebraic-set game are the same as the rules of the β-hyperplane game, except that A n is allowed to be the zero set of any polynomial of degree at most D. A set is algebraic-set winning if there exists D ∈ N so that it is (β, D) algebraic-set winning for all β > 0.
where the derivative is taken using multi-index notation. Let Definition 4.5. The rules of the (β, D, C 1 )-levelset game are the same as the rules of the β-hyperplane game, except that A n is allowed to be the zero set of any C D+1 function f : B n → R satisfying The condition (4.1) should be interpreted heuristically as meaning that "f is close to being a polynomial of degree D".
Clearly, any hyperplane winning set is algebraic-set winning and any algebraic-set winning set is levelsetwinning. The remainder of this section is devoted to the proof of the following theorem: Theorem 4.6. Any levelset winning set is hyperplane winning.
We begin by introducing some notation.
• For f : U → R, Z f will denote the zero set of f , i.e. Z f = f −1 (0). • For D ∈ N, P D will denote the set of all polynomials of degree at most D whose largest coefficient has magnitude 1. Note that P D is a compact topological space; moreover, every nonzero polynomial of degree at most D is a scalar multiple of an element of P D .
Lemma 4.8. Fix k ∈ N and 0 < β ≤ 1, and let f : R k → R be a nonzero polynomial. Suppose that Bob and Alice are playing the β-hyperplane game, and suppose that Bob's first move is B 0 = B(0, 1). Then there exists γ > 0 so that Alice has a strategy to guarantee that Bob's first ball of radius less than γ (assuming that such a ball exists) is disjoint from Z (γ) f . Proof. The proof is by induction on the degree of f . If deg(f ) = 0, then f is constant and Z f = , so the lemma is trivially satisfied. Next, suppose that the lemma is true for all polynomials of degree strictly less than f . In particular, it is true for f := (∂/∂x 1 )[f ]; let γ > 0 be given by the lemma. Since is a compact subset of a nonsingular hypersurface in R k , there exists δ > 0 with the following property: Alice's strategy is now as follows: Use the strategy from the induction hypothesis to guarantee that Bob's first ball of radius less than γ is disjoint from Z ( γ) f . If the radius of this ball is greater than δ, make further moves arbitrarily until Bob chooses a ball of radius less than δ. Either way, let B = B(x, ρ) denote Bob's first ball satisfying ρ ≤ min( γ, δ), and note that ρ ≥ β min( γ, δ) = 2γ/β. In particular ρ > γ, so Bob has not yet chosen a ball of radius less than γ. Let L be a hyperplane such that K ∩ B(x, 2ρ) ⊆ L (βρ/2) , guaranteed to exist by (4.2). Alice's next move will be to delete the neighborhood of the hyperplane L. Following that, she will make arbitrary moves until Bob chooses a ball B of radius less than γ.
We claim that B is disjoint from Z f . We next show that the constant γ can be made to depend only on the degree of f and not on f itself. Lemma 4.9. Fix k, D ∈ N and 0 < β ≤ 1. There exists γ > 0 such that for any nonzero polynomial f : R k → R of degree at most D, if Bob and Alice play the β-hyperplane game and if Bob's first move is B 0 = B(0, 1), then Alice has a strategy to guarantee that Bob's first ball of radius less than γ (assuming that such a ball exists) is disjoint from Z (γ) f . Proof. The map P D ∋ f → Z f is upper semicontinuous in the Hausdorff topology, meaning that for any f ∈ P D , γ > 0, and K ⊆ R k compact, there exists a neighborhood of f in P D such that all g in the neighborhood satisfy Z g ∩ K ⊆ Z (γ) f . In particular, for each f ∈ P D , let γ f be as in Lemma 4.8, and let be a finite subcover and let γ = min n i=1 γ fi /2. Then for all g ∈ P D , g ∈ U fi for some i, and so Since Alice has a strategy to avoid Z by the time Bob's radius is less than γ fi , she has a strategy to avoid Z (γ) g by the time Bob's radius is less than γ.
Let k, D, β, and γ be as above. Fix x ∈ R k and ρ > 0, and let . Bob's first ball of radius less than γρ will still have radius ≥ βγρ by the rules of the β-hyperplane game, so it can be interpreted as Bob's next move in the (βγ, D) algebraic set game. Summarizing, we have the following: Proof. For each k, D ∈ N and 0 < β ≤ 1, if γ > 0 is as in Lemma 4.9, then every (βγ, D) algebraic-set winning subset of R k is β-hyperplane winning.
To complete the proof of Theorem 4.6, we must show that every levelset winning set is algebraic-set winning. For this, we will need three more lemmas: Lemma 4.11. Fix k, D ∈ N and β > 0. Then there exists γ > 0 such that for any f ∈ P D , there exists g ∈ P D such that Proof. For each g ∈ P D , |g| is bounded uniformly away from 0 on g . Let γ g > 0 be strictly less than this uniform bound, and let U g be the set of all polynomials f ∈ P D such that min B(0,1)\Z (β) g |f | > γ g . Then U g is an open set containing g. Letting (U gi ) n i=1 be a finite subcover, the lemma holds with γ = min n i=1 γ gi .
Lemma 4.12. Fix k, D ∈ N and β > 0, and let B = B(0, 1). There exists In particular, Z Proof. Fix δ > 0 small to be determined, and let f : B → R be as above. For convenience of notation, we without loss of generality assume that f C D ,B = 1. By the definition of f C D ,B , there exists a point z ∈ B such that f C D ,z ≥ 1/2. Let h z denote the Dth order Taylor polynomial for f centered at z. Then Write h z = cj for some c > 0 and j ∈ P D ; then j C D ,B ≍ × 1 since P D is compact. Combining with (4.4), we see that c × 1, and thus Let γ > 0 be as in Lemma 4.11, and let δ be γ divided by the implied constant of (4.5). Then Moreover, by Lemma 4.11 there exists g ∈ P D such that g . This completes the proof.
Lemma 4.13. Fix k, D ∈ N and β, C 1 > 0. Then there exists ε > 0 such that for any ball B = B(x, ρ) ⊆ R k satisfying ρ ≤ ε and for any C D+1 function f : B → R satisfying there exists a polynomial g : Proof. Fix 0 < ε ≤ 1 small to be determined, and let B = B(x, ρ) and f : B → R be as above. Let T x,ρ be given by (4. 3), and let f = f • T x,ρ . Then for all α ∈ N k with |α| = D + 1, and on the other hand Combining, we have sup So for ε sufficiently small, f satisfies the hypotheses of Lemma 4.12. Let g be the polynomial given by Lemma 4.12, and let g = g • T −1 x,ρ , so that Z g = T x,ρ (Z g ). This completes the proof. Let k, D, β, C 1 , and ε be as above. Lemma 4.13 gives us a way of translating a winning strategy for Alice in the (β, D, C 1 )-levelset game into a winning strategy for Alice in the (2β, D) algebraic-set game. Indeed, without loss of generality suppose that Bob's first move in the (β, D, C 1 )-levelset game has radius ≤ ε. (Otherwise Alice makes dummy moves until this is true.) Now if Alice responds to Bob's move B(x, ρ) in the (β, D, C 1 )-levelset game by deleting the set Z (βρ) f , then in the (2β, D) algebraic-set game, she will simply delete the set Z (2βρ) g , where g is given by Lemma 4.13. Summarizing, we have the following: Corollary 4.14. Any levelset winning set is algebraic-set winning.
Combining Corollaries 4.10 and 4.14 completes the proof of Theorem 4.6.

The simplex lemma and its consequences
The paradigmatic example of a hyperplane winning set is the set which was proven to be hyperplane winning in [11,Theorem 2.5], as a consequence of the so-called simplex lemma [11,Lemma 3.1]. Essentially, the simplex lemma states that for each ball B(x, ρ) ⊆ R d , the set of rational points in B(x, ρ) whose denominators are less than ερ −d/(d+1) is contained in an affine hyperplane, where ε > 0 is small and depends only on d. As a result, when playing the hyperplane game Alice can simply delete the neighborhood of the hyperplane given by the simplex lemma, and it turns out that this strategy is winning for BA d . In this section we prove an analogue of the simplex lemma for rational points in a fixed manifold M . We then use the simplex lemma to prove two general negative results about intrinsic approximation on manifolds: that BA M (ψ c(k,d) ) is hyperplane winning, and that λ M (VWA M ) = 0.
Recall that for 1 ≤ k ≤ d, the constants N k,d and c(k, d) = (d + 1)/N k,d were defined in Notation 2.3.
Lemma 5.1 (Simplex lemma for manifolds). Let M ⊆ P d R be a submanifold of dimension k, let Ψ : U → M be a local parameterization of M , and let V ⊆ U be compact. Then there exists κ > 0 such that for all s ∈ U and 0 < ρ ≤ 1, the set Then f vanishes along the diagonal In fact, the first several derivatives of f vanish along the diagonal, due to repeated columns: The smallest order derivative of f which does not vanish along the diagonal is no less than N k,d .
Proof. Suppose that In particular, the rows (∂ αi Φ(t)) d+1 i=1 are all distinct, so the multi-indices α 1 , . . . , α d+1 must be distinct. Thus for each j ∈ N, and on the other hand, The order of the derivative so computing the smallest order derivative of f which potentially does not vanish along the diagonal becomes a combinatorial problem of minimizing ∞ j=1 jn j subject to (5.2) and (5.3). The reader will verify that the minimum is attained at the value N k,d described in Notation 2.3. ⊳ Thus by Taylor's theorem, we have Let On the other hand, if we write It follows that On the other hand, r i ≍ × 1 since r i ∈ Φ(V ), and so For κ > 0 sufficiently small, this contradicts (5.5).
Using the simplex lemma, we proceed to prove two results about intrinsic approximation on M . The first is the following: R be a submanifold of dimension k, and let c(k, d) be as in Notation 2.3. Suppose that for some D ∈ N, every point of M is D-nondegenerate. Then BA M (ψ c(k,d) ) is hyperplane winning relative to M .
Proof. Let Ψ : U → M be a local parameterization of M , and let K ⊆ U be compact. We need to show that the set is hyperplane winning. Fix C 1 > 0 large to be determined, and let β > 0. We will show that the set (5.6) is (β, D, C 1 )-levelset winning, where D is as in the statement of Theorem 5.3. Let λ = β −1/c(k,d) (so that λ > 1). Denote Bob's first move by B 0 = B(s 0 , ρ 0 ) ⊆ R k . Fix an open set V ⊇ K which is relatively compact in U ; without loss of generality we may assume that B(s 0 , 2ρ 0 ) ⊆ V , since Alice may make dummy moves until either this is true or Bob's ball is disjoint from K. Now Alice's strategy is as follows: If Bob has just made his nth move B n = B(s n , ρ n ) ⊆ V , then Alice will delete the βρ n -neighborhood of the set Ψ −1 (L n ), where L n is the hyperplane containing the set S sn,2ρn . To complete the proof we need to show (i) that this is legal (given C 1 > 0 large enough), and (ii) that the strategy guarantees that and by continuity, this quantity is bounded from below uniformly for s ∈ V and w ∈ S d . Now consider Alice's nth move.
so that Z f = Ψ −1 (L n ). Then by the first paragraph, f C D ,Bn ≥ f C D ,sn is bounded from below. On the other hand, In To state our last theorem regarding general manifolds, we need a definition: Definition 5.4. A measure µ on R k is absolutely decaying if there exist C, α > 0 such that for all x ∈ Supp(µ), for all 0 < ρ ≤ 1, for all ε > 0, and for every affine hyperplane L ⊆ R k , we have for all x ∈ Supp(µ) and 0 < ρ ≤ 1. If µ is both absolutely decaying and doubling, then µ is called absolutely friendly.
Recall that the set VWA M is defined by the equation A M (ψ c ).
Fix 0 < γ < ε/c(k, d). If n is sufficiently large, then we have On the other hand, since s ∈ K, we have s ∈ B(s i , 2ρ n ) , and so by Corollary 5.6 we have [r] ∈ L n,i . Thus Since this argument holds for all [r] satisfying (5.7), it follows that for infinitely many n ∈ N. ⊳ Claim 5.8. For each γ > 0, there exists α > 0 such that for all n ∈ N and i = 1, . . . , N n , Proof. Without loss of generality, suppose that Ψ(U ) ⊆ Φ(R d × {1}), and let Φ : nondegenerate embedding in the sense of [42]. Let B = B(s (n) i , ρ n ). By [42,Proposition 7.3], there exist C, α > 0 such that for any linear map P : On the other hand, if P is the linear functional whose zero set is the hyperplane L n,i , then So to complete the proof, we must show that ,ρn be as in (4.3), and let f = f • T . Translating (5.10) via T gives n . So to complete the proof, we must show that To demonstrate (5.12), let β > 0 be small enough so that for every polynomial g of degree at most D, . Such a β exists e.g. by a compactness argument. Let δ > 0 be given by Lemma 4.12. For n sufficiently large, the argument of Lemma 4.13 shows that the hypotheses of Lemma 4.12 are satisfied for f , and thus that Thus there exists s ∈ B(0, 1) for which | f (s)| ≥ δ f C D ,B(0,2) , demonstrating (5.12). ⊳ Fix γ, α as in Claim 5.8. From (5.8), we see that Remark 5.9. The conclusion of Theorem 5.5 holds for any measure µ satisfying (5.9). In particular, in [23], a class of measures will be considered which is vastly larger than the class of absolutely friendly measures, and these measures will be proven to satisfy (5.9). Thus these measures will also satisfy µ(VWA M ) = 0.
Part 2. Quadratic hypersurfaces 6. Preliminaries on quadratic forms 6.1. Orthogonality and nonsingularity. Let V be a vector space over R and let Q : V → R be a quadratic form. We denote by B Q the unique symmetric bilinear form on V satisfying We remark that B Q may be written explicitly in terms of Q via the formula B Q (x, Notation 6.2. The set of all vectors which are Q-orthogonal to a given vector x will be denoted x ⊥ , and for any S ⊆ V we let S ⊥ = x∈S x ⊥ . A form Q is nonsingular if and only if its corresponding hypersurface M Q is nonsingular (Proposition 6.15).
Observation 6.4. Q is nonsingular if and only if the map x → B Q (x, ·) is an isomorphism between V and V * . 6.2. Totally isotropic subspaces; rank and renormalization. Throughout this subsection, fix K ∈ {R, Q} and d ≥ 1, and let Q : R d+1 → R be a nonsingular quadratic form whose coefficients lie in K.
Definition 6.6. A subspace E ≤ R d+1 will be called a K-subspace if E has a basis consisting of elements of K d+1 , or equivalently, if E is defined by equations whose coefficients lie in K. (In the literature, it is sometimes said that E is defined over K.) Proposition 6.7. Any two maximal totally isotropic K-subspaces of R d+1 have the same dimension. 20 Proof. Let E 1 , E 2 ≤ R d+1 be two maximal totally isotropic K-subspaces, and let A := {y ∈ E 2 : B Q (x, y) = 0 ∀x ∈ E 1 }. Since A + E 1 is a totally isotropic K-subspace of R d+1 , by the maximality of E 1 we have Definition 6.8. The common dimension of Proposition 6.7 is called the K-rank of Q. It will be denoted p .
It turns out to be convenient to conjugate totally isotropic subspaces to canonical subspaces, namely to subspaces of the form By choosing the right conjugation map φ, we may also guarantee that the conjugated quadratic form R = Q • φ has a particularly nice form. We make this rigorous as follows: 20 This proposition may be well-known, although we have not been able to find a reference. We include the proof for completeness. Definition 6.9. For m ≤ (d + 1)/2, a quadratic form R is m-normalized if there exists a quadratic form R on R d+1−2m such that The quadratic form R will be called the remainder of R.
Proposition 6.10. Let E ≤ R d+1 be a totally isotropic K-subspace of dimension m. Then m ≤ (d + 1)/2, and there exists φ ∈ GL d+1 (K) such that Proof. Since Q is nonsingular, we may identify E * with R d+1 /E ⊥ via the map i=m be a K-basis for E 3 , and let φ be the (d + 1) × (d + 1) matrix whose columns are given by f 0 , . . . , f d , so that φ(e i ) = f i for i = 0, . . . , d. Then φ ∈ GL d+1 (K) by the above-mentioned decomposition R d+1 = E ⊕ E 2 ⊕ E 3 . (i) and (ii) follow immediately. Corollary 6.11. p R ≤ (d + 1)/2. Corollary 6.12. If Q has coefficients in Q then p Q ≥ (d − 3)/2 unless p Q = p R . In particular, p Q ≥ p R − 2 in all cases.
Proof. Without loss of generality suppose that Q is p Q -normalized, and let Q be the remainder of Q. If p Q = p R , then Q represents zero over R; if d + 1 − 2p Q ≥ 5, then Meyer's theorem implies that Q represents zero over Q, contradicting the definition of p Q .
A convenient fact about m-normalized quadratic forms is that any element of GL m (R) extends to an element of SL d+1 (R) which preserves every m-normalized quadratic form. Specifically, given a quadratic form R : Then direct computation yields the following: Observation 6.13. Fix m ≤ (d + 1)/2 and φ ∈ GL m (R). Define the reverse of the matrix φ to be the matrix whose (i, j)th entry is equal to the (m − j, m − i)th entry of φ, and denote this matrix by φ R . Visually, φ R is φ flipped along the northeast-southwest diagonal. Let Then g φ ∈ O(R) for every m-normalized quadratic form R.
Next, for each m ≤ (d + 1)/2 and t ∈ R m , denote the diagonal matrix whose entries are e −t0 , . . . , e −tm−1 by φ t , and let Of particular importance will be the case m = 1, in which case A simple computation immediately yields the following observation, which will turn out to be quite useful: Observation 6.14. For t ≥ 0 and x ∈ R d+1 , (6. 6) dist(x, L 1 ) ≤ g t (x) . 6.4. The space of lattices; Mahler's compactness criterion. As stated in the introduction, our main tool for proving theorems concerning intrinsic approximation on M Q is a correspondence principle between approximations of a point in M Q and dynamics in the space of lattices. We will describe this correspondence principle in Section 7 below, while here we introduce the space of lattices which we are interested in, namely the space of Q-arithmetic lattices. 21 Strictly speaking, it is not necessary to prove nonsingularity of an algebraic variety in order to apply Theorem 4.3, since one may restrict one's attention to the smooth locus of that variety. By contrast, the nondegeneracy assertion is necessary. Definition 6.16. Fix a quadratic form Q : (Symmetrically, we may also say that Q is Λ-arithemetic.) The set of Q-arithmetic lattices will be denoted Ω Q , while the set of all unimodular lattices in R d+1 will be denoted Ω d .
Observation 6.17. A quadratic form is Z d+1 -arithmetic if and only if its coefficients are integral.
Clearly, Ω Q is preserved by the action of O(Q). If Λ * ∈ Ω Q is fixed, we denote its stabilizer by O(Q; Λ * ) and its orbit by Ω Q,Λ * . We will implicitly identify Ω Q,Λ * with the symmetric space O(Q)/O(Q; Λ * ) via the map gO(Q; Λ * ) → gΛ * . This automatically endows Ω Q,Λ * with a topological structure and a Haar measure, which we will denote by µ Q,Λ * .
Viewing Ω Q,Λ * as a symmetric space could conceivably give it a different topology than viewing it as a subset of Ω d , which has its own topology from its identification with SL d+1 (R)/ SL d+1 (Z) coming from the map g SL d+1 (Z) → g(Z d+1 ). Fortunately, it turns out that these topologies are identical: Proposition 6.18. The inclusion map Ω Q,Λ * → Ω d is proper and continuous, when both spaces are endowed with the topologies coming from the identification with their corresponding symmetric spaces. Consequently, the topology on Ω Q,Λ * is unambiguous.
Proof. The continuity of the inclusion map follows directly from the continuity of the inclusion map from O(Q) to SL d+1 (R). Let us show that the inclusion map is proper. Let (Λ n ) ∞ 1 be a sequence in Ω Q,Λ * converging to a point Λ 0 ∈ Ω d . Then there exist SL d+1 (R) ∋ g n → g 0 ∈ SL d+1 (R) such that Λ n = g n (Z d+1 ) for all n ≥ 0. Then for all n ≥ 1, Q n := Q • g n is a Z d+1 -arithmetic quadratic form, and Q n → Q 0 := Q • g 0 . Since the space of Z d+1 -arithmetic quadratic forms is discrete (being identical to the space of quadratic forms with coefficients in Z), we have Q n = Q 0 for all sufficiently large n. (Thus a posteriori Q 0 is Z d+1arithmetic.) For n satisfying Q n = Q 0 , we have h n := g n g −1 0 ∈ O(Q); in particular Λ 0 = h −1 n (Λ n ) ∈ Ω Q,Λ * . On the other hand Λ n = h n Λ 0 and h n → h 0 = id; this implies that Λ n → Λ 0 in the topology on Ω Q,Λ * coming from its identification with the symmetric space O(Q)/O(Q; Λ * ).
Observation 6.20. If we let Q = max Proof. For p ∈ Λ \ L Q , p ≥ |Q(p)|/ Q ≥ 1/ Q . Proof. By Observation 6.20, ρ Q is bounded from below on S if and only if ρ is bounded from below on S. But by Theorem 6.19, ρ is bounded from below if and only if S is precompact in the topology on Ω d . But by Proposition 6.18, this occurs if and only if S is precompact in the topology on Ω Q,Λ * . (Here we use not only the fact that the topology on Ω Q,Λ * is the one induced from Ω d , but also the fact that the inclusion map is proper and consequently Ω Q,Λ * is closed in Ω d .) 6.5. Relation to Kleinian lattices. In this subsection, we describe the relation between the intrinsic Diophantine approximation of a quadratic hypersurface M Q satisfying p Q = p R = 1 and the approximation of points in the boundary of d-dimensional hyperbolic space H d by parabolic fixed points in a lattice Γ ≤ Isom(H d ) which depends on the quadratic hypersurface M Q . Since the latter situation is well-studied, this correspondence can be used to immediately prove the theorems of §2.3 in the case p Q = p R = 1.
(However, our proofs of the theorems of §2.3 in the general case are not dependent on assuming p R > 1, so this subsection can be skipped without any loss of generality.) Let Q : R d+1 → R be a quadratic form with integral coefficients satisfying p Q = p R = 1. Then the signature of Q is either (d, 1) or (1, d). Without loss of generality, we will suppose that its signature is (d, 1). The hyperboloid model of hyperbolic geometry is the set with the Riemannian metric Q ↿ H d (its positive-definiteness is guaranteed by the fact that the signature of Q is (d, 1)). For the equivalence of the hyperboloid model with other standard models of hyperbolic geometry, see e.g. [19].  [30] and the references therein for subsequent generalizations) into the context of quadratic forms, yielding the results of §2.3 in the case p Q = p R = 1. Details are left to the reader.
Proof of (i). Fix ε > 0, and for each [ where r is the unique primitive integral representative of [r]. The fact that H [r] is a horoball centered at [r] follows from the following well-known formula for the Busemann function in the hyperboloid model: , and apply g ∈ O(Q) such that g(x) = w, where w ∈ H d is fixed. Then |B Q w, g(r i ) | < ε, where r i is the primitive integral representative of [r i ]. On the other hand, since Q has signature (d, 1) and Q(w) = −1, we have (6.9) |B Q (w, r)| ≍ × r for all r ∈ L Q .
Proof of (ii

The correspondence principle
In this section we introduce the correspondence principle alluded to in the introduction. It is an intrinsic approximation analogue of the so-called Dani Correspondence for ambient approximation [24,25,21,45,44]. A special case can be found in [46,Theorem 1.5].
Fix d ≥ 2, and let Q : R d+1 → R be a nonsingular quadratic form with integral coefficients. Suppose that P d Q ∩ M Q = , or equivalently that p Q ≥ 1. By Proposition 6.10, there exists a matrix φ ∈ SL d+1 (Q) such that R := Q • φ is p Q -normalized. Let Λ * = φ −1 (Z d+1 ). Note that Λ * is commensurable with Z d+1 and that Λ * ∈ Ω R . Moreover, the Q-and R-ranks of Q and R are identical, so denoting these ranks by p Q and p R will not cause ambiguity.
The first version of the correspondence principle gives a relation between the following entities: . (B) Points in Λ pr ∩ L R which are close to L 1 . Here Λ pr denotes the set of primitive vectors of Λ.
In particular, if ψ : (0, ∞) → (0, ∞) is a regular function (cf. Remark 2.12), then In each case, the implied constant can be made independent of g if g is constrained to lie in a bounded subset of O(R).
(iii) Fix p ∈ Λ ∩ L R \ {0}. such that |p 0 | = p (i.e. |p 0 | ≥ dist(p, L 1 )). For t ≥ 0, In particular, letting t(p) = log p / dist(p, L 1 ) we have Proof. Parts (i) and (ii) are straightforward and are left to the reader; the regularity of ψ is used in the deduction of (7.4) from (7.3). The first inequality of (7.5) is an immediate consequence of the definition of g t . To demonstrate the second inequality of (7.5), let q = g t (p), and write q = (q 0 , . . . , q d ). Then To bound |q d |, we use the fact that q ∈ L R , which can be written where R is the remainder of R. Rearranging, we have The second version of the correspondence principle depends on a function ψ : (0, ∞) → (0, ∞), and may be stated as follows: Lemma 7.2 (Correspondence principle, form 2). Let g, [x], and Λ be as in (7.2), and assume that [x] is irrational (equiv. that Λ ∩ L 1 = {0}). Let ψ : (0, ∞) → (0, ∞) be a regular function such that the map q → qψ(q) is nonincreasing and tends to zero. Then Proof. The first asymptotic follows directly from (i) and (ii) of Lemma 7.1. The second asymptotic can be rewritten in a more convenient form using the function Ψ(q) := qψ(q): To demonstrate the direction of (7.8), for each t ≥ 0 choose p t ∈ Λ pr ∩ L R such that ρ R (g t Λ) = g t (p t ) . Then by (7.5), we have dist(p t , L 1 ) ≤ ρ R (g t Λ) and p t ≤ e t ρ R (g t Λ) and thus Here we have used the fact that the function Ψ is nonincreasing. Next, suppose we have a sequence t k → ∞ such that lim k→∞ e −t k ψ(e t k ρR(gtΛ)) < ∞. Since Ψ(q) → 0 as q → ∞, it follows that ρ R (g t k Λ) → 0. In particular dist(p t k , L 1 ) → 0.
To demonstrate the direction of (7.8), suppose that p k ∈ Λ pr ∩ L R is a sequence such that [p k ] → [e 0 ]. For each k, let t k = t(p k ) be defined as in (iii) of Lemma 7.1. Since [p k ] → [e 0 ], we have t k → ∞. On the other hand, by (7.6) we have Letting k → ∞ finishes the proof.
Although the following corollary is not used in our proofs, it is worth pointing out as a direct analogue of Dani's correspondence between bounded orbits and badly approximable vectors/matrices [22, Proposition 2.20] Corollary 7.3. Let g, [x], and Λ be as in (7.2). Then the following are equivalent: is badly intrinsically approximable, i.e.
Proof. Clearly all the above statements are false if [x] is irrational. Otherwise, let C be the class of all regular functions ψ such that the map q → qψ(q) is nonincreasing and tends to zero. Then (A) is equivalent to the assertion that the left hand side of (7.7) is positive for all ψ ∈ C, (B) is equivalent to the assertion that the middle of (7.7) is positive for all ψ ∈ C, and (C) is equivalent (by Theorem 6.21) to the assertion that the right hand side of (7.7) is positive for all ψ ∈ C.
Remark 7.4. It is somewhat annoying that Lemma 7.2 requires the assumption that qψ(q) → 0 as q → ∞, so that the Dirichlet function ψ = ψ 1 is ruled out. (If we were allowed to use ψ = ψ 1 , then the proof of Corollary 7.3 could be made even simpler -just consider ψ = ψ 1 rather than all functions ψ ∈ C.) However, this assumption is necessary, as can be seen as follows. It follows from Theorem 6.21 that there exists C > 0 such that ρ R (Λ) ≤ C for all Λ ∈ Ω R,Λ * . This C is a uniform upper bound on the right hand side of (7.7) when ψ = ψ 1 . However, we know that when p Q = p R , then there is no uniform upper bound on the left hand side of (7.7); this follows from Theorem 8.1(ii) below. Thus the left and right hand sides cannot be asymptotic. 22 Under the assumption that qψ(q) → 0 as q → ∞, Lemma 7.2 can be used to dynamically describe the sets A MQ (ψ) and WA MQ (ψ): Corollary 7.5. Let ψ : (0, ∞) → (0, ∞) be a regular continuous function such that the map q → qψ(q) is nonincreasing and tends to zero, let (this is well defined for large enough t), and let Then for every compact set K ⊆ O(R), there exists C > 0 (depending on ψ and K) such that Proof. Given g ∈ O(R) and [x], Λ as in (7.2), write C([x]) for the left hand side of (7.7) and write C(Λ) for the right hand side of (7.7). Then The conclusion follows. The "consequently" part follows from the regularity of ψ and the elementary computation r εψ (t) = e −t ψ −1 (e −t /ε).
In applying the correspondence principle, the following observations happen to be useful: Observation 7.6. There exists a compact set K ⊆ O(R) such that π 1 (K) = M Q .
Proof. This follows from the facts that M Q is compact, O(R) is locally compact, and π 1 is open and surjective.
We remark that the corresponding assertion is not true for π 2 , since Ω R,Λ * is not compact by Theorem 6.21. Now let µ R and µ R,Λ * denote the Haar measures on O(R) and Ω R,Λ * , respectively.
We remark that Corollary 7.3, the ergodicity of the g t -action on Ω R,Λ * , and the above observation allow one to conclude that the set BA MQ (ψ 1 ) is λ MQ -null. This is a special case of a more general Khintchine-type result -namely Theorem 9.1.

Proof of Theorem 8.1 (Dirichlet's theorem for quadratic hypersurfaces)
In this section we prove the following: Except for the forward direction of (ii) (i.e. uniformly Dirichlet implies p Q = p R ), which we will prove separately (see p.38), all of these results are consequences of the following theorem together with the correspondence principle 24 , namely Lemma 7.2(i,ii) and Observations 7.6 and 7.7. Details are left to the reader.
Theorem 8.2. Fix d ≥ 2, let R be a nonsingular p Q -normalized quadratic form on R d+1 satisfying p Q ≥ 1, and fix Λ * ∈ Ω R commensurable to Z d+1 . Then: 23 Note that the measures π 1 [µ R ] and π 2 [µ R ] are not σ-finite; in fact, they are {0, ∞}-valued. 24 However, the correspondence principle cannot be used to deduce Theorem 8.2 from Theorem 8.1 (or similarly, Theorem 9.2 from Theorem 9.1), due to the lack of an analogue of Observation 7.6 for π 2 . Similar considerations prevent the forwards direction of Theorem 8.1(ii) from being deduced from an appropriate analogue in the space of lattices.
(iii) The following are equivalent: There exist C, T 0 > 0 such that for all Λ ∈ Ω R,Λ * and for all T ≥ T 0 there exists p ∈ Λ∩L R \{0} with p ≤ T such that has positive µ R,Λ * -measure.
Proof of (i). We require the following preliminary result: Then Proof.
The assumption Λ ∩ L Q \ {0} = implies that p Q (Q; Λ) ≥ 1; thus we may without loss of generality assume that Λ = Z d+1 and that Q is 1-normalized (cf. Proposition 6.10). Then clearly e 0 , e d ∈ Λ ∩ L Q . On the other hand, for each i = 1, . . . , d − 1, we have e i + Q(e i )e 0 − e d ∈ Λ ∩ L Q by direct calculation. Since e 0 , e d ∈ Λ ∩ L Q , this implies e i ∈ Span(Λ ∩ L Q ).
For t ≥ 0, let g t ∈ O(R) be as in (6.5). Applying Mahler's compactness theorem to the lattices (g t Λ) t≥0 , we see that one of the following two cases holds: Case 1: There exists a sequence t n → ∞ and a sequence g tn Λ ∋ g tn (p n ) → 0. In this case, for all sufficiently large n, (6.6) implies that p n satisfies (8.2). If the set {p n : n ∈ N} is infinite, this completes the proof. Otherwise, there exists p ∈ Λ such that p n = p for arbitrarily large n. In particular, g tn k (p) → 0 for some increasing sequence (n k ) ∞ 1 . Comparing with (6.5), we see that p ∈ L 1 . This contradicts our hypothesis that L 1 ∩ Λ = {0}. Case 2: There exists a sequence t n → ∞ such that g tn Λ → Λ ∈ Ω R,Λ * . In this case, by Lemma 8. 3 we have Λ ∩ L R L ⊥ 1 , so we may fix p ∈ Λ ∩ L R \ L ⊥ 1 . Since g tn Λ → Λ, there is a sequence g tn Λ ∋ g tn (p n ) → p. Let C Λ = 2 p ; then for all sufficiently large n, (6.6) implies that p n satisfies (8.2). If the set {p n : n ∈ N} is infinite, this completes the proof. Otherwise, there exists p ∈ Λ such that p n = p for arbitrarily large n. In particular, e tn k dist(p, L ⊥ 1 ) → dist( p, L ⊥ 1 ) = 0 for some increasing sequence (n k ) ∞ 1 . This is clearly a contradiction.
Proof of (ii). We require the following lemma: Lemma 10.11. There exists C 1 > 0 such that for every Λ ∈ Ω R,Λ * , there exists a totally isotropic Λ- The proof of Lemma 10.11 requires reduction theory, so we delay its proof until Section 10. Let C 1 be as in Lemma 10.11. Fix Λ ∈ Ω R,Λ * . For each t ≥ 0, applying Lemma 10.11 to the lattice g t Λ ∈ Ω R,Λ * yields a totally isotropic g t Λ-rational subspace V t ⊆ R d+1 of dimension p Q satisfying At this point we divide the proof into two cases: Case 1: L 1 ⊆ V t for some t ≥ 0. In this case, since the set L (C1) 1 has infinite volume in the vector space V t , by Minkowski's theorem it contains infinitely many lattice points p ∈ V t ∩ Λ ∩ L (C1) 1 . Note that each such p is in L R since V t is totally isotropic. On the other hand, (8.2) is clearly satisfied (with C Λ = C 1 independent of Λ). This completes the proof.
6) implies that p n satisfies (8.2), with C Λ = 3C 1 independent of Λ. If the set {p t : t ≥ 0} is infinite, this completes the proof. Otherwise, there exists p ∈ Λ such that p t = p for arbitrarily large t. However, for all t we have g t (p t ) ∈ V t \ (V t ∩ L ⊥ 1 ) = V t \ L ⊥ 1 , and thus p / ∈ L ⊥ 1 . This implies that g t (p) → ∞, a contradiction.
Proof of (iii). For the purpose of this proof, we introduce a new system of coordinates on R d+1 . For We will think of the letters H, W , and L as being short for "height", "width", and "length", respectively. Note that for t ∈ R, In other words, for t ≥ 0, applying g t decreases height and increases length while leaving width fixed. Moreover, where R is the remainder of R.
We will now rephrase the Diophantine condition on a lattice Λ ∈ Ω R,Λ * described in (B ′ ) and (C ′ ) of Theorem 8.2(iii) as a dynamical condition on the same lattice Λ. Precisely, (C) For all t ≥ log(T 0 ), there exists q ∈ g t Λ ∩ L R \ {0} satisfying q ≤ C 2 max(1, R ) and W (q) ≤ C H(q).
(A) ⇒ (B). Fix T ≥ T 0 , and let t = log(T /C) ≥ 1 2 log(T 0 ). Let q ∈ g t Λ ∩ L R \ {0} be as in (A), and let To demonstrate (8.3), we bound W (p) and L(p). First of all, On the other hand, we have whic implies Combining with (8.6) demonstrates (8. In particular, For each C > 0 consider the set Then (B ′ ) and (C ′ ) of Theorem 8.2(iii) are equivalent to the following conditions, respectively: (B ′′ ) There exists C > 0 such that for all Λ ∈ Ω R,Λ * and for all t ≥ C, g t Λ ∈ F C . (C ′′ ) The set has positive µ R,Λ * -measure. Now (B ′′ ) is clearly equivalent to the following: (B ′′′ ) There exists C > 0 such that F C = Ω R,Λ * .
To complete the proof, we must show that (B ′′′ ) is equivalent to (A).
Proof of (A) ⇒ (B ′′′ ). Since p R = 1, the remainder R does not represent zero over R, i.e. it is either positive definite or negative definite. Without loss of generality suppose that it is positive definite. Then R is a norm on R d−1 , so there exists K > 0 such that Then for all x ∈ L R , providing an asymptotic converse to (8.5).
Proof. For each i = 0, . . . , p Q − 1, we have e i ∈ Z d+1 ∩ L pQ ⊆ Λ * ∩ L R , and thus g t (e i ) = e −t e i ∈ Λ t ∩ L R . Since Λ t is R-arithmetic, we have On the other hand, Combining with (8.8), we see that Note that the fact that the group (gt) t∈Ê is totally noncompact in G follows from the inequality (π i ) ′ (z) = 0 proven on p.41 of the present paper. 27 Here we abandon the assumption that Λ * is commensurable to Z d+1 .
It follows that the Λ t -rational subspace L pQ + Rp is totally isotropic, and so by the maximality of p Q , we have p ∈ L pQ . ⊳ where E pR is the Euclidean metric on R pR . Such a choice is possible since by assumption p R > 1. Let g φt be given by (6.3) We claim that for all C > 0, there exists t ≥ 0 such that Λ ′ t / ∈ γ(F C ); in particular F C Ω R,Λ * . Indeed, fix C and t, and suppose we have . If t is large enough (depending on C), then by Claim 8.6 we have p ∈ Γ t and thus q ∈ L pR \ L 1 . In particular, L(q) = 0 but W (q) > 0. This is a contradiction. Thus F C Ω R,Λ * for all C > 0, so (B ′′′ ) fails. ⊳ We complete the proof of Theorem 8.1 by demonstrating the forwards direction of (ii).
Proof of Theorem 8.1, forwards direction of (ii). Let V Q be a maximal isotropic Q-subspace of R d+1 , and By contradiction, suppose that ψ 1 is uniformly Dirichlet. This is equivalent to the existence of a constant C > 0 such that for all [x] ∈ M Q , there exist infinitely many r ∈ Z d+1 ∩ L Q satisfying , only finitely many r ∈ V Q ∩ Z d+1 can satisfy (8.9), so there exists r ∈ Z d+1 ∩ L Q \ V Q satisfying (8.9). Let x be the projection of r onto L [x] , so that (8.10) x − r = dist(r, L [x] ) ≤ C.
Let b 1 , . . . , b pQ be a basis of V Q ∩ Z d+1 . Since V R is totally isotropic and x ∈ V R , we have B Q (x, b i ) = 0 for all i = 1, . . . , p Q . Thus and so since Q is Z d+1 -arithmetic, On the other hand, since r / ∈ V Q , the maximality of V Q implies that V Q + Rr is not isotropic (it is clearly a Q-subspace). Thus B Q (r, b i ) = 0 for some i = 1, . . . , p Q , i.e. z = 0.
Choose real numbers c 1 , . . . , c pQ linearly independent over Q, and let s = Dividing by t m we have Remark 8.7. The hypothesis of nonsingularity can be dropped from parts (i) and (ii) of Theorem 8.1, if the hypothesis that P d Q ∩ M Q = is replaced by the stronger hypothesis that Z d+1 intersects L Q \ (R d+1 ) ⊥ .
Proof. Any singular quadratic form is conjugate to a quadratic form Q : R d+1 → R of the form where Q is a nonsingular quadratic form on R m+1 for some m < d. In particular, L Q = L Q × R d−m . Note that the hypothesis on Q guarantees that P m Q ∩ M Q = . Fix [x] ∈ M Q and a representative x = (x (1) , x (2) ) ∈ L Q . Suppose first that x (1) = 0, and let r (1) ∈ Z m+1 ∩ L Q be such that (8.11) dist(r (1) , Rx (1) ) ≤ C [x (1) ] .
Finally, if p Q = p R , then by using Theorem 8.1(ii) in place of Theorem 8.1(i), the above argument shows that the implied constant is independent of x.
Remark 8.8. The same technique cannot be used to remove the nonsingularity hypothesis from Theorem 9.1 below. Indeed, if we suppose that [x (1) ] ∈ A ψ,M Q for some ψ, then C [x (1) ] will be replaced by εH std ([r])ψ• H std ([r]) in (8.12), but the second term (namely 1) will not be changed. Thus the bound is no better than if we did not know that [x (1) ] ∈ A ψ,M Q . Remark 8.9. The hypothesis that M Q is rational certainly cannot be dropped from Theorem 8.1. Indeed, Theorem 8.1(i) implies that the set P d Q ∩ M Q is dense in M Q whenever M Q is a nonsingular rational quadratic hypersurface in P d R satisfying P d Q ∩ M Q = . By contrast, if Q is a quadratic form which is not a scalar multiple of any quadratic form with integral coefficients, then P d Proof. Let π : R → Q be a Q-linear map, and let Q 0 : R d+1 → R be the unique quadratic form so that , and so Q is a scalar multiple of Q 0 . But Q 0 has rational coefficients, and is therefore a scalar multiple of a quadratic form with integral coefficients. 9. Proof of Theorem 9.1 (Khintchine's theorem for quadratic hypersurfaces) In this section we reduce Theorem 9.1 to an assertion concerning the Haar measure of certain subsets of Ω R,Λ * (Proposition 10.9). Recall that Q 0 denotes the exceptional quadratic form (2.12).
Theorem 9.1 (Khintchine's theorem for quadratic hypersurfaces). Fix d ≥ 2, and let M Q ⊆ P d R be a nonsingular rational quadratic hypersurface satisfying P d Q ∩ M Q = . Fix ψ : N → (0, ∞), and suppose that (I) q → qψ(q) is nonincreasing and tends to zero, and (II) ψ is regular (see Remark 2.12). Then A MQ (ψ) has full measure with respect to λ MQ if and only if the series Theorem 9.1 can be deduced directly from the following theorem together with the correspondence principle (Corollary 7.5 and Observation 7.7). As before, details are left to the reader. 28 Theorem 9.2. Fix d ≥ 2, let R be a nonsingular p Q -normalized quadratic form on R d+1 , and fix Λ * ∈ Ω R commensurable to Z d+1 . Let ψ : (0, ∞) → (0, ∞) be a continuous function, and suppose that q → qψ(q) is nonincreasing and tends to zero. Let r ψ : (0, ∞) → (0, ∞) and A R (ψ) = A(r ψ , Ω R,Λ * ) be defined as in Corollary 7.5, see (7.10). Then A R (ψ) has full measure with respect to µ R,Λ * if and only if (9.1) diverges; otherwise, A R (ψ) is null with respect to µ R,Λ * .
Proof. The proof of Theorem 9.2 will occupy Sections 9 and 10. In the current section, we reduce Theorem 9.1 to a statement about the asymptotic behavior of the measure µ R,Λ * . Namely, we will deduce Theorem 9.2 as a corollary of one of the main results of [45], which we now recall: Definition 9.3. Let (X, dist X ) be a metric space, let µ be a (finite Borel) measure on X, and let ∆ : X → R be a continuous function. For each z ∈ R let S ∆,z = {x ∈ X : ∆(x) ≥ z} and Φ ∆ (z) = µ(S ∆,z ). Φ ∆ is called the tail distribution function of ∆. We say that ∆ is distance-like if (I) ∆ is uniformly continuous, and (II) Φ ∆ is regular (see Remark 2.12).
Let G be a connected semisimple center-free Lie group without compact factors, and let Γ ≤ G be a lattice. By [60,Theorem 5.22], one can find connected normal subgroups G 1 , . . . , G ℓ ≤ G such that G is the direct product of G 1 , . . . , G ℓ , Γ i := G i ∩ Γ is an irreducible lattice in G i for each i = 1, . . . , ℓ, and ℓ i=1 Γ i has finite index in Γ. Of course, if Γ is irreducible, then we have ℓ = 1, G 1 = G, and Γ 1 = Γ. Let π 1 , . . . , π ℓ denote the projections from G to the factors G i . 28 It is helpful to notice that the convergence/divergence of the series (9.1) is unaffected by the substitution ψ → Cψ, where C > 0 is a constant. Theorem 9.4 ([45, Theorem 1.7(a)]). Fix G, Γ, G 1 , . . . , G ℓ as above. Let g denote the Lie algebra of G, and let z ∈ g be an element of a Cartan subalgebra of g. Suppose that (π i ) ′ (z) = 0 for all i = 1, . . . , ℓ. (If G is simple, this just amounts to saying that z = 0.) Let X = G/Γ, let µ X be normalized Haar measure on X, let dist G be a right-invariant Riemannian metric on G, let dist X be the quotient of dist G by Γ, and let ∆ : X → R be a distance-like function. 29 If (z t ) ∞ 1 is a sequence in R, then Remark 9.5. In [45, Theorem 1.7(a)], Γ is assumed to be irreducible, and z is simply assumed to be a nonzero vector in a. However, in [45, § 10.3], the authors of [45] describe how to modify their proof to include the case where Γ is reducible. Incorporating those modifications leads to the above theorem.
For the purposes of this paper, it will be more convenient to deal with the following "continuous" version of Theorem 9.4: Theorem 9.6. Let G, Γ, a, z, X, µ X , ∆ be as in Theorem 9.4. If z : (0, ∞) → (0, ∞) is nondecreasing, then Proof of Theorem 9.6 using Theorem 9.4. Let z (1) t = z(t), and let z (2) t = z(t) − C for some C > 0. To complete the proof it suffices to demonstrate the following: for infinitely many t ∈ N implies e tz (x) ∈ S ∆,z(t) for arbitrarily large t > 0, and (iii) If C is large enough, e tz (x) ∈ S ∆,z(t) for arbitrarily large t > 0 implies e tz (x) ∈ S ∆,z (2) t for infinitely many t ∈ N. Indeed, (i) follows from the fact that Φ ∆ is regular (since ∆ is assumed distance-like), and (ii) is obvious, so we turn to (iii). Suppose that e tz (x) ∈ S ∆,z(t) for some t, and let t ′ = ⌊t⌋. Then dist X e t ′ z (x), e tz (x) ≤ C 1 for some constant C 1 > 0; since ∆ is uniformly continuous, there exists C = C 2 > 0 independent of t so that ∆ e t ′ z (x) − ∆ e tz (x) ≤ C 2 . On the other hand, since z is nondecreasing, z Let O(R) 0 denote the identity component of O(R). We claim that Theorem 9.2 follows from applying Theorem 9.6 with (9.4) Obviously, the verification of this claim consists of two parts: showing that the hypotheses of Theorem 9.6 are satisfied, and showing that Theorem 9.2 follows from the conclusion of Theorem 9.6.
As log g , log g −1 ≤ dist G (I, g) for all g, it follows that ∆ is 1-Lipschitz. 3. Φ ∆ is regular. This will be a consequence of the following asymptotic formula for Φ ∆ (z), whose proof will occupy Section 10: Proposition 10.9. For z large enough, Completion of the proof. First, we rewrite (9.3) using (9.4): So to complete the proof, it suffices to show that the series is asymptotic to (9.1). First of all, by Proposition 10.9, we have If Γ is irreducible, then there will actually be only one factor, namely G, and so as before there is nothing to prove. and let k = d − 1. Then we can write both (9.1) and (9.5) in a uniform manner: Since ψ is regular, each of these series is asymptotic to its corresponding integral, that is, Let Ψ(T ) = T k log n log(T ). In the following integrals, we omit the finite limit of integration since it is irrelevant for determining whether or not the integral converges. The reader should think of the finite limit of integration as being some arbitarily large number.
We shall now resort to the following lemma: Proof. The regions whose areas are represented by these integrals are congruent to each other via the map (x, y) → (y, x). ⊳ Applying this lemma with f = ψ k • Ψ −1 , we continue our calculation: Comparing with (9.5), we see that we have proven Theorem 9.2 in the case n = 0, and also for all functions ψ satisfying (9.7) log log ψ −1 (e −t ) ≍ +,× − log r ψ (t).
Remark 9.8. For the remainder of the proof, we could require n = 1 and thus k = 2 to simplify notation somewhat. However, we prefer to keep the original notation.
For each c > 0, let ψ 1,c be defined by the equation i.e.
We now proceed to prove the general case of Theorem 9.2, using Claim 9.9. Fix c 1 > 1/k > c 2 > c 3 > 0.
On the other hand, we have Since the latter set has measure zero, the measures of A R (ψ ′ ) and A R (ψ) are equal. ⊳ So from now on, we assume ψ ≥ ψ 1,c1 . If ψ ≤ ψ 1,c3 , then this completes the proof. So we will assume that ψ(q) > ψ 1,c3 (q) for arbitrarily large q.
Proof. Since q → qψ(q) is assumed to be nondecreasing, we have On the other hand, , Since the right hand side of (9.8) tends to infinity as T 2 → ∞, the existence of infinitely large values of T 2 for which the hypotheses of the claim are satisfied implies that i.e. (9.1) diverges at ψ = min(ψ, ψ 1,c2 ). Thus by Claim 9.9, we have µ R,Λ * A R (min(ψ, ψ 1,c2 )) = 1.
But since A R (ψ) ⊇ A R min(ψ, ψ 1,c2 ) , this completes the proof. 10. Estimating the measure µ R,Λ * In this section we estimate φdµ R,Λ * for any function φ : Ω R,Λ * → [0, ∞). Our main tools will be the generalized Iwasawa decomposition (Theorem 10.1) and the reduction theory of algebraic groups (Theorem 10.4). We first prove a theorem for general algebraic groups, and then specify to the case G = O(R) 0 .
Let G be a semisimple algebraic group. Let P ≤ G be a parabolic subgroup, and let P = M AN be a Langlands decomposition of P . Let g, p, m, a, and n denote the corresponding Lie algebras. Let K ≤ G be a maximal compact subgroup whose Lie algebra k is orthogonal to a with respect to the Killing form.
Theorem 10.1 (Generalized Iwasawa decomposition, [49,Proposition 8.44]). Let ρ P be the modular function of P . Then given any Haar measures µ K , µ M , µ A , µ N on K, M , A, N respectively, the measure m, a, n) is a Haar measure on G. Now suppose that G is Q-algebraic and that P ≤ G is a minimal parabolic Q-subgroup. Let Γ ≤ G be a lattice commensurable to G Z . Here Ad a denotes the adjoint action of a. Let dist G denote a right-invariant Riemannian metric on G. Let X = G/Γ, and consider the metric dist X (x, x ′ ) = min gΓ=x,g ′ Γ=x ′ dist G (g, g ′ ). We note that dist X is a Riemannian metric on X. Let µ X denote normalized Haar measure on X.
Theorem 10.4. There exist C > 0 and a finite set F ⊆ G Q such that for any function φ : X → [0, ∞), we have Proof. Let M 0 ⊆ M , N 0 ⊆ N , and F ⊆ G Q be as in Theorem 10.3, and let F be given by (10.2).
Now by Theorem 10.1, Since N is contracted by the adjoint action of A + , the set {ana −1 : a ∈ A + , n ∈ N 0 } is precompact and thus C < ∞. For k ∈ K, m ∈ M 0 , a ∈ A + , and n ∈ N 0 fixed, we have Thus by (10.5), m, a, n).
Now since K, M 0 , and N 0 are open and precompact we have Theorem 10.6. Let R : R d+1 → R be a p Q -normalized quadratic form, and suppose that Λ * ∈ Ω R is commensurable with Z d+1 . Let There exists C > 0 such that for any monotonic function φ : Ω R,Λ * → [0, ∞), we have Proof. Let G = O(R) 0 and let Γ = O(R; Λ * ) ∩ O(R) 0 . Then G is a semisimple Q-algebraic groups, and Γ is commensurable with G Z . For t ∈ R pQ , let Φ(t) = g t be as in (6.4), so that Φ : R pQ → G is a homomorphism. Let A = Φ(R pQ ). Then the Lie algebra a of A is isomorphic to R pQ via the map Φ ′ (0). In our notation, we will not distinguish between a and R pQ . Let (10.9) a + = {t ∈ R pQ : t 0 > t 1 > . . . > t pQ−1 > 0} ⊆ a, and let A + = exp(a + ). Then A is a maximal Q-split torus, and A + is a Weyl chamber in A. Fix a ∈ A + , and let N ≤ G and P ≤ G be the groups N := g ∈ G : a n ga −n − → n 0 P := g ∈ G : (a n ga −n ) ∞ 1 is bounded , i.e. N is the group contracted by A + , and P is the group stabilized by A + . Then P is a minimal parabolic Qsubgroup of G whose Langlands decomposition is P = M AN for some reductive group M ≤ P . Moreover, A + is given by the formula (10.1). So by Theorem 10.4, there exist C > 0 and a finite set F ⊆ G Q such that for any φ : Ω R,Λ * → [0, ∞), we have (10.10) Claim 10.7. For some C ′ > 0, Proof. For f ∈ F ⊆ G Q fixed, f Λ * is commensurable with Λ * , and thus 1 In particular, since φ is monotonic Thus (10.11) Thus (10.10) becomes (10.12) Proof. It is well-known (e.g. [49, (8.38)] 32 ) that ρ P (g t ) = e −ρ(t) , where ρ is the sum of the positive roots of A, counting multiplicity. So to demonstrate the claim, we must show that ρ = s T . One verifies that the positive roots of A are of the form with corresponding root spaces : (x pQ , . . . , x d−pQ ) ∈ R d+1−2pQ }. In particular, the multiplicity of the root λ i,j,± is 1, and the multiplicity of the root λ i is (d + 1 − 2p Q ). Thus The sign difference between [49, (8.38)] and the present formula is due to Knapp's convention of assuming that n is the union of the positive root spaces, while we assume that n is the union of the negative root spaces (cf. (10.1)).
Claim 10.10. For x ≥ 1, Here s · t denotes pQ−1 i=1 s i t i . Proof. If p Q = 1, then the domain of integration is zero-dimensional, making the statement trivial. Thus suppose p Q ≥ 2. If d = 3, then Proposition 6.10 implies that R ∼ Q 0 . So if R ∼ Q 0 , then d ≥ 4 and in particular s 1 = d − 3 > 0. Since s i ≥ 0 for all i, we have ⊳ Let n be given by (9.6), so that Integrating over t 0 > z gives f (z) ≍ × t0>z e −s0t0 t n 0 dt 0 ≍ × e −s0z z n = e −(d−1)z z n , demonstrating (10.13).
We end this section by proving a lemma which was needed in the proof of Theorem 8.1(ii,iii): Lemma 10.11. There exists C 1 > 0 such that for every Λ ∈ Ω R,Λ * , there exists a totally isotropic Λrational subspace V ⊆ R d+1 of dimension p Q satisfying Codiam(V ∩ Λ) ≤ C.
Let V 0 = L pQ , and let V = h(V 0 ). We observe that V 0 is a totally isotropic af Λ * -rational subspace of R d+1 dimension p Q , and thus V is a totally isotropic Λ-rational subspace of R d+1 of dimension p Q .

The exceptional quadratic hypersurface
Recall that the exceptional quadratic hypersurface is the hypersurface M Q0 defined by the quadratic form Q 0 (x) = x 0 x 3 − x 1 x 2 . This hypersurface occupies an interesting place in the theory of intrinsic Diophantine approximation on quadratic hypersurfaces developed in this paper. To begin with, it has "more rational points than expected". Specifically, (11.1) N MQ 0 (T ) ≍ × T 2 log(T ), rather than N MQ (T ) ≍ × T 2 , which holds when Q is a quadratic form on R 4 which is not equivalent to Q 0 . Nevertheless, these "extra points" do not appear to affect either the Dirichlet or Khintchine theories of these manifolds in quite the way one would expect. With regards to the Dirichlet theory, the extra points have no effect at all, and the optimal Dirichlet function for M Q is always ψ 1 , independent of whether or not Q ∼ Q 0 . On the other hand, the extra points do affect the Khintchine theory, but not as expected: they introduce a factor of log log(T ) into the series (9.1), rather than a factor of log(T ) as a naive application of the Borel-Cantelli lemma would predict. It is natural to ask whether these extraordinary properties of the exceptional quadratic hypersurface are due to special algebraic properties. This turns out to be the case; in this section we make ths special structure explicit, and use this explicitness to derive elementary proofs both of (11.1) and of the convergence case of Theorem 9.1 for the manifold M Q0 .
We begin by describing the special algebraic property which leads to the results outlined above: The manifold M Q0 is isomorphic to P 1 R ×P 1 R , with the isomorphism given by the Segre embedding Φ : P 1 R ×P 1 R → P 3 R defined by the formula Φ([x 0 : x 1 ], [y 0 : y 1 ]) = [x 0 y 0 : x 0 y 1 : x 1 y 0 : x 1 y 1 ].
Thus M Q0 has a "product structure". This explains why the lattice O(Q 0 ; Z) ∩ O(Q 0 ) 0 factors as SL 2 (Z) × SL 2 (Z); each factor of SL 2 (Z) acts on a different copy of P 1 R . Note that the product structure of M Q0 is consistent with its Diophantine structure. More precisely, the set of intrinsic rationals P 3 Q ∩ M Q0 factors as P We are now ready to begin proving statements about the manifold M Q0 by using the decomposition M Q0 ≡ P 1 R × P 1 R . We begin by computing the number of rationals up to a given height: An elementary proof of (11.1). It is well-known that (2 N ) 2 (by (11.3)) = (2 N ) 2 (N + 1) ≍ × (2 N ) 2 log(2 N ), demonstrating (11.1) in the case T ∈ 2 . The general case follows from a standard approximation argument.
Next, we give an elementary proof of the convergence case of Theorem 9.1 for the manifold M Q0 . This proof will give insight as to why in this case Theorem 9.1 does not simply state the converse of the (naive) Borel-Cantelli lemma; cf. Remark 11.3.
Remark 11.2. In the following proof, we will assume that ψ is regular and that qψ(q) → 0, but we do not need to assume that q → qψ(q) is nonincreasing, as was assumed in the proof of Theorem 9.1.
Thus, for any function ψ satisfying (11.9) log 1 qψ(q) × log log q , we have (11.5) × (11.4), and thus the conclusion of Theorem 9.1 holds in the convergence case for such ψ.
Remark 11.3. There are two important points to be made about the above proof. The first point is that the calculation (11.8) indicates what the nontrivial relation is which causes the series (11.4) to differ from (2.14). Indeed, (11.8) shows that if n ≤ N + log 2 ψ(2 N ) or n ≥ − log 2 ψ(2 N ), then we are better off computing λ × λ(A n,N ) not by simply adding the measures of the squares which define A n,N , but by estimating the measure of A n,N in terms of the rectangles B Z n , ψ(2 N ) × P 1 R or P 1 R × B Z N −n , ψ(2 N ) , respectively. Inside each rectangle are many overlapping squares, and this overlap is what causes the difference in the series.
The second point is that we should not expect there to be a difference in series for the Jarník-Besicovitch theorem if s < k. Indeed, the same argument would work up until the point where the inequality (11.9) is required. But when s < k, then the ψ which we "expect to see" (i.e. those which are near the boundary of convergence/divergence) will satisfy log 1 qψ(q) ≍ × log q rather than (11.9). Thus the "refined argument" for the convergence case produces in this case the same series (2.14).