Diophantine Approximation and applications in Interference Alignment

This paper is motivated by recent applications of Diophantine approximation in electronics, in particular, in the rapidly developing area of Interference Alignment. Some remarkable advances in this area give substantial credit to the fundamental Khintchine-Groshev Theorem and, in particular, to its far reaching generalisation for submanifolds of a Euclidean space. With a view towards the aforementioned applications, here we introduce and prove quantitative explicit generalisations of the Khintchine-Groshev Theorem for non-degenerate submanifolds of $\mathbb{R}^n$. The importance of such quantitative statements is explicitly discussed in Section 4.7.1 of Jafar's monograph `Interference Alignment - A New Look at Signal Dimensions in a Communication Network', Foundations and Trends in Communications and Information Theory, Vol. 7, no. 1, 2010.


Introduction
The present paper is motivated by a recent series of publications, including [11,13,15,16,17,18,22,23,24], which utilize the theory of metric Diophantine approximation to develop new approaches in interference alignment, a concept within the field of wireless communication networks.This new link is both surprising and striking.
The key ingredient from the number theoretic side is the fundamental Khintchine-Groshev Theorem and its variations.In this paper we seek to address certain problems in Diophantine approximation which crop up, or impinge upon, the applications to interference alignment.The results obtained represent quantitative refinements of the Khintchine-Groshev Theorem that are relevant to the applications mentioned above.Indeed, the desirability of such quantitative statements is explicitly eluded to in Jafar's monograph [13, §4.7.1].
Although the main emphasis will be on the Khintchine-Groshev Theorem for submanifolds of R n , we begin by considering the classical theory for systems of linear forms of independent variables.This approach has two benefits.Firstly, we are able to introduce the key ideas without too much technical machinery obscuring the picture.Secondly, the refinements of the classical theory produce effective results with much better constants.
In order to recall Khintchine's theorem we first define the set W(ψ) of ψ-well approximable numbers.To this end, denote by R + the set of non-negative real numbers.Given a real positive function ψ : R + → R + with ψ(r) → 0 as r → ∞, let then W(ψ) := {x ∈ R : |qx − p| < ψ(q) for i.m. (q, p) ∈ N × Z} , where 'i.m.' reads 'infinitely many'.For obvious reasons the function ψ is often referred to as an approximating function.The points x in W(ψ) are characterized by the property that they admit approximation by rational points p/q with the error at most ψ(q)/q.
A simple 'volume' argument together with the Borel-Cantelli Lemma from probability theory implies that where |X| stands for the Lebesgue measure of X ⊂ R. The above convergence statement represents the easier part of the following beautiful result due to Khintchine which gives a criterion for the size of the set W(ψ) in terms of Lebesgue measure.In what follows, we say that X ⊂ R is full in R and write |X| = FULL if |R \ X| = 0; that is, the complement of X in R is of Lebesgue measure zero.The following is a slightly more general version of Khintchine, see [4].
Thus, given any monotonic approximating function ψ, for almost all 1 x ∈ R the inequality |x − p/q| < ψ(q)/q holds for infinitely many rational numbers p/q if and only if the sum ∞ q=1 ψ(q) diverges.There are various generalisations of Khintchine's theorem to higher dimensionssee [3] for an overview.Here we shall consider the case of systems of linear forms which originates from a paper by Groshev in 1938.In what follows, m and n will denote positive integers and M m,n will stand for the set of m × n matrices over R. Given a function Ψ : Z n → R + , let W m,n (Ψ) := {X = (x i,j ) ∈ M m,n : Xa < Ψ(a) for i.m. a ∈ Z n \ {0}} , where a = (a 1 , . . ., a n ), Xa := max 1≤i≤m x i,1 a 1 + . . .+ x i,n a n and x := min{|x − k| : k ∈ Z} is the distance of x ∈ R from the nearest integer.Given a subset X in M m,n , we will write |X| mn for its ambient (i.e.mn-dimensional) Lebesgue measure.It is easily seen that W 1,1 (Ψ) coincides with W(ψ) when Ψ(q) = ψ(|q|).Therefore the following result is the natural extension of Theorem A to higher dimensions.Notice that there is no monotonicity assumption on the approximating function.
Theorem B. Let m, n ∈ N with nm > 1, ψ : N → R + be an approximating function and Let Ψ : Z n → R + be given by Ψ(a) := ψ(|a|) for a = (a 1 , . . ., a n ) ∈ Z n \ {0}, where Theorem B was first obtained by Groshev under the assumption that q n ψ(q) m is monotonic in the case of divergence.The redundancy of the monotonicity condition for n ≥ 3 follows from Schmidt's paper [19,Theorem 2] and for n = 1 from Gallagher's paper [10].Theorem B as stated was eventually proved in [6] where the remaining case of n = 2 was addressed.The convergence case of Theorem B is a relatively simple application of the Borel-Cantelli Lemma and it holds for arbitrary functions Ψ.Thus together with Theorem A, we have the following extremely general statement in the case of convergence.
Theorem C. Let m, n ∈ N and Ψ : Z n → R + be any function such that the sum converges.Then |W m,n (Ψ)| mn = 0 .
An immediate consequence of Theorem C is the following statement.
Corollary 1.Let Ψ be as in Theorem C.Then, for almost every X ∈ M m,n there exists a constant κ(X) > 0 such that In recent years estimates of this kind have become an important ingredient in the study of the achievable number of degrees of freedom in various schemes on Interference Alignment from electronics communication -see, e.g., [15].The applications typically require that κ(X) is independent of X.Unfortunately, this is impossible to guarantee with probability 1, that is on a set of full Lebesgue measure.To demonstrate this claim, let us define the following set : Then, for any κ and Ψ, the set B 1,n (Ψ, κ) will not contain [−κΨ(a), κΨ(a)] × R n−1 with a = (1, 0, . . ., 0).This set is of positive probability.In the light of this example it becomes highly desirable to address the following problem : Problem.Investigate the dependence between κ and the probability of B m,n (Ψ, κ).
As the first step to understanding this problem we obtain the following straightforward consequence of Theorem C. Theorem 1.Let m, n ∈ N and µ be a probability measure on M m,n that is absolutely continuous with respect to Lebesgue measure on M m,n .Let Ψ : Z n → R + be any function such that (2) converges.Then for any δ ∈ (0, 1) there is a constant κ > 0 depending only on µ, Ψ and δ such that Prior to giving a proof of this theorem recall that a measure µ on M m,n is absolutely continuous with respect to Lebesgue measure if there exists a Lebesgue integrable function f : M m,n → R + such that for every Lebesgue measurable subset A of M m,n , one has that where A f is the Lebesgue integral of f over A. The function f is often referred to as the distribution (or density) of µ.
Proof.Since µ is absolutely continuous with respect to Lebesgue measure, Theorem C implies that µ(W m,n (Ψ) Theorem 1 now follows on using the continuity of measures.
In view of our previous discussion we have that κ → 0 as δ → 0.Then, the above problem specialises to the explicit understanding of the dependence of κ on δ.This will be the main content of the next section.Subsequent sections will be devoted to obtaining similar effective version of the convergence Khintchine-Groshev Theorem for non-degenerate submanifolds of R n .This constitutes the main substance of the paper.The results are obtained by exploiting the techniques of Bernik, Kleinbock and Margulis [8] originating from the seminal work of Kleinbock and Margulis [14] on the Baker-Sprindžuk conjecture.

The theory for independent variables
To begin with we give an alternative proof of Theorem 1 which introduces an explicit construction that will be utilized for quantifying the dependence of κ on δ.Indeed, in the case that µ is a uniform distribution on a unit cube the proof already identifies the required dependence.
2.1.Theorem 1 revisited.By a unit cube in M m,n we will mean a subset of M m,n given by ) It is easily seen that W(a, ε) is invariant under additive translations by an integer matrix; that is, W(a, ε) + B = W(a, ε) for any B ∈ M m,n (Z), where M m,n (Z) denotes the set of m × n matrices with integer entries.Furthermore, we have that for any 0 ≤ ε ≤ 1 2 and any unit cube P in M m,n .This follows, for example, from [20, Chapter 1, Lemma 8].Then, since we must have that In what follows we will assume that 2κM Ψ ≤ 1 . ( This condition ensures that we can apply (8) with ε = κΨ(a).
Fix a unit cube P 0 in M m,n and for each ∆ ∈ M m,n (Z), let denote the additive translation of P 0 by ∆.Clearly, P ∆ itself is a unit cube.Furthermore, Note that the union is disjoint.Using (8) and the fact that we obtain that for each ∆ ∈ M m,n (Z), Since µ is a probability measure, it follows from ( 12) that there exists a finite subset Let N = #A be the number of elements in A. Since µ is absolutely continuous with respect to Lebesgue measure, for every ∆ ∈ A and any ε 1 > 0, there exists ε 2 such that for any measurable subset X of P ∆ , In view of (13), applying (15) to In particular, the second inequality in ( 16) holds if Since A is finite, there exists κ satisfying (11) and Clearly, for such a choice of κ the first inequality in ( 16) holds for any ∆ ∈ A. Hence, by (14) and the additivity of µ we obtain that The upshot of this is that which completes the proof of Theorem 1.

2.2.
Quantifying the dependence of κ on δ .We now turn our attention to quantifying the dependence of κ on δ within the context of Theorem 1.To this end, we will make use of the L p norm.Given a Lebesgue measurable function f : M m,n → R + , a measurable subset X of M m,n and p ≥ 1, we write f ∈ L p (X) if the Lebesgue integral exists and is finite.Here χ X is the characteristic function of X.For f ∈ L p (X), the L p norm of f on X is defined by The following lemma gathers together two well know facts regarding the L p norm.
Lemma 2. Let p > 1 and µ be a probability measure on M m,n with density f .Let X be a Lebesgue measurable subset of Proof.By definition, we have that Define q by the equation 1 p + 1 q = 1.Then by Hölder's inequality, we have that We are now in the position to provide an effective version of Theorem 1.Let P 0 and A be the same as in §2.1.In particular, assume that (14) holds.Furthermore, assume that there exists some p > 1 such that for every ∆ ∈ A, the density f of µ has finite L p norm on P ∆ .
Let κ be such that (11) is satisfied.In this case, (13) holds for every P ∆ with ∆ ∈ A. By Lemmas 1 and 2, Using (13), we obtain that where Σ Ψ is given by (9).It follows that Since A is finite, the quantity Σ f is also finite.The upshot of the above discussion is the following statement.
Theorem 2 (Effective version of Theorem 1).Let m, n ∈ N, µ, Ψ be as in Theorem 1, let M Ψ be given by (10) and let f denote the density of µ.Furthermore, let P 0 be any unit cube in M m,n and A be any finite subset of M m,n (Z) satisfying (14).Assume there exists p > 1 such that f ∈ L p (P ∆ ) for any ∆ ∈ A and also assume that the quantity Σ f is given by (19).Then, for any δ ∈ (0, 1), inequality (5) holds with In this formula, the quotient p/(p − 1) should be taken as equal to 1 when p = ∞.
Remark 1.In the case when Ψ is even, that is Ψ(−a) = Ψ(a) for all a ∈ Z n \ {0}, one can improve formula (20) for κ by replacing Σ Ψ with 1 2 Σ Ψ .This is an obvious consequence of the fact that in this case the sets W(a, κΨ(a)) and W(−a, κΨ(−a)) coincide and therefore do not have to be counted twice within the proof.
There are various simplifications and specialisations of Theorem 2 when we have extra information regarding the measure µ.The following is a natural corollary which is particularly relevant for probability measures µ with bounded distribution f and mean value about the origin.
Consider now Corollary 2 in the case when m = n = 1 and when µ follows the standard Gaussian distribution N (0, 1).It can then be verified that Corollary 2 implies that inequality (5) holds with where Here ⌈x⌉ is the "ceiling" of x, that is the smallest integer that is bigger than or equal to x ∈ R. We now consider explicit approximating functions.First, let Ψ be the function given by Ψ(q) = 0 if q ≤ 0, Then Σ Ψ < 1.555 and on making use of ( 23) we obtain the following table for values of N and κ : It follows for instance from this set of data that for 99% of the values of the random variable x with normal distribution N (0, 1), one has that qx > 1 2000 • Ψ(q) for all q ∈ N.
In the next example, we fix a Q ∈ N and consider the approximating function Ψ given by Ψ(q) := Then Σ Ψ = 1 and one can readily verify that (i) for at least 75% of the values of the random variable x with normal distribution N (0, 1), one has that for all q ∈ [−Q, Q], q = 0, (ii) for at least 90% of the values of the random variable x with normal distribution N (0, 1), one has that for all q ∈ [−Q, Q], q = 0.

Diophantine approximation on manifolds
The aim is to establish an analogue of Theorem 2 for submanifolds M of R n .More precisely, we consider the set B n (Ψ, κ) ∩ M, where The fact that the points of interest are of dependent variables, reflecting the fact that they lie on M, introduces major difficulties in attempting to describe the measure theoretic structure of B n (Ψ, κ) ∩ M.
Non-degenerate manifolds.In order to make any reasonable progress with the above problems it is not unreasonable to assume that the manifolds M under consideration are non-degenerate.Essentially, these are smooth submanifolds of R n which are sufficiently curved so as to deviate from any hyperplane.Formally, a manifold M of dimension d embedded in R n is said to be non-degenerate if it arises from a non-degenerate map f : ) is said to be l-non-degenerate at x ∈ U, where l ∈ N, if f is l times continuously differentiable on some sufficiently small ball centred at x and the partial derivatives of f at x of orders up to l span R n .The map f is non-degenerate at x if it is l-non-degenerate at x for some l ∈ N. As is well known, any real connected analytic manifold not contained in any hyperplane of R n is non-degenerate at every point [14].
Observe that if the dimension of the manifold M is strictly less than n then we have that |B n (Ψ, κ) ∩ M| n = 0 irrespective of the approximating function Ψ and κ.Thus, when referring to the Lebesgue measure of the set B n (Ψ, κ) ∩ M it is always with reference to the induced Lebesgue measure on M.More generally, given a subset S of M we shall write |S| M for the measure of S with respect to the induced Lebesgue measure on M. Without loss of generality, we will assume that |M| M = 1 as otherwise the measure can be re-normalized accordingly.
The following statement is a straightforward consequence of the main result of Bernik, Kleinbock and Margulis in [8].
Theorem BKM.Let M be a non-degenerate submanifold of R n .Let Ψ : Z n → R + be monotonically decreasing in each variable and such that Then, for any δ ∈ (0, 1), there is a constant κ > 0 depending on M, Σ Ψ and δ only such that Remark 2. Theorem BKM holds for arbitrary probability measures supported on M that are absolutely continuous with respect to the induced Lebesgue measure on M, thus giving an analogue of Theorem 2 for manifolds.As in the case of Theorem 1, the more general result follows from the Lebesgue statement.
It is worth pointing out that the main result in [8] actually implies that the union κ>0 B n (Ψ, κ) ∩ M has full measure on M. Theorem BKM as stated above follows from [8, Theorem 1.1]2 on using the continuity of measures.Our main goal is to quantify the dependence of κ on δ.Theorem 6 of §6 below explicitly quantifies this dependence.However, the statement is rather technical and we prefer to state for now a cleaner result that shows that the dependency between κ and δ is polynomial.Theorem 3. Let l ∈ N and let M be a compact d-dimensional C l+1 submanifold of R n that is l-non-degenerate at every point.Let µ be a probability measure supported on M absolutely continuous with respect to | .| M .Let Ψ : Z n → R + be a monotonically decreasing function in each variable satisfying (24).Then there exist positive constants κ 0 , C 0 , C 1 depending on Ψ and M only such that for any 0 < δ < 1, the inequality

Diophantine approximation on manifolds and wireless technology
In short, interference alignment is a linear precoding technique that attempts to align signals in time, frequency, or space.The following exposition is an attempt to illustrate at a basic level the role of Diophantine approximation in implementing this technique.We stress that this section is not meant for the "electronics" experts.We consider two examples.The first basic example brings into play the theory of Diophantine approximation while the second slightly more complicated example also brings into play the manifold theory.EXAMPLE 1.There are two people (users) S 1 and S 2 who wish to send (transmit) a message (signal ) u ∈ {0, 1} and v ∈ {0, 1} respectively along a single communication channel (could be a cable or radio channel) to a person (receiver ) R. Suppose there is a certain degree of fading (channel coefficients) associated with the messages during transmission along the channel.This for instance could be dependent on the distance of the users to the receiver and in the case of a radio channel, the reflection caused by obstacles such as buildings in the path of the signal.It is worth stressing that this aspect of "fading" associated with a signal should not be confused with the more familiar aspect of a signal being corrupted by "noise" that will be discussed a little later.Let h 1 and h 2 denote the fading factors associated with the messages being sent by S 1 and S 2 respectively.These are strictly positive numbers and assume their sum is one.Also, assume that the channel is additive.That is to say that R receives the message: where x 1 = u and x 2 = v .
(28) Specifically, the outcomes of y are and if h 1 = h 2 , the receiver is obviously able to recover the messages u and v.Moreover, the greater the mutual separation of the above four outcomes in the unit interval I = [0, 1], the better the tolerance for error (noise) during the transmission of the signal.
The noise can be a combinations of various factors but often the largest contributing factor is the interference caused by other communication channels.If z denotes the noise, then instead of (28), in practice R receives the message: Now let d denote the minimum distance between the four outcomes of y ∈ I which are explicitly given by (29).Then as long as the absolute value |z| of the noise is strictly less than d/2, the receiver is able to recover the messages u and v.This is simply due to the fact that intervals of radius d/2 centered at the four outcomes of y are disjoint.In this basic example, it is easy to see that the maximum separation between the four outcomes is attained when h 1 = 1/3 and h 2 = 2/3.In this case d = 1/3, and we are able to recover the messages u and v as long as |z| < 1/6.The upshot is that the closer the real numbers h 1 and h 2 are to 1/3 and 2/3 the better the tolerance for noise.Hence, at the most fundamental level we are interested in the simultaneous approximation property of real numbers by rational numbers.In practice, it is the probabilistic aspect of the approximation property that is important -knowing that the numbers h 1 and h 2 lie within a 'desirable' neighbourhood of the points 1/3 and 2/3 with reasonably high probability is key.This naturally brings into play the theory of metric Diophantine approximation.0 1 Note that from a probabilistic point of view, the chances that h 1 = h 2 is zero and is therefore insignificant.Furthermore, within the context of this basic example, by weighting (precoding) the messages u and v appropriately before the transmission stage it is possible to ensure optimal separation (d = 1/3) at the receiver regardless of the values of h 1 and h 2 .Indeed, suppose x 1 = 1 3 h −1 1 u and y 2 = 2 3 h −1 2 v are transmitted instead of u and v.Then, without taking noise into consideration, R receives the message and so the specifics outcomes are EXAMPLE 2. There are two users S 1 and S 2 as before but this time there are also two receivers R 1 and R 2 .Suppose S 1 wishes to simultaneously transmit independent signals u 1 and v 1 as a single signal, say Similarly, suppose S 2 wishes to simultaneously transmit independent signals u 2 and v 2 as a single signal, say x 2 = u 2 + v 2 where u 2 is intended for R 1 and v 2 for R 2 .As in the first example, for the sake of simplicity, we can assume that the signals Assume that the channel is additive and let y 1 (respectively y 2 ) denote the signal at receiver R 1 (respectively R 2 ).Thus, where . Recall, that R 1 (respectively R 2 ) only cares about recovering the signals u 1 and u 2 (respectively v 1 and v 2 ) from y 1 (respectively y 2 ).For the moment, let us just concentrate on the signal received by R 1 ; namely It is easily seen that this corresponds to a received signal in Example 1 modified to incorporate four users and one receiver.This time there are potentially 16 different outcomes.In short, the more users, the more outcomes and therefore the smaller the mutual separation between them and in turn the smaller the tolerance for noise.Now there is one aspect of the setup in this example that we have not yet exploited.The receiver R 1 is not interested in the signals v 1 and v 2 .So if they could be deliberately aligned via precoding into a single component v 1 + v 2 , then y 1 would look like a received signal associated with just 3 users rather than 4. With this in mind, suppose instead of transmitting x 1 and x 2 given by (35), S 1 and S 2 transmit the signals respectively.Then, it can be verified that the received signals given by ( 33) and (34) can be written as In other words, the unwanted, interfering signals at either receiver are aligned to a one dimensional subspace of four dimensional space.Notice that in the above equations the six coefficients are only of four variables, namely h i,j , i, j = 1, 2, and thus represent dependent quantities.This, together with our findings from Example 1, naturally brings into play the manifold theory of metric Diophantine approximation.
Example 2 is a simplified version of Example 3 appearing in [15,§III].For a deeper and more practical understanding of the link between interference alignment and metric Diophantine approximation on manifolds the reader is urged to look at [15] and [13, §4.7].

Preliminaries for Theorem 3
5.1.Localisation and parameterisation.Since M is non-degenerate everywhere, we can restrict ourself to considering a sufficiently small neighbourhood of an arbitrary point on M. By compactness, M then can be covered with a finite subcollection of such neighbourhoods.Therefore, in view of the finiteness of the cover, the existence of κ 0 , C 0 and C 1 satisfying Theorem 3 globally will follow from the existence of these parameters for every neighbourhood in the finite cover : κ 0 , C 0 and C 1 should be taken to be the minimum of their local values.Now as we can work with M locally, we can parameterize it with some map f : U → R n defined on a ball U in R d , where d = dim M. Note that f must be at least C 2 in order to ensure that M is non-degenerate.Without loss of generality we assume that Furthermore, using the Implicit Function Theorem if necessary, we can make f to be a Monge parametrisation, that is f(x) = (x 1 , . . ., x d , f d+1 (x), . . ., f n (x)), where x = (x 1 , . . ., x d ).Note that f can be assumed to be bi-Lipschitz on U.This readily follows from the fact that f is C 1 but possibly requires a further shrinking of U.
Let B n (Ψ, κ, M) denote the orthogonal projection of B n (Ψ, κ) ∩ M onto the set of parameters U. Thus, The set B n (Ψ, κ) ∩ M and its projection B n (Ψ, κ, M) are related by the bi-Lipschitz map f.Since bi-Lipschitz maps only affect the Lebesgue measure of a set by a multiplicative constant (in this case the constant will depend on f only), it suffices to prove Theorem 3 for the project set.More precisely, Theorem 3 is equivalent to showing that there exist positive constants κ 0 , C 0 and C 1 > 0 depending on Ψ and f only such that for any 0 holds with κ given by (27).5.2.Auxiliary statements.We will denote the standard L 1 (resp.Euclidean, infinity) norm on R d by . 1 (resp. . 2 , .∞ ).Also as before, given an x ∈ R, x will denote the distance of x from the nearest integer.The notation B(x, r) will refer to the Euclidean open ball of radius r > 0 centered at x and S d−1 will denote the unit sphere in dimension d ≥ 1 (with respect to the Euclidean norm).Furthermore, throughout is the volume of the d-dimensional unit ball and N d denotes the Besicovitch covering constant.
Remark 3.For further details on the Besicovitch covering constant, cf.[9].We will only need in what follows the inequality N d ≤ 5 d satisfied by this constant.
The proof of Theorem 3 involves two separate cases that take into consideration the relative size of the gradient of f(x) • q, where q = (q 1 , . . ., the standard inner product of f(x) and q.The first case of 'big gradient' is considered within the next result and is an adaptation of [8,Theorem 1.3].
In what follows, ∂ β will denote partial derivation with respect to a multi-index β = (β 1 , . . ., β d ) ∈ N d 0 , where N 0 will stand for the set of non-negative integers, that is N 0 := {0, 1, 2, . . .}. Furthermore, |β| we will mean the order of derivation, that is i will denote the differential operator corresponding to the k th derivative with respect to the i th variable, that is, Theorem 4. Let U ⊂ R d be a ball of radius r and f ∈ C 2 (2U), where 2U is the ball with the same centre as U and radius 2r.Let Then, for every δ ′ > 0 and every q ∈ Z n \ {0}, the set of x ∈ U such that has measure at most K d δ ′ |U| d , where ∇f(x)q is the gradient of f(x) • q and is a constant depending on d only.
Proof.The proof of Theorem 4 follows on appropriately applying [8, Lemma 2.2].For convenience we refer to this lemma as L2.2.We take M in L2.2 to be equal to the quantity ndL, where L is defined by (39).We set δ in L2.2 to be equal to δ ′ appearing in Theorem 4.Then, in view of (39) and the fact that n, d, q ∞ ≥ 1, it follows that the hypotheses of L2.
and then Next in Theorem 5 below we consider the case of 'small gradient'.This is an explicit version of [8,Theorem 1.4].First we introduce auxiliary constants.
Given a C l map f : U → R n defined on a ball U in R d , the supremum of s ∈ R such that for any x ∈ U and any v ∈ S n−1 there exists an integer k, 0 < k ≤ l, and a unit 3 There are two typos in the proof of L2.2 that one should be aware of when verifying the values of the constants given here.On page 6 line -2, the inclusion regarding U (x) is the wrong way round, it should read will be called the measure of l-non-degeneracy of (f, U) and will be denoted by s(l; f, U).
Here and elsewhere for a unit vector u ∈ S d−1 , ∂ k /∂u k will denote the derivative in direction u of order k.
As in Theorem 4, the radius of the ball U will be denoted by r.Throughout, we let x 0 denote the centre of U. Also, given a real number λ > 0, we let λU denote the scaled ball of radius λ r and with the same centre x 0 as that of U.With this in mind, consider the balls U + := 3 d+1 U, Ũ := 3 n+1 U, For technical reasons, that will soon become apparent, in order to deal with the 'small gradient' case we make the following assumption on the map f : U → R n .

Assumption 1.
The map f = (f 1 , . . ., f n ) is an n-tuple of C l+1 functions defined on the closure of Ũ+ which is l-non-degenerate everywhere on the closure of Ũ+ .
Remark.In view of the discussion of §5.1, there is no loss of generality in imposing Assumption 1 within the context of Theorem 3. We denote by s 0 the measure of non-degeneracy of f on Ũ+ .Note that Assumption 1 ensures that s 0 := s(l; f, Ũ+ ) > 0.
(44) Also, notice that it ensures the existence of a constant M ≥ 1 such that for all k ≤ l + 1 and all u 1 , . . ., where ∂u i means differentiation in direction u i .Note that the left-hand side of (42) is the length of the projection of ∂ k f(x)/∂u k on the line passing through v and hence it is no bigger than M.This implies that Without loss of generality, we will assume that the radius r of the ball U satisfies where and where with the quantity φ(ω, B, k) defined as for any integer k ≥ 1 and any real numbers ω, B > 0.
Furthermore, define the following constants determined by f and U: τ := r l s 0 4l l (l + 1)! , and Theorem 5. Let U ⊂ R d be a ball and f = (f 1 , . . ., f n ) be an n-tuple of C l+1 functions satisfying Assumption 1.Then, for any 0 < δ ′ ≤ 1, any n-tuple T = (T 1 , . . ., T n ) of real numbers ≥ 1 and any K > 0 satisfying define the set A(δ ′ , K, T) to be where and in which ρ is given by (52) and C is the constant explicitly given by (71) below.
At first glance the statement of Theorem 5 looks very similar to [8,Theorem 1.4].We stress that the key difference is that in our statement the constants are made fully explicit.The proof of Theorem 5 is rather involved and will be the subject of §7.

A strengthening and proof of Theorem 3
In view of the discussion of §5.1, Theorem 3 will follow immediately on establishing a stronger result (Theorem 6 below), which explicitly characterizes the dependence on Ψ and M of the constants κ 0 , C 0 and C 1 appearing within the statement of Theorem 3. In the case that the function f defining the manifold under consideration is explicitly given, the values of these constants may be improved by following the methodology of the proof of Theorem 6 as many computations will then be made simpler. Let It is a well known fact that, under the assumption that Ψ is monotonically decreasing in each variable, relation (24) implies that 0 < C Ψ < ∞.Also define the constant which is clearly finite and positive as the sum converges.
Clearly the above is an explicit version of Theorem 3 in the case when µ is Lebesgue measure.The arguments given in the proof of Theorem 2 are easily adapted to deal with the general situation.6.1.Proof of Theorem 6 modulo Theorem 5.For κ > 0 and any q ∈ Z n , define A(κ; q) := {x ∈ U : f(x) • q < κΨ(q) & (40) holds} and A c (κ; q) := {x ∈ U : f(x) • q < κΨ(q) & (40) does not hold} .
Clearly it suffices to prove that By Theorem 4 with δ ′ = κΨ(q), we immediately have that |A(κ; q)| d ≤ K d κΨ(q)|U| d .Then, summing over all q ∈ Z n \ {0} gives Now to establish the second inequality in (59), given an n-tuple t = (t 1 , . . ., t n ) ∈ N n 0 , define the set A c t := q=(q 1 ,...,qn)∈Z n \{0} where q + i = max{1, |q i |}.Observe that By (57) and the monotonicity of Ψ in each variable, for every q = (q 1 , . . ., q n ) ∈ Z n \{0} satisfying the inequalities 2 t i ≤ q + i < 2 t i +1 , we have that Then, A c t is easily seen to be contained in the set A(δ ′ , K, T) defined within Theorem 5. Clearly T 1 , . . ., T n ≥ 1 and K > 0. Since κ < C −1 Ψ , we have that 0 < δ ′ < 1.Finally, (53) is satisfied, since where the last inequality follows from the definition of κ.Therefore, Theorem 5 is applicable and it follows that where E is given by ( 56) and where, from (63), the definition of δ ′ and the fact that κC Ψ < 1, .
Then, using (62) and summing over all t ∈ N n 0 , we find that where the latter inequality follows from the definition of κ.This establishes the secound inequality in (59) and thus completes the proof of Theorem 6 modulo Theorem 5.

Proof of Theorem 5
To establish Theorem 5, we will follow the basic strategy set out in the proof of [8, Theorem 1.4].We stress that non-trivial modifications and additions are required to make the constants explicit.To begin with, we state a simplified form of [8, Theorem 6.2] and, to this end, various notions are now introduced.
Given a finite dimensional real vector space W , ν will denote a submultiplicative function on the exterior algebra W ; that is, ν is a continuous function from W to R + such that ν(tw) = |t| ν(w) and ν(u ∧ w) ≤ ν(u)ν(w) (64) , where v 1 , . . ., v k is a basis of Λ (this definition makes sense from the first equation in (64)).Also, L(Λ) will denote the set of all non-zero primitive subgroups of Λ.Furthermore, given C, α > 0 and V ⊂ R d , a function Then, for any positive ε ≤ ρ, one has Theorem 5 is now deduced from Theorem 7 in the following manner.With respect to the parameters appearing in Theorem 7, we let There is nothing to gain in formally recalling the definition of ν * .All we need to know is that ν * as given in [8] has the property that and that its restriction to W coincides with the Euclidean norm.Next, the discrete subgroup Λ appearing in Theorem 7 is defined as Note that it has rank k = n + 1, therefore the ball B appearing in the statement of Theorem 7 coincides with the ball Ũ defined by (43).Finally, we let the map H send x ∈ Ũ to the product of matrices where and D is the diagonal matrix defined via the constants δ ′ , K, T 1 , . . ., T n , and ε 1 appearing in Theorem 5.
With the above choice of parameters, on using (66), it is easily verified that the set A(δ ′ , K, T) defined by (54) within the context of Theorem 5 is contained in the set on the left-hand side of (65) with The upshot is that Theorem 5 follows from Theorem 7 on verifying conditions (i), (ii) and (iii) therein with appropriate constants C, α and ρ.With this in mind, we note that condition (iii) is already established in [8, §7] for any ρ ≤ 1.In §8 below, we will verify the remaining conditions (i) & (ii) with the following explicit constants : where (here, σ(l, d) is the quantity defined in (48)), and ρ = ρ as defined by (52) (note that ρ < 1).This will establish Theorem 5.

Verifying conditions (i) & (ii) of Theorem 7
Unless stated otherwise, throughout this section, Λ will be the discrete subgroup given by (67) and Γ ∈ L(Λ) will be a primitive subgroup of Λ. Verifying condition (i) of Theorem 7 is based on two separate cases : one when the rank of Γ is one and the other case of rank ≥ 2.

Rank one case of condition (i) .
The key to verifying condition (i) in the case that Γ is of rank one is the following explicit version of [8,Proposition 3.4].Notice that it and its corollary are themselves independent of rank and indeed Γ. Proposition 1.Let U ⊂ R d be a ball, F ⊂ C l+1 Ũ+ be a family of real valued functions and λ and γ be positive real numbers such that : , where r is the radius of U as defined in (46) and σ(l, d) is defined in (48).
Then, for any f ∈ F , we have that where Remark.Hypothesis ( 2) is additional to those made in [8,Proposition 3.4].In short, it is this "extra" hypothesis that yields an explicit formula for the constant C 1 .Note that by the definition of C * 1 as given by ( 72), we have that Using the explicit constant C 1 appearing in Proposition 1, it is possible to adapt the proof of [8,Corollary 3.5] to give the following statement.Then, for any linear combination Corollary 3 allows us to verify condition (i) of Theorem 7 in the case that Γ is a primitive subgroup of Λ of rank 1.Indeed, in view of (68) and of the discussions following equations ( 64) and (66), ν * (H(x)Γ) is the Euclidean norm of H(x)w = DU x w, where w is a basis vector of Γ.It is readily seen that the coordinate functions of H(x)w are either constants, or f (x), or ∂f (x)/∂x i for some f = c 0 + n i=1 c i f i with c 0 , . . ., c n ∈ R. Hence, by Corollary 3 and [8, Lemma 3.1 (b,d)] we obtain that the function H(•)Γ ∞ is (C * 1 , α)-good on Ũ, where α is given by (74).In turn, on using [8, Lemma 3.1(c)] and the fact that Proof of Corollary 3. In view of [8, Lemma 3.1.a],it suffices to prove the corollary under the assumption that (c 1 , . . ., c n ) 2 = 1.Thus, with reference to Proposition 1, define The corollary will follow on verifying the four hypotheses of Proposition 1.Thus, hypothesis ( 1) is easily seen to be satisfied.Hypothesis ( 2) is a consequence of (45) and of the Cauchy-Schwarz inequality while hypothesis (3) follows straightforwardly from the definition of the measure of non-degeneracy s 0 in ( 42) and (44).Finally, hypothesis ( 4) is guaranteed by ( 46) and the choices of γ and λ.

8.2.
Proof of Proposition 1.The proof of Proposition 1 relies on the following lemma : Lemma 3. Let f be a real-valued function of class C k (k ≥ 1) defined in a neighbourhood of x ∈ R d (d ≥ 1).Assume that there exists an index 1 ≤ i 0 ≤ d and a real number µ > 0 such that Then there exists a rotation S : R d → R d such that As the proof of Lemma 3 is lengthy, before given it, we show how to deduce Proposition 1 from it.
Deduction of Proposition 1 from Lemma 3. Let x 0 = (v 1 , v 2 , . . ., v d ) denote the centre of U. Hypothesis (3) of Proposition 1 implies that for any f ∈ F , there exists a unit vector u ∈ S d−1 and an index 1 ≤ k ≤ l such that Even if it means applying a first rotation to the coordinate system that brings the x 1 axis onto the line spanned by the vector u, it may be assumed, without loss of generality, that the above inequality reads as From Lemma 3, up to another rotation of the coordinate system, one can guarantee that for a fixed index i, it follows from a Taylor expansion at x 0 that, for any x = (x 1 , . . ., x d ) ∈ Ũ+ , where, by hypothesis (2), R j (x; x 0 ) satisfies the inequality In view of hypothesis ( 4), we have furthermore that Next, observe that any cube circumscribed about Ũ lies inside of Ũ+ .It then follows on applying [8, Lemma 3.3] with A 1 = λ and A 2 = C 2 /2 that the function f is C ′ , 1 dkgood on Ũ, where A computation then shows that We now proceed with the proof of Lemma 3 which requires several intermediate results.The first one is rather intuitive.Lemma 4. Let C > 0 be a real number and p ≥ 1 be an integer.Then every section of the cube [0, C] p with a (p − 1)-dimensional subspace of R p has a volume at most Lemma 5. Let k ≥ 1 denote an integer and let w := (w 0 , . . ., w k ) ∈ R k+1 .Let ω, B > 0 be real numbers.Furthermore, assume that the k + 1 real numbers 0 < t 0 < • • • < t k satisfy the following two assumptions : (1) min 0≤i =j≤k Then, there exist an index 0 ≤ j ≤ k such that k i=0 where φ(ω, B; k) is the quantity defined in (49).
The following notation will be used in the course of the proof of Lemma 5 : given a point x ∈ R n and a set A ⊂ R n , dist(x, A) will denote the quantity dist(x, A) := inf{ x − a 2 : a ∈ A}. (77) denote the matrix defined by the following k + 1 column vectors in R k+1 : x 1 := (1, t 0 , . . ., t k 0 ) T , . . .
Together with the origin, these points form a simplex S(X) in R k+1 whose volume |S(X)| k+1 satisfies the well-known equation The formula for the determinant of a Vandermonde matrix together with hypothesis (1) then yields the inequality Note that hypothesis (2) implies that all the vectors x 1 , • • • , x k+1 lie in the hypercube B := [0, B] k+1 .As a consequence, the volume of the section of the simplex S(X) with any hyperplane does not exceed the volume of the section of S(X) with B which, from Lemma 4, is at most √ 2B k .Also, given a hyperplane P, it should be clear that The upshot of this discussion is that the following inequality holds : Consider now the hyperplane P = w ⊥ and let j be one of the indices realizing the maximum in (78).The conclusion of the lemma is then a direct consequence of the equation The next result contains the main substance of the proof of Lemma 3.
Lemma 6.Let f be a real valued function of class C k (k ≥ 1) defined in a neighbourhood of (x 0 , y 0 ) ∈ R 2 .Let c > 0 be a real number such that Then, there exist two orthonormal vectors u, v ∈ S 1 such that It readily follows from the assumptions of the lemma that Let λ > 0 be a real number such that, for all indices 0 ≤ j ≤ k, ∂ k f ∂x k−j ∂y j (x 0 , y 0 ) ≤ λ.
With the choices of the parameters ω := 1/(2k) and B = 1, Lemma 5 applied to the vector w and to the system of points (t i ) 0≤i≤k yields the existence of a point Let denote a constant and This implies that for all t ∈ [t j − ǫ, t j + ǫ], where t j is the constant appearing in (80), the following inequalities hold : With the choices of the parameters and B = 2, apply once more Lemma 5, this time to the vector ((−1) i w i ) 0≤i≤k and to the set of points This yields the existence of tj The upshot of this is that, when considering the point s := 1/ tj , the following two inequalities hold simultaneously : , it is easily seen that one can find a unit vector (u . Let u ∈ S 1 and v ∈ S 1 denote the two orthonormal vectors defined as u := (u 1 , u 2 ) and v := (u 2 , −u 1 ).

Note then that
, 1, k and, similarly, Since, from the definition of the vector w, this completes the proof of the lemma from the definition of φ in (49).
We now have all the ingredients at our disposal to prove Lemma 3.
Proof of Lemma 3. Denote the coordinates of the vector x ∈ R d as x = (x 1 , . . ., x d ).
Even if it means relabeling the axes, assume furthermore without loss of generality that i 0 = 1 in the statement of the lemma.The proof then goes by induction on d ≥ 1, the conclusion being trivial when d = 1.When d = 2, Lemma 3 reduces to Lemma 6. Assume therefore that d ≥ 3. It then readily follows from the induction hypothesis applied to the function (x 1 , . . ., The lemma follows upon setting S = S 1 • S 2 .

Higher rank case of condition (i).
The key to verifying condition (i) of Theorem 7 in the case when Γ is of rank greater than one is Proposition 2 below.In short, it is an explicit version of [8,Proposition 4.1] in the particular case when the set G appearing therein is given by The statement is concerned with the skew gradient of a map as defined in [8, §4].We recall the definition.Let g = (g 1 , g 2 ) : Ũ+ → R 2 be a differentiable function.The skew gradient ∇g : Ũ+ → R 2 is defined by ∇g(x) := g 1 (x)∇g 2 (x) − g 2 (x)∇g 1 (x).
If we write g(x) in terms of polar coordinates; i.e. via the usual functions ρ(x) and θ(x), it is then readily verified that Essentially, the skew gradient measures how different the pair of functions g 1 and g 2 are from being proportional to each other.
Proposition 2. Let U ⊂ R d be a ball and f = (f 1 , . . ., f n ) be an n-tuple of C l+1 functions satisfying Assumption 1.Let ρ 2 , C d,l and G be given by (51), ( 73) and (82) respectively.Then, (a) for all g ∈ G, This proposition together with Corollary 3 and the basic properties of (C, α)-good functions given in [8,Lemma 3.1] enables us to deduce the following statement, which establishes condition (i) in the higher rank case.
Corollary 4. Let U ⊂ R d be a ball and f = (f 1 , . . ., f n ) be an n-tuple of C l+1 functions satisfying Assumption 1.Let Λ be the discrete subgroup given by (67) and Γ ∈ L(Λ) be a primitive subgroup of Λ.Furthermore, let H be the map given by (68).Then, the function is (C, α)-good on the ball Ũ with constants C and α given by (71) and (74) respectively.
Proof.Let k denote the rank of Γ.The case k = 1 has already been established as a consequence of Corollary 3 in §8.1.Assume therefore that k ≥ 2. It is shown in [8, §7 Eq(7. 3)] that there exist real numbers a, b, µ ∈ R such that, for all x ∈ Ũ, ν * (H(x)Γ) given by (85) can be expressed as the Euclidean norm of a vector w(x).Furthermore, there exists an orthonormal system of vectors of the form S = {e 0 , e * 1 , . . ., e * d , v 1 , . . ., v k−1 } when k ≤ n or of the form S = {e 0 , e * 1 , . . ., e * d , v 0 , . . ., v k−1 } when k = n + 1 such that w(x) is a linear combination of skew products of elements of S whose coefficients are of any of the following form : where • It follows from part (a) of Corollary 3 and [8, Lemma 3.1(a,d)] that the coordinate functions given by ( 86), ( 87) and ( 88) are (C ′ , α)-good, where • It follows from part (b) of Corollary 3 and [8, Lemma 3.1(a,d)] that, when the index i is fixed, the maximum over s of the coordinate functions given by (89), that is, the quantity b µ ∇(f • v i ) ∞ , is (C ′ , α)-good.
• It follows from Proposition 2 and [8, Lemma 3.1(a,d)] that, for fixed indices i and j, the Euclidean norm over s of the coordinate functions given by ( 90) and (91), that is, the quantities µ ∇(f respectively, are (C ′ , α)-good.On using the relation The upshot of the above together with [8, Lemma 3.1(b)] is that the maximum of the coordinate functions (86)-( 91) is (d α/2 C ′ , α)-good.In turn, on using the relation 1 k) and [8, Lemma 3.1(c)], we have that As k ≤ n + 1, the desired statement follows.
Modulo the proof of Proposition 2, we have completed the task of verifying condition (i) of Theorem 7. The proof of the proposition is rather lengthy and therefore is postponed till after we have verified condition (ii) of Theorem 7.
8.4.Verifying condition (ii) of Theorem 7 modulo Proposition 2. The following lemma, which although not explicitly stated, is essentially proved in [8, §7], see [8,Eq(7.5)] and onwards.The key difference is that we make use of Proposition 2 in place of [8,Proposition 4.1] and so are able to give explicit values of ρ 1 and ρ.Lemma 7. Let U ⊂ R d be a ball and f = (f 1 , . . ., f n ) be an n-tuple of C l+1 functions satisfying Assumption 1.Let ρ 1 , ρ > 0 be given by (50) and (52) respectively and assume that for any v ∈ S n−1 and p ∈ R we have that Furthermore, let Λ be the discrete subgroup given by (67), Γ ∈ L(Λ) be a primitive subgroup of Λ and H be the map given by (68).Then The following statement immediately verifies condition (ii) of Theorem 7. It is the above lemma without the assumptions made in (92).
Corollary 5. Let U ⊂ R d be a ball and f = (f 1 , . . ., f n ) be an n-tuple of C l+1 functions satisfying Assumption 1.Let Λ be the discrete subgroup given by (67) and Γ ∈ L(Λ) be a primitive subgroup of Λ.Furthermore, let H be the map given by (68).Then where ρ is given by (52).
Proof of Corollary 5.The desired statement follows directly from Lemma 7 on verifying the inequalities associated with (92).Let v ∈ S n−1 .By the definition of s 0 := s(l; f, Ũ+ ), there exists a u ∈ S d−1 and 1 ≤ k ≤ l such that Recall, that x 0 is the centre of U. It follows that for any x ∈ U, we have that : Let s ′ denote the unit vector • By Lagrange's Theorem, there exists x ′ between x 0 and x such that It then follows from (95) and the definition of M in (45) that This together with the fact that r < s 0 /2M -a direct consequence of (46) -, implies that The upshot is that the hypotheses of [ for any p ∈ R. Thus the first inequality appearing in (92) is established.
It remains to prove the second inequality in (92); that is, that for any Recall from above that for any v ∈ S n−1 we can find a vector u ∈ S d−1 such that (94) and (96) hold.Furthermore, observe that We proceed by considering two cases, depending on whether or not k = 1 in (94).
• Suppose k = 1 in (94).Then it follows from (96) and (98) that On applying the Cauchy-Schwartz inequality, we obtain that This together with the fact that . 2 ≤ √ d .∞ implies the second inequality appearing in (92).
• Suppose k ≥ 2 in (94).Consider the function g(x) := ∂(f •v) ∂u (x) defined on U. Then by (96), we have that Thus, the hypotheses of [8, Lemma 3.6] are satisfied for the function g(x) and a straightforward application of that lemma together with (50) implies that Now the Cauchy-Schwartz inequality and (98) imply that This together with (99) and the fact that .
The upshot of §8 is that we have verified conditions (i) & (ii) of Theorem 7 as desired modulo Proposition 2.

Proof of Proposition 2
In order to prove Proposition 2, we first establish an explicit version of [8,Lemma 4.3].Throughout this section, the notation introduced in (77) will be used.
It is easily inferred from (101) that there exists x 1 ∈ B, the closure of B, such that p (x 1 ) 2 ≥ 1/8.Working in polar coordinates and choosing the straight line L 1 joining the origin to p (x 1 ) to be the polar axis, let (ρ(x) , θ(x)) denote the polar coordinates of a vector x ∈ R 2 .Thus, ρ(p(x 1 )) ≥ 1/8.Furthermore, from (101), there exists x 2 ∈ B such that dist (L 1 , p(x 2 )) ≥ 1/8 and therefore, together with (104), we have that Now let ∆ be the straight line joining p(x 1 ) and p(x 2 ).Furthermore, let L 2 denote the x-coordinate axis, (x 1 , y 1 ) the Cartesian coordinates of p( Therefore, the distance from the origin O to ∆ satisfies the inequality Let J denote the straight line segment [x 1 , x 2 ] and let u be the unit vector Restricting p to J, Lagrange's Theorem guarantees the existence of y ∈ (x This completes the proof of (102).We now turn out attention to (103).
Let i ∈ {1, 2}.It may be assumed without loss of generality that p i (0, . . ., 0) = 0 and that the ball B is centered at the origin.Then, for given x 2 , . . ., x d in R, consider the polynomial in one variable p (x) := p i (x, x 2 , . . ., x d ), which is of degree at most l.It follows from (100) that sup This together with the fact that .Regarding (108), let g = (u 1 • f, u 2 • f + u 0 ) ∈ G. Also, let u := (u 1 , u 2 ) with u 1 , u 2 ∈ S n−1 and let v := (v 1 , v 2 ) ∈ S 1 .Furthermore, let w denote the vector w := v 1 u 1 + v 2 u 2 .Since by the definition of G, the vectors u 1 and u 2 are orthogonal, it follows that w ∈ S n−1 .Now observe that for any multi-index β such that |β| ≤ l, By the definition of s 0 = s(l; f, Ũ+ ), there exists s ∈ S d−1 and 1 As per usual, x 0 denotes here the centre of U. It follows that for any x ∈ Ũ+ , we have that The same arguments as those used to prove (96) can be employed to show that This proves (108) with c := s 0 2 • We now turn our attention to (109).With g and u as above, first note that for any x ∈ Ũ+ and for any multi-index β such that |β| = l, we have that Next, note that from the Cauchy-Schwarz inequality, we have that, for i = 1, 2, On combining (112) and (113), we find that Now, since f satisfies Assumption 1, in view of(45), we obtain that Hence, for any x, y ∈ Ũ+ we have that ∂ β g (x) − ∂ β g (y) 2 ≤ 2M ( x − x 0 2 + y − x 0 2 ) .
In view of (46), we also have that To see that this is so, take v := (1, 0) ∈ S 1 .In view of (108), there exists a vector u ∈ S d−1 and 1 ≤ k ≤ l such that Now, observe that, in view of (46), of the definition of τ and of the fact that s 0 ≤ M, we have that τ ≤ r M. Consider the ball B ′ ⊂ B = U with radius τ /(2M) ≤ r/2 centred at y, where y satisfies (115).Take a vector v ∈ S 1 orthogonal to g(y).In view of (108), there exists a vector u ∈ S d−1 and 1 (117) On the other hand, the upper bound (116) implies that sup The upshot of (117) and ( 118) is that we are able to apply [8,Lemma 4.2] to the map g : B ′ → R 2 to yield (84) and thereby complete the proof of part (b) of Proposition 2.
For ease of comparison, we point out that the quantities a, δ and w appearing in the statement of [8,Lemma 4.2] correspond to τ , τ / √ 2 and the right-hand side of(117) respectively.

Corollary 3 .
Let U ⊂ R d be a ball and f = (f 1 , . . ., f n ) be an n-tuple of C l+1 functions satisfying Assumption 1.With reference to Proposition 1, let γ := s 0 and λ := M.

.
Part (a) of Proposition 1 is now a consequence of [8, Lemma 3.1.d].Regarding part (b), the proof is essentially the same as that of [8, Proposition 3.4.b]with the constant C replaced with the explicit constant C 1 given by (75).
1).Consider now the function (x 1 , x d ) ∈ R 2 → f (x 1 , . . ., x d−1 , x d ).Applying Lemma 6 to this function with c = µ • σ(k, d − 1) therein provides the existence of a rotation S 2 : R d → R d acting on the plane (x 1 , x d ) and leaving its orthogonal unchanged such that min

1
(x) 2 , sup x∈B ∇p 2 (x) 2 ≤ 2l 2 √ dand therefore completes the proof of the lemma.We now have all the ingredients in place to prove Proposition 2.Proof of Proposition 2. The proposition is an explicit version of [8, Proposition 4.1].Within our setup in which G is given by (82), the starting point for the proof of part (a) of[8, Proposition 4.1] corresponds to the existence of positive constants δ, c, and α with0 < δ < 1/8 and 2C d,l N d δ 1/(d(2l−1)(2l−2)) ≤ 1 (107)such that for every g ∈ G one has∀ v ∈ S 1 ∃ u ∈ S d−1 ∃ k ≤ l : inf x∈ Ũ v • ∂ k g ∂u k (x) ≥ c (108) and sup x,y∈ Ũ ∂ β g(x) − ∂ β g(y) ≤ δcα 8ξl l (l + 1)! = δcα 16l l+2 (l + 1)! √ d(109)for all multi-indices β with |β| = l.Here, ξ = 2l 2 √ d is the quantity in right-hand side of (103) and the real number α is required to be less than the constant appearing in the right-hand side of (102), that is, statements (108) and (109) correspond exactly to [8, Eq(4.5a) & Eq(4.5b)]) with V replaced by Ũ.The proof of part (a) of Proposition 2 follows from the existence of the constants δ, c, and α as established in the proof of part (a) of [8, Proposition 4.1].It remains for us to show that, given the definition of r in (46), it is indeed possible to choose such constants in such a way that the relations (107)-(110) hold.With this in mind, set δ := η, where η is defined by (47).It follows from the definition of η and the well known bound N d ≤ 5 d for the Besicovitch constant (cf.Remark 3 p.16) that (107) is satisfied with δ = η.We proceed with verifying (108) and (109).

Thus, on applying [ 8 ,
Lemma 3.6]  to the function g 1 and the ball B, we obtain that sup x,y∈B |g 1 (x) − g 1 (y)| ≥ r l c l l (l + 1)! = r l s 0 2l l (l + 1)! •This implies the existence of a point y ∈ B such thatg(y) 2 ≥ |g 1 (y)| ≥ τas claimed.Next, observe that for any w ∈ S d−1 and any x ∈ U, using the Cauchy-Schwarz inequality, we obtain via (45) that

Thus, on applying [ 8 ,
Lemma 3.6]  to the function x → v • g(x) and the ball B ′ Lemma 8. Let B ⊂ R d be a ball of radius 1 and let B ∞ denote the hypercube circumscribed around B with edges parallel to the coordinate axes.Assume further that p = (p 1 , p 2 ) : B → R 2 is a polynomial map of degree at most l ≥ 1 such that