Deviation inequality for Banach-valued orthomartingales

We show a deviation inequality inequalities for multi-indexed martingale We then provide applications to kernel regression for random fields and rates in the law of large numbers for orthomartingale difference random fields.

1 Deviation inequalities for orthomartingale difference random fields

Introduction, motivations and summary of the contribution of the paper
Giving a bound on the tail of a random variable is a fundamental tool in order to measure the rates of convergence of a collection of random elements, for example in the context of the strong law of large numbers.Such inequalities can also be used in order to check tightness criterion.Therefore, a lot of attention has been given to the obtention of probability inequalities: in the independent case [Nag82,Hoe63], mixing case (chapter 6 in [Rio00]), functions of an i.i.d.sequence [LXW13] or martingales [Nag03,FGL17,FGL15] In this paper, we will focus on orthomartingale difference random fields, that is, a special case of multi-indexed martingale difference random which allows to exploit unidimensional martingale difference properties and use arguments based on induction on the dimension.When the increments are form a strictly stationary random field, most of the main limit theorems have been investigated: the central limit theorem [Vol15,Vol19], quenched versions of the functional central limit theorem [PV20] and the law of the iterated logarithms [Gir21b].However, the law of large numbers has not been as much investigated as much as the other limit theorems.
We will bring the following contribution to the study of orthomartingale difference random fields.
1. We establish a deviation inequality for orthomartingale difference random field takings values in a separable Banach space that have to satisfy some smothness assumptions.Note that we do not assume the random field to be identically distributed.
2. Since the random field into consideration can have any marginal distribution, provided that it possesses the orthomartingale structure, we can consider weighted sums of such random fields.This gives the possibility to provide applications to regression models.
3. We give also an optimal sufficient condition for the law of large numbers of an identically distributed orthomartingale difference random field.

A deviation inequality for orthomartingale difference random fields
Given a real-valued martingale difference sequence (D j , F j ), it is possible to control the tail of max 1 n N n j=1 D j by a functional of the tails of max 1 j N |D j | and those of the predictable quadratic variance N j=1 E D 2 j | F j−1 , see [Nag03].An analoguous result has been obtained in [Gir19] for martingales with values in a smooth Banach space which may not have a finite moment of order two.
In order to define orthomartingales, we first need to define an order relation on Z d .It turns out that the most conveniant one is the coordinatewise order.For i = (i ℓ ) d ℓ=1 , j = (j ℓ ) d ℓ=1 ∈ Z d , we say that i j if for each ℓ ∈ {1, . . ., d}, i ℓ j ℓ .Once this order is defined, we can introduce the concept of completely commuting filtrations.
Definition 1.1.We say that a collection of σ-algebras (F i ) i∈Z d is a completely commuting filtration if 1. for each i, j ∈ Z d such that i j, F i ⊂ F j and 2. for each Y ∈ L 1 and each i, j ∈ Z d , where min {i, j} is the element of Z d defined as the coordinatewise minimum of i and j, that is, min {i, j} = (min {i ℓ , j ℓ }) d ℓ=1 .Let us give two examples of commuting filtrations.
, 1 ℓ L, are completely commuting filtrations on a probability space (Ω, F , P). Suppose that for each i (1) ∈ Z d 1 , . . ., i (L) ∈ Z, the σ-algebras Both examples where introduced in Section 1 of [CW75], but without proof.The first item is a direct consequence of Proposition 2 p. 1693 of [WW13].
We are now in position to define orthomartingale martingale difference random field, which allows to exploit the martingale property in every direction.To formize this, we need to denote by e ℓ , ℓ ∈ {1, . . ., d}, the element of Z d whose ℓ-th coordinate is 1 and all the others are zero.
Definition 1.3.Let (X i ) i∈Z d be a random field taking values in a separable Banach space (B, • B ).We say that (X i ) i∈Z d is an orthomartingale martingale difference random field with respect to the completely commuting filtration Such a definition is very convenient because summation on a rectangular region of Z d can be treated with martingale properties when summing on a fixed coordinate.
This allows to use induction arguments.For example, it can be shown using similar arguments as in the proof of Lemma 2.2 in [KVW16] that combining an induction argument with Theorem 2.1 in [Rio09], the inequality holds Moreover, it has been shown in [Faz05] that for each p 1 and each real-valued orthomartingale difference random field that c 1 (p, d) This suggest that the tails of the partial sums of an orthomartingale difference random field can be controlled by a functional of the tail of the sum of squares.We plan to formulate such an inequality for Banach-valued orthomartingale difference random fields.Some assumptions on the geometry of the Banach space are required.Definition 1.4.Let (B, • B ) be a separable Banach space.We say that B is r-smooth for 1 < r 2 if there exists an equivalent norm For example, if µ is σ-finite on the Borel σ-algebra of R, then L p (R, µ) is min {p, 2}-smooth.Moreover, a separable Hilbert space is 2-smooth.Definition 1.5 (Martingale in Banach spaces).Let (B, • B ) be a separable Banach space.We say that a sequence (D i ) i 1 is a martingale differences sequence with respect to a filtration By [Ass75], we know that if B is a separable r-smooth Banach space, then there exists a constant C such that for each martingale difference sequence (D i ) i 1 with values in B and each n, (1.5)By definition, an r-smooth Banach space is also p-smooth for 1 < p r, hence it is possible to define , (1.6) where the ∆ n denotes the set of the martingale differences sequences is not identically 0.
A key tool to prove deviation and moment inequalities for martingale difference sequence is a so-called "good-λ-inequality", that is, an inequality of the form where Such an approach was used in order to derive Burkholder's inequality (see [Bur73,JS88] in the real valued case, [DLP13,Gir19] in the Banach-valued case).Usually, a way to obtain a goodλ-inequality is to introduce a martingale transform of the original martingale based on stopping times, where the increments are controlled as well as the indices where the maximum lies between λ and βλ.Unfortunately, such a method does not seem to be appliable in the context of multi-indexed martingales essentially because there is no proper generalization of stopping times and martingale transforms.
To overcome this problem, we propose an approach by induction on the dimension.In the one dimensional case, we control the tail of the maximum of partial sums by the sum of p-th power of the increments (see Proposition A.6).This will be also used for the induction step; the resulting sum of powers of norms will be considered as the norm of an orthomartingale difference random field indexed by Z d−1 in a modified version of the original Banach space, but sharing smoothness properties.
We thus get a control of the tail of the maximum of the partial sums over rectangles by a function of the tail of the sum of p-th powers of norms of the increments, which is our first result.
Theorem 1.6.Let 1 < r 2 and let (B, • B ) be a separable r-smooth Banach space.For each p ∈ (1, r], q > p and d 1, there exists a function f p,q,d : R + → R + such that if (X i ) i∈Z d is a an orthomartingale martingale differences random field with respect to a completely commuting filtration (F i ) i∈Z d , and taking values in a B, then for each 1 < p r, q > p and x > 0, the following inequality holds: Let us make some comments about this result.First, observe that the right hand side of (1.8) is finite if and only if , one can replace the right hand side of (1.8) by (1.9) For s > p, multiplying by st s−1 in (1.8) (with q = s + 1) and integrating over the positive real line gives the following moment inequality: This gives a partical generalization to the result of [Faz05], since we provide an analogue of the second inequality in (1.3) for Banach-valued orthomartingale difference random fields.One can generalize Theorem 1.13 of [Gir19] to stochastically dominated orthomartingale difference random fields.To state it, we need to define the following order on random variables: we say that X conv Y for two real-valued random variables if for each convex increasing function Corollary 1.7.Let 1 < r 2 and let (B, • B ) be a separable r-smooth Banach space.For each p ∈ (1, r], q > p and d 1, there exists a function f p,q,d : R + → R + such that if (X i ) i∈Z d is a an orthomartingale martingale differences random field with respect to a completely commuting filtration (F i ) i∈Z d , taking values in B, and such that there exists a real-valued random variable V such that then for each 1 < p r, q > 0 and x > 0, the following inequality holds: The functions f p,q,d involved in (1.12) are bigger than the ones in (1.8).Moreover, the second term in the right hand side contains a power d instead of d − 1, which is due to the combination with an other tail inequality under convex ordering (see (A.45)).In some applications, this will play a role.

Application to regression models
We consider the following regression model: where g : [0, 1] d → R is an unknown smooth function and (X i ) i∈Z d is an orthomartingale difference random field.Let K be a probability kernel defined on R d and let (h n ) n 1 be a sequence of positive numbers which converges to zero and which satisfies lim n→+∞ nh n = +∞ and lim We estimate the function g by the kernel estimator g n defined by where Λ n = {1, . . ., n} d . (2.4) We make the following assumptions on the regression function g and the probability kernel K: (A2) There exist positive constants c and C such that for any (A3) There exists a positive constant C such that the absolute values of all the derivatives of first order of g are bounded by Assumption (A3) will not be used in the following result.However, by Proposition 1 in [EM07], this guarantees that sup where Lip (K) denotes the collection of all K-Lipschitz functions on R d .
Theorem 2.1.Let p > 1 and let (X i ) i∈Z d be an identically distributed real-valued orthomartingale difference random field and let g n : [0, 1] d → R be given by (2.3).Assume that (A1) and (A3) hold.For each positive t, the following inequality takes place: • for 1 < p 2, • for p > 2, Note that in the case 1 < p 2, the assumptions that nh n → ∞ and is finite suffice to guarantee that the right hand side of (2.6) goes to 0 as n goes to infinity.However, in the case p > 2, the extra term n d 2−p 2p imposes a restriction in the choice of the bandwidth.For example, if

Law of large numbers for sums of Banach-valued orthomartingale difference random fields over rectangles
In this Subsection, we will deal with convergence rates of orthomartingale difference random fields.
Althought our first result is not a consequence of Theorem 1.6, it gives a necessary and sufficient condition for the Marcinkievicz strong law of large numbers to take place in a smooth Banach space.
In order to state it, we need, to introduce the following norm.For p 1 and q 0, denote by • p,q the Orlicz-norm associated to the Young funtion ϕ p,q : t ∈ (0, ∞)] → t p (1 + |log t|) q , that is, (2.8) For q = 0, • p,q = • p,0 reduces to the classical L p -norm and will be simply denoted as Theorem 2.2.Let (B, • B ) be a separable r-smooth Banach space for some r ∈ (1, 2], 1 < p < r and d ∈ N.There exists a constant K p,d,B such that the following holds: if (X i ) i∈Z d is an identically distributed orthomartingale difference random field such that X 1 B ∈ L p,d−1 , then for all positive x, the following inequality holds where S j = 1 i j X i .In particular, for some constant C p,d depending only on p and d, and the following convergence holds: |n| 1/p = 0 almost surely. (2.11) Note that the condition X ∈ L p,d−1 cannot be removed, not even in the independent identically distributed case (see the theorem p. 165 in [Smy73] for the normalization by |n| instead of |n| 1/p and [Gut78] for the latter one).
Note that the convergence in (2.11) holds if only one of the coordinates of n goes to infinity and uniformly with respect to the other coordinates of n.For example, if d = 2, we have lim (2.12) We now complete this section by giving results in the spirit of those obtained in [Gir19].Let (X i ) i∈Z d be an i.i.d.real-valued random field.Theorem 4.1 in [Gut78] gives the equivalence between the following two assertions for α > 1/2 and p max {1/α, 1}: 1. X 1 belongs to L p log d−1 L; 2. for each positive ε, (2.13) Deviation inequalities has been used in [KL11,Lag16] for the question of complete convergence of orthomartingale differences random fields.
Theorem 2.3.Let B be a separable r-smooth Banach space.For each identically distributed Bvalued orthomartingale difference random field (X i ) i∈Z d , for each positive ε and each α ∈ (1/r, 1], the following inequality takes place: (2.14) Remark 2.4.One could also formulate the corresponding result where r is replaced in (2.14) by 1 < p < r.But this could be established in a more general context than ours, namely, that of stochastically dominated orthomartingale differences random fields, by using truncation arguments like in [DM07].
Theorem 2.5.Let B be a separable r-smooth Banach space and s > r.For each identically distributed B-valued orthomartingale difference random field (X i ) i∈Z d , for each positive ε and each α ∈ (1/r, 1], the following inequality takes place (2.15) Remark 2.6.On one hand, the results in [Lag16] require boundedness of the conditional moments, whereas we do not.On the other hand, their result do not require that (|X i |) i∈Z d is identically distributed hence the results are not directly comparable.

Law of large number for weighted sums of orthomartingale difference random fields
In this subsection, we study the convergence rates in the law of large numbers for identically distributed orthomartingale difference random fields.As we work with Banach space valued random variables, we may consider sums of linear bounded operators from a Banach space B to itself.
Theorem 2.7.Let (B, • B ) be a separable r-smooth Banach space for some r ∈ (1, 2].For i ∈ Z d and n 1, let A n,i : B → B be a linear bounded operator and denote its norm by . Let (X i ) i∈Z d be an identically distributed B-valued orthomartingale difference random field and assume that X 1 ∈ L s for some s > p. Then for each positive ε and each increasing sequence of positive numbers (2.16) Let us give an example where Theorem 2.7 can be used.Suppose that (Λ n ) n 1 is a sequence of subsets of Z d such that (Card (Λ n )) n 1 forms an increasing sequence (note that we do not assume the sequence (Λ n ) n 1 to be increasing).Let A n,i be the identity operator if i belongs to Λ n and A n,i = 0 otherwise.For a positive γ, define R n = Card (Λ n ) γ .Then (2.16) reads (2.17) 3 Proofs

Proof of Proposition 1.2
As pointed out before, it suffices to prove the second item.We will use the following lemma: Lemma 3.1.Let G ℓ , 1 ℓ L be independent sub-σ-algebras of F , where (Ω, F , P) is a probability space.For each ℓ ∈ {1, . . ., L}, let G ′ ℓ be a sub-σ-algebra of G ℓ and A ℓ ∈ G ℓ .Then Proof.Note that the random variable in the right hand side of (3.1) is measurable with respect to L ℓ=1 G ′ ℓ hence it suffices to show that for each The left hand side of (3.3) is and an other use of the independence of the σ-algebras gives (3.3) and finishes the proof of Lemma 3.1.
Let i ∈ Z d , j ∈ Z d of the form i = i (ℓ) i (ℓ) ∈Z d ℓ and j = j (ℓ) j (ℓ) ∈Z d ℓ Since item 1 of Definition 1.1 is clear, it remains to check (1.1).To do so, it suffices to check it when Y is F i -measurable.By a standard approximation argument and Dynkin's theorem, it suffices to do it when Y = L ℓ=1 1 (A ℓ ), where , where the maximum is taken coordinatewise, we get where the second equality uses the commutativity of the filtration and the third one by an other use of Lemma 3.1 to this time G ′ ℓ = F min{i (ℓ) ,j (ℓ) } .This ends the proof of Proposition 1.2.

Proof of Theorem 1.6
We will proceed by induction over the dimension d.We will actually show the following assertion A (d) by induction: " For each 1 < p 2 and each q > p, there exists a function f p,q,d : (0, ∞) → (0, ∞) such that if (B, • B ) is a separable Banach space for which C p,B defined as in (1.6) is finite and (X i ) i∈Z d is an orthomartingale difference random field taking values in B, then For an r-smooth, the constant C p,B is finite for all p ∈ (1, r] hence A (d) contains the statement of Theorem 1.6.

The case d = 1
The statement of A (1) is exactly Proposition A.6, for which a proof is given right after.

Induction step
We will proceed by induction on the dimension d.We will denote by i the elements of Z d and (i, i d+1 ) (and similarly for other letters) the elements of Z d+1 .We assume that A (d) holds.This means that for each 1 < p 2, q > p and each separable Banach space (B, • B ) for which C p,B defined as in (1.6) is finite, there exists a function f p,q,d : (0, ∞) → (0, ∞) such that if (X i ) i∈Z d is a B-valued orthomartingale martingale differences random field with respect to a completely commuting filtration (F i ) i∈Z d , then for each 1 < p r, q > p and x > 0, (3.9) holds.
Let B be such a Banach space and let 1 < p 2 and q > p.In order to complete the induction step we have to find a function f p,q,d+1 (C p,B ) such that if X i,i d+1 i∈Z d ,i d+1 ∈Z is an orthomartingale differences random field with respect to the completely commuting filtration F i,i d+1 i∈Z d ,i d+1 ∈Z , then (3.10) Let X i,i d+1 i∈Z d ,i d+1 ∈Z and F i,i d+1 i∈Z d ,i d+1 ∈Z be such a random field and a filtration.Let N ∈ N d be fixed and such that N 1. and and the orthomartingale difference property gives that in other words, Y n d+1 , F n d+1 is a non-negative sub-martingale.Therefore, by Lemma A.4, we derive that Tail   max Then we use the induction assumption A (d) in the setting The control of the tail of will be done by an other use of the martingale property by summing on the (d + 1)-th coordinate.More precisely, define the separable Banach space B, • B by Observe that if (D k ) n k=1 is a martingale differences sequence taking its values in B, then denoting by D k,i , 1 i N the coordinates of D k , we have and we get that The combination (3.16) with (3.24) gives the bound Using twice Lemma A.3 ends the proof of Theorem 1.6.

Proof of Corollary 1.7
We have to bound the tail of 1 i N X i p B in terms of the tail of V .To do so, we apply successively Theorem 1.6, (A.17), (A.46) and Lemma A.3 in order to derive Tail max and we apply another time (A.17).This concludes the proof of Corollary 1.7.
3.4 Proof of Theorem 2.1 . Let n 1 be fixed.Define for i ∈ Λ n the B-valued random variable X i by We will now show that for p ′ = min {p, 2}, where κ (K, p) depends only on K and p. Define (3.32) and observe that i∈Λn Let ϕ : R → R be a convex increasing function.Then denoting A p = i∈Λn α i,p , convexity and identical distribution of the X i imply that It remains to bound the term A p = i∈Λn α i,p .First assume that p 2. Then p ′ = p and Since K is supported on [−1, 1] d , by assumptions (A1) and (A2), we derive that Then an application of Corollary 1.7 with p replaced by p ′ = min {p, 2} allows to conclude.

Proof of the results of Subsection 2.2
Proof of Theorem 2.2.Let us prove (2.9) for x = 1; the general case follow by applying the previous one to X i /x.We define for all n 1 the event Let us fix n 1 and define for 1 i 2 n the random variables (3.44) We denote by S ′ n and S ′′ n the respective partial sums.Since (X i ) i∈Z d is an orthomartingale difference random field with respect to the filtration (F i ) i∈Z d , the equality where (3.46)By Chebyshev's inequality, and since (X ′ i ) i∈Z d is an orthomartingale difference random field with respect to the filtration (F i ) i∈Z d , Doob's inequality and PropositionA.1 gives we derive that Moreover, since the random variable X i has the same distribution as X 1 , we derive that which leads to the bound In order to bound P (A ′′ n ), we use Markov's inequality and max to get An other use of the fact that X i has the same distribution as X 1 leads to Combining (3.52) with (3.54), we obtain (3.55) The number of elements of N d whose sum is k does not exceed (k + 1) d−1 hence (3.56) Now, (2.9) follows from (A.47) and (A.48).
In order to prove (2.10), observe that (2.9) entails that for any positive t, hence for all t and all positive R, In particular, for all R > X p,d−1 , sup t>0 which gives (2.10).
In order to prove (2.11), we define for 1 j d the random variable Then the combination of (2.9) with the Borel-Cantelli lemma gives that M j,p,N → 0 almost surely.This ends the proof of Theorem 2.2.
Proof of Theorem 2.3.In what follows, C (r, d, B) will denote a constant that depends only on r, d and B and that may change from line to line.Observe that partitionning (N \ {0}) d into rectangles of the form n ∈ N d , for each ℓ ∈ {1, . . ., d} , 2 N ℓ n ℓ 2 N ℓ +1 − 1 , it suffices to prove that 3.6 Proof of Theorem 2.7 For a fixed n, let X i = A n,i (X i ).By linearity of A n,i , X i i∈Z d is an orthomartingale difference random field.Moreover, the following inequality takes place: (3.69) Indeed, let ϕ : R → R be a convex non-decreasing function.Then using the fact that ϕ is non-decreasing and the elementary bound Ax B A B(B) x B , we derive that hence convexity of ϕ and the fact that random variables X i p B , i ∈ Z d , have the same distribution gives (3.69).
We are now in position to apply Corollary 1.7 to q = s + 1 and From the elementary (in)equalities and since s > p, the last integral is convergent.This ends the proof of Theorem 2.7.

A.3 Tail inequalities
In order to state the tail inequalities, we will make a use of the operators T p,q,d given in Definition A.2.In order to ease the notation, we will use the notation Tail (Y ) for the tail function of the non-negative random variable Y , that is, Tail (Y ) : t → P (Y > t) .(A.16) martingale difference sequence with respect to the filtration (F i ) i∈Z then for each 1 < p r, q > p and x > 0, the following inequality holds: P max Proof of Proposition A.6.Let (B, • B ) be a separable r-smooth Banach space, where 1 < r 2.
According to Theorem 1.3 in [Gir19], for each 1 < p r, each q > 0, t > 0 and each B-valued martingale differences sequence (X i , F i ) i 1 , the following inequality holds: 2 q 2 q − 1 q2 −p T −∞,q,0 Tail max where T −∞,q,0 is defined as in (A.+ 2 q−p 2 q − 1 qT 0,q,0 Tail Therefore, we are reduced to bound the last term of (A.43) by an other one involving the tails of n i=1 X i p .This is done with the help of Lemma A.5 used in the following setting: Y i = X i B , q = q − p and t = 2 −p−q C −1 p,B t p u p .This allows the bound A 2 by K (p, q, C p,B ) T 0,q−p,0 • T 1,q,0 Tail where the equality comes from (A.17).
.3) for some constants c 1 (p, d) and c 2 (p, d) depending only on p and d, which extends the twodimensional result obtained in [M 78].
.20) This shows, by the definition (1.6) of C p,B , that C p, B C p,B .Moveover, considering B-valued martingale differences sequences which vanish on all the coordinates which are not 1 shows that C p, B = C p,B .We thus apply the induction assumption to the case d = 1, the Banach B, • B and the B-valued martingale differences sequence (D k ) N d+1 k=1 (with respect to the filtration (F N ,k ) 5) and Tail (•) by (A.16).Bounding max 1 i n X i B by (

2
−p−q C −1 p,B t p = K (p, q, C p,B ) pT 0,q−p,0 • T 1,q,0 .57) where C p,d depends only on d and p and is bigger than 1.Let R be a positive number.If tleqR, then t p P sup n 1