A LOCAL LIMIT THEOREM FOR SUMS OF INDEPENDENT RANDOM VECTORS

We prove a local limit theorem for sums of independent random vectors satisfying appropriate tightness assumptions. In particular, the local limit theorem holds in dimension 1 if the summands are uniformly bounded.


The main result
A classical Local Limit Theorem says that the distribution of the sum of i.i.d. random variables considered at a small scale is approximately invariant with respect to translations by a large 1 subgroup of R d . Several authors addressed a generalization of this result for non-identically distributed terms (see e.g. [1,2,4,5,6,7,8,9,11] and references therein). Here we show that a reasonable theory can be obtained if we impose appropriate tightness assumptions on individual summands. Consider a sum S N = N j=1 X j where X j are independent, R d valued random variables such that E(X j ) = 0, and there exists a constant ε 0 > 0 such that for each s ∈ R d E( X j , s 2 ) ≥ ε 0 |s| 2 .
We denote by C(R d ) (respectively C r (R d )) the space of continuous (respectively r times differentiable) functions on R d . The subscript 0 indicates that we consider only functions of compact support in the corresponding space.
where λ H is the Haar measure on H and u N (z) is the density of the normal random variable with zero mean and covariance V N . In particular, in the non-arithmetic case for each sequence z N = O( The Haar measure in the above theorem is defined as follows. H is isomorphic to the product of Z d1 × R d−d1 . λ H is the product of the counting measure on the first factor and the Lebesgue measure on the second factor normalized as follows. Choose a set D so that each x ∈ R d can be uniquely written as x = h + θ where h ∈ H, θ ∈ D. λ H is normalized so that (1.5) where λ D is the Lebesgue measure on D normalized to have total volume 1. 2 Sometimes in the literature the term arithmetic is reserved to the case where H is a discrete subgroup of R d while the case where it has both discrete and continuous parts is called mixed but in our presentation we will not distinguish between those two cases.

One dimensional case
If d = 1 there are several simplifications. Namely V N is a scalar and H is either R or hZ for some h ∈ R. So Theorem 1.2 can be restated as follows. (1.6) or (ii) there exists h > 0 and a bounded sequence a N such that S N −a N mod h converges almost surely to a random variable S and for each g ∈ C 0 (R) for each sequence z N such In Section 8 we deduce the following consequence of this result.
Corollary 1.4. Let X j be independent random variables of zero mean which are uniformly bounded (that is, there is K such that |X j | ≤ K with probability one). Then either S N converges almost surely to some random variable S in which case or S N satisfies the conclusions of Corollary 1.3.

Examples
Here we provide several examples of computing the minimal subgroup, the normalizing sequence a N and the shape of local distribution S. 3 They provide a good illustration of versatility of Corollary 1.4, even though the computations in each individual example presented below could be done by hand. Namely, all cases where H = R follow immediately from Kolmogorov's Three Series Theorem. The cases where H = R seem a little more tricky and could be most easily analyzed with the help of Lemma 3.2. Example 1.5. X 1 has a continuous distribution and X n for n ≥ 2 are i.i.d and P(X n ∈ a + hZ) = 1 where h is the maximal number with this property. Then H = hZ, a N = N a mod h, S = X 1 . Example 1.6. X n are integer valued and |X n | ≤ M with probability 1. According to Corollary 1.4 there are two cases (I) 3 The reader should keep in mind that the choices of a N and S are not unique. Namely, we can replace (a N , S) by (a N +ã N + c, S − c) where c is an arbitrary constant andã N is a sequence converging to 0. In Examples 1.5-1.8 we give one possible choice.
(II) The minimal subgroup is hZ for some h ≤ 2M. Note that the same argument as in (b1) shows that hZ is sufficient iff We now distinguish to further subcases: (IIa) The series (1.8) converges only for h = 1. In this case S = 0 and we obtain the classical arithmetic local limit theorem (IIb) The maximal h for which the series (1.8) converges is larger than 1. In this case H = hZ with h as above, k n mod h, where k n = arg max P(X n ≡ k mod h) and S = ∞ n=1 (X n − k n ) (note that due to Borel-Cantelli Lemma this sum has only finitely many non-zero terms with probability 1).
The LLT in Example 1.6 is proven in [10] (except that our results are slightly more precise in case (IIb). The fact that (1.2) and (1.3) are sufficient for the LLT is noted in [12] which obtains the LLT under slightly weaker conditions than (1.2) and (1.3) (under the assumption that X N are integer valued!). P(X n = −1) = 1 2 + p n , P(X n = 1 + ε n ) = 1 2 − p n , where ε n = 4p n 1 − 2p n (so that E(X n ) = 0). We assume that p n → 0. Then either (I) n ε 2 n converges (which is equivalent to the convergence of n p 2 n ). Then or (II) n ε 2 n diverges in which case H = R and we are in the non-arithmetic situation.

Plan of the paper
In Section 2 we prove Proposition 1.1. In Section 3 we show that the non-arithmetic case is characterized by the condition that the characteristic function of S N tends to 0 everywhere except for the origin. In Section 4 we show that if the characteristic function EJP 21 (2016), paper 39.
is large at some point then it decays rapidly nearby. This estimate is used in Section 5 to prove the Local Limit Theorem for test functions whose Fourier transform is compactly supported. In Section 6 we use an approximation argument to prove the Local Limit Theorem for continuous functions of compact support. The proof relies on an auxiliary estimate saying that a probability to visit a cube of a unit size is O(det(V −1/2 N )). That estimate is established in Section 7. Finally, in Section 8 we prove Corollary 1.4.
Throughout the paperĝ denotes the Fourier transform of a function g. U ε (A) denotes ε-neighborhood of a set A ⊂ R d . B R is a ball of radius R centered at the origin.

Minimal subgroup
We need the following deterministic fact.
where H =H ∩H. Let s N be a sequence such that both s N modH and s N modH converge. Then s N mod H converges.
Now note that if R d /H was not compact there would be a proper subspace L ⊃ H and so (2.1) and (2.2) would contradict (1.4) with Π =b N + L.
Our next claim is that H is sufficient. Indeed pickω so that both S N (ω) −ã N modH is an integer greater than 1. On the other hand the proof of part (a) shows that if R is large enough then H k has a basis in B R for each k. Thus the chain can not be continued indefinitely ending at some finite r. Then H r is minimal and it is sufficient by construction.

Distinguishing between the arithmetic and non-arithmetic cases
We start with an auxiliary estimate.
Lemma 3.1. Each random variable X can be decomposed as We will refer to the decomposition of Lemma 3.1 as the useful decomposition of X .
The next result will help us to distinguish between the arithmetic and non-arithmetic cases. 5 Indeed EJP 21 (2016), paper 39. Lemma 3.2. Let X N be independent random variables with zero mean. Let S N = N n=1 X n . The following are equivalent (a) There is a sequence a N such that S N − a N mod 2π converges; Therefore (a) implies (c).

A local estimate
One of standard proofs of the Central Limit Theorem relies on the following bound (see e.g. [3, Section XVI.6]). In this section we extend this result to a neighborhood of an arbitrary point (rather than 0). So fix an arbitrarys ∈ R d .

Lemma 4.2. (a) Suppose that
Next, Denoting p j = − 1 2 E(X 2 j ) + 2E(X j Y j ) and writing the remainder term as P j +iQ j where where the last step uses that p 2 j = O(∆ 3 + E(Y 2 j )).
Next, the inequality gives ln E e i(Yj +Xj ) ≤ − and using Cauchy-Schwartz inequality and the fact that Since for each R To prove part (a) we use (4.5) where Y N is from (4.1) and X N is given by (4.4). The fact that Y N was a part of a useful decomposition was used in part (b) only to get (4.8).
Here we have a stronger bound (4.2) by the assumptions of part (a). In particular, (4.2) implies that E(Y 2 j ) ≤ ε so all terms in (4.5) are small. Accordingly we can use the Taylor expansion of ln(1 + x) to conclude that Using (4.2) to estimate the third term, Cauchy-Schwartz to estimate the first term and the fact that |∆| 2 N = O(V N ) to estimate the second term we get  Proof. Given ε > 0 letN be such that On the other hand Lemma 4.2(a) (applied to Since ε can be chosen arbitrary small the result follows.

Observables with compact Fourier transform
Here we prove that formulas of Theorem 1.2 are valid ifĝ is continuous and has compact support. So we suppose that supp(ĝ) ∈ [−K, K] d for some K.

Non-arithmetic case
Assume first, that lim We claim that the main contribution comes from Since this holds for all L we can let L → ∞ to conclude that lim N →∞ It remains to show that the contributions of I j with j = 0 are smaller.
On the other hand, by Lemma 4.2(b) Combining the estimates forJ N andJ N we obtain the lemma.
Lemma 5.1 shows that the main contribution to E(g(S N )) comes from I 0 so that

Arithmetic case
Next, we consider the arithmetic case. Let H be the minimal subgroup. After a linear change of variables we can assume that 8 Since this holds for all L we can let L → ∞ to conclude that Here the first equality holds since we have identified m ∈ Z d1 with (m, 0) ∈ R d , the second equality follows by the Poisson Summation Formula and the third equality follows by (5.3) and (1.5). This proves Theorem 1.2 for the functions with compactly supported Fourier transform.

Proof of the Local Limit Theorem
Here we finish the proof of Theorem 1.2.
We need the following a priori estimate proven in Section 7.
Lemma 6.1. There is a constant D such that for any cube Q of unit size To fix the notation we consider a non-arithmetic case, the argument in the arithmetic case is similar.
We note that it is sufficient to prove Theorem 1.2 for g ∈ C d+1 the theorem holds for all continuous functions.
due to the results of Section 5, Theorem 1.2 holds on C d+1 0 (R d ) and, hence, on C 0 (R d ).

Concentration inequality
The proof of Lemma 6.1 in arbitrary dimension is the same as the proof for d = 1 given in [9, Section III.1] but we reproduce the proof here for completeness.
Proof of Lemma 6.1. It is enough to prove the claim for cubes of any fixed size ρ since the unit cube can be covered by a finite number of cubes of size ρ. Let imply that there is a constantD such that E(g(S N − a)) ≤D N d/2 On the other hand g(0) = 1 2 d so there is a constant ρ such that g(x) > 1 4 d on the cube of size ρ centered at 0. Hence if Q is a cube of size ρ centered at a then E(g(S N − a)) ≥ P(S N ∈ Q) 4 d .
Combining the last two displays we obtain the result.

Bounded random variables
Proof of Corollary 1.4. If j V (X j ) converges then S N converges almost surely by Kolmogorov's Three Series Theorem and so (1.7) holds.
Therefore we assume that j V (X j ) diverges. Fix a large A and let k n be a sequence such that denoting X n = kn j=kn−1+1 X j we have Since E(X 4 n ) = (E(X 2 n )) 2 + for some A and all n. We claim that, in fact, the conclusions of Corollary 1.3 are satisfied for our original sum S N . Indeed, take an arbitrary sequence satisfying (8.1). Suppose, to fix our notation, that S kn satisfies a non-arithmetic Local Limit Theorem, the arithmetic case is similar. We claim that (1.6) holds. Otherwise there exist sequences {N l } {z l } such that z l / V N l → z and a continuous function g of compact support such that lim l→∞ V N l E(g(S N l − z l )) does not converge to e −z 2 /2 √ 2π ∞ −∞ g(x)dx. By taking a subsequence we can assume that Let n l be such that k n l ≤ N l < k n l+1 . Replacing k n l by N l we obtain a new sequencẽ k n satisfying (8.1) with A replaced by 2A. Also, letz n = z l ifk n = N l for some l and z n = z Vk n otherwise. Then lim l→∞ Vk n E(g(Sk n −z n )) fails to exist giving a contradiction with the assumption that (1.6) fails. Hence (1.6) holds as claimed.