The Asymptotic Distribution of Randomly Weighted Sums and Self-normalized Sums

We consider the self-normalized sums $T_{n}=\sum_{i=1}^{n}X_{i}Y_{i}/\sum_{i=1}^{n}Y_{i}$, where ${Y_{i} : i\geq 1}$ are non-negative i.i.d. random variables, and ${X_{i} : i\geq 1} $ are i.i.d. random variables, independent of ${Y_{i} : i \geq 1}$. The main result of the paper is that each subsequential limit law of T_n$ is continuous for any non-degenerate $X_1$ with finite expectation, if and only if $Y_1$ is in the centered Feller class.


Introduction
Let {Y, Y i : i ≥ 1} denote a sequence of i.i.d. random variables, where Y is non-negative and non-degenerate with cumulative distribution function [cdf] G. Now let {X, X i : i ≥ 1} be a sequence of i.i.d. random variables, independent of {Y, Y i : i ≥ 1}, where X is in the class X of non-degenerate random variables X satisfying E|X| < ∞. Consider the randomly weighted sums and self-normalized sums We define 0/0 := 0.
In statistics T n has uses as a version of the weighted bootstrap, where typically more assumptions are imposed on X and Y . See Mason and Newton [17] for details. We shall see that T n is an interesting random variable, which is worthy of study in its own right.
Notice that E|X| < ∞ implies that T n is stochastically bounded and thus every subsequence of {n} contains a further subsequence {n ′ } such that for some random variable T , T n ′ D −→ T . 1 Research supported by the TAMOP-4.2.1/B-09/1/KONV-2010-0005 project. 2 Research partially supported by NSF Grant DMS-0503908.
Theorem 4 of Breiman [1] says that T n converges in distribution along the full sequence {n} for every X ∈ X with at least one limit law being non-degenerate if and only if Y ∈ D (β) , with 0 ≤ β < 1. (1) In this paper, Y ∈ D (β) means that for some function L slowly varying at infinity and β ≥ 0, P {Y > y} = y −β L(y), y > 0.
In the case 0 < β < 1 this is equivalent to Y ≥ 0 being in the domain of attraction of a positive stable law of index β. Breiman [1] has shown in his Theorem 3 that in this case T has a distribution related to the arcsine law. We give a natural extension of his result in Theorem 6 below.
At the end of his paper Breiman conjectured that T n converges in distribution to a nondegenerate law for some X ∈ X if and only if Y ∈ D (β) , with 0 ≤ β < 1. Mason and Zinn [18] partially verified his conjecture. They established the following: Whenever X is non-degenerate and satisfies E|X| p < ∞ for some p > 2, then T n converges in distribution to a non-degenerate random variable if and only if (1) holds.
We shall not solve Breiman's full conjecture in this paper. Our interest is to investigate the asymptotic distributional behavior of the weighted sums W n and T n along subsequences {n ′ } of {n}. An important role in our study is played by those Y that are in the centered Feller class. A random variable Y (not necessarily non-negative) is said to be in the Feller class if there exist sequences of norming and centering constants {a n } n≥1 and {b n } n≥1 such that if Y 1 , Y 2 , . . . are i.i.d. Y then for every subsequence of {n} there exists a further subsequence {n ′ } such that where W is a non-degenerate random variable. We shall denote this by Y ∈ F. Furthermore, Y is in the centered Feller class, if Y is in the Feller class and one can choose b n = 0, for all n ≥ 1. This we shall denote as Y ∈ F c . In this paper the norming sequence {a n } is always assumed to be strictly positive and to tend to infinity.
Our most unexpected finding is the following theorem, which connects Y ∈ F c with the continuity of all of the subsequential limit laws of T n . It is an immediate consequence of the results that we shall establish.

Theorem 1. All subsequential distributional limits of
Our result agrees with both Theorem 4 of [1] as cited above and Theorem 3 of [1], which implies that if Y ∈ D (β), with 0 < β < 1, then T n D −→ T , where T has a continuous distribution with a Lebesgue density. Note that all such Y are in the centered Feller class. It turns out that whenever Y ∈ F c and X ∈ X every subsequential limit law of T n has a Lebesgue density. Refer to Theorem 3 below.
Breiman [1] also studied the randomly weighted sums W n . From his Proposition 3 it can be readily inferred that if Y ≥ 0 and Y ∈ D (β), with 0 < β < 1, and X is independent of Y satisfying E |X| < ∞ then This implies that for any sequence of norming constants a n > 0 such that where W (β) is a non-degenerate stable law of index β, then for the randomly weighted sums we have 1 a n where W ′ (β) is also a non-degenerate stable law of index β.
Along the way towards establishing the results needed to prove Theorem 1 we shall need to generalize this result. Our Theorem 2 implies that if along a subsequence {n ′ } the normed sum a −1 It also identifies their limit laws.
Here is a brief outline of our paper. Some necessary notation is introduced in subsection 1.1, and our main results are stated in subsection 1.2, where we fill out the picture of the asymptotic distribution of the self-normalized sums T n along subsequences under a nearly exhaustive set of regularity conditions. The proofs are detailed in section 2 and some additional information is provided in an appendix. We shall soon see that the innocuous looking sequence of stochastic variables {T n } displays quite a variety of subsequential distributional limit behavior.

Some necessary notation
Before we can state our results we must first fix some notation. Let id(a, b, ν) denote an infinitely divisible distribution on R d with characteristic exponent where b ∈ R d , a ∈ R d×d is a positive semidefinite matrix and ν is a Lévy measure on R d and u ′ stands for the transpose of u. In our case d is 1 or 2. For any h > 0 put For d = 1, id(α, Λ), with Lévy measure Λ on (0, ∞), such that holds, and α ≥ 0, denotes a non-negative infinitely divisible distribution with characteristic exponent Moreover, an infinitely divisible random variable is non-negative, if and only if the representation above holds. We will use both representations, so note that id(α, Λ) = id(0, b, Λ), if and only if b ∈ R and Λ be the Lévy measure of W 2 concentrated on (0, ∞) satisfying (4).
We write for 0 Note that lim v 2 ցv 1 Λ ((v 1 , v 2 ]) = 0 and thus Λ (v) is right continuous on (0, ∞); and Let F be the cdf of a random variable X satisfying 0 < E|X| < ∞. We denote F = 1 − F. For u ≥ 0 and v > 0 set and In order to define a bivariate Lévy measure we need to verify that the functions above are meaningful when u > 0 and v = 0. First we shall check that which is equivalent to the finiteness of The finiteness of (8) with u > 0 and v = 0 can be shown in the same way. Using the functions Π(u, v) and Π(−u, v) we define the Lévy measure Π on (−∞, ∞) × (0, ∞) by for −∞ < a < b < ∞ and 0 < c < d < ∞.

Our results
In this subsection we state our results on the asymptotic distributional behavior of W n and T n along subsequences {n ′ }. Our first theorem is a generalization of the convergence in distribution fact stated in (2) and (3) above. In the following, , where X and Y are independent, X has cdf F and Y has cdf G, with 0 < P {Y > 0} ≤ P {Y ≥ 0} = 1.
Theorem 2. Assume that E|X| < ∞. If along a subsequence {n ′ } for a sequence of norming constant a n ′ > 0 1 where W 2 has id(α, Λ) = id(0, b, Λ) distribution as in (5) and necessarily then along the same subsequence where i.e. it has characteristic function Remark 1. In general, Theorem 2 is no longer valid if E |X| = ∞. For example, let X and Y be non-negative, non-degenerate random variables such that X ∈ D(β 1 ) and Y ∈ D(β 2 ), with 0 < β 1 < β 2 < 1. We have EX = ∞. From Lemma 1 below we can conclude that XY is in the domain of attraction of positive stable law of index β 1 . In this example for sequences of norming constants a n,i = L i (n) n 1/β i , i = 1, 2, where L i (x) , i = 1, 2, are slowly varying functions at infinity, where W i are non-degenerate stable random variables of index β i , i = 1, 2. Since a n,1 /a n,2 → ∞, (12) cannot hold. It is clear in this example that the self-normalized sum T n P −→ ∞, which says that T n is not stochastically bounded.
where (a 1 , a 2 ) = (αEX, α) and Furthermore under the assumptions of Theorem 2, we have that the convergence takes place in the Skorohod space

is the bivariate Lévy process with characteristic function
This immediately follows from Theorem 2 combined with Skorohod's theorem (Theorem 16.14 in [11]).
In a separate paper we shall characterize when under regularity conditions the ratio U t /V t converges in distribution to a non-degenerate random variable T as t → ∞ or t ց 0.

Remark 3. A result closely related to Theorem 2 is the fact that the Feller class F is closed under independent multiplication. It is established in Proposition 5 in the Appendix that if X
and Y are independent random variables in the Feller class, then so is XY .
By applying Theorem 2 we see then that which in combination with (18) implies that Notice that (18) holds for the entire sequence {n} with c n = nEY when EY < ∞. It is also satisfied whenever along a subsequence {n ′ } for some sequence b n ′ → ∞, where W is non-degenerate and b n ′ /a n ′ → ∞, as n ′ → ∞. A random variable Y that is in the Feller class but not in the centered Feller class has this property. In this case (18) holds with The following theorem, describes what happens when Y is in the centered Feller class.
Theorem 3. Assume X ∈ X and Y ∈ F c , then for a suitable sequence of norming constants a n > 0 any subsequence of {n} contains a further subsequence {n ′ } such that converges in distribution to a non-degenerate random vector, say (W 1 , W 2 ), having a C ∞ Lebesgue density f on R 2 , which implies that the asymptotic distribution of the corresponding ratio along the subsequence {n ′ } satisfies and has a Lebesgue density f T on R.
Corollary 1 below is a kind of a converse of this fact. It is known (and easy calculation shows) that if Y ∈ D(β), β ∈ (0, 1), then the non-negative constant α appearing in the representation of the stable limit law id(α, Λ) is necessarily 0. (Breiman tacitly uses this fact in the course of his proof of Theorem 3 [1].) It turns out that this is true in a far more general setup.
In order to state our next theorem we shall need the following notation. Let where to be specific, m(n) is the smallest 1 ≤ m ≤ n such that Y n,n = Y m(n) . For any 0 < ε < 1 put Theorem 4. Assume that E|X| < ∞ and there exists a subsequence {n ′ } such that then lim ε→0 lim inf In Proposition 1 in [16] Mason proves that whenever Y is not in the Feller class, that is, and, in addition, Consult Griffin [9] for more details. Theorem 4 leads to the following corollary.
By the stochastic boundedness of T n this implies that there is a subsequence {n ′ } such that It is well-known (cf. Theorem 3.2 by Darling [4]) that if Y has a slowly varying upper tail, which by an application of Theorem 1.2.1 of de Haan [5] is seen to be equivalent to then (24) holds along the full sequence {n} with δ = 1. In this case (27) holds since (30) implies This leads immediately to Proposition 2 in [1]: s., we conclude by the law of large numbers that R n → 0, a.s. In this case, it is trivial to see that T n → EX, a.s., as n → ∞.
Finally, let us consider an illustrative case when E|X| is not necessarily finite. We shall need the following lemma, which is a simple extension of Breiman's Proposition 3 [1]. Since the proof is nearly the same, we omit it.
Lemma 1. Assume that Y ∈ D(β) for some β > 0, and there exists ε > 0 such that E|X| β+ε < ∞. Then A more general result is given in Proposition II in Cline [3]. For recent results along this line consult Jessen and Mikosch [10] and Denisov and Zwart [6].
By substituting the use of Breiman's Proposition 3 in the proof of his Theorem 3 in [1] by the above Lemma 1, we obtain the following extension of his Theorem 3, which implies that his asymptotic distribution result for T n holds in cases when E|X| = ∞.
Theorem 6. Assume that Y ∈ D(β) for some β ∈ (0, 1), and there exists ε > 0 such that It is interesting that even in the latter case the tail behavior of the limit distribution is determined by the distribution of X. Note that Using that as y → 0 we obtain then that , as x → ∞.
Without any further assumptions on F we have the simple bounds Moreover, assuming that 1 − F is regularly varying with index −α, with α > β it is easy to show that Clearly analogous results are true for the negative tail.
The tail behavior that we just pointed out is in sharp contrast to the classical self-normalized sum setup, where it is shown by Giné, Götze and Mason (Theorem 2.5 in [7]) that if the ratio i is stochastically bounded, then all the subsequential limits are subgaussian.
Summary picture To summarize, we have developed the following picture: Let X and Y be independent such that 0 < P {Y > 0} ≤ P {Y ≥ 0} = 1.
(i) If X is non-degenerate, 0 < E|X| < ∞ and Y ∈ F c then T n is stochastically bounded and every subsequential limit random variable T has a Lebesgue density.
(ii) If E |X| < ∞ and Y ∈ F but Y / ∈ F c then there exists a subsequence {n ′ } such that The last result is a special case of the fact that if E |X| < ∞ and along a subsequence {n ′ } and some sequence c n ′ → ∞, we have c −1 (27) holds then there exists a subsequence {n ′ } such that for some δ > 0 lim ε→0 lim inf

Proofs
We shall need the following additional notation. Write for v > 0 Λ n (v) = nP {Y > a n v} = nG (a n v) and for u > 0 and v > 0 Π n (u, v) = nP {XY > a n u, Y > a n v} = and The following lemma is well-known (see Corollary where a h and b h are defined above (4).
The following lemma determines the continuity points of the two-dimensional Lévy measure. Proof. We see that lim u↑u, v↑v which is zero only if F (u/s) and Λ (s) are not discontinuous at the same points in (v, ∞) and F (u/v−) Λ({v}) = 0. The second part of the lemma is proved in the same way.
Next we deal with the convergence of the Lévy measures.

Proposition 2. Assume that at every continuity point
and assume that for every (some) continuity point holds where α h < ∞. Then at every continuity point and at every continuity point Proof. First choose any continuity point (u, v) ∈ [0, ∞) × (0, ∞) of Π and let γ > v be a continuity point of Λ. By (36) lim sup By Lemma 3, F (u/s) and Λ (s) are not discontinuous at the same points in (v, γ], and since the set of discontinuities of F (u/s) on (v, γ] is countable and those have Λ measure zero, assumption (36) allows us to conclude that (see the proof of Proposition 8.12 on page 163 of [2]). Since Λ (γ) can be made arbitrarily small by choosing γ arbitrarily large we readily infer (38) from (41) and (40).
To prove the convergence in (38) when u > 0 and v = 0 we shall need assumption (37). We have to show that for any continuity point γ > 0 Using that the convergence (41) holds for any continuity points 0 < v < γ of Λ it is enough to prove the convergence lim sup Since s −1 F (u/s) → 0, as s → 0, (37) implies the statement keeping mind that α h ց α < ∞ as h ց 0 for some finite α ≥ 0. Statement (39) is proved in the same way.

Proof. Select any 0 < v < h, then for all
and by (9) Π where ϕ(·) is defined in (42). This says that We also obtain that with v = h, and the proof is complete.  Proof. Let 1 ≥ h > γ > 0, be continuity points of Λ. By assumptions (36) and (37) Remark 7. Applying Lemma 2, we see that assumption (10) implies that (36) and (37) hold with where in accordance with the notation in Theorem 2. This shows that (11) holds.
Notice that Define the functions of v ∈ (0, h] where ϕ(·) is defined in (42). Observe that and Now we can prove the convergence of the truncated expectations. and Proof. Observe that Choose any 0 < γ < h such that γ is a continuity point of Λ. Notice that since Π (C h ) = 0 we can infer from Remark 6 that for any such γ the functions of v defined in (γ, h] by φ (v) v and ψ (v) do not share the same discontinuity points as Λ. Thus since these functions are also bounded on (γ, h] , assumption (36) implies as in the argument that gives (41) that and Next, using the monotonicity of φ we see that Therefore, by (37) lim sup Similarly lim sup As γ → 0 the statements follow from the definition of α given in (45), (46) and (47).
Observe that

Proposition 4. Assume (36) and (37). Then for every
Moreover lim hց0 lim sup and lim hց0 lim sup Proof. In the proof of (55) and (56) we can assume without loss of generality that Π (C h ) = 0 for all h > 0 sufficiently small, since we only need it to be true for a countable number of h ց 0, and this holds trivially. We see that Statement (55) is a consequence of (48) and a slight modification of the argument giving (49) yields from which (56) follows. The proof of the first three limit results now can be carried out the same way as in the previous proposition.
Now we are ready to prove Theorem 2.
Proof of Theorem 2. We have to check the three conditions in Lemma 2 for the array First of all, assumption (10) permits us to apply Lemma 2 to the array to get that (e.i) and (e.ii) in the form (36) we get that b must have the form b = αEX + 0<u 2 +v 2 ≤1 uΠ (du, dv) α + 0<u 2 +v 2 ≤1 vΠ (du, dv) .
Finally, Proposition 4 shows that the covariance matrix a has to be 0, so that (e.iii) holds for (57) with a = 0.
Proof of Theorem 3. The proof will be derived from results in Griffin [8]. Note that since both X and Y are independent and non-degenerate, the random vector (XY, Y ) is full, which in this case means that its distribution is not concentrated on a line. Since Y ∈ F c there exits a sequence of positive constants a n such that for every subsequence of {n} there is a further subsequence {n ′ } such that W 2,n ′ /a n ′ converges in distribution to a non-degenerate random variable. Set Clearly, we can now apply Theorem 2 to conclude that for every subsequence of {n} there is a further subsequence {n ′ } such that converges in distribution along {n ′ } to a random vector which is non-degenerate and full. "Full" follows by an examination of the structure of the characteristic function of (W 1 , W 2 ) given in (14). Thus we see that condition (C) of Griffin [8] holds. Next Theorem 4.5 of Griffin [8] says the conditions (A) and (C) of [8] are equivalent. Now since condition (A) of [8] is satisfied, we can use the proof of Griffin's Theorem 4.1 to show that there exist sequences of linear transformations A n : R 2 → R 2 and vectors δ n ∈ R 2 such that is stochastically compact and all of its subsequential distributional limit random vectors, say, are non-degenerate and full. Moreover, Griffin proves that any such random vector (61) has a C ∞ density. This fact combined with an argument based on the convergence of types theorem implies that each subsequential limit random vector (60) has a C ∞ density, say f (u, v). (See the convergence of types theorem given in Theorem 2.3.17 on page 35 in [19].) Thus since every subsequential limit (59) is full with density f (u, v), the distributional limit T of the corresponding self-normalized sum (23) has density Proof of Proposition 1. It can be inferred from classical theory (or from the proof of Theorem 2) that every subsequential limit law W of a −1 n n i=1 Y i has the id(α, Λ) distribution with characteristic function where Λ satisfies (4), and α ≥ 0. Clearly W D = α + V and the Lévy process associated with W is αt By an application of Corollary 1 of Maller and Mason [14] this implies that the process αt + V t , t ≥ 0, is both in the centered Feller class at zero and at infinity. Using the notation of [14] and [15] we have where γ α = α + 1 0 yΛ(dy). We get by Theorem 2.3 in Maller and Mason [15] (equation (2.11)) that for some C > 0 for all x > 0 small enough x α + Proof of Theorem 4. Choose any 0 < ε < 1, then on the set A n (ε) for any k > 1, by the conditional version of Chebyshev's inequality Let ε = 1/k and set We get by (62) that P B k,n |A n k −1 ≥ 1 − k −1/2 .
On the set A n k −1 ∩ B k,n we have Now for any 0 < η < 1 there exists a K η > 0 such that P |X m(n) | ≤ K η ≥ 1 − η. Observe that Therefore we have with ε k (η) := K η k −1 + k −1/2 E|X|, Notice that for each fixed η > 0 and δ ′ < δ for all large enough k and large enough n ′ along the subsequence {n ′ } as in (24) Clearly we can choose δ ′ < δ sufficiently close to δ and η > 0 small enough so that δ ′ − η is as close to δ as desired: Since for each fixed η > 0, ε k (η) → 0, as k → ∞, we see that statement (25) holds along the subsequence {n ′ } as in (24).
Proof of Theorem 5. First we introduce some notation. Set for any C > 0 and random variable Z, Z C = ZI {|Z| ≤ C} and Z C = Z − Z C . Define the random variables for n ≥ 1 As we noted before by the results of Griffin [9] our assumption that (27) does not hold is equivalent to so there exist a δ > 0 and a subsequence {n k } of {n} such that n k → ∞ and Now for any η > 0, C > 0 and K > 0 Note that by Markov's inequality which by Chebyshev's inequality is Thus for each η > 0, C > 0 and K > 0 Next note that for large enough K 1/K 2 + 2/K < δ/4.
To complete the proof, notice that S n k = O P (1) , which implies by tightness that there exists a subsequence {n ′ } of {n k } and a random variable S S n ′ D −→ S, which by (66) satisfies P {S = 0} ≥ δ/4.
We are now ready to prove Theorem 1.
Proof of Theorem 1. Theorem 3 implies that if Y ∈ F c then every subsequential law of T n has a Lebesgue density. Now suppose that Y / ∈ F c . Applying a characterization of Maller [12] we know that Y is in the centered Feller class if and only if lim sup x→∞ Note that if Y / ∈ F c and (27) which means that there is a K > 0 and x 0 > 0, such that for all x ≥ x 0 We show that (67) holds for XY . We have that Since x ≤ t/x 0 , t/x ≥ x 0 , so we can use the estimate above to obtain

Now, using that
so we only have to show that the lim sup of the last term is finite. In order to do this notice that From this we have and the finiteness of the lim sup of the last factor is exactly the condition X ∈ F. The proof is finished.