Multifractal Analysis of a Class of Additive Processes with Correlated Non-Stationary Increments

We consider a family of stochastic processes built from inﬂnite sums of independent positive random functions on R + . Each of these functions increases linearly between two consecutive negative jumps, with the jump points following a Poisson point process on R + . The motivation for studying these processes stems from the fact that they constitute simpliﬂed models for TCP tra–c on the Internet. Such processes bear some analogy with L¶evy processes, but they are more complex in the sense that their increments are neither stationary nor independent. Nevertheless, we show that their multifractal behavior is very much the same as that of certain L¶evy processes. More precisely, we compute the Hausdorﬁ multifractal spectrum of our processes, and ﬂnd that it shares the shape of the spectrum of a typical L¶evy process. This result yields a theoretical basis to the empirical discovery of the multifractal nature of TCP tra–c.


Background and Motivations
We study in this work a family of stochastic processes built from infinite sums of independent positive random functions on R + . Each of these functions increases linearly between two consecutive negative jumps, with the jump points following a Poisson point process on R + . The interest of this class of processes is twofold. The first is theoretical: It will be seen that the infinite sums of independent random positive functions that we study, though they have non-stationary and correlated increments, have connections with Lévy processes. The multifractal nature of Lévy processes has been demonstrated in [18]. A natural question is to enquire how the multifractal features of Lévy processes are modified when correlation and non-stationarity of the increments are present. It turns out that, at least in the frame we consider here, neither correlations nor non-stationarity modify the shape of the multifractal spectrum. More precisely, we compute the Hausdorff multifractal spectrum of our processes, and we show that it is the same as that of a typical Lévy process. Though the strategy developed in [17,18] to study the multifractal nature of functions with a dense countable set of jump points applies partly here, our more complex setting requires different and/or refined arguments at key points of the study. In particular, we will need a refined version of Shepp's covering theorem for certain coverings of R + by Poisson intervals.
The second interest stems from applications: The motivation for studying the processes considered here is that they constitute simplified but realistic models for TCP traffic on the Internet. Recent empirical studies, beginning with [23,29], have shown that traffic on the Internet generated by the Traffic Control Protocol (TCP) is, under wide conditions, multifractal. This property has important consequences in practice. For instance, one may show that the queuing behavior of a multifractal traffic is significantly worse that the one of a non-fractal traffic (see [13] for details). It is therefore desirable to understand which features of TCP are responsible for multifractality, in order to try and reduce their negative impact on, e.g., the queuing behavior.
"Explaining" the multifractality of traffic traces from basic features of the Internet is a difficult task. Models investigated so far have been based on the paradigm of multiplicative cascades ( [13], [24]). Indeed, with few exceptions (most notably [1,15,17,18,19]), multifractal analysis has mainly been applied to multiplicative processes. An obvious reason is that a multiplicative structure often leads naturally to multifractal properties ( [25,26,8]).
However, there exists a number of real-world processes for which there is convincing experimental evidence of multifractality, but which do not display a naturally associated multiplicative structure. Among these, a major example is Internet traffic: Multiplicative models for TCP are not really convincing because there is no physical evidence that genuine traffic actually behaves as a cascading or multiplicative process. As a matter of fact, TCP traffic is rather an additive process, where the contributions of individual sources of traffic are merged in a controlled way.
The analysis developed below shows that merely adding sources managed by TCP does lead to a multifractal behavior. This result provides a theoretical confirmation to the empirical finding that TCP traffic is multifractal. Furthermore, it sheds light on the possible causes of this multifractality: Indeed, it indicates that it may be explained from the very nature of the protocol, with no need to invoke a hypothetical multiplicative structure. More precisely, multifractality in TCP already arises from the interplay between the additive increase multiplicative decrease (AIMD) mechanism and the variable synchronization of the sources. Finally, our computations allow to trace back, in a quantitative way, the main multifractal features of traces to specific mechanisms of TCP. This may have practical consequence in traffic engineering.

A simplified model of TCP traffic
The exact details of TCP are too intricate to allow for a tractable mathematical analysis. We consider a simplified model that captures the main ingredients of the congestion avoidance and flow control mechanisms of TCP. The reader interested with the precise features of TCP may consult [7,24,31]. Our model goes as follows (more details on the model may be found in [7]): 1. Each "source" of traffic S i sends "packets" of data at a time-varying rate. At time t, it sends Z i (t) packets.
2. Between two consecutive time instants t and t + 1, two things may happen: The source i may experience a "loss", i.e. the flow control mechanisms of TCP detects that a packet sent by the source did not reach its destination. In this case, TCP tries to avoid congestion by forcing the source to halve the number of packets sent at time t + 1 (multiplicative decrease mechanism). In other words, Z i (t + 1) = Z i (t)/2. If there is no loss, the source is allowed to increase Z i (t) by 1, i.e. Z i (t + 1) = Z i (t) + 1 (additive increase mechanism).
3. The durations (τ (i) k ) k≥1 between time instants t k and t k+1 where a given source i experiences a loss are modeled by a sequence of independent exponential random variables with parameter λ i . 4. The total traffic Z is the sum of an infinite number of independent sources with varying rates λ i , where (λ i ) i≥1 is a non-decreasing sequence of positive numbers.
As compared to the true mechanisms of TCP, our model contains a number of simplifications (see [7]). However, except for one, these simplifications are not essential, at least as far as multifractality is concerned: Of all our assumptions, only the one of independence in (4) is clearly an oversimplification. Indeed, it is obvious that almost all losses are a consequence of congestion, which is caused by the fact that several sources are in competition. This gives rise to a strong correlation in the behavior of the sources. Unfortunately, introducing correlations leads to a significantly more complex analysis. One should remark nevertheless that the competition between sources is implicitly taken into account through the fact that sources indexed by large integers are subject to more frequent losses. We hope to investigate the effect of correlations on the multifractal behavior in a future work. Note also that most other approaches dealing with the fractal analysis of TCP make similar assumptions of independence. This is in particular the case in the models [4,16,22] discussed below.
Our model takes into account the main features of TCP, while allowing at the same time a thorough mathematical analysis: We show in the sequel that Z is multifractal, and we compute its Hausdorff multifractal spectrum. Both the multifractality of Z and the shape of its spectrum corroborates empirical findings ( [23,29]).
It is interesting to compare our approach with previous works dealing with the mathematical modeling of Internet traffic in relation with its (multi-) fractal behavior. A large number of studies ( [16,22,27]) have given empirical evidence that many types of Internet traffic are "fractal", in the sense that they display self-similarity and/or long range dependence. Most theoretical models that have been developed so far have focused on explaining such behaviors. In that view, a popular class of models is based on the use of "ON/OFF" sources. An ON/OFF source is a source of traffic that is either idle, or sends data at a constant rate. Adequate assumptions on the distribution of the ON and/or OFF periods allow to obtain fractal properties. More precisely, the model in [22] considers independent and identically distributed ON/OFF sources, where the length of the ON and OFF periods are independent random variables. In addition, the distribution of the ON or/and of the OFF periods is assumed to have a regularly varying tail with exponent β ∈ (1, 2). Then, when the number of sources tends to infinity, and if one rescales time slowly enough, the resulting traffic, properly normalized, tends to a fractional Brownian motion, with exponent 3/2 − β/2. In [28], it is shown that the same model leads to a β−stable Lévy motion when the time rescaling is "fast". Finally, in a recent work, Gaigalas and Kaj ([14]) investigated the intermediate regime where time is rescaled proportionally to the number of sources. They found that the limit behavior in this case is neither a stable motion nor a fractional Brownian motion.
Another, elegant, model, which does not require a double re-normalization, is presented in [16]. It also uses a superposition of independent ON-OFF sources, but this time with a sequence of ratios for Poisson-idle and Poisson-active periods assumed to decay as a polynomial. Again, the resulting process display fractal features 1 .
A major feature of the above models is that the sources, in their ON mode, send data at a constant rate. This is obviously a simplification, since one does not take into account the strong and rapid variations induced by the flow control mechanisms of TCP. This seems to be of no consequence for studying long range dependence or self-similarity: These properties are obtained through the slow decay of the probability of observing large busy or idle periods. These slow decays may in turn be traced back to certain large scale features, such as, e.g., the distribution of the files sizes in the Internet ( [11]). More generally, it is usually accepted that long memory is a property of the network.
In contrast, the use of ON/OFF sources does not allow a meaningful investigation of the multifractal properties of traffic: Contrarily to long range dependence, multifractality is a short-time behavior. An ON/OFF modeling is clearly inadequate in this frame since it washes out all the (intra-source) high frequency content. At small time scales, the role of the protocol, i.e. TCP, becomes predominant ( [4]). Incorporating some sort of modeling of TCP is thus necessary if one wants to perform a sensible high-frequency analysis: The local, rapid variations due to TCP, are determinant from the multifractal point of view.
In that view, it is interesting to note that the limiting behavior of the ON/OFF model which is usually considered is the one leading to fractional Brownian motion. It is therefore not multifractal. In contrast, the other limiting case gives rise to a stable motion, which is multifractal. A possible cause might be that, in this regime, the intersource high frequency content (i.e. the rapid variations in the total traffic resulting from de-synchronized sources) is large enough to produce multifractality. However, it is not clear which actual mechanisms in the Internet would favor this particular regime. It would also be interesting to investigate whether the critical case studied in [14] is also multifractal.
Another approach that allows to "explain" the multifractal features of TCP is based on the use of "fluid models" ( [4]): Rather than representing TCP at the packet level, one uses fluid equations to describe the joint evolution of throughput for sessions sharing a given router. The interest of this approach is that it represents the traffic as simple products of random matrices, while allowing to capture the AIMD mechanism of TCP. In particular, [4] shows through numerical simulations that this model does lead to a multifractal behavior. In other words, the fluid model indicates that the multifractality is already a consequence of the AIMD mechanism. This numerical result corroborates our theoretical findings. A network extension of the fluid model is studied in [5]. It also points to multifractality of the traces, with additional intriguing fractal features.
Note also that in a series of paper ([2, 3, 10]), F. Baccelli and collaborators have performed a fine analysis of TCP at the packet level. They have in particular shown that TCP is Max-Plus linear. A desirable extension of our work would be to study the multifractal properties of these more precise models.

A class of additive processes with non-stationary and correlated increments
We now describe our model in a formal way. Let (λ i ) i≥1 be a non-decreasing sequence of positive numbers.
For every i ≥ 1, let (τ (i) k ) k≥1 be a sequence of independent exponential random variables with parameter λ i . Define τ The σ-algebras σ(τ (i) k , k ≥ 1) are assumed to be mutually independent.
We consider an infinite sequence of sources (S i ) i≥1 . The "traffic" (Z i (t)) t≥0 generated by the source S i , i ≥ 1, is modeled by the following stochastic process where (Z i (0)) i≥1 is a sequence of non-negative random variables such that the series i≥1 Z i (0) converge, and µ is a fixed real number larger than one (typically equal to 2 in the case of TCP).
The resulting "global traffic" is the stochastic process Our first task is to give conditions under which Z is almost surely everywhere finite.
with probability one, the stochastic process Z is finite everywhere. If i≥1 1/λ i = ∞ then, with probability one, Z(t) = ∞ almost everywhere with respect to the Lebesgue measure.
The proof of Proposition 1 is postponed to Section 5.
We are interested in the multifractal nature of the sample paths of Z. In order to analyze this matter, it will be useful to decompose each elementary process Z i in the following way on [T Then, under the assumptions of Proposition 1, Z is the sum of the two non-negative processes X = i≥1 X i and R = i≥1 R i .
It will be shown that the processes Z and X share the multifractal spectrum of a Lévy process without Brownian part and whose characteristic measure is Π = i≥1 λ i δ −1/λ i (see [18] for the multifractal nature of Lévy processes).
A heuristic explanation of this fact is that the process X "resembles" the Lévy process L defined almost surely as lim N →∞ [9]): In particular, both X and L jump at each point T (i) k (i, k ≥ 1); The jump sizes are mutually independent random variables for both processes; And, finally, at each T (i) k , the jump size of L is the expectation of the jump size of X. A major difference is that the increments of X are both correlated and not stationary. The same is true for the increments of Z. Moreover, the sizes of the jumps of Z cease to be independent. This has important consequences in performing the multifractal analysis of Z: Even though the approach used by S. Jaffard in studying the multifractal nature of some functions with a countable dense set of jump points ( [17], [18]) proves useful here, it is necessary to involve different and refined tools for the study of X and Z. This will be discussed in more detail in Remark 1.
In the present work, the multifractal nature of Z is investigated through the computation of its spectrum of singularities or Hausdorff multifractal spectrum. This spectrum gives a geometrical information on the singularity structure of Z. Another approach to multifractal analysis is based on a statistical description of the distribution of the singularities. It leads to the computation of the so-called large deviation spectrum. The large deviation spectrum and related quantities pertaining to the statistical analysis of Z (as, e.g., its Legendre multifractal spectrum) are studied in the companion paper [6]. These quantities are the one usually considered in applications (see for instance [29,23,24,13]).
The spectrum of singularities. We need the notion of pointwise regularity of a real valued function on a non-trivial subinterval I of R. If f is such a function, t 0 ∈ Int(I) and s ∈ R + , then f belongs to C s (t 0 ) if there exists C > 0 and a polynomial P t 0 of degree at most [s] such that in a neighborhood of t 0 , The Hölder exponent of f at t 0 , denoted h f (t 0 ), is defined as The spectrum of singularities or Hausdorff multifractal spectrum of f describes, for every h ≥ 0, the "size" of the set S h of points in Int(I) where f has Hölder exponent h. More precisely, let dim E denote the Hausdorff dimension of the set E (we adopt the convention dim ∅ = −∞). Then the spectrum of singularities of f is the function: The spectrum of singularities of the sample paths of Z (here I = R + ) is governed by the following index which is also the Blumenthal-Getoor [12] index of the Lévy process L (β ∈ [1, 2] under the assumptions of Proposition 1). Our main result is: With probability one, X and Z are well defined and they share the following spectrum of singularities: Remark 1. The spectra of X and Z are the same as that of the Lévy process L defined above. The condition i≥1 1/λ i < ∞ is also necessary and sufficient to define L, but [18] assumes slightly more than i≥1 1/λ i < ∞ to derive the multifractal spectrum of L when β = 2. More precisely, the additional assumption in [18] is (C) : This restriction is due to the use of a certain Lemma by Stute in finding the lower bound estimate of the Hölder exponents. In fact this lemma gives an upper bound on the number of jump points of 2 j ≤λ i <2 j+1 L i (·) in any dyadic interval. In [18], Stute's result is combined with a concentration inequality and the fact that the jump size is of the order of 2 −j at jump points of 2 j ≤λ i <2 j+1 L i (·). Under (C), this approach also yields a lower bound estimate for the Hölder exponents of X (not for those of Z) if, on the one hand, one uses the same truncations of the X i 's as those used in this paper, and on the other hand one interprets the X i s as the difference between a drift and a pure jump process. Nevertheless, there remain problems with the lower bound estimates of the Hausdorff dimensions of the level sets S h , as well as with the computation of the maximal Hölder exponent of X. This is due to the fact that the jump size δ at jump points of 2 j ≤λ i <2 j+1 X i (·) ceases to be of the same order as 2 −j (more precisely, log δ is not of the order of -j). In particular, in [18] the maximal Hölder exponent of L is found using Shepp's Theorem on the covering of the real line by Poisson intervals centered at the jump points of L. Here, we need a refinement of Shepp's result for "economic" coverings by Poisson intervals centered at jump points of the 2 j ≤λ i <2 j+1 X i (·)s selected to satisfy that the jump size at each of those points is of the same order as 2 −j (Theorem 3).
Our lower bound estimate of the Hölder exponents of X and Z is not based on Stute's lemma. Rather, we rely on a classical concentration inequality (Bennett inequality, Lemma 3(ii)). As a consequence, we avoid the restriction (C) in the study of X in Theorem 1 when β = 2.
Theorem 1 possesses the following natural extension: and .
With probability one, the process Z is well defined and its spectrum of singularities is d β .
In other words, the multifractal nature of the sum is not affected if µ is replaced by µ i in Z i and if the sequence (µ i ) remains bounded and does not tend "too fast" to 1. Theorem 2 includes many potential or actual variants of TCP. For instance, one could imagine treating in different ways sources with different intensity λ i : As long as the reduction factors are bounded and do not approach 1 too fast, the multifractal spectrum remains unchanged. This suggests that reducing the multifractality of TCP might require more drastic changes. 4 Proof of Theorem 1.
The proof of Theorem 1 is decomposed in several steps. In Section 4.1, we set some definitions useful in the sequel.

Definitions and notations
Due to the last assertion of Lemma 1 (Section 5) and the definition of the R i s, the component involving Z i (0) is too small to play a role in computing the Hölder exponents of Z on (0, ∞). Consequently, we assume without loss of generality that Z i (0) = 0 almost surely for all i ≥ 1.
It is enough to establish that for every integer T > 0, the restrictions of X and Z to (0, T ) have almost surely the spectrum of singularities given in Theorem 1.
Therefore, in the sequel we fix T ∈ N * and study X and Z on (0, T ).
Moreover, we may and will assume that inf i≥1 λ i ≥ 2 without loss of generality, since we work under the assumption i≥1 1/λ i < ∞.
We need some new definitions.
The following sets will prove to be useful.
For every j ≥ 1 and δ > 0 define Then for every δ > 0 define For every j ≥ 1 define where #G j denotes the cardinal of the set G j , with the convention log(0) = −∞. It follows from the definition of β that For j ≥ 1 define For every ε > 0 and m ≥ 1 define .
For every J ≥ 0, denote by D J the set of dyadic points of the J th generation contained in [0, T ].

A lower bound for
This section is devoted to the proof of the following proposition. It involves intermediate results stated and proved in Section 5.
Proposition 2 Assume the hypothesis of Theorem 1. Fix δ > β. With probability one, for every t 0 ∈ (0, T ) and Y ∈ {X, R, Z}, if t 0 is not a jump point of Y then Proof. Due to the equality Z = X + R, and the fact the X, R, and Z have the same jump points, we only have to deal with Y ∈ {X, R}.
Fix ε > 0 small enough so that: Fix η ∈ (0, T ) and then Ω = Ω (η) a subset of Ω of probability 1, such that for every ω ∈ Ω , there exists m 0 (ω) ≥ 1 such that for every m ≥ m 0 (ω), the conclusions of Corollary 3, Lemma 4 and Lemma 6(ii) hold, as well as that of Lemma 1 (with K = 6) for i ∈ G j when j ≥ m/δ and G j = ∅, and also that of Lemma 5 and 7 if j ≥ (m + r m )/β j . Fix such an m 0 (ω) for every ω ∈ Ω . Now, fix ω ∈ Ω , and then t 0 ∈ (η, T ) such that t 0 ∈ E δ (ω) and t 0 is not a jump point of Y (ω). Since t 0 ∈ E δ (ω), we can choose j 0 ≥ m 0 (ω)/δ such that for every j ≥ j 0 , t 0 ∈ E j,δ . The Hölder exponent of Y at t 0 is the same as that of j≥j 0 i∈G j Y i . We also choose j 0 so that β j < β + ε < δ and (j + 1) √ j ≤ 2 εj for j ≥ j 0 . To conclude, we need the following three upper bounds (a), (b), (c): i∈G j Y i has no jump between t and t 0 . Consequently, (b) By Lemma 6(ii), for some constant C = C(ω), for every m ≥ δj 0 and t ∈ (η, T ) such by property (ii) for ε. Moreover, due to Lemma 1, we have i∈G j X i (d and if β = 2, since β j < 2 + ε, property (iv) for ε yields j, (m+rm)/β j ≤j≤m(β,ε) On the other hand, since β j < β + ε and (j + 1) √ j ≤ 2 εj , property (ii) for ε yields j≥m(β,ε) Finally, we get From (a), (b) and (c), we deduce that the Hölder exponent of j≥j 0 i∈G j Y i and Y at t 0 is at least 1/δ. So, for every ω ∈ Ω (η), if t 0 ∈ (η, T ) is not a jump point of Y , (1) holds. One concludes by considering Ω = ∩ n≥1 Ω ( 1 n ). Remark 2. Under the condition i≥1 1/λ i < ∞, the above computations imply the following property for Y ∈ {X, R}, even without the knowledge of the finiteness of i≥1 Y i : With probability one, for every η ∈ (0, T ) there exists α > 0 such that if t, t ∈ (η, T ) and |t − t| ≤ α then lim J→∞ J j=1 exists. This is a key point in the proof of Proposition 1.
Proof. If Y ∈ {X, Z} and t > 0 is a jump point of Y , |∆ Y (t)| stands for the size of the corresponding jump.
Fix t 0 ∈ E δ,ϕ . Fix a sequence (r jn ) n≥1 of points such that for every n ≥ 1 there exist i ∈ G jn and 1 ≤ k ≤ N By construction we have Moreover, The conclusion follows from Lemma 1 in [17] (also Lemma 4 in [18]).
In the next two subsections, the sets S h are the level sets of the Hölder exponents of Y ∈ {X, Z}.

dim
Upper bound for dim S h Proposition 4 With probability one, dim S h ≤ βh for all 0 ≤ h ≤ 1/β.
Proof. The set of rational numbers being countable, the conclusion of Proposition 2 holds almost surely simultaneously for all rational δ > β. Consequently, due to this proposition, with probability one, if t 0 ∈ S h then t 0 ∈ δ∈Q, δ<1/h E δ . Moreover, it is shown in [18] that dim E δ ≤ β/δ.

Lower bound for dim S h
Let ϕ be as in previous section. For h ≥ 0, define Proof. We saw that the conclusion of Proposition 2 holds almost surely simultaneously for all rational δ > β. This implies that with probability one, for every h ∈ [0, 1/β], if t 0 ∈ S h,ϕ , then, due to Proposition 2, h Y (t 0 ) ≥ 1/δ for all rational δ such that h > Let (j n ) n≥1 be an increasing sequence of integers such that G jn = ∅ and lim n→∞ β jn = β. Then for b ≥ 1 define For every d ≥ 0, let H d be the Hausdorff measure defined with the gauge function x ≥ 0 → (log(x)) 2 x d .

Proposition 6 With probability one
The proof is postponed to Section 6 (assertion (i) is the only to be proved; the other one is a consequence of Theorem 2 in [18] or [19]).
Since S 0 = ∅ (it contains at least the jump points), it follows from Propositions 4 and 5, as well as Corollary 1 that with probability one, dim S h = βh for all h ∈ [0, 1/β]. We use the terminology "economic covering" with respect to the analogous property satisfied by the largest sets E δ : (R) with probability one (0, T ) ⊂ E δ for all δ < β. The property (R) is used in [18] to prove for Lévy processes the result corresponding to Corollary 2 below. Moreover, (R) is a consequence of Shepp's theorem for the covering of the real line by Poisson intervals.

S
The proof of Theorem 3 is postponed to Section 6.

Proofs of basic lemmas and propositions
Recall that we assumed without loss of generality that λ i ≥ 2 for all i ≥ 1. For t > 0 and is a Poisson random variable with intensity λ i t.
Lemma 1 Assume i≥1 1/λ i < ∞. For every K ≥ 2, with probability one, there exists i 0 ≥ 1 such that for every i ≥ i 0 , In particular, for every i ≥ i 0 and t ∈ (K log(λ i )/λ i , T ], one has k (i) The first assertion of the lemma is a consequence of the Borel-Cantelli Lemma. The other one is a consequence of the first assertion and the definition of k (i) t . Proof of Proposition 1. Suppose i≥1 1/λ i < ∞. Assume we have shown the following property (P): with probability one, the processes X and R are finite at every point t of a dense countable subset of R + .
Then, since X and R are respectively the infinite sums of the nonnegative processes X i and R i , the property obtained in Remark 2 after the proof of Proposition 2 and (P) together show that almost surely X and R are finite everywhere, as well as their sum Z.
To see that (P) holds, it is enough to show that for every t > 0 and Y ∈ {X, R}, i≥1 Y i (t) < ∞ almost surely. Fix t > 0. Due to Lemma 1, k (i) t goes so fast to infinity that one can assume that Z i (0) = 0 for all i ≥ 1. Then, the computations done in the proofs of Lemma 8(i) and Lemma 10(i) show that i≥1 E X i (t) + R i (t) < ∞, hence the conclusion.
Suppose now that i≥1 1/λ i = ∞. Then we can use Lemma 8(ii) to show that Since the X i (t)s are independent random variables, Kolmogorov's three series theorem (see [32] p. 106) shows that i≥1 X i (t) = ∞ almost surely. Moreover, the function f : (t, ω) ∈ R * Consequently, the Fubini theorem applied for every n ≥ 1 with the restriction of f to [0, n] × Ω and the products n ⊗ P, where n denotes the restriction of the Lebesgue measure to [0, n], implies that with probability one, X(t) = ∞ almost everywhere. The same holds for Z since Z ≥ X.
For m ≥ 1 define r m = 2 log 2 144 max 1, 1 (iii) There exists m 0 ≥ 1 such that for all m ≥ m 0 , if j ≥ m(β, ε), η < t < t ≤ T and t − t ≤ 2 −m+1 then The proof of Lemma 2 uses the following well-known inequalities, which are essentially, e.g., Lemma 1.5 and Bennett inequality (6.10) in [21]. Lemma 3 Let (V i ) 1≤i≤n be a finite sequence of independent random variables with mean 0. Assume that there exists γ > 0 such that |V i | ≤ γ almost surely for all i.
Proof of Lemma 2. (i) The case Y = X: One verifies that by our choice for γ j , as m → ∞, for m/3 ≤ j ≤ m(β, ε) such that G j = ∅ and η < t < t ≤ T such that and λ i (t − t) → 0. It follows from Lemma 9 applied with γ = γ j (γ j < η for j large enough) that as m → ∞ The case Y = R: It follows from a combination of Lemma 10(i) and (ii) that there exists K > 0 such that for m large enough, for m/3 ≤ j ≤ m(β, ε) and By our choice γ j = 6(j + 1)2 −j , on the one hand γ 2 On the other hand, Before applying Lemma 10(iii), notice that for . Also before applying Lemma 10(iv), use Lemma 1 (with K = 6) to get . Then, it follows from Lemma 10(iii)(iv) applied with γ = γ j that as

To find the lower bound for
Then, the Cauchy-Schwarz inequality together with Lemma 10(iii)(iv) and the above estimates show that Consequently, we can apply Lemma 3(ii) to get Now using the right inequality in (i) and again the fact that j ≤ m yields s 2 /4b 2 Y G j (t, t ) ≥ 9m 2 , hence the conclusion.
Lemma 4 Fix ε > 0 and η ∈ (0, T ). There exists C > 0 such that for every m large enough, sup Proof. We saw in the proof of Lemma 2(i) that for every j ≥ 1, i ∈ G j , and 0 < η < t, t ≤ T , Lemma 5 Fix η ∈ (0, T ). With probability one, there exists C > 0 such that for Y ∈ {X, R} and j large enough, the jumps sizes of i∈G j Y i in (η, T ) are bounded by Cj2 −j .
Proof. Recall that Z i (0) is assumed to be 0 for all i ≥ 1. Due to Lemma 1 and the respective definitions of X i and R i , with probability one, there exists K > 0 such that for i large enough both sup t∈(η,T ) X i (t) and sup t∈(η,T ) R i (t) are less than K log(λ i )/λ i . This is enough to conclude.
Lemma 6 (i) There exists m 0 ≥ 1 such that for all m ≥ m 0 , for Y ∈ {X, R}, the probability of the event E m = {there exists [m/δ] ≤ j ≤ (m + r m ))/β j such that i∈G j Y i has on any of the T 2 m−1 dyadic subintervals of length 2 −m+1 of [0, T ] more than m 8 jumps} is bounded by 2 −m .
(ii) Fix η ∈ (0, T ). With probability one, there exists m 0 ≥ 1 and a constant C η > 0 such that for m ≥ m 0 , for all t, t ∈ (η, T ] such that 2 −m ≤ |t − t| ≤ 2 −m+1 , Proof. (i) Since the Y i are independent, the number of jump points of i∈G j Y i in a dyadic interval of the (m − 1) th generation is a Poisson variable of parameter Consequently, for m large enough, the probability that i∈G j Y i has on any of the 2 m−1 T dyadic subintervals of length 2 −m+1 of [0, T ] more than m 8 jumps is bounded by (ii) Due to the first part of the lemma and the Borel-Cantelli lemma, with probability one, there exists m 0 ≥ 1 such that if m ≥ m 0 then E m does not happen, hence for all t, t ∈ (η, T ] such that 2 −m ≤ t − t ≤ 2 −m+1 and all for all j such that [m/δ] ≤ j ≤ (m + r m )/β j , i∈G j Y i has at most 2m 8 jump points between t and t . By Lemma 5, m 0 can be also chosen so that these jumps are O(j2 −j ), so O(m)2 −j here. Consequently, that is the desired result.
Lemma 7 Fix ε > 0. With probability one, for Y ∈ {X, R} and j large enough, the distance between two consecutive jump points of i∈G j Y i in [0, T ] is at least 2 −2(β+ε)j .
Proof. The jump points of i∈G j Y i are also the jump points of a Poisson process of intensity Λ j = i∈G j λ i . Consequently, there exists a sequence (θ (j) k ) k≥1 of independent exponential random variables with parameter Λ j such that almost surely the jump points of i∈G j Y i in [0, T ] are described by the points of the increasing finite sequence of points of the form m k=1 θ (j) k , m ≥ 1, that belong to [0, T ]. The problem is reduced to estimate the minimum distance between two consecutive of these points, i.e the minimum of {θ One concludes with the Borel-Cantelli Lemma.
Recall that for k ≥ 1 the probability distribution of T (i) k is the Gamma(λ i , k) distribution, that is It is then quite easy, and left to the reader, to verify the properties collected in the following lemma.
Lemma 8 Fix i ≥ 1 and γ > 0. For every t > t > 0, (ii) The stochastic process X i (t)1 {X i (t)≤γ} t>γ is stationary. More precisely, for t > γ, one has Now, on the one hand, using Lemma 8(iii) we have On the other hand we have . Consequently, using Lemma 8(iv) we can get Adding the three above estimates yields the conclusion.
Fix γ > 0. Consider the stochastic process R (ii) There exists a constant K > 0 such that if λ i γ ≥ 1 then Then, simple computations show that for every n ≥ 1, there exists K > 0 such that for all i ≥ 1 and t > 0 such that λ i t ≥ 1 one has E (τ (iv) In the following computation, we use the fact that, conditionally on {τ t , both R i (t) and R i (t ) are in [0, γ], and so is |R i (t ) − R i (t)|. We also use the obvious upper bound R i (t) ≤ t.
Case β = 1: fix 0 < δ < 1. In fact, the following stronger result holds: with probability one, there exists i 0 ≥ 1 such that for all i ≥ i 0 To see this, denote by M i the smallest integer larger than λ 1−δ i /(2 log(λ i )), and notice, using Lemma 1 and its proof, that where M i = T λ i + 4 T λ i log(T λ i ). The right hand side of the previous inequality is summable. By the Borel-Cantelli Lemma, this implies that with probability one, for i large enough, the number of consecutive points of the form T is at most M i + 1. Moreover, due to Lemma 1 (applied with K = 2), the distance between two T (i) k 's is at most 2 log(λ i )/λ i for i large enough. Consequently, for i large enough, two consecutive intervals of the form [T are overlapping. This yields the conclusion.
We prove the following result, which is stronger than the desired one because lim n→∞ δ jn = β: with probability one, for all n 0 large enough, The method is inspired from Kahane's approach for Shepp's Theorem in [20].
Let n η > 0 be such that η > ϕ(2 −jn ) + 2 −jnδ jn for n ≥ n η . Then fix n 0 ≥ n η . For N ≥ n 0 set The sequence τ N is non-decreasing, and the conclusion will follow if we show that a.s. τ N → ∞ as N → ∞, or equivalently E(exp(−τ N )) → 0 as N → ∞. We shall prove that there exists C > 0 such that for N large enough By our choice for δ j N , the right hand side in (3) tends to 0 as N → ∞ and the conclusion follows.
Adding P 1 and P 2 yields P i (t) = λ i 2 −jnδ jn exp − λ i (ϕ(2 −jn ) + 2 −jnδ jn ) which does not depend on t. Write P i for P i (t). Equation (4) now yields The integral I N can also be written where, for i ∈ G jn , P i (s, t) is the probability that there is no integer k ≥ 1 such that T (i) k ∈ (max(t, t + s − 2 −jnδ jn ), t + s). The last equality is due to the lack of memory of the exponential law as well as the independence between sources. The probability P i (s, t) is well known to be equal to e −λ i |I| where |I| denotes the length of the interval I = (max(t, t + s − 2 −jnδ jn ), t + s). Since that length does not depend on t, we denote P i (s, t) by P i (s), which is given by where                otherwise.
So there exists A > 0 such that