Regularly Varying Measures on Metric Spaces: Hidden Regular Variation and Hidden Jumps

We develop a framework for regularly varying measures on complete separable metric spaces $\mathbb{S}$ with a closed cone $\mathbb{C}$ removed, extending material in Hult&Lindskog (2006), Das, Mitra&Resnick (2013). Our framework provides a flexible way to consider hidden regular variation and allows simultaneous regular variation properties to exist at different scales and provides potential for more accurate estimation of probabilities of risk regions. We apply our framework to iid random variables in $\mathbb{R}_+^\infty$ with marginal distributions having regularly varying tails and to c\`adl\`ag L\'evy processes whose L\'evy measures have regularly varying tails. In both cases, an infinite number of regular variation properties coexist distinguished by different scaling functions and state spaces.


Introduction
This paper discusses a framework for regular variation and heavy tails for distributions of metric space valued random elements and applies this framework to regular variation for measures on R ∞ + and D([0, 1], R). Heavy tails appear in diverse contexts such as risk management; quantitative finance and economics; complex networks of data and telecommunication transmissions; as well as the rapidly expanding field of social networks. Heavy tails are also colloquially called power law tails or Pareto tails, especially in one dimension. The mathematical formalism for discussing heavy tails is the theory of regular variation, originally formulated on R + and extended to more general spaces. See, for instance, [7,15,16,18,19,23,36,39,41,44].
One approach to estimating the probability of a remote risk region relies on asymptotic analysis from the theory of extremes or heavy tail phenomena. Asymptotic methods come with the obligation to choose an asymptotic regime among potential competing regimes. This is often tantamount to choosing a state space for the observed random elements as well as a scaling. For example, in R 2 + , for a risk vector X = (X 1 , X 2 ), if we need to estimate P [X > x] = P [X 1 > x 1 , X 2 > x 2 ] for large x, should the state space for asymptotic analysis be [0, ∞] 2 {(0, 0)} or (0, ∞] 2 ? Ambiguity for the choice of asymptotic regime led to the idea of coefficient of tail dependence [9,10,26,27,34,43], hidden regular variation (hrv) [20,28,32,33,[38][39][40] and the conditional extreme value (cev) model [12-14, 21, 35].
Due to the scaling inherent in the definition of regular variation, a natural domain for regularly varying tails is a region closed under scalar multiplication and usually the domain is a cone centered at the origin. Commonly used cones include R + , R d + , or the two sided versions allowing negative values that are natural in finance and economics. However, as argued in [14], there is need for other cones as well, particularly when asymptotic independence or asymptotic full dependence ( [41,Chapter 5], [45]) is present. Going beyond finite dimensional spaces, there is a need for a comprehensive theory covering spaces such as R ∞ + and function spaces. Fortunately a good framework for such a theory of regular variation on metric spaces after removal of a point was created in [23]. The need to remove more than a point, perhaps a closed set and certainly a closed cone, was argued in [14]. These ideas build on w # -convergence in [11,Section A2.6].
This paper has a number of goals: (1) We follow the lead of [23] and develop a theory of regularly varying measures on complete separable metric spaces S with a closed cone C removed. Section 2 develops a topology on the space of measures on S C which are finite on regions at positive distance from C. This topology allows creation of mapping theorems (Section 2.1) that encourage continuity arguments and is designed to allow simultaneous regular variation properties to exist at different scales as is considered in hidden regular variation. (2) We apply the general material of Section 2 to two significant applications.
(a) In Section 4 we focus on R p + and R ∞ + , the space of sequences with non-negative components. An iid sequence X = (X 1 , X 2 , . . . ) ∈ R ∞ + such that P [X 1 > x] is regularly varying has a distribution which is regularly varying on R ∞ + C j for any j 1, where C j are sequences with at most j positive components. Mapping theorems (Section 2.1) allow extension to the regular variation properties of S = (X 1 , X 1 + X 2 , X 1 + X 2 + X 3 , . . . ) in R ∞ + minus the set of nondecreasing sequences which are constant after the jth component. See Section 4.5.2. For reasons of simplicity and taste, we restrict discussion to R ∞ + but with modest effort, results could be extended to R ∞ . We also discuss regular variation of the distribution of a sequence of Poisson points in R ∞ + (Section 4.5.4). (b) The R ∞ + discussion of Poisson points in Section 4.5.4 can be leveraged in a natural way to consider (Section 5) regular variation of the distribution of a Lévy process whose Lévy measure ν is regularly varying: lim t→∞ tν b(t)x, ∞ = x −α , x > 0, for some scaling function b(t) → ∞. We reproduce the result [22,24] that the limit measure of regular variation with scaling b(t) on D([0, 1], R) {0} concentrates on càdlàg functions with one positive jump. This raises the natural question of what happened to the rest of the jumps of the Lévy process that seem to be hidden by the scaling b(t). We are able to generalize for any j 1 to convergence under the weaker normalization b(t 1/j ) on a smaller space in which the limit measure concentrates on nondecreasing functions with j positive jumps. Again, as in the study of R ∞ + , we focus for simplicity only on large positive jumps of the Lévy process.
(3) A final goal is to clarify the proper definition of regular variation in metric spaces. For historical reasons, regular variation is usually associated with scalar multiplication but what does this mean in a general metric space? Traditional definitions are in Cartesian coordinates in finite dimensional spaces and the form of the definition may not survive change of coordinates. For example, in R p + , a random vector X (in Cartesian coordinates) has a regularly varying distribution if for some scaling function b(t) → ∞ we have tP [X/b(t) ∈ · ] converging to a limit. If we transform to polar coordinates X → (R, Θ) := ( X , X/ X ), the limit is taken on tP [(R/b(t), Θ) ∈ · ] which appears to be subject to a different notion of scaling. The two convergences are equivalent but look different unless one allows for a more flexible definition of scalar multiplication. We discuss requirements for scalar multiplication in Section 2 along with some examples; related discussion is in [2,3,30].
The existing theory for regular variation on, say, R d + , uses the set-up of vague convergence. A troubling consequence is the need to use the one point uncompactification [39, page 170ff] which adds lines through infinity to the state space. When regular variation is defined on the cone [0, ∞] d {0}, limit measures cannot charge lines through infinity. However, on proper subcones of [0, ∞] d {0} this is no longer true and this creates some mathematical havoc: Convergence to types arguments can fail and limit measures may not be unique: In given examples [14,Example 5.4], under one normalization the limit measure concentrates on lines through infinity and under another it concentrates on finite points. Another difficulty is that the polar coordinate transform x → ( x , x/ x ) cannot be defined on lines through infinity. One way to patch things up is to retain the one point un-compactification but demand all limit measures have no mass on lines through infinity but this does not resolve all difficulties since the unit sphere {x : x = 1} defined by the norm x → x may not be compact on a subcone such as (0, ∞] d . Another way forward which we deem cleaner and more suitable to general spaces where compactification is more involved, is not to compactify and just to define tail regions as subsets of the metric space at positive distance from the deleted closed set. This is the approach given in Section 2.

Convergence of measures in the space M O
Let (S, d) be a complete separable metric space. The open ball centered at x ∈ S with radius r is written B x,r = {y ∈ S : d(x, y) < r} and these open sets generate S , the Borel σ-algebra on S. For A ⊂ S, let A • and A − denote the interior and closure of A, respectively, and let ∂A = A − A • be the boundary of A. Let C b denote the class of real-valued, non-negative, bounded and continuous functions on S, and let M b denote the class of finite Borel measures on S . A basic neighborhood > 0 and f i ∈ C b for i = 1, . . . , k. Thus a sub-basis for M b are sets of the form {ν ∈ M b : ν(f ) := f dν ∈ G} for f ∈ C b and G open in R + . This equips M b with the weak topology and convergence µ n → µ in M b means f dµ n → f dµ for all f ∈ C b . See e.g. Sections 2 and 6 in [6] for details.
Fix a closed set C ⊂ S and set O = S C, e.g. one possible choice is O = S {s 0 } for C = {s 0 } for some s 0 ∈ S. The subspace O is a metric subspace of S in the relative topology with σ-algebra denote the real-valued, non-negative, bounded and continuous functions f on O such that for each f there exists r > 0 such that f vanishes on C r ; we use the notation Similarly, we will write d(A, C) = inf x∈A, y∈C d(x, y) for A ⊂ S. We say that a set A ∈ S O is bounded away from C if A ⊂ S C r for some r > 0 or equivalently d(A, C) > 0. So C O consists of non-negative continuous functions whose supports are bounded away from C. Let M O be the class of Borel measures on O whose restriction to S C r is finite for each r > 0. When convenient, we also write Convergence µ n → µ in M O is convergence in the topology defined by this base or sub-base.
For µ ∈ M O and r > 0, let µ (r) denote the restriction of µ to S C r . Then µ (r) is finite and µ is uniquely determined by its restrictions µ (r) , r > 0. Moreover, convergence in M O has a natural characterization in terms of weak convergence of the restrictions to S C r . Theorem 2.1 (Portmanteau theorem). Let µ, µ n ∈ M O . The following statements are equivalent.
(ii) f dµ n → f dµ for each f ∈ C O which is also uniformly continuous on S.
(iii) lim sup n→∞ µ n (F ) µ(F ) and lim inf n→∞ µ n (G) µ(G) for all closed F ∈ S O and open G ∈ S O and F and G are bounded away from C.
For proofs, see Section 2.4. Note, the result is true for any general metric space. Weak convergence is metrizable (for instance by the Prohorov metric; see e.g. p. 72 in [6]) and the close relation between weak convergence and convergence in M O in Theorem 2.1(v)-(vi) indicates that the topology in M O is metrizable too. With minor modifications of the arguments in [11], pp. 627-628, we may choose the metric where µ (r) , ν (r) are the finite restriction of µ, ν to S C r , and p r is the Prohorov metric on M b (S C r ).
is a separable and complete metric space.
This result is illustrated in Examples 3.3, 3.3 and 3.4 and is also needed for considering the generalized polar coordinate transformation in Section 4.2.3. It is the basis for the approach to regular variation of Lévy processes in Section 5. Theorem 2.3 is formulated so that h is defined on O = S C, rather than on all of S. If S = R p + and h(x) = ( x , x/ x ) is the polar coordinate transform, then h is not defined at 0. This lack of definition is not a problem since . The proof is in Section 2.4.4 but it is instructive to quickly consider the special case where D h = ∅ so that h is continuous. In this case h induces a continuous mappingĥ : To see this, look at the inverse image of a sub-basis set (2.1): For G open in R + , and f ∈ C O , Since h is continuous and f Here are two variants of the mapping theorem. The first allows application of the operator taking successive partial sums from R ∞ + → R ∞ + in Proposition 4.2 and also allows application of the projection map (  and for each η > 0 there exists a compact set K i ⊂ S C ri such that 3. M-convergence vs vague convergence. Vague convergence complies with the topology on the space of measures which are finite on compacta. Regular variation for measures on a space such as R p + has traditionally been formulated using vague convergence after compactification of the space. In order to make use of existing regular variation theory on R p + , it is useful to understand how Mconvergence is related to vague convergence.
Let S be a complete separable metric space and suppose C is closed in S. Then , its compact support K ⊂ S C must be bounded away from C and hence d(K, C) > 0 and f ∈ C(S C). If µ ∈ M(S C), and D satisfies d(D, C) > 0, then µ(D) < ∞. If K ∈ K(S C) then d(K, C) > 0 and so µ(K) < ∞, showing any µ ∈ M(S C) is also in M + (S C).
Remark: Let S = [0, ∞) and C = {0}, and Here and elsewhere, we use the notation x for the Dirac measure concentrating mass 1 on the point x so that x (A) = 1, if x ∈ A, and x (A) = 0, if x ∈ A c . We have µ n converging to Lebesgue measure in M + (S C) but {µ n } does not converge in M(S C). If f is 0 on (0, 1), linear on (1, 2), f (2) = 1 and f is constant on (2, ∞), then f ∈ C(S C) but µ n (f ) 2.4.1. Preliminaries. We begin with two well known preliminary lemmas in topology. The second one is just a version of Urysohn's lemma [17,42] for metric spaces. .
The desired properties of f are easily checked from Lemma 2.2.
Lemma 2.4. If A ∈ S O is bounded away from C, A = ∪ i∈I A i for an uncountable index set I, disjoint sets A i ∈ S O , and µ(A) < ∞, then µ(A i ) > 0 for at most countably many i.
Proof. Suppose there exists a countably infinite set I n such that µ(A i ) > 1/n for i ∈ I n . Then which is a contradiction to the assumption that µ(A) < ∞. The conclusion follows from letting n → ∞.
Suppose that (ii) holds. Take any closed F that is bounded away from C. Then there exists r > 0 such that This completes the proof of (iii). Suppose that (iii) holds and take A ∈ S O bounded away from C with µ(∂A) = 0.
Hence, lim n→∞ µ n (A) = µ(A), so that (iv) holds. Suppose that (iv) holds and take r > 0 such that µ(∂(S C r )) = 0. By Lemma 2.5, all but at most countably many r > 0 satisfy this property. As S C r is trivially bounded away from C, we have that µ n (S C r ) → µ(S C r ). Now any A ⊂ S C r is also bounded away from C and as S C r is closed, ∂ S C r A = ∂A, where the first expression denotes the boundary of A when considered as a subset of S C r . So for any subset A ⊂ S C r with µ(∂ S C r A) = 0, we have by (iv) that µ n (A) → µ(A) and hence µ . This completes the proof of (v). Suppose that (v) holds. Since, µ (r) n → µ (r) in M b (S C r ) for all but at most countably many r > 0 we can always choose a sequence {r i } with r i ↓ 0 such that µ Suppose that (vi) holds. Take > 0 and a neighborhood N ,f1,...,f k (µ) = {ν : and each f j vanishes on C r . Let n j be an integer such that n n i implies | f i dµ The proof consists of minor modifications of arguments that can be found in [11], pp. 628-630. Here we change from r to 1/r. For the sake of completeness we have included a full proof. We ∞ 0 e −r g n (r)dr, so that for each n, g n (r) decreases with r and is bounded by 1. Helly's selection theorem (p. 336 in [5]), applied to 1 − g n , implies that there exists a subsequence {n } and a nonincreasing function g such that g n (r) → g(r) for all continuity points of g. By dominated convergence, ∞ 0 e −r g(r)dr = 0 and since g is monotone this implies that g(r) = 0 for all finite r > 0. Since this holds for all convergent subsequences {g n (r)}, it follows that g n (r) → 0 for all continuity points r of g, and hence, for such r, p r (µ for all but at most countably many r > 0. Hence, for such r, p r (µ (ii) Separability: For r > 0 let D r be a countable dense set in M b (S C r ) with the weak topology. Let D be the union of D r for rational n } is a Cauchy sequence for p r for all but at most countably many r > 0. Since S is separable and complete, its closed subspace S C r is separable and complete. Therefore, M b (S C r ) is complete which implies that {µ (r) n } has a limit µ r . These limits are consistent in the sense that µ (r) Then µ is a measure. Clearly, µ 0 and µ(∅) = 0. Moreover, µ is countably additive: for disjoint A n ∈ S O the monotone convergence theorem implies that 2.4.6. Proof of Corollary 2.2. The proof of Corollary 2.1 shows that it suffices if either {x n } or {y n } has a limit point. In the former case, if x n → x for some subsequence n → ∞, then d(x, y n ) → 0 and y n → x ∈ C and h(y n ) → h(x) so d (A , C ) = 0 again giving a contradiction. Note if S is compact than {x n } has a limit point. On the other hand, if {y n } has a limit point then there exists an infinite subsequence {n } and y n → y ∈ C so that d( is relatively compact for each i, and let {µ n } be a sequence of elements in M . We use a diagonal argument to find a convergent subsequence. Since Hence, we can define µ : Thus, µ is finite on sets A ∈ S C r for some r > 0. To show that µ is countably additive, let A 1 , A 2 , . . . be disjoint sets in S O and 0 f nk ↑ I A k for each k. Then k f nk ↑ I ∪ k A k and, by Fubini's theorem and the monotone convergence theorem, it holds that

Regularly varying sequences of measures
3.1. Scaling. The usual notion of regular variation involves comparisons along a ray and requires a concept of scaling or multiplication. We approach the scaling idea in a general complete, separable metric space S by postulating what is required for a pleasing theory. Given any real number λ > 0 and any x ∈ S, we assume there exists a mapping (λ, x) → λx from (0, ∞) × S into S satisfying: (A1) the mapping (λ, x) → λx is continuous, (A2) 1x = x and λ 1 (λ 2 x) = (λ 1 λ 2 )x. Assumptions (A1) and (A2) allow definition of a cone C ⊂ S as a set satisfying x ∈ C implies λx ∈ C for any λ > 0. For this section, fix a closed cone C ⊂ S and then O := S C is an open cone. We require that 3.1.1. Examples to fix ideas: To emphasize the flexibility allowed by our assumptions, consider the following circumstances all of which satisfy (A1)-(A3).

Regular variation.
Recall from e.g. [7] that a positive measurable function c defined on (0, ∞) is regularly varying with index ρ ∈ R if lim t→∞ c(λt)/c(t) = λ ρ for all λ > 0. Similarly, a sequence {c n } n 1 of positive numbers is regularly varying with index ρ ∈ R if lim n→∞ c [λn] /c n = λ ρ for all λ > 0. Here [λn] denotes the integer part of λn. The choice of terminology is motivated by the fact that {ν n (A)} n 1 is a regularly varying sequence for each set A ∈ S O bounded away from C, µ(∂A) = 0 and µ(A) > 0. We will now define regular variation for a single measure in M O .
There are many equivalent formulations of regular variation for a measure ν ∈ M O . Some are natural for statistical inference. Consider the following statements.
(i) There exist a nonzero µ ∈ M O and a regularly varying sequence {c n } n 1 of positive numbers such that Theorem 3.1. The statements (i)-(v) are equivalent and each statement implies that the limit measure µ has the homogeneity property for some α 0 and all A ∈ S O and λ > 0.
Notice that a regularly varying measure does not correspond to a single scaling parameter α unless the multiplication operation with scalars is fixed.

3.3.
More examples. We amplify the discussion of Section 3.1.1.  Consider two independent Pareto random variables: Let X 1 be Pa(γ 1 ) and X 2 be Pa(γ 2 ). Define (λ, (x 1 , x 2 )) → According to our definition, the distribution of (X 1 , X 2 ) is regularly varying on S C. The limit measure therefore has the scaling property: For λ > 0, implying that the distribution of (X 1 , X 2 ) is regularly varying. For λ > 0, the limit measure has the scaling property,
In particular, suppose for a random vector (X, Y ) and scaling function b(t) → ∞, This is regular variation of the distribution of (X, Y ) on D with the scaling function defined as (λ, (x, y)) → (x, λy). The mapping Theorem 2.3 gives where µ = µ•h −1 , which is regular variation with respect to the traditional scaling (λ, (x, y)) → (λx, λy).
Conversely, define g : D → D by g(x, y) = (x/y, y). One observes g is continuous and obeys the bounded away condition and so The summary is that (3.2) and (3.3) are equivalent.

Example 3.4 (Polar coordinates). Set
and scaling operation on O is λ, (r, a) → (λr, a). The map Let d and d be the the metrics on S and S . Suppose X has a regularly varying distribution on O so that for some b(t) → ∞, in M O for some limit measure µ. We show h(X) =: (R, Θ) has a regularly varying distribution on O . We apply Theorem 2.
Given µ ∈ M O , let A µ denote the set of µ-continuity sets A ∈ S O bounded away from C satisfying S(A) = A.
Proof. Let D µ denote the π-system of finite differences of sets of the form A 1 A 2 for A 1 , A 2 ∈ A µ with A 2 ⊂ A 1 . Take x ∈ O and > 0 such that B x, is bounded away from C. The sets ∂S(B x,r ), for r ∈ (0, ), are disjoint. Similarly, the sets ∂B x,r , for r ∈ (0, ), are disjoint. Therefore, µ(∂S(B x,r )) = µ(∂B x,r ) = 0 for all but at most countably many r ∈ (0, ). Moreover, B x,r = S(B x,r ) (S(B x,r ) B x,r ), so B x,r ∈ D µ for all but at most countably many r ∈ (0, ). Hence, there exists Since O is separable we find (as in the proof of Theorem 2.3 in [6]) that there is a countable The inclusion-exclusion argument in the proof of Theorem 2.2 in [6] this implies that lim inf n µ n (G) µ(G) for all open sets G bounded away from C. Any closed F bounded away from C is a subset of an open µ-continuity set A = O C r for some r > 0. Notice that A ∈ A µ . Therefore i.e. lim sup n µ n (F ) µ(F ). The conclusion follows from Theorem 2.1.

3.4.2.
Proof of Theorem 3.1. The proof is structured as follows. We first prove that (iii) implies the homogeneity property in (3.1) of the limit measure µ. Then we prove that the statements (i)-(v) are equivalent and that the limit measures are the same up to a constant factor.
Suppose that (i) holds and set c(t) = c [t] . For each A ∈ A µ and t 1 it holds that Since {c n } n 1 is regularly varying it holds that lim n→∞ c n /c n+1 = 1. Hence, lim t→∞ c(t)ν(tA) = µ(A) for all A ∈ A µ . It follows from Lemma 3.2 that (ii) holds.
Suppose that (ii) holds. Then } is a regularly varying sequence since c(t) is a regularly varying function. Therefore, (ii) implies (i).
Suppose that (ii) holds. Take a set E ∈ S O bounded away from C such that ν(tE), µ(E) > 0 and µ(∂E) = 0. Then as t → ∞. Hence, by Theorem 2.1 (ii), (iii) holds. Suppose that (iii) holds. It was already proved in (a) above that statement (iii) implies that t → ν(tE) is regularly varying with index −α 0. Setting c(t) = 1/ν(tE) implies that c(t) is regularly varying with index α and that c(t)ν(t·) → µ(·) in M O . This proves that (iii) implies (ii). Up to this point we have proved that statements (i)-(iii) are equivalent.
Suppose that (iv) holds. Set b(t) = b [t] and take A ∈ A µ . Then from which it follows that lim t→∞ tν(b(t)A) = µ(A). It follows from Lemma 3.2 that (v) holds. If (v) holds, then it follows immediately that also (iv) holds. Hence, statements (iv) and (v) are equivalent. Suppose that (iv) holds. Take E such that µ(∂E) = 0 and µ(E) > 0. For t > b 1 , let k = k(t) be the largest integer with b k t. Then b k t < b k+1 and k → ∞ as t → ∞. Hence, for A ∈ A µ , from which it follows that lim t→∞ ν(tA)/ν(tE) = µ(A)/µ(E). It follows from Lemma 3.2 that (iii) holds. Hence, each of the statments (iv) and (v) implies each of the statements (i)-(iii).
Suppose that (iii) holds. Then c(t) := 1/ν(tE) is regularly varying at infinity with index α 0. If α > 0, then c(c −1 (t)) ∼ t as t → ∞ by Proposition B.1.9 (10) in [16] and therefore for all A ∈ S O bounded away from C with µ(∂A) = 0. If α = 0, then Proposition 1.3.4 in [7] says that there exists a continuous and increasing function c such that c(t) ∼ c(t) as t → ∞. In particular, c( c −1 (t)) = t and This section considers regular variation for measures on the metric spaces R ∞ + and R p + for p 1 and applies the theory of Sections 2 and 3. We begin in Section 4.1 with notation and specification of metrics and then address in Section 4.2 continuity properties for the following maps and the norm is Euclidean norm on R p + . We also define the generalized polar coordinate transformation in (4.3) which is necessary for estimating the tail measure of regular variation when the Euclidean unit sphere {x ∈ R p + : x = 1} is not bounded away from C in the space R ∞ + C. {0} in our present context, rather than proving things from scratch. Section 4 concludes with Section 4.5, a discussion of regular variation of measures on R ∞ + C giving particular attention to hidden regular variation properties of the distribution of X = (X 1 , X 2 , . . . ), a sequence of iid non-negative random variables whose marginal distriutions have regularly varying tails. This discussion extends naturally to hidden regular variation properties of an infinite sequence of non-negative decreasing Poisson points whose mean measure has a regularly varying tail. Results for the Poisson sequence provide the basis of our approach in the next Section 5 to regular variation of the distribution of a Lévy process whose Lévy measure is regularly varying.

Metrics. All metrics are equivalent on
and we also need where · 1 is the usual L 1 norm on R p + . Proposition 4.1. The metrics d ∞ and d ∞ are equivalent on R ∞ + and d ∞ (x, y) d ∞ (x, y) 2d ∞ (x, y).
Proof. First of all, For the other inequality, observe  Proof. We write We can now apply Corollary 2.1.

POLAR COORDINATE TRANSFORMATIONS.
The polar coordinate transformation in R p + is heavily relied upon when making inferences about the limit measure of regular variation. Transforming from Cartesian to polar coordinates disintegrates the transformed limit measure into a product measure, one of whose factors concentrates on the unit sphere. This factor is called the angular measure. Estimating the angular measure and then transforming back to Cartesian coordinates provides the most reliable inference technique for tail probability estimation in R p + using heavy tail asymptotics. See [ POLAR(x) = x , x/ x .

Compared with the notation of Corollary 2.2, we have
Since POLAR is continuous on the domain, we get from Corollary 2.2 the following. When removing more from the state space than just {0 p }, the conventional polar coordinate transform (4.1) is not useful if ℵ is not compact, or at least bounded away from what is removed. For example, if S C = (0, ∞) p , ℵ is not compact nor bounded away from the removed axes. The following generalization [14] sometimes resolves this, provided (4.2) below holds.
Temporarilly, we proceed generally and assume S is a complete, separable metric space and that scalar multiplication is defined. If C is a cone, θC = C for θ 0. Suppose further that the metric on S satisfies Since C is a cone and d(·, ·) has property (4.2), we have for any s ∈ S C that so the second coordinate of GPOLAR belongs to ℵ C . For example, if S = R 2 + and we remove the cone consisting of the axes through 0 2 , that is, It is relatively easy to check that if A ⊂ (0, ∞) × ℵ C is bounded away from C = {0} × ℵ C , then GPOLAR −1 (A ) is bounded away from C. On (0, ∞) × ℵ C adopt the metric and setting a 2 = a 1 this is inf (r1,a1)∈D r 1 . We conclude that (r, a) ∈ A implies r . Since GPOLAR −1 (A ) = {ra : (r, a) ∈ A }, we have in S C, remembering that C is assumed to be a cone, Remark on condition (4.4): The condition says take an infinite sequence z in C, truncate it to z |p ∈ R p + , and then make it infinite again by filling in zeros for all the components beyond the pth. The result must still be in C. Examples: (2) Pick an integer j 1 and define where recall x (A) = 1, if x ∈ A, and 0, if x ∈ A c . So C j consists of sequences with at most j positive components. Truncation and then insertion of zeros does not increase the number of positive components so C j is invariant under the operation implied by (4.4).
Proof. Suppose C satisfies (4.4) and (4.5) holds. Suppose f ∈ C(R ∞ + C) and without loss of generality suppose f is uniformly continuous with modulus of continuity Pick any p so large that 2 −p < δ/2 and define Then we have (a) From (4.7), PROJ p (C)) and g is uniformly continuous. To verify that the support of g is positive distance away from PROJ p (C), suppose d p is the L 1 metric on R p + and d p (x 1 , . . . , x p ), PROJ p (C) < δ/2. Then there is (z 1 , . . . , z p ) ∈ PROJ p (C) such that d p ((x 1 , . . . , x p ), (z 1 , . . . , z p ) < δ. But then if z ∈ C with z |p = (z 1 , . . . , z p ), we have, since (z 1 , . . . , z p , 0 ∞ ) ∈ C by (4.4), . . . , x p ), (z 1 , . . . , z p ) < δ, and therefore d p (x 1 , . . . , x p ), PROJ p (C) < δ/2 implies So the support of g is bounded away from PROJ p (C) as claimed. Now write Then we have and similarly for dealing with term C, we would have Owing to finite dimensional convergence (4.5) and (4.8), we have Since Λ c is bounded away from C, µ 0 (Λ c ) < ∞ and since the inequality holds for any p sufficiently large such that 2 −p < δ, we may let p → ∞ to get µ n (f ) → µ 0 (f ).
Remark: The proof shows (4.5) only needs to hold for all p p 0 . For example, if and for p < j, PROJ p (C j ) = R p + and R p + PROJ p (C j ) = ∅. However, it suffices for the result to hold for all p j.

4.4.
Comparing M-convergence on S C with vague convergence when S is compactified. This continues the discussion of Section 2.3. Conventionally [39] regular variation on [0, ∞) p has been defined on the punctured compactified space [0, ∞] p {0 p }. This solves the problem of how to make tail regions relatively compact. However, as discussed in [14], when deleting more than {0 p }, this approach causes problems with the convergence to types lemma and also because certain natural regions are no longer relatively compact. The issue arises when there is mass on the lines through ∞ p , something that is impossible for regular variation on [0, ∞] p {0 p }. The following discussion amplifies what is in [14].
Suppose C is closed in [0, ∞] p and set Examining the definitions we see that, • Ω 0 ⊂ Ω. • Proposition 4.4. Suppose for every n 0 that µ n ∈ M + (Ω) and µ n places no mass on the lines through ∞ p : (4.10) if and only if the restrictions to the space without the lines through ∞ p converge: Proof. Given (4.11), let f ∈ C + K (Ω). Then the restriction to Ω 0 satisfies f | Ω0 ∈ C(Ω 0 ) so . Conversely, assume (4.10). Suppose B ∈ S (Ω 0 ) and µ 00 (∂ Ω0 B) = 0, where ∂ Ω0 B is the set of boundary points of B in Ω 0 . This implies µ 0 (∂ Ω B) = 0 since Therefore µ n (B) → µ 0 (B) and because of (4.9), µ n0 (B) → µ 00 (B) which proves (4.11). 4.5. Regular variation on R p + and R ∞ + . For this section, either S is R p + or R ∞ + and C is a closed cone; then S C is still a cone. Applying Definition 3.2, a random element X of S C has a regularly varying distribution if for some regularly varying function b(t) → ∞, as t → ∞, for some limit measure ν ∈ M(S C). In R p + , if C = {0 p } or if (4.9) holds, this definition is the same as the one using vague convergence on the compactified space. 4.5.1. The iid case: remove {0 ∞ }. Suppose X = (X 1 , X 2 , . . . ) is iid with nonnegative components, each of which has a regularly varying distribution on (0, ∞) satisfying Equivalently, as t → ∞, (4.13) and the limit measure concentrates on the sequences with exactly one component positive. Note {0 ∞ } ∪ C =1 =: C 1 , the sequences with at most one component positive, is closed.
To verify (4.13), note from Theorem 4.1, it suffices to verify finite dimensional convergence since {0 ∞ } satisfies (4.4), so it suffices to prove as t → ∞, for p 1, p place mass on the lines through ∞ p , M-convergence and vague convergence are the same and then (4.14) follows from the binding lemma in [39, p. 228, 210].
Applying the operator CUMSUM and Corollary 4.1 to (4.13), gives, where the limit concentrates on non-decreasing sequences with one jump and the size of the jump is governed by ν α . Then applying the operator PROJ p we get by Corollary 4.2, tP [(X 1 ,X 1 + X 2 , . . . , 16) giving an elaboration of the one big jump heuristic saying that summing independent risks which have the same heavy tail results in a tail risk which is the number of summands times the individual tail risk; for example, see [39, p. 230]. In particular, applying the projection from R p The projection T is uniformly continuous but also Theorem 2.3 applies to T since for y > 0, T −1 (y, ∞) = {(x 1 , . . . , x p ) : x p > y} is at positive distance from {0 p }.
The above discussion could have been carried out with minor modifications without the iid assumption by assuming (4.12) and 4.5.2. The iid case; remove more; hidden regular variation. We now investigate how to get past the one big jump heuristic by using hidden regular variation. For j 1, set (4.18) so that C j is closed. We imagine an infinite sequence of reductions of the state space with scaling adjusted at each step. This is suggested by the previous discussion. On M(R ∞ + {0 ∞ }), the limit measure µ (0) concentrated on C =1 , a small part of the potential state space. Remove {0 ∞ } ∪ C =1 = C 1 and on M(R ∞ which concentrates on C =2 . In general, we find that in M(R ∞ which concentrates on C =(j+1) . This is an elaboration of results in [28,31,32]. The result in R ∞ + can be proven by reducing to R p + by means of Theorem 4.1 noting that C j satisfies (4.4) and then observing that neither µ (j) t nor µ (j) puts mass on lines through ∞ p . It is enough to show convergences of the following form: Assume p j and i 1 < i 2 < · · · < i j+1 and y l > 0, l = 1, . . . , j + 1 and A formal statement of the result and a proof relying on a convergence determining class is given in the next Section 4.5.3. Table 1 gives a summary of the results in tabular form.
is the set of non-decreasing sequences and CUMSUM(C j ) =: S j is the set of nondecreasing sequences with at most j positive jumps. Now apply the map PROJ p to (4.21) to get a p-dimensional result for (X 1 , X 1 + X 2 , . . . , X 1 + · · · + X p ) and the analogue of (4.16) is tP [(X 1 ,X 1 + X 2 , . . . , . When j > 1, unlike the step leading to (4.17), we cannot apply the map T : (x 1 , . . . , x p ) → x p to (4.22) to get a marginal result for X 1 + · · · + X p . Although T is uniformly continuous, Corollary 2.1 is not applicable since Theorem 4.2. For every j 1 there is a nonzero measure µ (j) ∈ M Oj with support in C =(j+1) such that tP [X/b(t 1/j+1 ) ∈ · ] → µ (j) (·) in M Oj as t → ∞. The measure µ (j) is given in (4.20), or more formally, where the components of e i k are all zero except component i k whose value is 1 and the indices (i 1 , . . . , i j+1 ) run through the ordered subsets of size j + 1 of {1, 2, . . . }.
The proof of Theorem 4.2 uses a particular convergence determining class A j of subsets of O j . Let A j denote the set of sets A m,i,a for m j, where A m,i,a = {x ∈ R ∞ + : x i k > a k for k = 1, . . . , m}, i 1 < · · · < i m , a 1 , . . . , a m > 0. Lemma 4.1. If µ t , µ ∈ M Oj and lim t→∞ µ t (A) = µ(A) for all A ∈ A j bounded away from C j with µ(∂A) = 0, then µ t → µ in M Oj as t → ∞.
Proof. Consider the set of finite differences of sets in A j and note that this set is a π-system. Take x ∈ O j and > 0. Since x ∈ O j there are i 1 < · · · < i j such that x i k > 0 for each k. If 2 −ij < /2 choose m = i j . Otherwise, choose m > i j such that 2 −m < /2. Take δ < min{ /2, min{x k : x k > 0 and k m}} and set for all k m and y k = δ or y k = x k ± δ for some k m. In particular, there is an uncountable set of δ-values, for which the boundaries ∂(B B ) are disjoint, satisfying the requirements. Therefore δ can without loss of generality be chosen so that µ(∂(B B )) = 0. The separability of R ∞ + implies (cf. the proof of Theorem 2.3 in [6]) that each open set is a countable union of µ-continuity sets of the form (B B ) • . The same argument as in the proof of Theorem 2.2 in [6] therefore shows that lim inf t→∞ µ t (G) µ(G) for all open G ⊂ O j bounded away from C j . Any closed set F ⊂ O j bounded away from C j is a subset of some A ∈ A j . By the same argument as above, we may without loss of generality take A such that µ(∂A) = 0. The set A F is open and therefore i.e. lim sup t→∞ µ t (F ) µ(F ). The conclusion follows from Theorem 2.1(iii). Therefore, the support of µ is a subset of C j+1 C j . Notice that for j 1 It follows that b is regularly varying at infinity with index 1/α.
Let {E n , n 1} be iid standard exponentially distributed random variables so that if {Γ n , n 1} := CUMSUM{E n , n 1}, we get points of a homogeneous Poisson process of rate 1. Transforming [39, p. 121], we find {Q ← (Γ n ), n 1} are points of a Poisson process with mean measure ν, written in decreasing order.
Define the following subspaces of R ∞ + : Analogous to (4.13), we claim To verify this, it suffices to prove finite dimensional convergence and for the biggest component and x > 0, For the first two components, let PRM(ν) be a Poisson counting function with mean measure ν and for x > 0, y > 0, and writing p(t) = ν(b(t)(x ∧ y, ∞)), we have The conclusion now follows from Lemma 4.1 by observing that we have shown convergence for the sets in a convergence determining class. Similarly, we claim Straightforward computations show that the distribution of (Γ 1 , Notice that, for x > y > 0, In particular, it is a straightforward exercise in calculus to verify that for x > y > 0 Similar computations show that, for y > x > 0, Moreover, for x > 0, y > 0, z > 0, and writing p(t) = ν(b(t 1/2 )(x ∧ y ∧ z, ∞)), we have As in the iid case described by Theorem 4.2 and (4.20), we have an infinite number of regular variation properties co-existing.  (4.24) in M Oj−1 as t → ∞, where µ (j) is a measure concentrating on H =j given by Proof. The explicit computations above, and similarly for j 3, together with an application of Lemma 4.1 yields the conclusion.

Finding the hidden jumps of a Lévy process
In this section we consider a real valued Lévy process X = {X t , t 0} as a random element of D := D([0, 1], R), the space of real valued càdlàg functions on [0, 1]. We metrize D with the usual Skorohod metric where x, y ∈ D, λ is a non-decreasing homeomorphism of [0, 1] onto itself, Λ is the set of all such homeomorphisms, e(t) = t is the identity, and x = sup t∈[0,1] |x(t)| is the sup-norm. The space D is not complete under the metric d sk , but there is an equivalent metric under which D is complete [6, page 125]. Therefore, the space D fits into the framework presented in Section 2 and we may use the Skorohod metric to check continuity of mappings.
For simplicity we suppose X has only positive jumps and its Lévy measure ν concentrates on (0, ∞). Suppose x → ν(x, ∞) is regularly varying at infinity with index −α < 0. Let Q(x) = ν([x, ∞)) and define Q ← (y) = inf{t > 0 : ν([t, ∞)) < y}. Then the function b given by b(t) = Q ← (1/t) satisfies lim t→∞ tν(b(t)x, ∞) = x −α and b is regularly varying at infinity with index 1/α. It is shown in [22,24] that with scaling function b(t), the distribution of X is regularly varying on D {0} with a limit measure concentrating on functions which are constant except for one jump. Where did the other Lévy process jumps go? Using weaker scaling and biting more out of D than just the zero-function 0, allows recovery of the other jumps.
The standard Ito representation [1,4,25] of X is where B is standard Brownian motion independent of the Poisson random measure N on [0, 1] × (0, ∞) with mean measure Leb × ν. Referring to the discussion preceding (4.23), {Q ← (Γ n ), n 1} are points written in decreasing order of a Poisson random measure on (0, ∞) with mean measure ν and by augmentation [39, p. 122], we can represent where (U l , l 1) are iid standard uniform random variables independent of {Γ n }.
The Lévy-Ito decomposition allows X to be decomposed into the sum of two independent Lévy processes, where J is a compound Poisson process of large jumps bounded from below by 1, and X = X − J is a Lévy process of small jumps that are bounded from above by 1. The compound Poisson process can be represented as the random sum J = Recall the notation in (4.23) for R ∞ ↓ + , H =j and H j and the result in Theorem 4.3. We seek to convert a statement like (4.24) into a statement about X. The first step is to augment (4.24) with a sequence of iid standard uniform random varables. The uniform random variables will eventually serve as jump times for the Lévy process. The following result is an immediate consequence of Theorem 4.3.
Proposition 5.1. Under the given assumptions on ν and Q, for j 1, where L is Lebesgue measure on [0, 1] ∞ and µ (j) concentrates on H =j and is given by (4.25).
Recall ν α is the Pareto measure on (0, ∞) satisfying ν α (x, ∞) = x −α , for x > 0, and we denote by ν j α product measure generated by ν α with j factors. For m 0, let D m be the subspace of the Skorohod space D consisting of nondecreasing step functions with at most m jumps and define A m as : u i ∈ (0, 1) for 1 i m; u i = u j for i = j, 1 i, j m}.
Let T m be the map x i 1 [ui,1] , and we think of T m as mapping a jump size sequence and a sequence of distinct jump times into a step function in D m ⊂ D. Our approach applies T m to the convergence in (5.2) to get a sequence of regular variation properties of the distribution of X. Whereas in Section 4.5.2, we could rely on uniform continuity of CUMSUM, T m is not uniformly continuous and hence the mapping Theorem 2.3 must be used and its hypotheses verified. We will prove the following.
Theorem 5.1. Under the regular variation assumptions on ν and Q, for j 1, The first expression after taking the limit in (5.5) follows from the mapping Theorem 2.3 and the second from applying T j to (5.2) and then using Fubini to hold the integration with respect to Lebesgue measure L outside as an expectation.
Proof. Here is the outline; more detail is given in the next section. We prove convergence using Theorem 2.1 (iii). Take F and G closed and open sets respectively in D that are bounded away from D j−1 . Take δ > 0 small enough so that also The Lévy process X has all moments finite and does not contribute asymptotically. Application of Lemmas 5.2 and 5.1, and letting δ ↓ 0 gives To deal with the lower bound using open G, take δ > 0 small enough so that is nonempty and bounded away from D j−1 . Then Applying Lemmas 5.2 and 5.1 and letting δ ↓ 0 gives 5.1. Details. We now provide more detail for the proof of Theorem 5.1. In the decomposition (5.1), the process X represents small jumps that should not affect asymptotics. We make this precise with the next Lemma. Proof. We rely on Skorohod's inequality for Lévy processes [8], [  Proof. We apply Theorem 2.1 (iii).
Construction of the lower bound for open sets: Let G ⊂ D be open and bounded away from D j−1 . This implies that functions in G have no fewer than j jumps. Recall that Γ l = E 1 + · · · + E l , where the E k s are iid standard exponentials. Take M j and notice that Let t → ∞ and apply Theorem 2.1 (iii) to (5.2) so the lim inf of the first factor above has a lower bound. As t → ∞, the second factor approaches 1. Let M → ∞ and the third factor also approaches 1. Let δ ↓ 0 and we obtain lim inf Construction of the upper bound for closed sets: Let F ⊂ D be closed and bounded away from D j−1 . Take β ∈ (0, 1) close to 1 and let M t = N1 l=1 1 (b(t 1/j ) β ,∞) (Q ← (Γ l )).
Choose δ > 0 small enough so that F δ := {x ∈ D : d(x, F ) δ} is bounded away from D j−1 . Then Decompose the first summand according to whether M t j or M t j + 1. Notice M t < j is incompatible with Mt l=1 Q ← (Γ l )1 [U l ,1] ∈ b(t 1/j )F δ since F δ is bounded away from D j−1 . Thus we get the upper bound We now show that the second and third of the three terms above vanish as t → ∞. Firstly, the definition of M t implies that Q ← (Γ l ) b(t 1/j ) β for M t + 1 l N 1 . Thus, The right-hand side converges to 0 as t → ∞ since the tail probability has a Markov bound of tE(N 1 ) p /[b(t 1/j ) 1−β δ] p for any p. Secondly, P [max(E 1 , . . . , E j+1 ) ν([b(t 1/j ) β , ∞))] = P [E 1 ν([b(t 1/j ) β , ∞))] j+1 .