UvA-DARE (Digital Academic Repository) Extreme Value Analysis for a Markov Additive Process Driven by a Nonirreducible Background Chain

,


Introduction
The Markov additive process (in this paper abbreviated to MAP) can be seen as the Markov-modulated version of the Lévy process.Indeed, when an independently evolving continuous-time Markov chain on d ∈ N states, usually referred to as the background process, is in state i, the MAP locally behaves as a Lévy process X i (•): Additionally, a MAP allows for jumps at transition epochs of the background process.As such, MAPs offer a natural modeling framework to study stochastic processes of which the dynamics change over time, with broad applications in, for example, credit and risk theory, queueing, inventory management, and finance; early references on MAPs include C ¸inlar (1972) and Neveu (1961).
A key object of study concerns the extreme values attained by the MAP over a finite or infinite horizon.With Y(•) denoting the MAP under consideration, the focus is on the analysis of the distribution of its running maximum process Y(t) : sup s∈[0,t] Y(s) (as well as the corresponding running minimum process).Besides being interesting in its own right, the running maximum process can be directly translated in terms of the first-passage process τ(y) : inf{s ≥ 0 : Y(s) > y} because of the known duality relation between the events {Y(t) > y} and {τ(y) < t}.Building upon related results for Lévy processes, a wide range of characterizations is derived, typically in terms of transforms or so-called scale functions.We refer to Ivanovs (2011), chapter II, for an extensive account of the main results on extremes of MAPs as well as the corresponding first-passage process.Particularly noteworthy are the results obtained by Asmussen and Kella (2000), who use martingale methods to effectively extend the Pollaczek-Khinchine formula for spectrally one-sided Lévy processes to the MAP setting.We, in addition, mention the work by Dieker and Mandjes (2011) as well as D' Auria et al. (2010), the latter being predominantly in terms of the first-passage process.
Then, a MAP is a stochastic process (Y(t)) t≥0 for which where (L n ij ) n∈N is a sequence of independent copies of the random variable L ij , representing the size of the jump at the time of a transition from background state i to background state j (where i ≠ j).Because jumps at selftransitions, say from background state i to itself, can be incorporated in the Lévy process X i (•), we assume without loss of generality that there are no such self-transitions.An example of a MAP is shown in Figure 1.
In this paper, we consider a spectrally one-sided MAP Y(•) with the following characteristics.First, assume Y(0) 0 by convention.The generator matrix of the background process J(•), which evolves independently of the Lévy processes upon which Y(•) is based, is (q ij ) d i,j 1 with q i : −q ii .When there is a transition of J(•) from i to j, there is a jump of which the size is distributed as the random variable L ij ; for a given pair (i, j) of background states, these jumps are assumed independent of all of the driving Lévy processes as well as the background process.
Finally, we incorporate (state-dependent) killing, which happens with rate ϑ i ≥ 0 when the background process is in state i.At the moment the MAP is killed, it remains constant indefinitely such that the running maximum becomes the all-time maximum of the process.Alternatively, killing can be thought of as reaching an absorbing background state that corresponds to a Lévy process that is identical to zero.Various specific choices of the killing rates ϑ i are of interest.When choosing ϑ i ϑ > 0 for all i, for instance, we consider the running maximum over an exponentially distributed horizon with mean 1=ϑ.In addition, the choice ϑ i 0 for all i corresponds to the all-time maximum.We also note that, with a specific choice of the rates ϑ i , we can analyze the maximum of a Lévy process over a phase-type interval as argued in Section 5.
Denoting by Δ the killing time of the MAP, its distribution is characterized by the following system of equations: E(e −αΔ | J(0) i) q i + ϑ i q i + ϑ i + α j≠i q ij q i + ϑ i E(e −αΔ | J(0) j) + ϑ i q i + ϑ i , for α ≥ 0. This system of equations follows by observing that the time till the first event (being either a transition of the background process J(•) or killing) is exponentially distributed with rate q i + ϑ i ; then, one needs to distinguish between the background state becoming j (for j ≠ i) and killing.
m ij (α) : q ij E(e −αL ij ) + φ i (α) 1 {i j} − ϑ i 1 {i j} ; here, all X i (•) are assumed spectrally positive, and the jumps L ij are assumed nonnegative almost surely (a.s.).Later, we also work with a similar object for the spectrally negative case (in which the jumps L ij are assumed nonpositive almost surely); this MAP counterpart of the cumulant generating function is introduced at the beginning of Section 4.
Our aim is to analyze the distribution of Z i , that is, the maximum of the MAP under state-dependent killing, conditional on the initial background state being i: Z i : sup{Y(s) : s ∈ [0, Δ] | J(0) i}: As mentioned, we wish to do this without a priori assuming that J(•) corresponds to an irreducible continuoustime Markov chain.A central role is played by the probabilities for u ≥ 0, p i (u) : P(Z i ≥ u):

Preliminaries
Before moving on to the analytic part of the paper, we elaborate on two important results that are essential in our approach.The first of these results, the Wiener-Hopf decomposition, shows that the state of the Lévy process at an exponentially distributed epoch can be written as the difference between two independent nonnegative quantities.In case the Lévy process is spectrally one-sided, these distributions can be characterized explicitly in terms of the model primitives; notably, one of the two quantities is exponentially distributed.
To state this Wiener-Hopf decomposition, denote, for a given Lévy process X(•), its running maximum process by (X(t)) t≥0 and its running minimum process by (X(t)) t≥0 .Let T ν be an exponentially distributed random variable with mean ν −1 , sampled independently of anything else.
If X(•) is spectrally positive with Laplace exponent φ(•) and corresponding right inverse ψ(•), then X(T ν ) is distributed as −T ψ(ν) and If X(•) is spectrally negative with cumulant generating function Φ(•) and corresponding right inverse and This decomposition shows that, when X(•) is spectrally one-sided, the (transforms of the) two components can be expressed explicitly in terms of the underlying Laplace exponent (in the spectrally positive case) or cumulant generating function (in the spectrally negative case) and their right inverses.For more background and proofs, we refer to, for example, Kyprianou (2006, chapter VI).
The second result concerns a characterization of the zeroes of the determinant of the matrix exponent M(α) of a spectrally positive MAP.A special role is played by Lévy processes X i (•) that are monotone a.s.(also referred to as subordinators).Let S ↑ (S ↓ ) represent the set of background states corresponding to increasing (decreasing, respectively) subordinators.The result is a slight restatement of Ivanovs et al. (2010), theorem 1 and remark 2.1.Next to these two results, we often exploit a standard relation between two transform types: for α ≥ 0 and Y a nonnegative random variable,

Approach
Now that we have the essential notation and previous results at our disposal, we proceed by summarizing our approach.In both spectrally one-sided cases, the starting point is to use Proposition 1 to find a relationship between characteristics of the MAP at two successive transition epochs of the background chain.This relationship can be transformed to a system of equations for (transforms related to) p 1 (u), : : : , p d (u), involving the matrix M(•); here, we recall that we defined p i (u) : P(Z i > u).In the spectrally positive case (Section 3), this system contains unknown constants that can be determined exploiting Proposition 2. In the spectrally negative case (Section 4), the solution is directly expressed in terms of the zeroes of det M(•), entailing that, again, Proposition 2 can be used.

Spectrally Positive Case
Throughout this section, we assume that the MAP Y(•) is spectrally positive.As pointed out, this entails that, for each i, j ∈ {1, : : : , d}, X i (•) has no downward jumps and the random variable L ij is nonnegative a.s.We point out how to identify which is the transform of the tail distribution of Z i , and the corresponding Laplace-Stieltjes transform Note that ζ i (γ) 1 − γP i (γ) so that either of these transforms uniquely characterizes the distribution of Z i , the random variable of our interest.Once the Laplace-Stieltjes transform ζ i (γ) is evaluated, a numerical inversion algorithm, for example, the one developed in Abate and Whitt (2006), can be used to obtain the distribution of Z i .
A few observations can be made.
• Recalling that T γ denotes an exponentially distributed random variable with rate γ, it holds that γP i (γ) P(Z i > T γ ).In other words, γP i (γ) can be interpreted as the probability of Y(•) reaching an exponentially distributed level (with mean γ −1 ) before the process is killed.
• Furthermore, bearing in mind that killing occurs at rate ϑ i when the background process J(•) is in state i, it is worth noting that, when ϑ i ϑ for all i, (numerical) inversion of with respect to both γ and ϑ yields P(Y(t) > u | J(0) i), that is, the tail probability of the running maximum of the unkilled MAP at time t.
• Finally, we note that P i (0) E(Z i ), the expected maximum that the MAP attains before being killed.
Throughout this section, we analyze the behavior of P i (γ) for a fixed initial state i.Because the running maximum of a nondecreasing process necessarily equals the current value of the process, the analysis turns out to be slightly different depending on whether the Lévy process X i (•) is a nondecreasing subordinator.We consider, in Section 3.1, the case in which the fixed state i does not correspond to a nondecreasing subordinator and, in Section 3.2, the case in which it does.In the analysis of these sections, unknown constants appear; Section 3.3 points out how to determine these constants.

Nonsubordinator Case
In this section, we focus on the case in which J(0) i, where i ∉ S ↑ so that the spectrally positive Lévy process X i (•) may move downward in any interval with positive probability.Recall that ϑ i + q i is the rate of the exponentially distributed time until the first event; this first event corresponds to killing with probability π • i : ϑ i =(ϑ i + q i ) and to a transition of the background process to state j with probability π ij : q ij =(ϑ i + q i ).We decompose p i (u) by distinguishing between the case that the value of the MAP's running maximum X i (T ϑ i +q i ) at time T ϑ i +q i is above or below u.In the former case, we have that Z i > u so that we are done; in the latter case with probability π • i we do not exceed u before killing, whereas with probability π ij , we are left with the probability of Z j exceeding level van Kreveld, Mandjes, and Dorsman: Extreme Value Analysis for a Markov Additive Process Stochastic Systems, 2022, vol. 12, no. 3, pp. 293-317, © 2022 The Author(s) 297 Downloaded from informs.org by [146.50.150.90] on 02 November 2022, at 03:33 .For personal use only, all rights reserved.u − X i (T ϑ i +q i ) − L ij before killing.Formalizing this reasoning and applying Proposition 1 to decompose X i (T ϑ i +q i ) into the nonnegative independent random variables X i (T ϑ i +q i ) and X i (T ϑ i +q i ) − X i (T ϑ i +q i ), we obtain where we have also used that X i (T ϑ i +q i ) has the same distribution as X i (T ϑ i +q i ) − X i (T ϑ i +q i ).We continue by evaluating these two terms, which, in the sequel, we refer to by P + i (γ) and P − i (γ), separately.Evaluation of the first term is relatively straightforward; an interchange of the order of integration readily yields where by virtue of Proposition 1.We now focus on the evaluation of the second term, which is considerably more involved.As a first step, we interchange the order of the sum and the integrals: where The quantities P − ij (γ) can be evaluated separately as follows.Realize that, by Proposition 1, −X i (T ϑ i +q i ) is exponentially distributed with rate μ i : ψ i (ϑ i + q i ).We, thus, obtain the triple integral which, after replacing y by x − u + w, can be rewritten as Our strategy is to interchange the order of the integrals so as to be able to do the (easy) integration over u first.
By first swapping the integrals over u and w and then those over u and x, we find where the second equality follows by performing the integration over u and reorganizing the resulting expression.We can rewrite this expression as the difference of two terms in each of which the double integral factorizes into the product of two single integrals.In particular, with some rearranging of terms leads to the expression To separate L ij and Z j in the expression for η ij (γ), we rely on a probabilistic argument for nonnegative and independent random variables A and B. That is, using the memoryless property of the exponential distribution, where we use (1) in the last step.Furthermore, let λ ij (•) be the Laplace-Stieltjes transform of L ij , the size of the (nonnegative) jump at a transition by the background chain from state i to state j (in other words, λ ij (γ) : E(e −γL ij )).Recalling the definition of P i (γ), we conclude from Identity (3) that We now combine all these findings, which enables us to express P i (γ) in terms of P j (γ) with j ≠ i. Recalling that and substituting the obtained expressions for P + i (γ) and P − ij (γ), we obtain, for any i ∉ S ↑ , By using (2), recalling that μ i ψ i (ϑ i + q i ), and defining we can compactly summarize this result as follows.
Lemma 1.For i ∉ S ↑ and any γ ≥ 0, the transform of the tail probability p i (u) is given by So far, we have been working with the transform P i (γ) of the tail probability p i (u).In the remainder of this section, we rewrite the above lemma in terms of ζ i (γ) : E(e −γZ i ), which takes a particularly nice form.To this end, first note that, as a consequence of (1), Substituting this in (4) and rewriting leads to, for γ ≥ 0, Multiplying ( 6) by ϑ i + q i + φ i (γ) yields the identity We continue by considering the case that none of the states correspond to a nondecreasing subordinator.Then, the system of Equations ( 7) that characterizes ζ 1 (γ), : : : , ζ d (γ) can be written in a considerably more compact form.To this end, recall that the matrix Furthermore, using that m ii (μ i ) 0, we define the quantity b i (γ) as follows: with the constants ω i defined by Upon combining these, we obtain the equation , for i 1, : : : , d.In evident vector/matrix notation, we have, thus, rewritten (7) as follows.
Theorem 1.If no state corresponds to a nondecreasing subordinator (i.e., i ∉ S ↑ for all i 1, : : : , d), then, for any γ ≥ 0, It is important to note that, throughout the analysis, no assumptions on the chain structure of the background process J(•) are imposed.Also observe that we still need to identify the constants ω i that appear in ( 9), which we do in Section 3.3.
As an aside, we mention that Identity (6) can alternatively be derived using a probabilistic argumentation, which is, for completeness, provided in Appendix A.1.

Subordinator Case
The previous section deals with the case in which the initial state i is such that the spectrally positive Lévy process X i (•) does not correspond to a nondecreasing subordinator (i ∉ S ↑ , that is).The analysis led to the matrix Equation (11) for the case in which no state corresponds to a nondecreasing subordinator.In the present section, we address the case in which i ∈ S ↑ and point out how (11) should be adjusted if some of the background states correspond to nondecreasing subordinators.
To this end, let, for a given i 1, : : : , d, the Lévy process X i (•) be nondecreasing almost surely.It is important to note that in this case, necessarily, φ i (γ) ≤ 0 and ψ i (γ) ∞ for all γ ≥ 0. Our method for analyzing P i (γ) is largely the same as in the previous section, but is somewhat simpler because of the evident fact that any nondecreasing process attains its maximum at the end of the interval under consideration.Concretely, we could mimic the approach of the previous section while replacing X(T ϑ i +q i ) by X(T ϑ i +q i ), but it turns out to be convenient to condition on the value of Y(•) at the minimum of the killing time and the first transition of the background process.This yields With P + i (γ) and P − i (γ), respectively, representing the two terms in the right-hand side of (12), we use Proposition 1 and (1) to obtain and van Kreveld, Mandjes, and Dorsman: Extreme Value Analysis for a Markov Additive Process After collecting these intermediate results, we obtain the following characterization.
Lemma 2. For i ∈ S ↑ and any γ ≥ 0, Observe that (13) can also be obtained by taking the limit μ i ψ i (ϑ i + q i ) → ∞ in Lemma 1, which is consistent with the fact that ψ i (•) ∞ for subordinator processes X i (•).Similar to the nonsubordinator case, we can again present a vector/matrix version for the Laplace-Stieltjes transforms ζ i (γ) of (the distributions of) the random variables Z i : To this end, define b as defined in ( 9).Then, using similar steps as before, we eventually find the following counterpart of Theorem 1.
Theorem 2. For any γ ≥ 0, the vectors z(γ Note that, also in this set of equations, the vector b • (γ) still contains unknowns.These constants ω i , one for each i ∉ S ↑ , are identified in the next section.

Evaluation of the Unknowns
So far, we have established that, in the spectrally positive case, the Laplace-Stieltjes transforms of Z 1 , : : : , Z d are given by the solutions of ( 14) (which simplifies to (11) in case none of the Lévy processes X i (•) is a nondecreasing subordinator process).This section settles the complication that (11) contains unknown constants ω i .As we see, the number of such constants equals the number of states that do not correspond to nondecreasing subordinators, which we denote by d • (i.e., d • : d − |S ↑ |).To identify these d • unknowns and, ultimately, the solution z(γ) of ( 14), we subsequently analyze three cases: • The background chain has no transient classes.
• The background chain has exactly one transient class.
• The background chain has more than one transient class.
We proceed by studying each of these cases separately.

No Transient Classes.
In case the background chain has no transient classes, all classes of the chain are necessarily recurrent.To analyze Z i , it evidently suffices to restrict ourselves to the recurrent class in which the background state i is.As a consequence, without loss of generality, we may assume that the background process J(•) is irreducible.In this case, which has been studied extensively (see, e.g., the results in D'Auria et al. 2010, Dieker and Mandjes 2011), the following procedure can be used to identify the ω i .Note that, using the linear equations given in ( 14), one may express the vector z(γ) by relying on Cramer's rule.More concretely, with the matrix M b,i (γ) denoting the matrix M(γ) in which the ith column is replaced by the vector b • (γ), we have that Because ζ i (γ) is finite, any zero of the denominator should be a zero of the numerator.According to Proposition 2, in case J(•) consists of a single class, det M(γ) 0 has d • zeroes in the right half of the complex plane.For ease of exposition, we make the assumption that these zeroes have multiplicity one (and we call them, say, γ 1 , : : : , γ d • ).In the special case this assumption does not hold, a reasoning similar to the one that follows still applies, but one needs to resort to the concept of Jordan chains.We do not discuss this procedure in detail and instead refer to the in-depth treatment in D' Auria et al. (2010).
Having distinct zeroes guarantees that we have d • equations to identify the ω i .That is, for i 1, : : : , d and j 1, : : : in other words, the zeroes of det M (in the right half of the complex plane, that is) are also zeroes of det M b,i for each i 1, : : : , d.For any given j 1, : : : , d • , this seemingly yields d equations, but it can be seen easily that each of these d equations effectively provides the same information.Indeed, with m k (γ) denoting the kth column of M(γ), suppose, for any fixed i, that det M(γ) 0 and det M b,i (γ) 0 for some γ ≥ 0. This implies that both M(γ) and M b,i (γ) are singular, and as a consequence, there are nontrivial vectors u and v such that As a consequence, for any i ≠ i, entailing that there is a linear combination of the columns of M b,i (γ) that equals 0: In other words, M b,i (γ) is singular, and hence, det M b,i (γ) 0 as well.Now that we know that (15), for any given index j 1, : : : , d • , provides us with just a single equation, we study this equation in more detail.Let us focus on det M b,1 (γ j ) 0 for j 1, : : : , d • (we take i 1, that is).With Mij (γ) representing the (d − 1) × (d − 1) matrix that results after deleting the ith column and the jth row from M(γ) and recalling that b We, thus, obtain d • equations (one for each γ j ) that are linear in the unknowns ω 1 , : : : , ω d • , which can be dealt with in the standard manner, thus yielding a solution for the ω i .

A Single Transient
Class.We now consider the case in which the background chain has a single transient class, say T ⊂ {1, : : : , d}, next to one or more recurrent classes.In this case, note that the ζ i (γ) for all recurrent states i, that is, i ∉ T, can be computed by the procedure pointed out earlier.Subsequently, for i ∈ T ,we rewrite the ith equation of ( 14) as Observe that the right-hand side is known; we denote it by b i∈T represents the vector of right-hand sides of (16).Using these definitions, ( 16) can be written as Clearly, suppose that we could prove that det M(γ) 0 has d • zeroes in the right half of the complex plane; then, we could identify the constants ω i by following the same approach as the one we develop for the case of no transient classes.This is why we now verify that the entries of M(γ) can be written in the form (8) with transition rates that correspond to a single recurrent class so that we can apply Proposition 2 to establish the desired property for the number of zeroes of det M(γ) 0 in the right half of the complex plane.By rewriting the diagonal elements of M(γ) as with q ii : − j∈T\{i} q ij and ϑ i : we conclude that the row sums of transition rates q ii + j∈T\{i} q ij equal zero for all i ∈ T. This means that M(γ) indeed has the desired form: the entries are of the form (8) with transition rates that correspond to a single recurrent class.Applying Proposition 2, we have that det M(γ) 0 has d • zeroes in the right half of the complex plane so that we can identify the ω i for i ∈ T \ S ↑ (repeating the remark on roots with the multiplicity larger than one as made earlier in relation to the case with recurrent states only).
3.3.3.Multiple Transient Classes.We now consider the case in which there are K > 1 transient classes (say T 1 , : : : , T K ).We let R be the union of all remaining recurrent classes.Furthermore, we write T k ↝ T k if there is a direct transition from a state in T k to a state in T k , that is, there is a state i ∈ T k and a state j ∈ T k such that q ij > 0.
To handle the case of multiple transient classes, we order the transient classes in "layers" as follows.Let C 0 : R, and for n 1, 2, : : : , let the nth layer set be given by It is worth noting that, if a background state i is element of the layer set C j but not of C j−1 , then the background chain can reach a recurrent state in minimally j transitions.In addition, we can observe that the number of nonempty layer sets (including C 0 ) is at most K + 1. See Figure 2 for a pictorial illustration.
In the previous two cases, we already explained how to compute ζ i (γ) for i ∈ R and i ∈ C 1 , respectively.We now point out how we can evaluate ζ i (γ) for i ∈ C n , having ζ i (γ) for i ∈ R, C 1 , : : : , C n−1 at our disposal so that we can recursively determine all ζ i (γ).Suppose that T k ⊆ C n \ C n−1 (where it is noted that there are potentially multiple transient classes in C n \ C n−1 ).As states in T k cannot have direct transitions to classes outside C n−1 , we have, From this point, the analysis follows that of the case with a single transient class.More specifically, the number of zeroes of the determinant of the matrix (m ij (γ)) i,j∈T k in the right half of the complex plane equals the number of states in T k that do not correspond to nondecreasing subordinators, using the same argument as in the case of a single transient class.This allows us to identify the ω i for i ∈ T k \ S ↑ .

Spectrally Negative Case
The model we analyze in this section can be seen as the spectrally negative counterpart of the one considered in the previous section.This concretely means that now the Lévy processes X i (•) are assumed to be spectrally negative and the jumps L ij are nonpositive.In addition, we replace our earlier definition of the entries of the matrix for ν ≥ 0, to account for the nonpositive jumps.As in the spectrally positive case, the matrix M(ν) is helpful in establishing the main result of this section.Notes.In this case, Unlike in the previous section, throughout the present section, we focus directly on p i (u) P(Z i ≥ u) rather than on its Laplace transform P i (γ); as it turns out, Laplace transforms are not required in the analysis of the spectrally negative case.A convenient feature, made more precise later, is that, in the spectrally negative setting, the form of the distribution of the Z i is known.
Somewhat comparably to the setup of Section 3, to make the presentation as transparent as possible, we first treat the case in which none of the Lévy processes X i (•) is a nonincreasing subordinator (Section 4.1), after which we point out how to adapt the analysis to the case in which some of the X i (•) are (Section 4.2).
The following claim plays a crucial role in this section.
By Proposition 2, we already know that Lemma 3 holds if the background process is irreducible.In Section 4.3, we provide a proof for the case that the background process has a general chain structure.

Nonsubordinator Case
In this section, we consider the situation in which none of the states corresponds to a nonincreasing subordinator.This means that, for all i 1, : : : , d, Proposition 1 implies that the running maximum X i (T ϑ i +q i ) has an exponential distribution with rate μ i : Ψ i (ϑ i + q i ).Recalling that the time until either the process is killed or the background chain transitions are distributed exponentially with rate ϑ i + q i , we, thus, obtain the identity with To streamline our analysis, we impose Property (A).By Lemma 3, we know that, in this setting, without nonincreasing subordinators, the equation det M(ν) 0 has d zeroes with a positive real part.
The d zeroes of det M(ν) with a positive real part are distinct: (A) Importantly, however, imposing (A) effectively does not impose any restriction: as discussed in Remark 2, the argumentation can be adapted to cover zeroes with higher multiplicities.A crucial fact is that the first-passage process pertaining to Y(•) is a MAP itself irrespective of whether the background process is irreducible; cf. the discussion in Ivanovs (2011), section 2.6.This implies that the random variables Z i have phase-type distributions; see, for example, Asmussen (2003), section III.4, for background on this class of distributions.More specifically, one obtains their Laplace transforms by plugging in α 0 in the expression of the first statement of Ivanovs (2011), corollary 4.21.The result concretely entails that, in our setting, without nonincreasing subordinators, for a d × d transition rate matrix Λ, a vector l : −Λ1 ≥ 0 with at least one positive entry, and initial distributions a 1 , : : : , a d , Recalling the definition of M(ν) in ( 19), the zeroes of det M(ν) coincide with those of det(−νI − Λ); cf.Ivanovs (2011), theorem 4.7, and again the first statement of Ivanovs (2011), corollary 4.21.Because of this, the matrix (γI − Λ) is singular in γ −ν 1 , : : : , − ν d ; hence, E(e −γZ i ) can be written as a linear combination of the terms 1=(γ + ν 1 ), : : : , 1=(γ + ν d ).This means that, under (A), we can write, for k 1, : : : , d and u ≥ 0, where C (c ik ) d i,k 1 is a matrix of unknown coefficients whose rows add up to one.Remark 1.As mentioned, the first statement of Ivanovs (2011), corollary 4.21, already provides a characterization of the distribution of the random variables Z i (for i 1, : : : , d) under a possibly nonirreducible background chain.It is noted, though, that in Ivanovs (2011), corollary 4.21, the distribution of the Z i is given in terms of a Laplace-Stieltjes transform, which contains unknown matrices (viz. in the terminology of Ivanovs (2011), the matrices Λ(q) and Π(q)), which can be numerically computed, for example, using Ivanovs (2011), theorem 4.14.Our contribution is that we obtain a more explicit result in Theorem 3: our result concerns the probabilities p i (u), Stochastic Systems, 2022, vol. 12, no. 3, pp. 293-317, © 2022 The Author(s) Downloaded from informs.org by [146.50.150.90] on 02 November 2022, at 03:33 .For personal use only, all rights reserved.corresponding to the tail of Z i , rather than their transforms.For each i, we succeed in expressing p i (u) in terms of the solutions of an eigensystem.
We now exploit the structure as given in (22) to generate equations by which we can determine the coefficients c ik .To this end, we define a ik : c ik ν k .By conditioning on the value of Z j in (21) using ( 22), we, thus, obtain We then substitute v by u − w − x and recall that X i (T ϑ i +q i ) and L ij are nonpositive random variables, leading to Pulling the sum in front of the integrals leads to a sum in which the two integrals factorize.That is, we obtain Now, we can rewrite the first integral in this expression using (1) and Proposition 1, yielding Furthermore, we note for the second integral of (24) that u 0 Combining these, we conclude that It can be seen that the μ i differ from the ν k because, if they were equal for some pair (i, k), then p i (u) would have a term that is constant in u, thus violating its form given in (22).In Appendix A.2, an alternative, probabilistic proof of ( 25) is given.We now focus on finding the values of the coefficients c ik for i, k 1, : : : , d. Observe that we have two alternative ways of writing p i (u): Representation ( 22) and a representation based on ( 20) and ( 25).Note that both are linear combinations of e −μ i u and e −ν 1 u , : : : , e −ν d u .The weights corresponding to each of these d + 1 exponentials should match, thus providing equations that impose constraints on the c ik .
• Focusing on the terms corresponding to e −ν k u for k − 1, : : : , d, we, thus, obtain the equations where, as observed earlier, d k 1 c ik 1: • Regarding the terms corresponding to e −μ i u , recalling that μ i differs from all the ν k , we should have This equation holds true if (26) applies, which can be seen by recognizing the left-hand side as 1 − d k 1 c ik (as the obvious consequence of d k 1 c ik 1).In other words, this equation does not provide any additional information.
We now observe that ( 26) is equivalent to, for i, k 1, : : : , d, We reassuringly notice from ( 27) that the matrix M(ν k ) is singular for all k 1, : : : , d, and hence, that the ν k are indeed the solutions to det M(ν) 0.
As was done in the spectrally positive case, our result can be rewritten in a more compact vector/matrix form.In particular, to find the c jk , it is enough to solve, for k 1, : : : , d, the matrix-vector equation where c k : (c 1k ,: : : ,c dk ) , subject to C 1 1.The following theorem summarizes these findings.Remark 2. We briefly comment on the case in which some solutions of det M(ν) 0 have multiplicity larger than one.For instance in the case of a root with multiplicity two, suppose that, for some k 1 ≠ k 2 , ν k 1 ν k 2 ν, giving rise to terms in ( 22) proportional to e −νu and u e −νu .Finding the associated weights works effectively as pointed out: use Identity ( 20) to find two alternative expressions for p i (u) and then equate the terms proportional to u e −νu so as to obtain linear equations for these coefficients (in addition to equating all terms proportional to e −ν k u ).For an in-depth treatment of these multiplicity issues, we again refer to D' Auria et al. (2010).

Subordinator Case
We now consider the case in which some of the states of the background process correspond to a nonincreasing subordinator.Let i be in the set of states corresponding to nonincreasing subordinators, denoted by S ↓ .For i ∈ S ↓ , Z i 0 with positive probability.
The structure of this section is similar to that of the nonsubordinator case, the main difference being that now the MAP cannot cross positive levels (in the upward direction, that is) while the background process is in i ∈ S ↓ .Therefore, the following decomposition applies for u > 0: Regarding the zeroes of det M(ν), we make a similar claim and assumption as in Section 4.1.Let d • : |S \ S ↓ | be the number of states that do not correspond to a nonincreasing subordinator.Then, by Lemma 3, we know that det M(ν) has d • zeroes with a positive real part, say (ν k ) k∉S ↓ .In our analysis, we impose Property (A ).
The d • zeroes of det M(ν) with a positive real part are distinct: (A ) The case of zeroes with higher multiplicities can be dealt with as discussed in Remark 2, and the case that J(•) is not irreducible is covered by Section 4.3.
Relying on the same reasoning as in Section 4.1, under (A ), we again have, because of the first-passage process being a MAP and the first statement of Ivanovs (2011), corollary 4.21, that p i (u) is a linear combination of exponential terms, whereas the number of such terms now equals d • .Concretely, for u > 0 and i 1, : : : , d, To identify the coefficients c ik , it proves worthwhile to further study p − ij (u).In particular, for any j 1, : : : , d, conditioning on the value of Z j yields Then, we subsequently use Relation (1) and Proposition 1 to obtain such that, in combination with (28), we have By equating ( 29) and ( 30), we, thus, obtain equations that the coefficients should satisfy.As it turns out, doing this for any i ∈ S ↓ and k ∉ S ↓ , we again obtain (26).Following the same steps as the ones leading to Theorem 3, we obtain the following result.However, the matrix C now consists of entries c ik with k ∉ S ↓ , whereas c k : (c 1k ,: : : ,c dk ) as before.
Theorem 4.Under Property (A ), the tail probability p i (u) satisfies for i 1, : : : , d.Here, for k ∉ S ↓ , the vectors c k solve M(ν k ) c k 0 subject to k∉S ↓ c ik 1 for all i ∉ S ↓ .
We note that, because the rows i of C such that i ∈ S ↓ do not add up to one, we have that P(Z i 0) 1 − k∉S ↓ c ik > 0:

Number of Roots with a Positive Real Part
In the previous sections, we provide a recipe to compute the tail probabilities p i (u) using Lemma 3. The objective of this section is to prove this lemma.To this end, we partition the state space of the background chain in K transient classes (say T 1 , : : : , T K ) and L recurrent classes (say R 1 , : : : , R L ).We label the classes such that, for ℓ ∈ {1, : : : , K}, class ℓ refers to T ℓ , and for ℓ ∈ {K + 1, : : : , K + L}, class ℓ refers to R ℓ−K .We also order the transient classes as is done in Section 3.3: for any ℓ, T ℓ has no transitions to other classes T ℓ such that ℓ ≤ ℓ.Furthermore, we let d • ℓ be the number of states in class ℓ that do not correspond to nonincreasing subordinators, ℓ 1, : : : , K + L.
With the introduced ordering of class, the transition rate matrix of the background chain can now be written in the following form: The block matrices Q K+1 , : : : , Q K+L correspond to the recurrent classes and can be interpreted as "true" transition rate matrices of Markov chains of lower dimension in that they have nonnegative entries except on their diagonals and their row sums are all zero.This does not hold for the block matrices Q1 , : : : , QK : because they correspond to transient classes, their off-diagonal entries are still nonnegative, but they have at least one strictly negative row sum.The matrices S k,ℓ with k 1, : : : , K and ℓ K + 1, : : : , K + L contain nonnegative entries and correspond to transitions from T k into a different class.The next step is to construct the matrix M(ν) that corresponds to the "rearranged transition matrix" Q.This matrix is "block upper triangular," which is inherited from the matrix Q.It concretely means that, for appropriately constructed matrices M1 (ν), : : : , MK (ν) and M K+1 (ν), : : : , M K+L (ν) (based on, respectively, Q1 (ν), : : : , QK (ν) and Q K+1 (ν), : : : here, the matrices Mℓ (ν) (for ℓ 1, : : : , K) correspond to transient classes, whereas the matrices M ℓ (ν) (for ℓ K + 1, : : : , K + L) correspond to recurrent classes.It is clear that det M ℓ (ν), where ℓ K + 1, : : : roots with a positive real part as an immediate consequence of Proposition 2. This also holds for det Mℓ (ν), where ℓ 1, : : : , K, which follows by rewriting the diagonal entries as we did in ( 17) and ( 18) in such a way that Mℓ (ν) has the desired form to apply Proposition 2. Upon combining these, we conclude that det M(ν) Remark 3. As indicated, the computation of the coefficients c ik amounts to solving an eigensystem; see Theorems 3 and 4.However, using the structure of the background chain, in specific cases, this computation can be simplified considerably.Appealing to the factorization of det M(ν) provided in Equation ( 32), it can be argued that some of the c ik are necessarily equal to zero.In the first place, let ν k be a root of det M(ν) such that det M ℓ (ν k ) 0 for some ℓ ∈ {K + 1, : : : , K + L} (i.e., ℓ corresponds to a recurrent class).If the roots of det M(ν) are simple, this means that ν k does not solve det M ℓ (ν) 0 for ℓ ∈ {K + 1, : : : , K + L} with ℓ ℓ, nor det Mℓ (ν) 0 for ℓ ∈ {1, : : : , K}.By virtue of the structure of the matrix M(ν), which is inherited from the transition rate matrix Q (as given in ( 31)), we, thus, conclude that c ik 0 for all states i from which the recurrent class ℓ cannot be reached.In the second place, analogously, if ν i is such that det Mℓ (ν i ) 0 for some ℓ ∈ {1, : : : , K} (i.e., ℓ corresponds to a transient class), then c ik 0 for all states k that cannot be reached from this transient class.This reduction procedure makes intuitive sense: informally, the distribution of Z i cannot be affected by properties of the MAP that correspond to states that cannot be reached from state i.

Maximum of a Spectrally One-Sided L évy Process over a Phase-Type Period
In Lévy fluctuation theory, the focus is predominantly on the evaluation of the distribution of extreme values over exponentially distributed intervals; see, for instance, Proposition 1 for a key result in this context.In the present section, we use our results on the maximum of a killed MAP to determine the distribution of a spectrally one-sided Lévy process over a phase-type distributed time interval.
The practical relevance of working with the class P of phase-type distributions lies in the fact that any distribution on the positive half-line can be approximated arbitrarily closely by a distribution in P (Asmussen 2003, theorem III.4.2).The proof of this property reveals that, actually, any distribution on the positive half-line can be approximated arbitrarily closely by elements from a smaller class, namely, the class of mixtures of Erlang distributions.In particular, a deterministic positive number can be approximated by an Erlang distributed random variable with a large number of phases.
This section has two main goals.In Section 5.1, we show how our results on the maximum of a killed spectrally one-sided MAP can be applied to derive the distribution of the maximum of a spectrally one-sided Lévy process over a phase-type distributed time interval.Then, in Section 5.2, we obtain more specific results for the practically relevant class of mixtures of Erlang distributions.

Translation into the MAP Framework
We start our exposition by interpreting a phase-type distributed random variable as an absorption time in a continuous-time Markov chain.Each element in the class P is characterized by (i) a finite state space {1, : : : , d}; i,j 1 with nonpositive diagonal entries, nonnegative off-diagonal entries, and nonpositive row sums; and (iv) a nonnegative exit vector t : −T1.Note that the (d + 1) × (d + 1) matrix

T :
T t 0 0 , is a genuine transition rate matrix of a (d + 1)-state Markov chain in that its diagonal entries are nonpositive, its off-diagonal entries are nonnegative, and its row sums are equal to zero.The (d + 1) st column and row in this matrix correspond to a newly added state d + 1, which we refer to as the absorbing state.Observe that this chain can hit state d + 1 from any other state according to the exit vector t.Now, the phase-type random variable corresponding to the preceding instance is the time it takes the expanded Markov chain (with transition rate matrix T) to reach the absorbing state, at which the initial state has been sampled according to the distribution a.
We now consider the distribution of the maximum of the spectrally one-sided Lévy process X(•) over a phasetype distributed time interval (being characterized by the initial distribution a and the transition rate matrix T).
To use the MAP framework that we have been working with in the previous sections, we let X 1 (•), : : : , X d (•) be independent copies of a common spectrally one-sided Lévy process X(•) such that the resulting MAP evolves as this Lévy process.We write φ(•) for the Laplace exponent of X(•) in case it is spectrally positive, and we write Φ(•) for the cumulant generating function of X(•) in case it is spectrally negative.In addition, we let X d+1 (t) ≡ 0 for all t ≥ 0. Furthermore, we choose Q T + diag(t) and ϑ t such that absorption in state d + 1 corresponds to killing.In addition, let the jumps of the MAP at transition epochs of the background process, as represented by the random variables L ij , be equal to zero.Observe that, under this construction, with Z denoting the maximum of the Lévy process X(•) over the phase-type interval, where P(Z i ≥ u) can be analyzed using the techniques for extremes of MAPs as developed earlier in the paper.

Mixtures of Erlang Distributions
Because, with this distribution class, we can approximate any nonnegative random variable arbitrarily closely, we are particularly interested in the case in which the time interval is a mixture of Erlang distributions.This concretely means that, for some k ∈ N and i 1, : : : k, with probability p i ∈ [0, 1] the length of the interval is sampled from an Erlang distribution with shape parameter d i ∈ N and scale parameter τ i > 0 (obviously requiring k i 1 p i 1).It takes little thought to conclude that, in order to evaluate the maximum of the Lévy process of such an interval, it suffices to be able to evaluate its maximum over an Erlang distributed time interval (say with parameters d ∈ N and τ > 0).This requires us to extend the result of an example from Asmussen and Ivanovs (2018), which focuses on the maximum of Brownian motion (with a given drift and variance parameter) over an Erlang(d, τ) distributed time interval.Specifically, we generalize this result to any spectrally one-sided Lévy process (in which, to avoid trivial cases, we assume that the underlying Lévy process is not a subordinator).Related results on maxima over an Erlang horizon include Boxma and Mandjes (2021), section 5; De ¸bicki and Mandjes (2015), section IV.1;and Starreveld et al. (2016).In the remainder of this section, we treat both spectrally onesided cases separately.
The direct implication of the matrix M(•) being upper triangular is that z(γ) as well as the unknown constants ω i can be solved recursively.A concrete recipe for this could be the following.Defining m(γ) : −τ + φ(γ), we first find ζ d (γ) b d (γ)=m(γ).Note that the numerator contains the constant ω d (see ( 9)), which can be identified using the observation that the (single) zero, say γ ?: ψ(τ), from the denominator should be a zero of the numerator as well.As a next step, we identify ζ d−1 (γ) observing that This expression contains the (by now known) constant ω d as well as the (still unknown) constant ω d−1 through the function b d−1 (γ).However, ω d−1 can again be found, noting that the double zero from the denominator (which is again γ ? ) is also a double zero of the numerator.Thus, noting that m(γ ? ) 0, we obtain from which we find the unknown constant ω d−1 .We can continue along these lines until we have identified all Laplace-Stieltjes transforms ζ i (γ) and corresponding constants ω i for i 1, : : : , d.After a number of computations, one then obtains Because ζ j (ψ(τ)) is not well defined for any j 1, : : : , d, we note that lim ν→ψ(τ) ζ j (ν) can be derived from ζ j (γ) by L'H ôpital's rule.5.2.2.Maximum of a Spectrally Negative L évy Process over an Erlang-Distributed Time Interval.For the spectrally negative case, we follow the line of reasoning used in Section 4. The first observation is that, in this case, van Kreveld, Mandjes, and Dorsman: Extreme Value Analysis for a Markov Additive Process Stochastic Systems, 2022, vol. 12, no. 3, pp. 293-317, © 2022 The Author(s) where h m : ∞ 0 x m e −νx P(A > x) dx.In addition, h m satisfies the recursion with h • m : E(A m e −νA ) and h 0 1 ν (1 − E(e −νA )).Proof.The first claim is verified by observing that P(−A ≥ x) 1 − P(A > −x).The second claim, the recursive relation for h m , follows by applying integration by parts.Namely, for m 1, 2, : : : , Finally, the expression for h 0 results from the definition of h m in combination with (1).This completes the proof.ٗ In our analysis, we apply this lemma to the case in which A −X(T τ ) so that In conclusion, where, for n 1, : : : , m, At this point, we have obtained two expressions for p i (u), both in terms of a sum whose summands are proportional to e −νu , u e −νu , : : : , u d−i e −νu : Equating these gives a linear system from which the coefficients a ik can be solved.In this context, it is noted that, conveniently, solving this linear system allows a recursive solution procedure.To see this, first recall that a d1 ν.Then, a d−1,1 and a d−1,2 , appearing in (34) for i d − 1, are, by (37), expressed in terms of a d1 .Then, along similar lines, a d−2,1 , a d−2,2 and a d−2,3 are expressed in terms of a d−1,1 and a d−1,2 and so on.

Numerical Experiments
In the previous sections, we develop theory on the distribution of the maximum of a spectrally one-sided MAP.We now discuss some practical issues concerning the implementation of our findings, the determination of the role of the model parameters, and the application of our results in a practical context.So as to cover these three issues, we consider three experiments: the first highlights the impact of the structure of the background process, the second focuses on the maximum of a Lévy process in Erlang-distributed time intervals, and the third is motivated by a problem in risk theory.

Impact of the Chain Structure of the Background Process
In this first experiment, we consider a spectrally positive MAP in which the background chain J(•) has the structure shown in Figure 3 (where q ij > 0 when there is an arrow from i to j and q ij 0 otherwise).With states 4 and 5 being absorbing, the background process is clearly not irreducible.As a consequence, we cannot use results from the existing literature and have to rely on the results found in Section 3. Our goal is to evaluate z(γ), that is, the vector of Laplace-Stieltjes transforms of the Z i .This vector is the solution of the matrix Equation ( 14), in which we follow the procedure developed in Section 3.3 to determine the unknown constants ω i .We first categorize the communicating classes of J(•) in layers: using the notation from Section 3.3, we have C 0 R {4, 5}, C 1 {2, 3, 4, 5} and C 2 {1, 2, 3, 4, 5}.Note that, even though the communicating class {1} has a transition to the recurrent state 5, it belongs only to C 2 because it also has transitions into C 1 .Following our procedure, we consecutively evaluate ζ i (γ) for i ∈ {4, 5}, then for i ∈ {2, 3}, and finally for i 1.
Using this approach, we now consider an example MAP with the background chain structure given in Figure 3.We let X 1 (•), X 2 (•), X 3 (•), X 5 (•) correspond to standard Brownian motions, and X 4 (•) to a gamma process (i.e., Lévy process with independent gamma distributed increments) with jump intensity two and jump size van Kreveld, Mandjes, and Dorsman: Extreme Value Analysis for a Markov Additive Process Stochastic Systems, 2022, vol. 12, no. 3, pp. 293-317, © 2022 The Author(s) 311 Downloaded from informs.org by [146.50.150.90] on 02 November 2022, at 03:33 .For personal use only, all rights reserved.parameter two so that φ 4 (α) 2 log 2 α + 2 : Additionally, we let the background be governed by the transition rate matrix , and we set L ij 0 for all i, j, so there are no jumps at transition epochs of the background process.Finally, we consider the setting in which # (0,0,0,0:5,0:5) , meaning that killing only happens in states 4 and 5.For these model parameters, we plot in Figure 4 the density functions f i (•) of the Z i .We comment on a few aspects pertaining to Figure 4 that illustrate the impact of the chain structure.First, from the given transition rate matrix, it is clear that, if J(0) 2, then the process likely ends up in state 5.This explains why the densities of Z 2 and Z 5 behave similarly.A similar reasoning applies to Z 3 and Z 4 .Also, Z 4 and Z 5 are "closer to being killed" than Z 2 and Z 3 and, therefore, have more probability mass close to zero.Finally, notice that, from initial state 1, absorption in state 4 or 5 is about equally likely, resulting in a density function that roughly behaves as the average of the two pairs mentioned.
van Kreveld, Mandjes, and Dorsman: Extreme Value Analysis for a Markov Additive Process 6.2.Maximum of a L évy Process in an Erlang-Distributed Time Interval This example focuses on the distribution of the maximum of a Lévy process over a time interval of Erlangdistributed length, applying the theory of Section 5. We choose the model parameters as pointed out in Section 5.1.That is, we let background state i represent the ith phase of the interval.With n denoting the total number of phases, we let q i,i+1 1 n for all i 1, : : : , n − 1 and # (0, 0, : : : , 0, 1 n ) .This way, the mean interval length equals unity, and killing occurs only in the last phase.Because phase transitions should not affect the Lévy process, we take L i,i+1 0 for all i 1, : : : , n − 1.We particularly study the impact of the number of phases on the distribution of the maximum of the Lévy process, bearing in mind the Erlang distribution's capability of approximating a deterministic number.Indeed, our Erlang random variable converges to the deterministic value one as n → ∞, and this experiment serves to get insight into the maximum of a Lévy process during a deterministic time interval.
We first consider the case of X(•) being a standard Brownian motion, noting that, for this instance, we know that its maximum in a deterministic interval has a half-normal distribution (i.e., the distribution of the absolute value of a normally distributed random variable).Figure 5 shows the corresponding density functions for n 1, 2, 5 phases as well as its limiting counterpart.The figure confirms that the densities converge to the limit, at which the curve for n 5 already produces a reasonable fit.We proceed with an example in which the distribution of the maximum over a deterministic time horizon is not known.Let X(•) be the independent sum of (i) a standard Brownian motion that is increased by a positive  Stochastic Systems, 2022, vol. 12, no. 3, pp. 293-317, © 2022 The Author(s) 313 Downloaded from informs.org by [146.50.150.90] on 02 November 2022, at 03:33 .For personal use only, all rights reserved.
drift one and (ii) a compound Poisson process with arrival rate one and Erlang(2, 2) distributed jumps in the negative direction (thus, rendering the process spectrally negative).Figure 6 illustrates the (fast) convergence of the density functions as n grows, thus providing us with a way to approximate the distribution of the maximum of X(•) evaluated over a deterministic interval.

Risk Model
The last example is a special case of the model discussed in Delsing and Mandjes (2021) and is motivated by applications in credit risk.The process of interest is the capital of an insurance company over time with a finite number of obligors n.Each obligor independently goes into default after an exponentially distributed time with mean one.When going into default, the obligor makes a claim of exponential size with mean one and immediately ends the contract with the insurance company (i.e., leaves the system).Each obligor not gone into default pays premiums at rate r per time unit.Figure 7 shows a possible sample path of the process.We wish to quantify the ruin probability, that is, the probability that the capital of the insurance company eventually hits zero given some initial reserve u ≥ 0. This model can be cast in our framework as follows.Let background state i represent the number of obligors that have not yet gone into default.Then, the transition rates of the background chain are q i,i−1 i for i 1, : : : , n (all other transition rates are zero).As we are interested in the all-time ruin probability, we let # 0. Observe that the ruin probability depends on the minimum of the process, in which the results in this paper are in terms of the maximum, but this is easily remedied by flipping the sign.Concretely, we choose X i (t) −irt for t ≥ 0, i 0, : : : , n, and we have positive jumps L i,i−1 of exponentially distributed size with mean one for i 1, : : : , n.The ruin probability with initial capital u is now given by P(Z n ≥ u).
In an example with (initially) four obligors, Figure 8 shows how the ruin probability depends on the initial capital u and premium rate r.The ruin probability is decreasing in both u and r, and our techniques can be used to assess the corresponding sensitivities.Clearly, one can trade off u and r: when reducing the premium rate r, a higher initial surplus u is needed to guarantee a given ruin probability.This trade-off is illustrated in Figure 9.In terms of Laplace-Stieltjes transforms, this is equivalent to ζ i (γ) ϑ i ϑ i + q i κ i (γ) + j≠i q ij ϑ i + q i κ i (γ)E e −γ[X i (Tϑ i +q i )+Lij+Zj] + , (A.1) where, by Proposition 1, the random variable −X i (T ϑi+qi ) is exponentially distributed with rate μ i .We now present a lemma enabling us to evaluate the rightmost Laplace-Stieltjes transform.
Lemma A.1.Let A be a nonnegative random variable.Then, for γ, μ ≥ 0, we have Proof.Applying the standard identity e −x + + e x − e −x + 1 to x γ(A − T μ ), it holds that, using the memoryless property of The result follows from the fact that P(A < T μ ) E(e −μA ).w Equation ( 6) can now immediately be obtained from (A.1) by using the preceding lemma with A L ij + Z j and recalling (2).
A.2. Alternative Derivation of Equation ( 25) Observe that Definition ( 21) is equivalent to p − ij (u) : P X i (T ϑi+qi ) < u , X i (T ϑi+qi ) + X i (T ϑi+qi ) + L ij + Z j ≥ u : For k 1, : : : , d, let Z jk be independent exponentially distributed random variables with rate ν k .An alternative formulation of ( 23) is that, recalling that X i (T ϑi+qi ) is exponentially distributed with parameter μ i , c jk P X i (T ϑi+qi ) < u , X i (T ϑi+qi ) + X i (T ϑi+qi ) + L ij + Z jk ≥ u : Because X i (T ϑi+qi ) and L ij are nonpositive, the memoryless property of Z jk implies that c jk P X i (T ϑi+qi ) < u , Z jk + X i (T ϑi+qi ) ≥ u • P(Z jk > − (X i (T ϑi+qi ) + L ij )): Because Z jk has an exponential distribution with rate ν k , we have the equality P(Z jk > A) E(e −ν k A ) for any nonnegative random variable A. Combining this observation with the fact that X i (T ϑi+qi ) is exponentially distributed with rate μ i , we then arrive at where the transform of X i (T ϑi+qi ) is taken from Proposition 1.This completes the alternative derivation of Equation (25).

A.3. Alternative Derivation of Equation (37)
Let U k be an Erlang random variable with k phases of rate ν, and let (V n ) n≤k denote the first n phases of U k .Equation (36) can be written as p − i+1 (u) P(X i (T τ ) ≤ u, X i (T τ ) + X i (T τ ) + Z i+1 > u) because Z i+1 is a mixture of d − i − 1 Erlang random variables (see ( 33)).Suppose we need to add to X i (T τ ) exactly ℓ ∈ {1, : : : , k} exponential phases of U k in order to exceed u, that is, u − X i (T τ ) ∈ [V ℓ−1 , V ℓ ).The remainder of the ℓ th phase is again exponential, so there are k − ℓ + 1 phases left to negate the nonpositive variable X i (T τ ).Therefore, we write p − i+1 (u)

Figure 1 .
Figure 1.Example of a MAP with Two Background States and Its Running Maximum Process

Proposition 2 .
Let γ ≥ 0, and suppose the background chain J(•) consists of a single (hence, recurrent) class.If X 1 (•), : : : , X d (•) are spectrally positive Lévy processes and the jump sizes at transition epochs L ij are nonnegative a.s.for all i, j, then the equation det(M(γ)) 0 has d − |S ↑ | solutions in C with a positive real part.If X 1 (•), : : : , X d (•) are spectrally negative Lévy processes and the jump sizes at transition epochs L ij are nonpositive, then the equation det(M(−γ)) 0 has d − |S ↓ | solutions in C with a positive real part.

van 0 e 0 e
Kreveld, Mandjes, and Dorsman: Extreme Value Analysis for a Markov Additive Process where β Y (α) : ∞ −αx P(Y > x) dx and η Y (α) : ∞ −αx P(Y ∈ dx), the latter representing the Laplace-Stieltjes transform of Y.This relation trivially follows as an application of integration by parts.

•i
(γ): Define d : |T| as the number of transient states and d • : |T \ S ↑ | as the number of transient states that do not correspond to nondecreasing subordinators.In addition, we define the d × d matrix M(γ) : (m ij (γ)) i,j∈T , and we let the d-dimensional vector z(γ) : (ζ i (γ)) i∈T represent the entries of z(γ) that correspond to the states in T. Likewise, b •

Figure 2 .
Figure 2. Example of Layer Sets with K 4 Transient Classes

Theorem 3 .
Under Property (A), p i (u) satisfies p i (u) d k 1 c ik e −ν k u , for i 1, : : : , d.Here, for k 1, : : : , d, the vectors c k solve M(ν k ) c k 0 subject to C 1 1: 5.2.1.Maximum of a Spectrally Positive L évy Process over an Erlang-Distributed Time Interval.Recall from Theorem 1 that the Laplace-Stieltjes transform of the maximum, z(γ), satisfies the linear system M(γ)z(γ) b(γ), where the (d × d)-dimensional matrix M(•) is given by

Figure 5 .Figure 6 .
Figure 5. Density Functions f (•) of the Maximum of a Standard Brownian Motion over an n-Phase Erlang-Distributed Interval with Mean One (for n 1, 2, 5) with the Solid Line Representing Their Counterpart for a Deterministic Interval of Length One (as n → ∞)