Review of statistical actuarial risk modelling

Abstract: In this paper, we review some results for insurance risk theory. We first introduce a variety of the insurance risk models proposed thus far. Then, we show that the expected discounted penalty function (the so-called Gerber–Shiu function) can describe some risk indicators. Next, the dividend problem is discussed; more precisely, the (approximated) optimal dividend barrier is derived and other extended dividend strategies introduced. In addition, some modified models depending on reinsurance or tax are introduced. Finally, we discuss the statistical estimation of the ruin probability and the Gerber–Shiu function.


Introduction
In the actuarial literature, the classical approach is to represent an insurance risk as a stochastic process with a certain number of control variables. The term "insurance risk" here means the risk for non-life insurance such as motor vehicle insurance, home and contents insurance, and travel insurance rather than for life or health insurance, for example. Once an insurance policy is insured against the insurance risk, the insured party pays an appropriate premium to the insurer at the start of the period of insurance cover. Then, the insured party will make a claim under the insurance policy each time he or she has an insurance accident. From an insurer's point of view, the surplus is generally defined by ABOUT THE AUTHOR Hiroshi Shiraishi received the BS degree in mathematics in 1998 and the MS and Dr degrees in mathematical science from Waseda University, Japan in 2004 and 2007, respectively. He joined the GE Edison Life Insurance Company, the Prudential Life Insurance Company of Japan and the Hannover-Re Reinsurance Company, in 1998Company, in , 2000Company, in and 2005. His research interests are actuarial science, time series analysis, econometric theory and financial engineering. In particular, he investigates the statistical estimation of optimal dividend problems in the field of the actuarial science. He is currently an associate professor in the Department of Mathematics, Keio University, Japan. He is a fellow of the Institute of Actuary of Japan (FIAJ).

PUBLIC INTEREST STATEMENT
In this paper, we review some results for insurance risk theory. Once an insurance policy is insured against the insurance risk, the insured party pays an appropriate premium to the insurer at the start of the period of insurance cover. Then, the insured party will make a claim under the insurance policy. From an insurer's point of view, the (pure) surplus is defined by ''initial surplus + (aggregated) premium income -(aggregated) claim payment". We introduce the insurance risk models proposed thus far. In classical insurance risk analysis, it is interested in the ruin probability which is the probability that the surplus is below zero. We introduce its extended quantity the so-called Gerber-Shiu function. In addition, we discuss some applications for the insurance risk theory such as the dividend, reinsurance, tax problems. Finally, we discuss the statistical estimation of the Gerber-Shiu function.
Here, there are two sources of uncertainty for the insurer as follows: • How many claims will the insured party make?
• What will the amounts of those claims be?
alVarious extensions of the Cramér-Lundberg model have been proposed. For instance, Dufresne and Gerber (1991) added a diffusion process to the original model and Huzak and Perman (2004) included a Lévy process with mean 0 and no positive jumps. Andersen (1957) generalized the claim number process to a renewal process that included a Poisson process. Albrecher and Teugels (2006) considered the possible dependence between the interclaim time and claim size.
In classical insurance risk analysis, if the above surplus falls to zero or below, we say that ruin occurs, and the probability of ultimate ruin (i.e. the probability that ruin eventually occurs) is discussed. In particular, by denoting the ruin probability as a function of the initial surplus (u), the effect of u for the ruin probability is analysed. Gerber (1987) argued that the ruin probability is a crude stability criterion and discussed the seriousness of the situation when ruin occurs (the so-called "severity of ruin").  introduced the expected discounted penalty function (the so-called "Gerber-Shiu function") to treat both the deficit at ruin (which is an indicator of the severity of ruin) and the time of ruin simultaneously. The Gerber-Shiu function is described as a function of certain control variables such as the initial surplus, dividend barrier and reinsurance parameter.
De Finetti (1957) argued that it is unrealistic to minimize the ruin probability for an infinite length of time assuming that companies can grow their surplus without limit. He proposed an alternative formulation to avoid the infinite growth of the surplus, that is, a payment of a part of the surplus to the shareholder as a dividend (i.e. a refund). However, two fundamental questions have to be addressed: • When should dividends be distributed?
• How much of the surplus should be distributed?
The answer to these questions is based on the dividend strategy. Avanzi (2009) reviewed a variety of dividend strategies such as the constant barrier strategy, band strategy, linear barrier strategy and non-linear barrier strategy. Bühlmann (1970), Gerber and Shiu (2004), Gerber, Shiu, and Smith (2006), and Dickson and Drekic (2006) among others derived optimal dividend barriers, which maximize the expectation of the discounted dividends under the constant barrier strategy. In addition, Gerber et al. (2006) and Dickson and Drekic (2006) discussed the dividend problem under the optimization of the expected discounted dividends excluding the present value of the deficit at ruin. Further, because the explicit form of the optimal dividend barrier is obtained only in special cases, Gerber, Shiu, and Smith (2008) approximated some formulae for this case.
In this study, we introduce related topics such as "reinsurance" and "tax". For the former, we adopt an insurer's point of view to illustrate how utility theory can be applied to determine the optimal reinsurance parameters, which correspond to the optimal retention level under both the proportional and the excess-of-loss reinsurance. For the latter, we explain the "loss-carry forward system" discussed by Albrecher and Hipp (2007) and Albrecher, Badescu, and Landriault (2008).
The remainder of this paper is organized as follows. In Section 2, we introduce a variety of the insurance risk models proposed thus far. Section 3 discusses whether the Gerber-Shiu function can describe some of the risk indicators. The dividend problem is reviewed in Section 4. We first show the expected discounted dividends under some distributions of the claim amount. Then, approximation methods for the optimal dividend barrier are shown. We next consider the effect of the deficit at ruin and, finally, extended dividend strategies are introduced. In Section 5, some modified models depending on reinsurance or tax are introduced. Finally, Section 6 discusses the statistical estimation of the ruin probability and Gerber-Shiu function.

Risk models
A variety of risk models have been introduced thus far. Here, the risk models show the modelling of the surplus at time t. The definition is as follows: Definition 2.1 The surplus at time t is where u ≥ 0 is the initial surplus of the company, p(t) is the (predictable) premium income with p(0) = 0 and p(t) > 0 for t > 0, and {S(t)} is the aggregate claim process with S(t) ≥ 0 for t > 0.
In most cases, the premium income is assumed to be received continuously at a constant rate c > 0 (i.e. p(t) = ct), which implies that the possibility of gaining or losing customers is ignored. An insurance policy is a financial agreement between the insurance company and policyholder. The insurance company agrees to pay some benefits and policyholder agrees to pay premiums to the insurance company to secure these benefits. Suppose that the net premium (in which the insurance company's expenses are excluded from the gross premiums) for a single payment per insurance policy is denoted P. If a constant number of new policies (denoted L) is continuously included and the same number of the existing policies is withdrawn from the portfolio of insurance policies, then the insurance company can continuously receive LP(≡ c) as the premium income and the aggregated premium income is LPt(≡ ct). The insurance risk for the portfolio is maintained, because the volume of the insurance portfolio is the same over time. Premium calculation principles have many desirable properties such as non-negative loading, additivity, scale invariance, consistency and no ripoff (see Dickson, 2005). From these points of view, some premium principles have been introduced such as the expected value principle, variance principle, standard deviation principle, principle of zero utility, Esscher principle and Wan principle. In what follows, we suppose that the premium rate (c) is determined based on the expected value principle, that is, c = (1 + )E{S(1)} with > 0.
The initial surplus u becomes a parameter of the function of the risk indicators in many cases. For example, in classical risk theory, the (ultimate) ruin probability is considered to be a function of the initial surplus, which implies that the investor can decide the initial surplus to satisfy a desirable condition.
The aggregated claim (S(t)) expresses the total amount of claims arising from the insurance portfolio. In classical risk theory, the individual claim amount is assumed to be independent and (2.1) identically distributed (i.i.d.), while the sequence of the claim amounts ({X i }) and the number of claims (N) are assumed to be independent. However, the number of claims (N ≡ N(t)) that occurs in the fixed time interval [0, t] is assumed to be a Poisson distribution with parameter t (i.e. {N(t)} is a Poisson process) to express that the interclaim time is random but independent.
Example 2.1 [The Cramér-Lundberg model] The Cramér-Lundberg model is the classical model of risk theory introduced by Lundberg (2009), Cramér (1930Cramér ( , 1955. Suppose that the surplus of an insurance company at time t is where the premium income (p(t) = ct) is received continuously at a constant rate c > 0; the aggregate claims process According to Dickson (2005), "This model is, of course, a simplification of reality. Some of the more important simplifications are that we assume claims are settled in full as soon as they occur, there is no allowance for interest on the insurer's surplus, and there is no mention of expenses that an insurer would incur. Nevertheless, this is a useful model which can give us some insight into the characteristics of an insurance operation".  This model, proposed by De Finetti (1957), was later reviewed by Seal (1969), Gerber and Shiu (2004), and Avanzi (2009). Dickson and Waters (1991) discussed the compound Poisson process in discrete time of which the surplus at time t is De Vylder and Goovaerts (1988), Li and Garrido (2002), Claramunt, Mármol, and Alegre (2003), and Dickson and Waters (2004) also dealt with this model.
Example 2.6 [The Sparre Andersen model] The Cramér-Lundberg model assumes that {S(t)} is a compound Poisson process, but in the Sparre Andersen model, {S(t)} is not necessarily a compound Poisson process. The idea is taken from Andersen (1957), who gave his name to the model. Li and Garrido (2004), Albrecher, Claramunt, and Mármol (2005), Albrecher, Hartinger, and Thonhauser (2007), Meng, Zhang, and Wu (2007), Albrecher and Hartinger (2006), and Yang and Zhang (2008) among others dealt with this model. Moreover, Zhang (2014) considered the Sparre Andersen model to be perturbed by a Brownian motion.
Example 2.7 [The dynamic control model] Højgaard (2002) considered a process with the dynamic control of the premium income, in which the classical Cramér-Lundberg model with premium rates (c ≡ c t ) is calculated by using the expected value principle (i.e. c t = (1 + t )E S(1) ). The company controls dynamically the relative safety loading ( t ) with the possibility of gaining or losing customers.
Example 2.8 [The interclaim-dependent claim size model] Albrecher and Teugels (2006), Boudreault, Cossette, Landriault, and Marceau (2006), Landriault (2008a), and Huang and Li (2011) considered an extension to the classical Cramér-Lundberg model in which a particular dependence structure among the interclaim time and subsequent claim size is introduced. When the claim number process {N(t)} is a (homogeneous) Poisson process with the intensity parameter > 0, the interclaim time Δ i T between the ith claim and (i − 1)th claim follows an independently exponential distribution with the parameter . In this model, we suppose that the bivariate random vectors (X i , Δ i T) for each j ∈ ℕ are mutually independent, whereas X i and Δ i T are dependent.
Example 2.9 [The dual model] In the classical Cramér-Lundberg model, the surplus of a company at time t is where u is the initial surplus, c is the constant premium rate and S(t) are the aggregate claims at time t. In the dual model, the surplus at time t is where u is also the initial surplus, c is the company's expense rate and S(t) are the aggregate gains at time t. Avanzi, Gerber, and Shiu (2007), , and Ng (2009) optimal barrier strategies in this model. Moreover, Avanzi and Gerber (2008) and Avanzi, Shen, and Wong (2011) In addition, let Z denote the dual process of Z (i.e. the process {Z(t)} has the same law as {−Z(t)}). Then, the Laplace exponent of Z is given by

Gerber-Shiu function
Suppose that the surplus of an insurance company U(t) follows the Cramér-Lundberg model: where the claim size X i is an independent random variable with common distribution F(x) and probability density f(x), and {N(t)} is a Poisson process with the intensity parameter .
Let T denote the time of ruin, (T = ∞ if ruin does not occur). Classical ruin theory is interested in the probability of ultimate ruin as a function of the initial surplus u ≥ 0, We impose that following assumption for the premium rate c.
Assumption 3.1 This assumption ensures that {U(t)} has a positive drift and (u) < 1 (see Gerber . Gerber (1987) discussed not only the probability of ruin, but also the severity of the situation when ruin occurs. Let U(T) ≤ 0 denote the surplus at ruin. They introduced the following quantity: This quantity is an indicator of the severity of ruin, which corresponds to the probability that ruin will occur and states that the deficit at the time of ruin will be less than y 0 , for a given initial surplus (u). Dufresne and Gerber (1988) and Dickson (1992) also discussed the severity of ruin.
Furthermore, Shiu (1997, 1999) derived the joint distribution of the time of ruin, the surplus immediately before ruin (denoted by U(T−)), and the deficit at ruin by using the Laplace transform, which leads to the introduction of the following generalized function by .   In addition, when the above joint distribution has the density (i.e. it satisfies a differentiable condition), the (conditional) joint density function of (U(T−), |U(T)|) on the event {T < ∞} is included in the class of the Gerber-Shiu function under w(x, y) = I(x = x 0 , y = y 0 ) and = 0.
Example 3.4 [Single premium for the ruin] If > 0 and w are defined in terms of the benefit amount of an insurance (or reinsurance) payable at the time of ruin, then (u) is the single premium of this insurance.

Example 3.5 [Expected present value of a deferred continuous annuity] Define
where is the positive solution to Lundberg's fundamental equation satisfied by and T 0 (called the time of recovery) is the first time the surplus reaches zero after ruin. Then, it follows that In this case, the Gerber-Shiu function corresponds to the expected present value of a continuous annuity at a rate of 1 per unit time, starting at the time of ruin and ending as soon as the surplus rises to zero (see Remark (v) of Section 6 in . U(t) be the price of an asset such as a stock or a stock index at time t. Gerber and Shiu (1999) considered the pay-off function where K is the exercise price of a perpetual American put option. Let T ≡ T b = inf{t|A(t) < e b } where b is a number satisfying e b ≤ min(e u , K). Then, it follows that which implies that the value of the option-exercise strategy T b is included in the class of the Gerber-Shiu function. Then, the optimal value of b is derived as the solution to

Dividend problem
Suppose that in the absence of dividends, the surplus of an insurance company at time t is defined by (Equation 2.2) for t ≥ 0 (in this section, we assume the Cramér-Lundberg model). The insurance company refunds the part that the surplus exceeds a constant barrier b(≥ u) as the dividend to the shareholder. In this model, let D b (t) denote the aggregate dividend at time t. Then, we can write where U b (t) is the (modified) surplus with the dividend barrier b given by In addition, the time of ruin with dividends is given by

Classical dividend problem
De Finetti (1957), Bühlmann (1970), Gerber (1979), and  dealt with the problem of the expectation of the discounted dividends, namely, where > 0 is a force of interest.
Suppose each claim amount (X i ) has the probability distribution function (f(x)). Then, the following representations are obtained (see Dickson, 2005): Equation (4.4) comes from the decomposition of the aggregate dividends as (I) The case that the surplus attains b before the first claim occurs, (II) The case that the first claim occurs before the surplus attains b.
In case (I), since the "future aggregated dividends immediately before the first claim occur" corresponds to V(b, b), we can obtain the first term on the right-hand side of Equation (4.4). In case (II), since the "future aggregated dividends immediately after the first claim occur" corresponds to is the amount of the "surplus immediately before the first claim occurs" and x ∈ (0, s) is the amount of the first claim, we can obtain the second term on the right-hand side of Equation (4.4).
Similarly, Equation (4.5) comes from the decomposition of the aggregate dividends as (III) The part before the first claim occurs, (IV) The part after the first claim occurs.
By using Equations (4.4) and (4.5), the following integro-differential equation is derived: The explicit form of b opt is obtained only if each claim amount X i follows some special distributions such as exponential distribution and the diffusion process.
Example 4.1 [Exponential distribution (Bühlmann, 1970)] Suppose that X i follows an exponential distribution with E(X i ) = 1∕ (i.e. f (x) = e − x ). Then, from Equation (4.6), the second-order differential equation is obtained. By solving this differential equation with the boundary condition u V(u, b) | | |u=b = 1 from Equations (4.4) and (4.5), we have where > 0 and −R < 0 are the solutions to the following quadratic equation with respect to x Finally, from Equation (4.9), we obtain by solving h �� (b opt ) = 0 Example 4.2 [Diffusion process (Gerber & Shiu, 2004)] Suppose that the aggregated claim process S(t) is not a compound Poisson process but a diffusion process defined by S(t) = W(t), where {W(t)} is a standard Wiener process. Then, the second-order differential equation is obtained. By solving this differential equation with the boundary conditions where > 0 and −R < 0 are the solutions to the following quadratic equation with respect to x Finally, from Equation (4.11), we obtain by solving h �� (b opt ) = 0 The factorization formula is well known [(e.g. (7.5) of ] and satisfied for general distributions with the p.d.f.
is a positive solution to the following integro-differential equation: Let f * (s) and h * (s) denote the Laplace transform of f(x) and h(x), namely, Then, the Laplace transform of Equation (4.14) is (see e.g. Section 7.8 of Dickson, 2005) (4.8) (4.13)  (Gerber et al., 2006)] Suppose that X i follows a mixture of the exponential distribution with the p.d.f. Ross (2014) and others called such a density hyperexponential. In this case, from Equations (4.15) and (4.16), we can write where 0 , 1 , … , h are the solutions to (s) = 0, which are called Lundberg's fundamental equation , that is, By the method of partial fractions, we have and it follows that . The optimal dividend barrier (b opt ) is obtained as the solution to However, we cannot obtain the explicit form of b opt when h ≥ 2.
. The optimal dividend barrier (b opt ) is obtained as the solution to However, we cannot obtain the explicit form of b opt when m ≥ 2.
From the factorization formula (4.11), the optimal dividend barrier (b opt ) is defined as the solution to h �� (b) = 0; however, the explicit form cannot be obtained except in special cases. From this point of view, some asymptotic formulae for b opt have been considered by Højgaard (2002), Dickson and Drekic (2006), .  (4.17) Example 4.6 [De Vylder approximation (B) ] Another approximation method is not to modify c, but to choose ̃ and ̃, satisfying the first two moments per unit time of the process {U(t)}. In this case, the conditions are Therefore, ̃ and ̃ are given by By using this result and Equation (4.10), we get the approximation where ̃> 0 and −R < 0 are the solutions to the quadratic equation Example 4.7 [Diffusion approximation ] This approximation method aims to approximate the original model by using a model with a diffusion process and the formula for the optimal dividend barrier. The idea is to replace the original model ( In a similar way, we have (4.25) Dickson and Waters (2004), Gerber et al. (2006), and Dickson (2005) considered that the expected present value of the deficit at ruin (E[e

The deficit at ruin
) and the initial surplus (u) should be excluded from the expected present value of net income to shareholders. In that case, the objective function becomes Similar to Equation (4.6), the expected present value of the deficit at ruin Y(u, b) has the following integro-differential equation (see (8.34) of Dickson, 2005):

Definition 4.2
for a constant value u ≥ 0.
In this case, the explicit form of b L opt is not obtained even if each claim amount X i follows an exponential distribution.
Example 4.9 [Exponential distribution (Dickson, 2005, Gerber et al., 2006] Suppose that X i follows an exponential distribution with E(X i ) = 1∕ (i.e. f (x) = e − x ). Then, from Equation (4.28), we obtain the second-order differential equation which yields where > 0 and −R < 0 are the same as in Example (4.1). From this result and the boundary condition u Y(u, b) | | |u=b = 0, we have In addition, since we can write from Example (4.1) (4.26) where it follows that The optimal dividend barrier b L opt is obtained by solving or which comes from another boundary condition 2 u 2 L(u, b) | | |u=b = 0. However, they must be solved numerically.
Example 4.10 [Mixture of exponential distributions (Gerber et al., 2006)] Suppose that X i follows a mixture of exponential distributions with the p.d.f.
x + i to the integro-differential equation (Equation 4.6) and we have where 0 , 1 , … , h are the same as in Example (4.3), and C k (b), D k (b), k = 0, … , h are determined by and Hence, we have The optimal dividend barrier b L opt is obtained by solving (4.32) (4.34) or However, they must be solved numerically.

Other dividend strategies
Thus far, we have considered that when the surplus attains a constant level b, the premium income is paid to shareholders as dividends. By contrast, Avanzi (2009) introduced other dividend strategies as follows.
Example 4.11 [Band/ (Multiple) Threshold/ Constant Barrier Strategies] For a "band strategy" 2 , dividends are paid according to the region in which the surplus U A (t) is located. Given a region is located in the region A, no dividends are paid. On the contrary, while U A (t) is located in the region ℝ + ⧵ A, dividends are paid. In a band strategy on the region A, the aggregate dividend D A (t) is defined by A "threshold strategy" is a special case of the band strategy. When we set h = 1, b 0 = b under the band strategy, the definition of the aggregate dividend D b (t) ≡ D A (t) is with 0 ≤c ≤ c. Frostig (2005), Gerber et al. (2006), Lin and Pavlova (2006), Fang and Wu (2007), Wan (2007), Frostig (2008), Cheung, Dickson, and Drekic (2008), Yuen et al. (2009), andAvanzi (2009) discussed the threshold strategy.
A "constant barrier strategy" is a special case of the threshold strategy, in which the definition of D b (t) corresponds to Equation (4.1) when we set c = c under the threshold strategy. A "multiple threshold strategy" or "multilayer strategy" is a combination of the threshold strategy. When we set a i = b i and 0 ≤c i ≤ c for i = 1, … , h under the band strategy, the definition of the aggregate dividend , Badescu and Landriault (2008), and Lin and Sendova (2008) discussed the multilayer strategy.
Example 4.12 [Linear barrier strategies] Gerber (1974) argued that a (constant) barrier strategy does not sufficiently take into account the safety of the company, since the probability of ruin is 1 (i.e. P(T b < ∞) = 1). In a linear barrier strategy with the parameters (a, b), the aggregate dividend Gerber (1981), Siegl and Tichy (1999) and Albrecher, Hartinger, and Tichy (2005) also discussed the linear strategy.
Example 4.13 [Non-linear barrier strategies] These are a general family of strategies allowing for a (reflecting) barrier b(t) as a function of time. In this strategy, the aggregate dividend D b (t) is defined by Gerber (1974), Alegre, Claramunt, and Marmol (1999), Albrecher and Kainhofer (2002), and Albrecher, Kainhofer, and Tichy (2003) discussed the non-linear strategy.

Reinsurance
Suppose that the surplus of an insurance company at time t is the Cramér-Lundberg model defined by (Equation 2.2) for t ≥ 0. If the insurance company (i.e. insurer) effects reinsurance by paying a reinsurance premium continuously at a constant rate, then the process becomes a net of the reinsurance surplus process {U a (t)} t≥0 given by where u is the initial surplus, c (a) is the insurer's net premium income per unit time, {N(t)} is a Poisson process with the intensity parameter > 0 and X (a) i is the amount the insurer pays on the i-th claim.
Note that the insurer's net premium rate c (a) and payable amount X (a) i depend on the reinsurance scheme and reinsurance parameter a ≥ 0.
Example 5.1 [Proportional Reinsurance] Under a proportional reinsurance arrangement, the insurer pays a fixed proportion a ∈ [0, 1] of each claim that occurs during the period of the reinsurance arrangement. The remaining, 1 − a, of each claim is paid by the reinsurer, which implies that X (a) i = aX i . On the contrary, the insurer has to pay the reinsurance premium corresponding to the proportion of the premium rate calculated by the reinsurer. If you consider the premium rate in the expected value principle (see 3.3.2 of Dickson, 2005), the premium rates of the insurer (c) and reinsurer (c R ) are given by where E S(1) is the (re)insurer's expected claim under the risk, given by with = E{N(1)}, and = E(X i ); , and R are the premium loading factors defined by the insurer ( ) and reinsurer ( R ) with R ≥ > 0. As a consequence, the insurer's net premium rate c (a) is Note that if 0 < R < , the insurance company can earn a profit without any risk by a = 1.

Example 5.2 [Excess-of-loss Reinsurance]
Under an excess-of-loss reinsurance arrangement, a claim is shared between the insurer and reinsurer only if it exceeds a fixed amount called the retention level a ≥ 0. If the claim amount is smaller than a, the insurer has to pay the entire claim amount. Hence, the amount the insurer pays on the i-th claim is X (a) i = min(a, X i ) and the remaining X i − min(a, X i ) = max(X i − a, 0) is paid by the reinsurer.
Let f(x) denote the p.d.f. of the claim amount X i . Then, the reinsurance premium for the reinsurance portion max(X i − a, 0) is written as As a consequence, the insurer's net premium rate c (a) under the expected value principle is given by Note that if 0 < R < , the insurance company can earn a profit without any risk by a = 0.
The optimal reinsurance parameter (a opt ≡ a opt (u)) is defined by the maximizer of an objective function Q(a): where Q(a) is satisfied with Q(a) < Q(a opt ) < ∞ for any a ≥ 0.

Example 5.3 [Adjustment Coefficient] Let M (a)
X (r) denote the moment-generating function of the insurer's payable amount X (a) i , that is, M (a) X (r) = E(e rX (a) i ) for r ∈ ℝ. Then, the reinsurance adjustment coefficient (R(a)) is defined by the positive solution to i.e. R(a) is satisfied with Waters (1983) and Centeno (1986) among others discussed the optimality of the reinsurance parameter under Q(a) ≡ R(a).

Example 5.4 [Ruin Probability] Define
which implies that the optimal reinsurance parameter a opt is the minimizer of the ruin probability given u ≥ 0. Dickson and Waters (1996) discussed the optimal proportion retained a opt based on the ruin probability. According to Dickson (2005), the a opt should be the maximizer of the adjustment coefficient rather than the minimizer of the ruin probability, at least for large values. In addition, the excess-of-loss reinsurance maximizes the adjustment coefficient (and thus minimizes the probability of ruin) (see Bowers et al., 1997).
Recently, many researchers have considered the optimal reinsurance and dividend control problem of an insurance company. As for proportional reinsurance, Højgaard andTaksar (1998), Taksar (2000), David (2005), and Cao and Wan (2009) among others considered optimal investment and reinsurance strategies to maximize (or minimize) the utility functions (included the ruin probability) for a diffusion risk model or extended models. As for the excess-of-loss reinsurance, Asmussen, Højgaard, and Taksar (2000), Beveridge, Dickson, and Wu (2007), Gu, Guo, Li, and Zeng (2012), Zhao, Rong, and Zhao (2013), and Liu and Hu (2014) among others discussed the optimization (i.e. the maximization or minimization of utility functions) of the excess-of-loss reinsurance and investment problems for an insurer with diffusion or jump-diffusion risk models.  Albrecher and Hipp (2007) discussed the effect of tax payments under a loss-carry forward system in the Cramér-Lundberg model without dividends. The loss-carry forward system is motivated by its realistic feature of carried forward losses (i.e. occurred losses can be deducted from later income and hence they reduce the taxable profit of the business). Under this system, the classical Cramér-Lundberg model is modified as follows.

Tax
Example 5.5 [The Cramér-Lundberg Model under the loss-carry forward system] Suppose that the surplus of an insurance company at time t is defined by (Equation 2.2) for t ≥ 0 (in this section, we assume the Cramér-Lundberg model): We assume that tax is paid at a fixed rate ∈ (0, 1) of the insurer's premium income whenever it is in a profitable situation. Suppose that the initial surplus is u. We now consider the risk process {U (t);t ≥ 0}, which is the surplus process with tax. Before the first claim occurs, the (aggregated) premium income at time t becomes c(1 − )t and the (aggregated) tax payment at time t becomes c t. Let Δ 1 T(≡ T 1 ) be the time of the first claim. Then, the gains level (the surplus immediately before the first claim occurs) is set to M 1 : = u + c(1 − )Δ 1 T. The surpluses immediately before and after the first claim occurs are written as respectively. If 0 < U(T 1 ) and cΔ 2 > X 1 , the insurer does not have to pay any tax during [T 1 , T 1 + X 1 ∕c), because the surplus is less than M 1 and we set the time that the surplus equals M 1 as 1 , i.e. 1 : = T 1 + X 1 ∕c. Then, we can see that U ( 1 ) = M 1 and from t = 1 to t = T 1 + Δ 2 T = T 2 , the tax is paid. By denoting the term at which the tax is paid as W 1 , W 2 , we can write W 1 = T 1 and W 2 = T 2 − 1 . Then, the new gains level is set to M 2 = M 1 + c(1 − )W 2 . Let N(t) be the underlying claims arrival process. Albrecher and Hipp (2007) formally defined a sequence of gains levels (M n ), waiting times (term at which tax is paid) (W n ) and starting times of periods with profit ( n ) as with 0 = 0 and M 0 = u for n = 1, 2, …, where the surplus process U (t) at time t is where Λ(t) = sup{n ∈ ℕ; n ≤ t}.  introduced the following modified dual risk model, which is reasonable to model the surplus process for pharmaceutical and petroleum companies.
Example 5.6 [The Dual Risk Model under the Loss-carry Forward System] Let {Y j } denote a sequence of i.i.d. random variables, where Y j corresponds to the revenue associated with the j-th innovation, and let {Δ j T} denote a sequence of the inter-innovation times (i.i.d. random variables). Then, the sequence of innovation times {T j } is defined by T 0 = 0 and T j = ∑ j i=1 Δ i T. Suppose that the innovation number process {N(t), t ≥ 0} is a renewal process with the inter-innovation times Δ 1 T, Δ 2 T, …. The following dual risk model is thus reasonable to model the surplus process (without tax) for pharmaceutical and petroleum companies: where u is the company's initial surplus, c is now the constant expense rate and {S(t), t ≥ 0} is the aggregate revenue process. Let ∈ (0, 1) be the tax rate at which the insurance company has to pay the profit times as the tax. In this system, the profit is defined as the excess of each new record high of the surplus over the previous one. The value of the nth record high (J n ) is defined as follows: with k 0 = 0 is the number of innovations up to the time of the n-th record high. The resulting surplus with tax is given by Note that Y j − cT j is the "profit and loss" from the (i − 1)-th innovation to i-th innovation, and thus k n − k n−1 corresponds to the minimum term that the "aggregated profit and loss" is positive. Under the loss-carry forward system, the loss is carried forward, which implies that while the aggregated profit and loss is negative, the company does not have to pay the tax. That is the reason why k n is calculated. In addition, {J n } corresponds to the surplus at the time that the n-th tax payment is made. Table 1 shows an example of the dual risk model under the loss-carry forward system.    , Renaud, and Zhou (2008) discussed the same problem for the general Lévy risk model case, while Albrecher, Borst, Boxma, and Resing (2009) derived a tax identity valid to arbitrary surplus-dependent tax rates.

Statistical estimation
In this section, we discuss the statistical estimation of the ruin probability and Gerber-Shiu function. The statistical inference for the ultimate ruin probability has been studied by many authors. Grandell (1978Grandell ( /1979, Csörgő and Teugels (1990), Deheuvels and Steinebach (1990), Csörgő and Steinebach (1991), and Embrechts and Mikosch (1991) discussed the statistical estimation of the adjustment coefficient, because the ruin probability is approximately expressed as a function of the adjustment coefficient by using the Cramér-Lundberg approximation. On the contrary, Frees (1986), Croux and Veraverbeke (1990), Bening andKorolev (2003), andMnatsakanov et al. (2008) discussed the nonparametric estimation of the ruin probability. Shimizu (2012) proposed a non-parametric estimator of the Gerber-Shiu function when the surplus process {U(t)} is a compound Poisson claim process plus a diffusion perturbation (called the Wiener-Poisson risk model, defined by Equation (2.4)).

Statistical estimation of the adjustment coefficient
Suppose that the surplus of an insurance company at time t follows the Cramér-Lundberg model defined by Equation (2.2), where we assume that the claim size X i has the probability density f(x) for x ≥ 0. Let (u) denote the ruin probability defined by Equation (3.1). We introduce the adjustment coefficient (R), which is the positive solution to with respect to r, where M X (r) = E(e rX ) = ∫ ∞ 0 e rx f (x)dx. Hence, the adjustment coefficient satisfies Here, the left-hand side of Equation (6.1) comes from the Laplace exponent of U(1) − u, that is, which corresponds to Lundberg's fundamental equation with = 0. Then, the Cramér-Lundberg approximation is derived (see Minkova, 2010). A natural problem is then to consider the estimation of the adjustment coefficient R. Grandell (1978Grandell ( /1979 proposed the following estimator for R. Definition 6.1 Suppose that X 1 , … , X N(T) , N(T) and T are observed, where X i is the i-th claim amount, T is the ruin time and N(T) is the claim frequency up to the ruin. Then, an estimator (R T ) of R is defined as the solution to with respect to r, where M X (r) = 1
Furthermore, Grandell (1978Grandell ( /1979 showed its asymptotic normality as follows: Theorem 6.1 If E(X i ) − c∕ < 0 and lim r→∞ M X (r) − cr∕ = ∞ are satisfied, then where y(r) = M X (r) − cr∕ . Csörgő and Teugels (1990) derived an estimator of R based on the sample of size n. Deheuvels and Steinebach (1990) and Csörgő and Steinebach (1991) derived an estimator of R by using intermediate order statistics. Embrechts and Mikosch (1991) discussed a bootstrap procedure to estimate the adjustment coefficient R. Furthermore, Shimizu (2009) considered the estimation of the adjustment coefficient R when the underlying process consists of a Brownian motion plus negative jumps where each jump size of the process does not necessarily correspond to each claim size.

Non-parametric estimation of the ruin probability
Suppose that the surplus U(t) is defined by (Equation 2.2), where the claim size X i has the mean E(X i ) = , the distribution function F(x), and the p.d.f. f(x). Let (u) denote the ruin probability defined by (Equation 3.1). Define ̄(u) = 1 − (u) as the non-ruin probability. Then, it follows the following the integro-differential equation (see (7.6) of Dickson, 2005): Integrating Equation (6.2) from 0 to t leads to (see e.g. Minkova, 2010.) Let ̄ * (s),̄ * * (s) denote the Laplace transforms Taking the Laplace transform in both sides of Equation (6.3) gives where f * (s) is the Laplace transform of f(x). Since it follows that ̄ * * (s) = s̄ * (s), we have The inverse Laplace transform of Equation (6.4) leads to the following formula (the so-called "Pollaczek-Khinchin formula"): where G(u) = 1 ∫ u 0 (1 − F(y))dy and G (r) denote the r-fold convolution of the distribution function G itself 3 . The following formula is equivalent to Equation (6.5): Assuming that and are known, Croux and Veraverbeke (1990) proposed its estimator as where {m n } is a sequence of natural numbers that depends on the sample size n satisfied with m n ≤ n, lim n→∞ m n = +∞, and U nr (u) is a U-statistic given as follows.
Definition 6.2 Suppose that X 1 , … , X n are observed, where X i is the ith claim amount. Then, the estimator (U nr (u)) of G (r) (u) for a fixed r, u is defined by where ∑ 1≤i 1 <⋯<i r ≤n is the sum over all n r distinct combinations of {i 1 , i 2 , … , i r } satisfied with 1 ≤ i 1 < ⋯ < i r ≤ n, and h r (x 1 , … , x r ;u) is a symmetric kernel Note that for r = 1, 2, …, we can interpret G (r) (u) = P(Y 1 + ⋯ + Y r ≤ x), where Y 1 , … , Y r are independent with a common density function (1∕ )(1 − F(y)) for y ≥ 0. Hence, we can write Therefore, we can confirm that U nr (u) is an unbiased estimator of G (r) (u), since Frees (1986) showed the consistency for the linear combination of U-statistics, i.e. Furthermore, Croux and Veraverbeke (1990) showed the asymptotic normality for n (u) as follows.
Theorem 6.2 If log n = o(m n ), n → ∞, then for each u > 0 where Bening and Korolev (2003) mentioned that the above results can rarely be applied in practice since • The Croux-Veraverbeke estimator Equation (6.6) is constructed from a sample with a non-random size, (1 − F(y j ))dy 1 ⋯ dy r .
• The property of the asymptotic normality of the proposed estimator cannot be directly used to construct the (asymptotic) confidence intervals since the limit distribution of the estimator (more precisely, its variance) depends on the unknown distribution of claims.
For these problems, they considered the following model. Let Λ(t), t ≥ 0 be a random process independent of N(t) and possessing non-decreasing almost surely finite right-continuous sample paths starting from the origin. A Cox process, also called a doubly stochastic Poisson process, guided by the process Λ(t), is defined as

Consider the (new) process
Under this model, we suppose that the claims X 1 , … , X N(t) are obtained at time t. Then, the following estimator for (u) is introduced: where m(n) is an integer with m(n) ≤ n, lim n→∞ m(n) = +∞ and U N(t),r (u) is the same as U nr (u) when N(t) = n. Moreover, a consistent estimator of asymptotic variance is introduced as follows: where k(n) is an integer with k(n) ≤ n, lim n→∞ k(n) = +∞ and Then, Bening and Korolev (2003) showed the asymptotic normality for N(t) (u) as follows.
Theorem 6.3 Suppose log n = o(m(n)) and as n → ∞ are satisfied. Assume Λ(t) p → ∞ as t → ∞. Then, By contrast, Mnatsakanov et al. (2008) discussed the estimation of the ruin probability by using the inverse of the Laplace transform. In the same way to Equation (4.15), the Laplace transform of Equation (6.2) is N(t) = N(Λ(t)), t ≥ 0. where * (s), f * (s) are the Laplace transforms of , f , respectively. By stating (0) = 1 − ≤ (u) ≤ 1 for u ≥ 0, we can write Assuming that is known, Mnatsakanov et al. (2008) proposed its estimator as follows.
Definition 6.3 Suppose that X 1 , … , X n are observed, where X i is the ith claim amount. Then, the estimator (̂ * (s)) of * (s) for a fixed s is defined by where Although ̂ * is close to * , there is no guarantee that the inverse of the Laplace transform  −1̂ * will be close to  −1 * ≡ , because  −1 is not continuous. Hence, Mnatsakanov et al. (2008) considered slightly modifying the functions ̂ * and * , that is, for an arbitrary fixed > 0, are defined. Then, ̂ * (s), * (s) ∈ L 2 (0, ∞) a.s. are satisfied because the non-continuity for  −1̂ * and  −1 * is due to the behaviour of D(s) and D (s) near 0. For ̂ * (s), * (s), the inverse of the Laplace transform is introduced as follows: for a fixed > 0, where As a result, the following estimators are introduced.

Definition 6.4
Then, the following consistency in the ISE (integrated squared error) sense is shown. , as ↓ 0.
Here, ‖ ⋅ ‖ B is the norm in L 2 (0, B) for an arbitrary fixed 0 < B < ∞, which implies that the comparison between ̂ (̃ ) and is carried out in L 2 (0, B) by restricting the functions to [0, B]. Shimizu (2012) proposed an estimator of the Gerber-Shiu function for the Wiener-Poisson risk model (corresponding to the model in Example (2.5)) in a similar way to Mnatsakanov et al. (2008), and showed the consistency in the ISE sense.

Statistical estimation of the Gerber-Shiu function
The Laplace transform of the Gerber-Shiu function for the Wiener-Poisson risk process (Equation 2.4) was obtained by Tsai and Willmot (2002), Morales (2007), and Biffis and Morales (2010). The following result was given by Morales (2007).
Theorem 6.5 Suppose that the penalty function w is uniformly bounded, or it satisfies |w(x, y) − w 0 | ≤ k(x + y) for a universal constant k > 0 and > 1. Then, where f is a p.d.f. of X i , f * and K * are the Laplace transforms of f and K, D = 2 ∕2, the function K is given by is the unique non-negative solution to the following Lundberg's fundamental equation with respect to s: In particular, = 0 as = 0. Shimizu (2012) proposed its estimator as follows: Definition 6.5 Suppose that {U(t)|t ≡ t n i = iΔ n , i = 0, 1, 2, … , n} for some Δ n > 0 and the claim data X 1 , … , X N(T n ) where T n = nΔ n are observed. Then, the estimator (̂ * (s)) of * (s) for a fixed s is defined by where and Here, j is the j-th claim occurring time and ̂ is defined as the following M estimator: where I ⊂ (0, ∞) is a compact set in which exists, and we set ̂= 0 for = 0. , (U(t n i ) − U(t n i−1 )) 2 I{ j ∉ (t n i−1 , t n i ];j = 1, … , N T n }. = arg min s∈I |̂ (s)|,̂ (s) = cs −̂(1 − � f * (s)) +Ds 2 − ,