Modeling and Performance of Bonus-Malus Systems: Stationarity versus Age-Correction

In a bonus-malus system in car insurance, the bonus class of a customer is updated from a year to the next as a function of the current class and the number of claims in the year (assumed Poisson). Thus the sequence of classes of a customer in consecutive years forms a Markov chain, and most of the literature measures performance of the system in terms of the stationary characteristics of this Markov chain. However, the rate of convergence to stationarity may be slow in comparison to the typical sojourn time of a customer in the portfolio. We suggest an age-correction to the stationary distribution and present an extensive numerical study of its effects. An important feature of the modeling is a Bayesian view, where the Poisson rate according to which claims are generated for a customer is the outcome of a random variable specific to the customer.


Introduction
In the classical actuarial model for bonus-malus systems in automobile insurance (Denuit et al. [8] or Lemaire [14], for example), there is a finite set of bonus classes = 1, . . . , K. A customer having n claims and bonus class in a given year has bonus class b( , n) in the next for some deterministic function b (the bonus rule; claim sizes are ignored and only claim numbers counted). The customer has a risk parameter λ, such that the number of claims N 0 , N 1 , . . . in consecutive years are i.i.d. Poisson(λ), and so the sequence L 0 , L 1 , . . . of bonus classes is a time-homogeneous Markov chain with transition matrix P (λ) = p k (λ) k, =1,...,K where p , (λ) = ∞ n=0 e −λ λ n n! I b( , n) = (such a customer we denote a λ-customer ). Customers pay premium r when in class and enter the system in some fixed class 0 . The r and 0 may be chosen 1 according to certain optimality and/or financial equilibrium principles (see below) or arbitrarily. For a simple example of a bonus rule, consider the −1/+2 rule. Here each claim causes the bonus level to increase by 2, whereas it decreases by 1 for each claim-free year (obvious boundary modifications apply to levels 1, K − 1, K). The systems in use are often substantially more detailed, with K of order 15-25. Much of the discussion of the literature employs stationarity modeling, measuring characteristics of the system via the stationary distribution π(λ) = (π (λ)) =1,...,K (existing under the weak assumptions of irreducibility and aperiodicity). In particular, the average premium r(λ) of a customer with risk parameter λ is defined by where the p n are the n-step transition probabilities. From the r(λ), one often proceeds to calculate the Loimaranta efficiency at λ (denoted elasticity outside the actuarial sciences); it measures to which extent r(λ) is linear at λ (as should ideally be the case), with e(λ) = 1 expressing 'local linearity' at λ. Such stationary performance measures are only meaningful if the Markov chain L attains (approximate) stationarity within the typical time a customer spends in the portfolio. For this reason much attention has been given to studying the approach of the p n 0 , (λ) to π (λ). The rate of convergence is known to be geometric, with decay parameter the second largest eigenvalue of the transition matrix P (λ). However, this is an asymptotic result and so the studies are most often numerical, depicting for example the mean annual premium E λ r Ln or the total variation (t.v.) distance d TV p 0 · (λ), π(λ) = K =1 p n 0 , (λ) − π (λ) (1.2) as function of n (see, for example, Denuit et al. [8, p. 183ff]). The results are sometimes encouraging: for some bonus systems, d TV p 0 · (λ), π(λ) = 0 already for n = 4-6. However, these are typically simple-minded systems, and for the more realistic ones, one often sees a substantial value of the t.v. distance for say n = 30, a value exceeding the time span a customer can be expected to stay in the portfolio. Nevertheless, the studies of the effects of the sojourn time A in the portfolio being finite are remarkably few, with Borgan et al. [5] being the main exception. One purpose of this paper is to go deeper into this direction and to formulate an alternative (which we call age-correction) to the stationarity point of view. Bonus-malus systems may be seen as an example of experience rating which has as aim to calculate the premium on an individual basis by using the information available to the company. In the automobile insurance setting, we ignore in this paper profit, administration costs etc., and take the average claim size equal to 1, so that in after a given year m, the company would want to compute its net premium c m for year m as its best guess of the customer's λ as function of the numbers N 0 , N 1 , . . . , N m−1 of claims filed in years 0, 1, . . . , m − 1. The naive guess is of course the average λ m = (N 0 + N 1 + · · · + N m−1 )/m. However, a high value of λ m could be due to bad luck of an otherwise good driver and a low value luck of an otherwise bad driver. An estimate which is more fair to the customer is therefore obtained by a Bayesian view where one involves information on the population of customers in form of a prior distribution, say U , of λ, views the particular customer's λ as the outcome of a r.v. with distribution U , calculates the posterior distribution U m and takes c m as the mean of U m , the Bayes premium.
and one gets the Bayes premium as the posterior mean This has a neat interpretation as a weighted average of the population mean b/µ (the premium the company will charge without access to claims statistics) and the mean λ m of the claims, with the weight m/(µ + m) of λ m increasing to 1 as years go by and the information on the customer accumulates.
The Bayes premium enjoys the optimality property of minimizing the quadratic loss E(c m − Λ) 2 in the class of all functions c m of N 0 , N 1 , . . . , N m−1 , the natural class of predictors of Λ using information on N 0 , N 1 , . . . , N m−1 . In fact, it is standard that the solution to this minimization problem is c m = E[Λ | N 0 , N 1 , . . . , N m−1 ]. For these facts and the general theory of Bayes premiums, see Bühlmann [6], Bühlmann & Gisler [7] and Denuit et al. [8,Ch. 3]. 1 Given the above optimality property, the Bayes premium can be viewed as the optimal fair choice of the insurer's premium. Presumably, the reason that bonusmalus systems are used instead in practice is that they are better understandable to the average customer who would not know about prior and posterior distributions. Nevertheless, as noted by Norberg [19], the Bayesian view is highly relevant also in bonus-malus systems but for a different purpose, to compute the premiums r 1 , . . . , r K in the different bonus levels. To this end, the idea of basing the premium on the bonus level means that one chooses the minimizer of E(c − Λ) 2 in the class of all functions of the bonus level. One then needs to specify what is meant by this level, and and to avoid the dependency on m, the choice of [19] and much subsequent literature is a r.v. L ∞ distributed as π(Λ) (recall that π(λ) is the stationary distribution of the 1 These references have as their main theme not the Bayes premium but rather the credibility premium, also called the linear Bayes premium, computed as the minimizer of E(a m − Λ) 2 in the class of all linear functions a m of N 0 , N 1 , . . . , N m−1 ; for the Gamma example and many others, the Bayes premium and the linear Bayes premium coincide. The motivation for considering linear predictors only is computational ease.
Markov chain L 0 , L 1 , . . . when the customer's Poisson parameter is λ). Using the general L 2 -theory quoted above, the minimizer is so that the optimal bonus level r in class is E[Λ | L ∞ = ]. Evaluating the conditional expectation, we get gives the premium rule of the bonus system which is optimal from the point of view of minimizing the error in predicting a customer's Λ. 2 The rule (1.3) enjoys in a certain sense the principle of financial equilibrium (for the company) which asserts that on average, premium incomes and payments of claims should balance. Namely, the expected claims in a year of a typical customer is EΛ and his expected premiums are Er L∞ assuming that a typical customers bonus class is distributed as the stationary r.v. L ∞ . By the tower property of conditional expectations, these expressions coincide. However, the point we take in this paper is that this assumption is questionable and needs further discussion.  [16]). For simplicity, we will assume that the monetary scale is chosen such that the mean claim size is one. Remark 1.2. Car insurance in the real world involves several features not included in the above discussion. Examples are discounts for female drivers, penalties for young drivers, premium adjusted according to further covariates such as mileage, engine size of the car (or even color!), geographical region etc. Remark 1.3. Regulations vary greatly from country to country. At one extreme, all insurance companies are obliged to use the same bonus-malus system, at the other they have complete freedom. The general tendency has gone towards deregulation. A detailed survey of the situation in Europe as of year 2000 concerning such rules (as well as the use of covariates as in Remark 1.2) is in Meyer [18]. Of course, much has changed since then but still, [18] will serve to give an impression of many practical issues connected with motor insurance.
The paper is organized as follows. In Section 2, we introduce our age-correction approach. and give some of its simple properties. Sections 3-6 then contain an extensive numerical study of its behavior in concrete case and how it compares to the traditional stationarity-based approach. A concluding discussion, including more careful references to the literature, is in Section 7, and the Appendix contains some complements as well as an outline of a more general modeling approach via Markov chains.
Motivated from the criticism of the traditional use of stationarity, we now assume that a customer stays in the portfolio only for a finite number of years A. I.e., he is in the portfolio in years 0, 1, . . . , A − 1 after entering. We further assume that A is independent of his Λ and his claim sequence N 0 , N 1 , . . . and thereby his sequence of bonus levels, and that µ A = EA < ∞. The distribution of A is denoted by F , the point probabilities by f a = P(A = a), and we write f e a = (f a+1 + f a+2 + · · · )/µ A , Much of the discussion of Section 1 remains relevant, only do we need for each value λ of the Poisson parameter to replace the stationary distribution π(λ) of the bonus level by the distribution π * (λ) of the typical bonus level L * .
We then need to specify what is meant by the 'typical bonus level' of a λcustomer, and our suggestion is to define this as a r.v. L * with distribution denoted the age-corrected distribution in the rest of the paper. Expression (2.1) is fundamental for the paper and may be approached in various ways. We choose here the set-up in the following Theorem 2.1 where the interpretation is as a limiting long-term average of bonus classes of λ-customers seen by the company (for an alternative, see Appendix B). Before stating the Theorem, we need some notation and assumptions. Let M y denote the number of λ-customers in the portfolio in year y = 0, ±1, ±2, . . ., E y the number of λ-customers entering the portfolio 3 4 and L y,c , c = 1, . . . , M y the bonus classes of the λ-customers, T y,c ∈ {0, 1, . . .} the time since they entered the portfolio. Assume that the E y are i.i.d. with finite mean µ E and independent of the M y , L y ,1 , . . . , L y ,M y with y < y, and that that the A, Λ, N 0 , N 1 , . . . for different customers are i.i.d. and independent of the E y .
Theorem 2.1. Under the above assumptions, a.s. as Y → ∞ for all , no matter initial conditions.
Proof. We can write where C is the time-in-portfolio-before-Y of λ-customers that first entered the portfolio at some time y < 0, C the similar time of those that entered at some time y = 0, . . . , Y and left before Y + 1, and C the time of those that entered at some time y = 0, . . . , Y but still remain in the portfolio at time Y + 1 (and possibly after).
Here we can bound C by C , the total-time-in-portfolio (not necessarily before Y ) of λ-customers that first entered at some time y = 0, . . . , Y but remained in the portfolio after Y . By the law of large numbers, Combining these facts gives A similar argument shows that the total time C or C λ-customers ever spend is bonus class (not necessarily before Y ) is of order and that the C contribution to the l.h.s. of (2.3) dominates the C and C contributions. Combining with (2.4) gives (2.3). The proof of (2.2) is similar, though slightly easier.
Remark 2.1. The analysis in the proof of Theorem 2.1 is similar to the one of a discrete time G/G/∞ queue. M y then plays the role of the queue length at time y, and T y,c as the elapsed service time of customer y, c. We will not use this connection and hence leave out further details.
It may be noted that expression (2.1) may be evaluated in closed analytical form. To this end, we need the fundamental matrix Z(λ) of the Markov chain given by of F and use the same notation f [C] = ∞ 0 f n C n for a square matrix C. Then, with e the th (row) unit vector: For the proof, see the Appendix. 3 Numerical set-up

The bonus systems
We have selected three rather different systems for our numerical studies. Doing so, our source has been the survey in Meyer [18] treating the situation in most European countries (as well as Japan and the US) around 1999. Important characteristics of a bonus system is the number K of classes and the spread factor, defined as the ratio between the highest premium r K and the lowest r 1 . We chose systems from three countries, Ireland, Italy and Germany. Ireland has a small number K = 6 of classes and a low spread factor of 2, Germany has a high number K = 29 of classes and a high spread factor of 8.2, whereas Italy is intermediate with K = 18 of classes and spread factor 4. More detail on the various systems are given below. It should be noted that each system may only be one among several in the particular country when [18] was published and that much may have changed since then. However, our point is not to analyze systems that are necessarily in current use but rather that our examples both show diversity and are typical of many other systems. The premiums r , often called relativities, are traditionally given in percent of the premium in the initial class 0 or some other reference class, and when presenting the three systems, we follow that tradition (as does Meyer [18]). However, later we shall renormalize to get financial equilibrium.

Trial sojourn time distributions
We selected four trial distributions for the distribution of the sojourn time A of a customer in the portfolio. Two are negative binomial with point probabilities Here ρ was chosen to make the means 1 + 3ρ/(1 − ρ) equal to 7 and 13, and the two distributions are denoted N B (7), resp. N B (13). The two other, denoted U (6.5) and U (12.5), were taken as uniform distributions on {1, 2, . . . , 12}, resp. {1, 2, . . . , 24}, i.e. with roughly the same means. The distributions and their equilibrium distributions given by (2.2) are illustrated in Figure 1.
In the numerical calculations, the distributions were truncated at n = 50, except for Figure 14 where the truncation point was n = 2000.

Bayesian assumptions
We have taken the distribution U of the customers' λ parameter to be exponential with mean 0.1. The value 0.1 is from Lemaire & Zi [13, p. 288] who argue this to be typical in many countries. The exponential assumption is from Bichsel [3], who fitted a gamma distribution to data and found the shape parameter to be close to 1.
Motivated by this assumption, we have in many of the illustrations selected four values of λ, 0.04, 0.08, 0.16 and 0.32, i.e. two below the population mean and

Ireland
The Irish system is very simple with K = 6 classes and transition rules as in Table 1.
The initial class is 0 = 6. The convergence speed to the stationary distribution is illustrated in two figures. The first, Figure 2, shows the shape of the transient λ-distributions of L n for four selected values n = 5, 10, 15, ∞ of n (n = ∞ corresponds to the stationary distribu-tion) and the four selected values 0.04, 0.08, 0.16, 0.32 of λ, and the next, Figure 3, plots the t.v. distance (1.2) to the stationary distribution. The shape of these figures may be understood from the transition rules. Consider for example a customer with λ = 0.04. Here most of the mass of π is concentrated in class 1, but class 1 can at earliest been reached in year 5. This explains the steep drop in Figure 3 in the t.v. distance between years 4 and 5. When looking at the bar plots in Figure 2 for the distribution of his class in different years, consider for example year 5 and note that w.p. e −0.2 = 0.82 he have will have no claims in the first 5 years, so 0.82 is precisely the mass at class 1. W.p. 0.2e −0.2 = 0.16 he have will have exactly one claim. If this happens in year 0, his sequence of states in years 1, 2, 3, 4, 5 is 6, 5, 4, 3, 2. The similar sequences for a claim in year 1, 2, 3, resp. 4 are Since any of the years 0,1,2,3,4 are equally likely for the claim. this explains that class 4 is more likely than classes 2,3, which is of course not the case for a good customer in stationarity (n = ∞). The possibility of two or more claims giving mass in states 5,6 is just 0.02 and hence negligible. Similar remarks apply to other values of λ and n as well as the parallel figures for Italy and Germany to follow, but we shall not give the details.
The figures shows the fastest convergence rate among our three selected systems, and also that the rate is not that crucially depending on the value of λ. The explanation could be related to the simplicity of the Irish system. Our next figure, Figure 4, plots the age-corrected distribution π * (λ) for our four selected values of λ (one in each column) and our four trial distributions together  It is seen that the agreement within columns is relatively good, with the most marked differences for small values of λ. The explanation could be the relatively fast convergence rate in the Irish system
Finally the age-corrected distribution π * (λ) are plotted in Figure 7 together with the stationary distribution π(λ). One sees a marked worse agreement within columns than for Ireland. The most marked differences occur for small values of λ, with one feature being a considerable concentration of the π * (λ) (but not of π(λ)) close to the inital class 14. Again, the most natural explanation is the slow convergence rate.

Germany
The German system is rather elaborate. It has a large number of classes, K = 29, initial class 26, and quite detailed rules for the new class after one or more claims. For example, after one claim the customer moves up 14 classes when in class 1, always to class 17 when in classes 6-11, and up 3 classes when in classes 19-22. The rules for some selected cases are given in Table 3; for full details, see Meyer [18] or Mahmoudvand et al. [17].   29  245  25  29  29  29  29  25  100  24  26  29  29  29  20  55  19  23  26  27  29  15  40  14  21  25  27  29  10  35  9  17  24  26  29  5  30  4  16  22  24  29  1  30  1  15  22  24  29 A quite special feature of the German system is the very high initial class, 26, meaning that a customer at earliest can reach the lowest premium level in class 1 after 25 years! This clearly shows up in the following Figures 8, 9, 10, for example in the λ = 0.04 row in Figure 9 where the t.v. distance from the stationary distri-bution is substantial up to time n = 25, and in the comparisons of age-corrected distributions in Fig. 10 which shows the same phenomenon as for Italy, a strong concentration of the π * (λ) (but not of π(λ)) close to the inital class 26.

Relativities
We now turn to the influence of finite customer sojourn times on the Bayes premium, proceeding as follows. For each of the three selected bonus systems and of the four trial sojourn time distributions, we first compute our age-corrected alternatives π * (λ) to the stationary distribution π (λ) by means of (2.1) and next the Bayes premium r * in bonus class by means of the analogue of (1.3) The results are in the following three Figures 11, 12, 13. The legends are solid red for distribution N B(6), dotted red for N B(13), solid blue for U (6), and dotted blue for U (12). As supplement we also compute the Bayes premium corresponding to the stationary distribution π (λ) (dotted black) and supplement with the premium corresponding to the given relativities for the bonus system (e.g. 50, 60, 70, 80, 90, 100 for Italy) in solid black; whereas the Bayes premium automatically yields financial equilibrium, cf. the discussion following (1.3), we here need to normalize to satisfy this requirement.
When interpreting the figures, we first note that it does not contradict financial equilibrium that for a given country, one set of of relativities is below the other. For example, all relativities corresponding to one of our four trial sojourn time  Figure 13: Relativities, Germany distributions (colored graphs) are below the given relativities (solid black graph). But the explanation is simply that the relativities should be weighted with the age corrected, resp. the stationary distribution, and the age corrected distributions have a region of importance which is more shifted towards high classes.
We next note that the two distributions N B(7), U (6.5) with the low mean are quite close, in some cases even hard to distinguish. Dirstributions N B(13), U (12.5) have a roughly doubled mean. As could be expected, this puts them closer to the Bayesian relativity computed w.r.t. π(λ). The convergence rate appears quite slow, however, and this is further illustrated in the following Figure 14. We took here the Irish system and compared the stationarity-based Bayesian relativity (solid black) to those of four versions of the negative binomial distribution (3.1), one with mean 10 (dotted red), one with mean 100 (dashed red), one with mean 200 (dash-dotted red), and one with mean 400 (solid red). The figure confirms the expectation of convergence, but shows also that (as just noted) it is slow. The r * (λ) are plotted in Figures 15, 16, 17 one for each of the three bonus systems and the same 6 cases as for the relativities in Section 5, with the same legends. We see a considerable difference between the two stationarity-based average premiums (solid black and dotted black) for Ireland and Italy, whereas they appear almost identical for Germany. The age-corrected average premiums are again quite different, and exhibit somewhat similar behavior as the relativities in Figures 11, 12, 13. The ideal fairness criterion for a Bayesian premium rule is that the premium for a λ-customer should come as close to λ as possible. This can never be perfectly achieved: since the premium in the lowest bonus class is non-zero, a customer with a small λ will always pay too much, and since the premium in the highest class is finite, a customer with a large λ will always pay too little. The figures show that this effect is substantially more marked for the age-corrected average premiums than for the stationarity-based ones. The explanation is natural: if the customer has a finite sojourn time, the system will have less time to learn about his risk characteristics in the form of λ than if he had been there for ever, as is the false assumption underlying the stationarity-based calculations.

Concluding remarks
In this paper, we have inspected how reasonable it is to view bonus-malus system via the stationary distribution, as is usually done. The conclusion is that in many cases the transient distributions are quite far from the stationary ones, and that this has considerable consequences on the computation of such quantities as Bayesian relativities and average premiums.
We do not necessarily insist that our trial distributions for the sojourn time in the portfolio have the relevant time span. A motor insurance may be terminated for example just if the insured gets a new car. In that case, he will typically continue with a new policy in the same company, but not enter in the same level 0 as completely new insurers. Similar remarks to change of company, where usually some information on present bonus class or general previous claim statistics is passed from the new company to the old.
Examples of numerical studies of special bonus-malus systems are, for example, in Lemaire [12], Lemaire & Zi [13] and Mahmoudvand et al. [17]. These papers differ from the present one by not going into the Bayesian aspect. Here the more closely related literature is Norberg [19] and Borgan et al. [5]. In particular, [5] contains ideas on how to get away from the stationary point of view. As analogue of our π * (λ), [5] suggests a distribution of the form ∞ 0 w n e 0 P (λ) n where the w n are suitable weights summing to one. It is also briefly mentioned that one interpretation corresponds to sampling a customer at random from the portfolio, but the connection to our π * (λ) which is obtained by taking w n = f e n is not given. Also the concept of sampling a customer at random is not explained very clearly, cf., e.g., our Theorem 2.1 and Appendix B below, and in key examples the w n are taken constant on an interval whereas f e n is decreasing. Nevertheless, [5] contains some key ideas related to this paper, and to our mind, the paper has received surprisingly little attention in subsequent literature (but see Denuit et al. [8,Ch. 8]).
Of further classical references in the bonus-malus area not cited elsewhere in the text, we mention in particular (in chronological order) Grenander [10], Loimaranta [ A Proof of Theorem 2.2 For ease of notation, we suppress the dependency on λ. First note that Z = (I − P + 1π) − 1 satisfies (multiply by I − P + 1π on both sides). From this it follows easily by induction that P n Z = ZP n = Z − I − · · · − P n−1 + n1π so I + · · · + P n−1 = Z(I − P n ) + n1π Rewriting (2.1) in matrix notation and using the independence of A and L 0 , L 1 , . . . gives A variant of the derivation of the age-corrected distribution A different way to arrive at distribution (2.1) as the relevant bonus class distribution in a model with finite sojourn times of customers is to 'sieve customers one-by-one through the system'. By this we mean that we consider a sequence of λ-customers such that customer n has sojourn time A (n) and bonus classes L 1 , . . . , L 0 , L 1 , . . . , L This follows simply by noting that the instances A (1) , A (1)+A (2) , . . . where a new customer takes over in the construction of Z are regeneration points (the process starts afresh as from n = 0) and appealing to the general theory of regenerative processes ([2, Ch. 6]).
It should be noted, however, that Z is not a Markov chain. This follows by noting that where ξ n is the time elapsed since the last regeneration point (backward recurrence time; ξ n = n for 0 ≤ n < A (1) , Here the distribution of ξ n depends on n, except for the special case where A is geometric, and so must (B.2) do (the argument excludes time-homogeneity, but also the Markov property can be seen to fail).

C A more general model
We here suggest a model which incorporates several features not covered by the basic bonus-malus model consider in the body of the paper. We assume that a customer is characterized by a random mark M taking value in some set M and a time-homogeneous Markov chain X = X 0 , X 1 , . . . with state space X = N × L × Y where L is the finite set of bonus classes and Y is finite or countable. We write X k = (N k , L k , Y k ); N k is the number of claims in year k = 0, 1, . . ., assumed Poisson with parameter λ(M, L k , Y k ), and L k is the bonus class. The initial class L 0 as well as Y 0 depends on M .
In addition to potentially influencing the Poisson parameter, the Y component also generates the sojourn time A: it is assumed that the customer still in the portfolio in year k = 0, 1, . . . will no longer be there in year k + 1 w.p. δ(M, Y k ), so that then A = k + 1. The further transition rules state that L k+1 is calculated as a deterministic function b(N k , L k , Y k ) of X k = (N k , L k , Y k ), and given N k = n, L k = , Y k = y, one has Y k+1 = y w.p. q y ;n, ,y .
Example C.1. To cover the classical model with independent sojourn time considered in the rest of the paper, take M = Λ as the Poisson parameter of the customer and λ(m, , y) = m, L 0 = 0 . If A has some general distribution with point probabilities f n , n = 1, 2, . . ., there are at least two ways to conform to the general framework above. Both are familiar from the theory of discrete phase-type distributions, [2,III.4] or [1, IX.1 and A5] (see in particular Sections IX.1 and A5 of [1]).
In the first, we take the state space for the Y -chain to be Y = N ∪ {∆} for some extra state ∆ and Y 0 = 0. From state y = ∆ [corresponding to still being in the 22 portfolio in year y] one can go only to either y + 1 or ∆, w.p. f y+1 /f y+1 for ∆ and 1−f y+1/f y+1 for y +1 (thus q y ;n, ,y does not depend on n, ); state ∆ (the coffin state) is absorbing. In the second, we take Y = N and Y 0 = a w.p. f a . From Y n = m > 0, one goes always to Y n+1 = m, whereas state 0 is absorbing (thus q y ;n, ,y = δ y ,y−1 if y > 0, q y ;n, ,y = δ y , if y = 0).
Example C.2. Bonus hunger, i.e. the insured's aptness not to file all small enough claims in order to avoid increase in future premiums, has been studied repeatedly in the literature. E.g. Lemaire [11] (see also Lemaire & Zi [13]) calculate for a given bonus system, each class and each λ a retention level z , such that the insured's costs in terms of either covering a claim of size Z himself or expected future premiums precisely balance when Z = z when he is in class . He will then file the claim if Z > z and not otherwise. The calculation is based on a distribution G of a claim Z. We can model this by simply modifying Example C.1 by taking λ(m, , y) = m 1 − G(z ) rather than λ(m, , y) = m.
Example C.3. An insured may be tempted to look for another insurer if he has had many recent claims, therefore a high bonus class and so (often without reason!) believes that his present insurer's system is unfair. We can model this by modifying the first representation of A in Example C.1 by allowing direct transitions from state n of y to the coffin state ∆, occuring with a probability θ depending on the present bonus class (typically θ will be increasing in ). Thus q y+1;n, ,y = (1 − f y+1 /f y+1 )(1 − θ ) , q ∆;n, ,y = f y+1 /f y+1 + (1 − f y+1 /f y+1 )θ = θ + (1 − θ )f y+1 /f y+1 , all other q y ;n, ,y = 0.
Example C.4. Young or old drivers are generally considered to have risk characteristics different from the rest of the portfolio. We can model this by letting the mark M be the pair of the Poisson parameter Λ and (for a young driver) the year B = 0, 1, . . . after the drivers license (or the age for an old driver), as well as a state of y to be of the form (y , y ), with y determining A as above and y the updated year after the license. The initial class L 0 is then chosen as function of B, and one could have, e.g., that λ (λ, b), , (y , y ) has the multiplicative form s(y )λ.
Further examples, not spelled out in detail, are M including covariates entering multiplicatively in λ(m, , y).