Skip to main content
Log in

Including individual customer lifetime value and competing risks in tree-based lapse management strategies

  • Original Research Paper
  • Published:
European Actuarial Journal Aims and scope Submit manuscript

Abstract

A retention strategy based on an enlightened lapse model is a powerful profitability lever for a life insurer. Some machine learning models are excellent at predicting lapse, but from the insurer’s perspective, predicting which policyholder is likely to lapse is not enough to design a retention strategy. In our paper, we define a lapse management framework with an appropriate validation metric based on Customer Lifetime Value and profitability. We include the risk of death in the study through competing risks considerations in parametric and tree-based models and show that further individualization of the existing approaches leads to increased performance. We show that survival tree-based models outperform parametric approaches and that the actuarial literature can significantly benefit from them. Then, we compare, on real data, how this framework leads to increased predicted gains for a life insurer and discuss the benefits of our model in terms of commercial and strategic decision-making.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

For privacy reasons, all the data, statistics, product names and perimeters presented in this paper have been either anonymized or modified. All analyses, discussions and conclusions remain unchanged.

Notes

  1. Using XGBoost.

  2. We suppose that T has a continuous distribution.

  3. Because derived from the CIF, an improper cumulative distribution function.

  4. unlike to the function \(1-\exp \left( -\int _{0}^{t} \lambda _{T, j}(u) d u\right)\).

  5. as it does not tend to 1 as t goes to \(+\infty\)

References

  1. Akaike H (1973) Information theory and an extension of the maximum likelihood principle. In: Second international symposium on information theory, pp 267–281

  2. Ascarza E, Neslin SA, Netzer O, Anderson Z, Fader PS, Gupta S, Hardie B, Lemmens A, Libai B, Neal DT, Provost F, Schrift R (2018) In pursuit of enhanced customer retention management: review, key issues, and future directions. In: Special issue on 2016 choice symposium. Customer needs and solutions, p 5

  3. Azzone M, Barucci E, Moncayo GG, Marazzina D (2022) A machine learning model for lapse prediction in life insurance contracts. Expert Syst Appl 191:116261. https://doi.org/10.1016/j.eswa.2021.116261. (ISSN 0957-4174)

    Article  Google Scholar 

  4. Blum V, Thérond P-E (2019) Discount rates in IFRS: how practitioners depart the IFRS maze. PhD thesis, Autorité des Normes Comptables

  5. Bou-Hamad I, Larocque D, Ben-Ameur H (2011) A review of survival trees. Stat Surv 5:44–71. https://doi.org/10.1214/09-SS047

    Article  MathSciNet  Google Scholar 

  6. Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324

    Article  Google Scholar 

  7. Breiman L, Friedman J, Stone CJ, Olshen RA (1984) Classification and regression trees. Taylor & Francis, UK (ISBN 9780412048418)

    Google Scholar 

  8. Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78(1):1–3

    Article  ADS  Google Scholar 

  9. Buchardt K (2014) Dependent interest and transition rates in life insurance. Insur Math Econ. https://doi.org/10.1016/j.insmatheco.2014.01.004

    Article  MathSciNet  Google Scholar 

  10. Buchardt K, Moller T, Bjerre SK (2015) Cash flows and policyholder behaviour in the semi-Markov life insurance setup. Scand Actuar J 8:660–688. https://doi.org/10.1080/03461238.2013.879919

    Article  MathSciNet  Google Scholar 

  11. Burrows R, Lang J (1997) Risk discount rates for actuarial appraisal values of life insurance companies. In: Proceedings of the 7th international AFIR colloquium, pp 283–307

  12. Chen T, Guestrin C (2016) XGBoost: a scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’16. ACM, New York, NY, USA, pp 785–794. https://doi.org/10.1145/2939672.2939785. (ISBN 978-1-4503-4232-2)

  13. Chinchor N (1992) Muc-4 evaluation metrics. In: Proceedings of the 4th conference on message understanding, MUC4 ’92. Association for Computational Linguistics, USA, pp 22–29. https://doi.org/10.3115/1072064.1072067. (ISBN 1558602739)

  14. Cox DR (1972) Regression models and life-tables. J Roy Stat Soc Ser B (Methodol) 34(2):187–220. http://www.medicine.mcgill.ca/epidemiology/hanley/c626/cox_jrssB_1972_hi_res.pdf

    MathSciNet  Google Scholar 

  15. Cox SH, Lin Y (2006) Annuity lapse modeling: tobit or not tobit ? Society of Actuaries. https://www.soa.org/globalassets/assets/files/research/projects/cox-linn-paper-11-15-06.pdf

  16. Ćurak M, Podrug D, Poposki K (2015) Policyholder and insurance policy features as determinants of life insurance lapse-evidence from Croatia. Econ Bus Rev 1(15), 58–77. https://doi.org/10.18559/ebr.2015.3.5

  17. Dar AA, Dodds C (1989) Interest rates, the emergency fund hypothesis and saving through endowment policies: some empirical evidence for the UK. J Risk Insur 56:415

    Article  Google Scholar 

  18. Davidson-Pilon C (2019) Lifelines: survival analysis in python. J Open Source Softw 4(40):1317. https://doi.org/10.21105/joss.01317

    Article  ADS  Google Scholar 

  19. Donkers B, Verhoef P, Jong M (2007) Modeling clv: a test of competing models in the insurance industry. Quant Market Econ (QME) 5(2):163–190

    Article  Google Scholar 

  20. Duchemin R, Matheus R (2021) Forecasting customer churn: comparing the performance of statistical methods on more than just accuracy. J Supply Chain Manage Sci JSCMS 2(3/4):115–137

    Google Scholar 

  21. Eling M, Kiesenbauer D (2014) What policy features determine life insurance lapse? an analysis of the German market. J Risk Insur 81(2):241–269 (ISSN 00224367)

    Article  Google Scholar 

  22. Eling M, Kochanski M (2013) Research on lapse in life insurance: what has been done and what needs to be done? J Risk Fin 14(4):392–413

    Article  Google Scholar 

  23. Freund Y, Schapire RE (1996) Experiments with a new boosting algorithm. In: International conference on machine learning, pp 148–156. http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.51.6252

  24. Gatzert N, Schmeiser H (2008) Assessing the risk potential of premium payment options in participating life insurance contracts. J Risk Insur 75(3):691–712 (ISSN 00224367)

    Article  Google Scholar 

  25. Gemmo I, Götz M (2016) Life insurance and demographic change: an empirical analysis of surrender decisions based on panel data. ICIR working paper series 24/16, Goethe University Frankfurt, International Center for Insurance Regulation (ICIR). https://ideas.repec.org/p/zbw/icirwp/2416.html

  26. Grinsztajn L, Oyallon E, Varoquaux G (2022) Why do tree-based models still outperform deep learning on tabular data?. Thirty-sixth conference on neural information processing systems datasets and benchmarks track. https://openreview.net/forum?id=Fp7__phQszn

  27. Gupta S (2009) Customer-based valuation. J Interact Market 23(2):169–178. https://doi.org/10.1016/j.intmar.2009.02.006. (ISSN 1094-9968)

    Article  Google Scholar 

  28. Gupta S, Lehmann DR (2006) Customer lifetime value and firm valuation. J Relationship Market 5(2–3):87–110. https://doi.org/10.1300/J366v05n02_06

    Article  Google Scholar 

  29. Gupta S, Hanssens D, Hardie B, Kahn W, Kumar V, Lin N, Ravishanker N, Sriram S (2006) Modeling customer lifetime value. J Serv Res 9(139–155):11. https://doi.org/10.1177/1094670506293810

    Article  Google Scholar 

  30. Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (roc) curve. Radiology 143(1):29–36. https://doi.org/10.1148/radiology.143.1.7063747. (PMID: 7063747)

    Article  CAS  PubMed  Google Scholar 

  31. Harrell FE, Califf RM, Pryor DB, Lee KL, Rosati RA (1982) Evaluating the yield of medical tests. JAMA 247(18):2543–2546

    Article  PubMed  Google Scholar 

  32. He H, Garcia EA (2009) Learning from imbalanced data. IEEE Trans Knowl Data Eng 21(9):1263–1284. https://doi.org/10.1109/TKDE.2008.239

    Article  Google Scholar 

  33. Hu S, O’Hagan A, Sweeney J, Ghahramani M (2021) A spatial machine learning model for analysing customers’ lapse behaviour in life insurance. Ann Actuar Sci 15(2):367–393. https://doi.org/10.1017/S1748499520000329

    Article  Google Scholar 

  34. Hwang Y, Chan LF-S, Tsai J (2022) On voluntary terminations of life insurance: differentiating surrender propensity from lapse propensity across product types. North Am Actuar J 26(2):252–282. https://doi.org/10.1080/10920277.2021.1973507

    Article  Google Scholar 

  35. Ishwaran H, Kogalur UB (2007) Random survival forests for r. R News 7(2):25–31

    Google Scholar 

  36. Ishwaran H, Kogalur UB, Blackstone EH, Lauer MS (2008) Random survival forests. Ann Appl Stat 2(3):841–860. https://doi.org/10.1214/08-AOAS169

    Article  MathSciNet  Google Scholar 

  37. Ishwaran H, Gerds TA, Kogalur UB, Moore RD, Gange SJ, Lau BM (2014) Random survival forests for competing risks. Biostatistics 15(4):757–73. https://doi.org/10.1093/biostatistics/kxu010. (Epub 2014 Apr 11. PMID: 24728979 ; PMCID: PMC4173102)

    Article  PubMed  PubMed Central  Google Scholar 

  38. Kagraoka Y (2005) Modeling insurance surrenders by the negative binomial model. In: JAFEE international conference, 01. https://www.researchgate.net/publication/228481596_Modeling_Insurance_Surrenders_by_the_Negative_Binomial_Model

  39. Kiesenbauer D (2012) Main determinants of lapse in the German life insurance industry. North Am Actuar J 16(1):52–73. https://doi.org/10.1080/10920277.2012.10590632

    Article  MathSciNet  Google Scholar 

  40. Kim C (2005) Modeling surrender and lapse rates with economic variables. North Am Actuar J 9(4):56–70. https://doi.org/10.1080/10920277.2005.10596225

    Article  MathSciNet  Google Scholar 

  41. KPMG (2020) First impressions: Ifrs 17 insurance contracts (2020 edition), Jul 2020. https://assets.kpmg/content/dam/kpmg/ie/pdf/2020/09/ie-ifrs-17-first-impressions.pdf

  42. Kuo W, Tsai C, Chen W-K (2003) An empirical study on the lapse rate: the cointegration approach. J Risk Insur 70(3):489–508 (ISSN 00224367)

    Article  Google Scholar 

  43. Laurent J-P, Norberg R, Planchet F (eds) (2016) Modelling in life insurance—a management perspective (1st edn). European Actuarial Academy (EAA) series. Springer International Publishing, Cham, Switzerland

  44. Leblanc M, Crowley J (1993) Survival trees by goodness of split. J Am Stat Assoc 88(422):457. https://doi.org/10.2307/2290325. (ISSN 0162-1459)

    Article  MathSciNet  Google Scholar 

  45. Lemmens A, Gupta S (2020) Managing churn to maximize profits. Mark Sci 39(5):956–973

    Article  Google Scholar 

  46. Loisel S, Piette P, Jason Tsai C-H (2021) Applying economic measures to lapse risk management with machine learning approaches. ASTIN Bull 51(3):839–871. https://doi.org/10.1017/asb.2021.10

    Article  MathSciNet  Google Scholar 

  47. Mantel N (1966) Evaluation of survival data and two new rank order statistics arising in its consideration. Cancer Chemother Rep 1(50):163–170

    Google Scholar 

  48. Milhaud X, Dutang C (2018) Lapse tables for lapse risk management in insurance: a competing risk approach. Eur Actuar J 8(1):97–126

    Article  MathSciNet  Google Scholar 

  49. Milhaud X, Loisel S, Maume-Deschamps V (2011) Surrender triggers in life insurance: what main features affect the surrender behavior in a classical economic context ? Bull Fran d’Actuar 11(22):5–48

    Google Scholar 

  50. Nolte S, Schneider JC (2017) Don’t lapse into temptation: a behavioral explanation for policy surrender. J Bank Fin 79:12–27

    Article  Google Scholar 

  51. Oh S, Ouh C, Park S, Siyeol C, Park K (2018) A study on the estimation of the discount rate for the insurance liability under ifrs 17. J Insur Fin 29(3):45–75 (ISSN 2384-3209)

    Article  Google Scholar 

  52. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V et al (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830

    MathSciNet  Google Scholar 

  53. Pölsterl S (2020) scikit-survival: a library for time-to-event analysis built on top of scikit-learn. J Mach Learn Res 21(212):1–6

    Google Scholar 

  54. Poufinas T, Michaelide G (2018) Determinants of life insurance policy surrenders. Mod Econ 9:1400–1422. https://doi.org/10.4236/me.2018.98089

    Article  Google Scholar 

  55. Putter H, Schumacher M, van Houwelingen HC (2020) On the relation between the cause-specific hazard and the subdistribution rate for competing risks data: the fine-gray model revisited. Biom J. https://doi.org/10.1002/bimj.201800274

    Article  MathSciNet  PubMed  PubMed Central  Google Scholar 

  56. Renshaw AE, Haberman S (1986) Statistical analysis of life assurance lapses. J Inst Actuar 113:459–497. http://www.jstor.org/stable/41140822

    Article  Google Scholar 

  57. Routh P, Roy A, Meyer J (2021) Estimating customer churn under competing risks. J Oper Res Soc 72(5):1138–1155. https://doi.org/10.1080/01605682.2020.1776166

    Article  Google Scholar 

  58. Russell DT, Fier SG, Carson JM, Dumm RE (2013) An empirical analysis of life insurance policy surrender activity. J Insur Issues 36(1):35–57 (ISSN 15316076)

    Google Scholar 

  59. Shamsuddin S, Noriszura I, Roslan N (2022) What we know about research on life insurance lapse: a bibliometric analysis. Risks 10(97):5. https://doi.org/10.3390/risks10050097

    Article  Google Scholar 

  60. Sirak AS (2015) Income and unemployment effects on life insurance lapse. https://www.wiwi.uni-frankfurt.de/fileadmin/user_upload/dateien_abteilungen/abt_fin/Dokumente/PDFs/Allgemeine_Dokumente/Inderst_Downloads/Neuere_Arbeiten_seit2015/SIRAK_-_Income_and_Unemployment_Effects_on_Life_Insurance_Lapse.pdf

  61. Vasudev M, Bajaj R, Escolano AA (2016) On the drivers of lapse rates in life insurance. Sarjana thesis, University of Barcelona, Barcelona, Spain. https://diposit.ub.edu/dspace/handle/2445/115586

  62. von Mutius B, Huchzermeier A (2021) Customized targeting strategies for category coupons to maximize clv and minimize cost. J Retail 97(4):764–779. https://doi.org/10.1016/j.jretai.2021.01.004. (ISSN 0022-4359)

    Article  Google Scholar 

  63. Yu L, Cheng J, Lin T (2019) Life insurance lapse behaviour: evidence from China. Geneva Pap Risk Insur Issues Pract 44(4):653–678

    Article  Google Scholar 

Download references

Acknowledgements

Work(s) conducted within the Research Chair DIALog under the aegis of the Risk Foundation, an initiative by CNP Assurances. The authors would like to express their very great gratitude to Marie Hyvernaud and Stéphanie Dosseh for their valuable and constructive suggestions while developing this research work. Special thanks should be given to Marie Hyvernaud for her contribution to code writing.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Mathias Valla.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

A Appendix

A Appendix

1.1 A.1 Competing risk framework

There are several regression models to estimate the global hazard and the hazard of one risk in settings where competing risks are present: modeling the cause-specific hazard and the subdistribution hazard function. They account for competing risks differently, obtaining different hazard functions and thus distinct advantages, drawbacks, and interpretations. Here, we will introduce those approaches’ theoretical and practical implications and justify which one we will use in our modeling approaches.

In cause-specific regression, each cause-specific hazard is estimated separately, in our case, the cause-specific hazards of lapse and death, by considering all subjects that experienced the competing event as censored. Here, t is the traditional time variable of a survival model, with \(t=0\) being the beginning of a policy. It is not to be confused with the use of t in Sects. 3 and 4. We remind that \(J_{T}=0\) corresponds to an active subject that did not experience lapse \(J_{T}=1\) or death \(J_{T}=2\). The cause-specific hazard rates regarding the jth risk (\(j \in [1, \ldots J]\)) are defined as

$$\begin{aligned} \lambda _{T, j}(t)=\lim _{d t \rightarrow 0} \frac{P\left( t \le T<t+d t, J_{T}=j \mid T \ge t\right) }{d t}. \end{aligned}$$

We can recover the global hazard rate as \(\lambda _{T, 1}(t)+\cdots +\lambda _{T, J}(t)=\lambda _{T}(t)\), and derive the global survival distribution of T as

$$\begin{aligned} P(T>t)&=1-F_{T}(t)=S_{T}(t)\\&=\exp \left( -\int _{0}^{t}\left( \lambda _{T, 1}(s)+\cdots +\lambda _{T, J}(s)\right) d s\right) . \end{aligned}$$

This approach aims at analysing the cause-specific “distribution” function: \(F_{T, j}(t)=P\left( T \le t, J_{T}=j\right)\). In practice, it is called the Cumulative Incidence Function (CIF) for cause j and not a distribution function since \(F_{T, j}(t) \rightarrow P\left( J_{T}=j\right) \ne 1\) as \(t \rightarrow +\infty\). By analogy with the classical survival framework, the CIF can be characterised as \(F_{T, j}(t)=\int _{0}^{t} f_{T, j}(s) d s\),Footnote 2 where \(f_{T, j}\) is the improperFootnote 3 density function for cause j. It follows that

$$\begin{aligned} f_{T, j}(s)=\lim _{d t \rightarrow 0} \frac{P\left( t \le T<t+d t, J_{T}=j\right) }{d t}=\lambda _{T, j}(t) S_{T}(t). \end{aligned}$$

The equation above is self-explanatory: the probability of experiencing cause j at time t is simply the product of surviving the previous time periods by the cause-specific hazard at time t. We finally obtain the CIF for cause j as

$$\begin{aligned} F_{T, j}(t)=\int _{0}^{t} \lambda _{T, j}(s) \exp \left( -\int _{0}^{s} \lambda _{T}(u) d u\right) d s. \end{aligned}$$

There are several advantages to that approach. First of all, cause-specific hazard models can be easily fit with any classical implementation of CPH by simply considering as censored any subject that experienced the competing event. Then the CIF is clearly interpretable and summable \(P(T \le t)=F_{T, 1}(s)+\cdots +F_{T, J}(s)\)Footnote 4. On the other hand, the CIF estimation of one given cause depends on all other causes: it implies that the study of a specific cause requires estimating the global hazard rate, and interpreting the effects of covariates on this cause is difficult. Indeed, part of the effects on a specific cause comes from the competing causes, but in our setting, we are only interested in the prediction of the survival probabilities, not their interpretation as such.

We have introduced it at the beginning of this section; another approach is often considered to analyze competing risks and derive a cause-specific CIF. This other approach called the subdistribution hazard function of Fine and Gray regression, works by considering a new competing risk process \(\tau\). Without loss of generality, let’s consider death as our cause of interest,

$$\begin{aligned} \tau =T \times \mathbbm {1}_{J_{T}=2}+\infty \times \mathbbm {1}_{J_{T} \ne 2}. \end{aligned}$$

It has the same as T regarding the risk of death, \(P(\tau \le t)=F_{T, 2}(t)\) and a mass point at infinity \(1-F_{T, 2}(\infty )\), probability to observe other causes \(\left( J_{T} \ne 2\right)\) or not to observe any failure. In other words, if the previous approach considered every subject that experienced competing events as censored, this approach considers a new and artificial at-risk population. This last consideration is made clear when deriving the hazard rate of \(\tau\),

$$\begin{aligned} \lambda _{\tau }(t)=\lim _{d t \rightarrow 0} \frac{P\left( t \le T<t+d t, J_{T}=2 \mid \{T \ge t\} \cup \left\{ T \le t, J_{T} \ne 2\right\} \right) }{d t}. \end{aligned}$$

Finally, we obtain the CIF for the risk of death as

$$\begin{aligned} F_{T, 2}(t)=1-\exp \left( -\int _{0}^{t} \lambda _{\tau }(s) d s\right) . \end{aligned}$$

This subdistribution approach resolves the most important drawback to cause-specific regression, as the coefficients resulting from it do have a direct relationship with the cumulative incidence: estimating the CIF for a specific cause does not depend on the other causes, which makes the interpretation of CIF easier. The subdistribution hazard models can be fit in R by using the crr function in the cmprsk package or using the timereg package. Still, to our knowledge, there is no implementation of a Fine and Gray model in Lifelines or, more generally, Python. We can also note that these two approaches are linked, [55] and the link between \(\lambda _{\tau }(t)\) and \(\lambda _{T, j}(t)\) is given by

$$\begin{aligned} \lambda _{\tau }(t)=r_{j}(t) \lambda _{T, j}(t), \text { with } r_{j}(t)=\frac{P(J_{T}=0)}{\sum ^{J}_{p \ne j} P(J_{T}=p)}. \end{aligned}$$

In other words, if the probability of any competing risk is low, the two approaches give very close results.

1.2 A. 2 Survival analysis results

The quantity \(r^{lapser}_{i,t}\) represents the probability that the policy of subject i is still active at time t, given that it was active at its last observed time. Predicting the overall conditional survival with the competing risks, in that case, can be achieved by creating a combined outcome. The policy ends with death or lapse, whichever comes first, and to compute \(r^{lapser}\), we recode the competing events as a combined event. In terms of statistical guarantees, this approach is compatible with any survival analysis method.

In the following sections of this appendix, \(r^{acceptant}_{i,t}\) indicates the probability of survival for subject i at time t given that it will not lapse. In other words, it is the survival probability regarding only the risk of death. As detailed in Sect. 4.1.1, this corresponds to the cause-specific survival probability for death. It is to be noted that the density from which we derive our survival probabilities is improper as it derives itself from the CIF, which is not a proper distribution function.Footnote 5 Therefore, any conclusion about those probabilities should be drawn with care. Similarly to \(r^{lapser}\), covariates selection and tuning are performed by minimizing AIC.

All graphs representing survival curves below are plotted with the same axis. The x-axes are the time in years, the y-axes represent the survival probability.

1.2.1 A.2.1 Cox-model

We first decide to estimate survival with a Cox Proportional hazard model with a spline baseline hazard from the Python library Lifelines. Covariate selection and tuning are performed by minimizing AIC.

Here is what \(r^{acceptant}\), the vector of cause-specific probabilities, looks like, and we can compare it to \(r^{lapser}\) on some subjects (Figs. 9, 10).

Fig. 9
figure 9

10 policyholders’ survival curve for \(r^{acceptant}\) with Cox model

Fig. 10
figure 10

10 policyholders’ survival curve for \(r^{lapser}\)

The effect of various covariates on the survival outcome can be found below (Figs. 11, 12, 13, 14, 15, 16, 17, 18, 19).

Fig. 11
figure 11

Coefficient plot for \(r^{lapser}\)

Fig. 12
figure 12

\(r^{lapser}\) trajectories for different products

Fig. 13
figure 13

\(r^{lapser}\) trajectories by gender

Fig. 14
figure 14

\(r^{lapser}\) trajectories for different ages

Fig. 15
figure 15

\(r^{lapser}\) trajectories for different face amounts

Fig. 16
figure 16

Coefficient plot for \(r^{acceptant}\)

Fig. 17
figure 17

\(r^{acceptant}\) trajectories by gender

Fig. 18
figure 18

\(r^{acceptant}\) trajectories for different ages

Fig. 19
figure 19

\(r^{acceptant}\) trajectories for different face amounts

1.2.2 A.2.2 RSF

We obtain better results than Cox in terms of concordance index at the cost of very high computation time for one training with one set of parameters—5 days without parallelisation, 4 h with—compared to a few seconds for cox model (Tables 4, 5).

Some of the results we obtain are displayed below (Figs. 20, 21).

Fig. 20
figure 20

10 policyholders’ survival curve for \(r^{acceptant}\) with RSF

Fig. 21
figure 21

10 policyholders’ survival curve for \(r^{lapser}\) with RSF

Table 4 Covariates importance for \(r^{acceptant}\) with RSF
Table 5 Covariates importance for \(r^{lapser}\) with RSF

1.2.3 A.2.3 XGSB

We obtain better results than Cox and slightly better results than RSF in terms of concordance index at the cost of even higher computation time for one training with one set of parameters—10 h with great parallelisation—compared to a few seconds for Cox model (Tables 6, 7).

Some of the results we obtain are displayed below (Figs. 22, 23).

Fig. 22
figure 22

10 policyholders’ survival curve for \(r^{acceptant}\) with GBSM

Fig. 23
figure 23

10 policyholders’ survival curve for \(r^{lapser}\) with GBSM

Table 6 Covariates importance for \(r^{acceptant}\) with GBSM
Table 7 Covariates importance for \(r^{lapser}\) with GBSM

1.2.4 A.2.4 Final survival model

The final concordance index scores are displayed below (Table 8):

Table 8 Survival models comparison

1.3 A.3 Other results

Fig. 24
figure 24

Correlation between the proportion of non-targeted lapsers and the improvement of a CLV-augmented LMS.(Taking the results of XGBoost and excluding LMS \(\hbox {n}^{\circ }\hbox {B}\)-27 that has a very high improvement ratio.)

1.4 A.4 Considering various statistical metrics

The table below contains the results of the ”LMS listed in Table 3, evaluated on accuracy, recall, F1-score and AUC. For every metric, it displays the results of a classification over \(y_i\) tuned and cross-validated with each of the metrics—respectively \({\mathop {y_i}\limits ^{accuracy}}\), \({\mathop {y_i}\limits ^{recall}}\), \({\mathop {y_i}\limits ^{F1-score}}\) and \({\mathop {y_i}\limits ^{AUC}}\)—or over \(\tilde{y}_i\) which is always tuned and cross-validated with RG (Table 9, Fig. 24).

Table 9 Results of representative LMS with various statistical metrics

It is to be noted that regardless of the evaluation metric used for tuning and validation purposes, the objective function used with XGB to generate those results is always the log-loss function. Using the area under the ROC curve or the area under the Precision-Recall curve as an objective function in this boosting algorithm would surely yield better results when trained on \(y_i\) and even better on the more unbalanced \(\tilde{y}_i\). As stated in Sect. 4.2, this analysis is not within the scope of our article.

1.5 A.5 Complete LMS numerical results

See Table 10.

Table 10 More LMS

No

Time (s)

Model

% Target diff

Accuracy

Retention gain

RG/target

Improvement\(^{\text{a}}\)

\(y_i\)

\(\tilde{y}_i\)

\(y_i\)

\(\tilde{y}_i\)

\(y_i\)

\(\tilde{y}_i\)

A-1

4949

CART

62.58%

92.3%

85.3%

114,661

219,655

4.48

38.20

91.57%

RF

92.9%

85.4%

232,314

287,884

9.82

56.65

23.92%

XGB

93.4%

85.8%

243,365

324,952

9.61

54.64

33.52%

A-2

6111

CART

26.66%

92.3%

89.8%

7,092,097

6,142,119

277.00

353.83

\(-\)13.39%

RF

92.9%

90.2%

6,596,374

5,696,455

278.47

351.02

\(-\)13.64%

XGB

93.4%

90.9%

7,308,721

7,432,688

288.92

404.84

1.70%

A-3

4603

CART

93.50%

92.3%

83.3%

\(-\)2,187,622

\(-\)8224

\(-\)85.52

\(-\)31.09

99.62%

RF

92.9%

83.4%

\(-\)1,900,265

45,483

\(-\)80.18

194.35

102.39%

XGB

93.4%

83.5%

\(-\)2,032,650

77,481

\(-\)80.39

174.44

103.81%

A-4

5555

CART

55.37%

92.3%

86.5%

4,789,814

5,117,844

187.00

577.74

6.85%

RF

92.9%

86.4%

4,463,796

4,255,175

188.47

566.05

\(-\)4.67%

XGB

93.4%

86.8%

5,032,706

5,433,366

198.92

610.26

7.96%

A-5

4753

CART

86.72%

92.3%

83.6%

\(-\) 514,477

\(-\)112,372

\(-\)20.08

\(-\)86.48

78.16%

RF

92.9%

83.4%

\(-\)323,544

\(-\)3937

\(-\)13.65

\(-\)28.28

98.78%

XGB

93.4%

83.3%

\(-\)383,004

0

\(-\)15.14

0

100.00%

A-6

5803

CART

44.27%

92.3%

87.9%

335,810

517,224

13.17

39.91

54.02%

RF

92.9%

87.9%

655,350

661,021

27.68

61.13

0.87%

XGB

93.4%

88.6%

654,219

729,493

25.86

58.22

11.51%

A-7

4241

CART

99.09%

92.3%

83.3%

\(-\)2,816,759

\(-\)10,205

\(-\)110.08

\(-\)384.04

99.64%

RF

92.9%

83.3%

\(-\)2,456,122

1013

\(-\)103.65

66.30

100.04%

XGB

93.4%

83.3%

\(-\)2,659,020

243

\(-\)105.14

15.92

100.01%

A-8

5164

CART

82.78%

92.3%

84.0%

\(-\)1,966,473

\(-\)46,323

\(-\)76.83

\(-\)22.31

97.64%

RF

92.9%

84.0%

\(-\)1,477,229

253,885

\(-\)62.32

149.67

117.19%

XGB

93.4%

84.1%

\(-\)1,621,796

273,243

\(-\)64.14

117.83

116.85%

A-9

4781

CART

77.60%

92.3%

83.7%

\(-\)825 372

\(-\)161 100

\(-\)32.19

\(-\) 127.87

80.48%

RF

92.9%

83.4%

\(-\)384,736

8 596

\(-\)16.22

32.12

102.23%

XGB

93.4%

83.6%

\(-\)498,263

22,337

\(-\)19.70

35.47

104.48%

A-10

6075

CART

29.10%

92.3%

89.7%

4,614,513

4,483,831

180.36

266.33

\(-\)2.83%

RF

92.9%

89.9%

4,973,929

4,328,724

210.01

280.90

\(-\)12.97%

XGB

93.4%

90.7%

5,354,770

5,368,917

211.69

301.57

0.26%

A-11

4506

CART

96.56%

92.3%

83.2%

\(-\)3,127,655

\(-\)118,886

\(-\)122.19

\(-\)2230.39

96.20%

RF

92.9%

83.3%

\(-\)2,517,315

1340

\(-\)106.22

87.71

100.05%

XGB

93.4%

83.3%

\(-\)2,774,278

736

\(-\)109.70

52.00

100.03%

A-12

5534

CART

57.93%

92.3%

86.2%

2,312,231

3,310,314

90.36

412.71

43.17%

RF

92.9%

86.1%

2,841,351

3,129,652

120.01

465.74

10.15%

XGB

93.4%

86.6%

3,078,755

3825920

121.69

475.53

24.27%

A-13

4640

CART

92.91%

92.3%

83.3%

\(-\)1,201,626

\(-\)163,056

\(-\)46.87

\(-\)1838.44

86.43%

RF

92.9%

83.3%

\(-\)717,620

\(-\) 5339

\(-\)30.28

\(-\)354.24

99.26%

XGB

93.4%

83.3%

\(-\)875,378

508

\(-\)34.60

16.26

100.06%

A-14

5739

CART

47.12%

92.3%

87.3%

\(-\)1,476,651

\(-\) 831,019

\(-\)57.49

\(-\)77.99

43.72%

RF

92.9%

86.0%

\(-\)380,683

126,532

\(-\)16.03

21.14

133.24%

XGB

93.4%

85.5%

\(-\)644,389

29,382

\(-\)25.47

7.10

104.56%

A-15

4216

CART

99.61%

92.3%

83.3%

\(-\)3,503,908

\(-\) 97,263

\(-\) 136.87

\(-\)2354.34

97.22%

RF

92.9%

83.3%

\(-\)2,850,198

0

\(-\)120.28

0

100.00%

XGB

93.4%

83.3%

\(-\)3,151,393

0

\(-\)124.60

0

100.00%

A-16

5096

CART

84.46%

92.3%

83.8%

\(-\)3,778,933

\(-\)734,773

\(-\)147.49

\(-\)418.58

80.56%

RF

92.9%

83.5%

\(-\)2 ,513,261

8914

\(-\)106.03

20.13

100.35%

XGB

93.4%

83.6%

\(-\)2,920,405

34,492

\(-\)115.47

45.75

101.18%

No

Time (s)

Model

% Target diff

Accuracy

Retention gain

RG/target

Improvement\(^{\text{a}}\)

\(y_i\)

\(\tilde{y}_i\)

\(y_i\)

\(\tilde{y}_i\)

\(y_i\)

\(\tilde{y}_i\)

A-17

5390

CART

28.74%

92.3%

89.5%

5,100,456

4,899,479

199.11

279.88

\(-\)3.94%

RF

92.9%

89.8%

4,635,482

4,226,648

195.69

276.06

\(-\)8.82%

XGB

93.4%

90.2%

5,196,736

5,138,253

205.40

299.27

\(-\)1.13%

A-18

6452

CART

12.12%

92.3%

91.3%

52,090,240

47,706,070

2034.15

2170.64

\(-\)8.42%

RF

92.9%

91.9%

46,171,160

42,049,900

1949.05

2082.36

\(-\)8.93%

XGB

93.4%

92.5%

51,629,950

52,606,740

2040.95

2339.70

1.89%

A-19

4913

CART

64.89%

92.3%

85.2%

2,798,173

3,182,143

109.11

481.60

13.72%

RF

92.9%

85.2%

2,502,903

2,743,070

105.69

554.76

9.60%

XGB

93.4%

85.6%

2,920,720

3,438,303

115.40

576.64

17.72%

A-20

6160

CART

29.03%

92.3%

89.6%

49,787,960

45,366,730

1944.15

2616.32

\(-\)8.88%

RF

92.9%

90.0%

44,038,580

39,947,830

1859.05

2547.89

\(-\)9.29%

XGB

93.4%

90.6%

49,353,940

49,789,670

1950.95

2796.17

0.88%

A-21

5079

CART

51.69%

92.3%

86.8%

482,682

544,887

18.85

53.99

12.89%

RF

92.9%

86.8%

557,090

554,195

23.52

65.17

\(-\)0.52%

XGB

93.4%

87.1%

607,670

624,556

24.01

64.79

2.78%

A-22

6199

CART

23.94%

92.3%

90.2%

9,335,438

8,527,444

364.60

454.78

\(-\)8.66%

RF

92.9%

90.6%

8,570,307

7,931,029

361.80

460.42

\(-\)7.46%

XGB

93.4%

91.2%

9,518,466

9,581,934

376.27

501.56

0.67%

A-23

4601

CART

89.51%

92.3%

83.6%

\(-\)1,819,600

135,305

\(-\) 71.15

121.80

107.44%

RF

92.9%

83.5%

\(-\)1,575,489

159,620

\(-\) 66.48

215.65

110.13%

XGB

93.4%

83.7%

\(-\)1,668,346

228,226

\(-\) 65.99

208.69

113.68%

A-24

5650

CART

50.83%

92.3%

87.0%

7,033,156

7,124,100

274.60

680.08

1.29%

RF

92.9%

87.0%

6,437,729

6,364,477

271.80

711.89

\(-\)1.14%

XGB

93.4%

87.4%

7,242,450

7,840,770

286.27

771.71

8.26%

A-25

5379

CART

30.97%

92.3%

89.2%

4,160,423

3,882,623

162.44

241.06

\(-\)6.68%

RF

92.9%

89.5%

4,018,432

3,666,219

169.65

249.54

\(-\)8.76%

XGB

93.4%

90.0%

4,455,108

4,410,629

176.09

267.87

\(-\)1.00%

A-26

6410

CART

12.52%

92.3%

91.3%

49,612,660

45,948,690

1937.51

2 083.30

\(-\)7.39%

RF

92.9%

91.9%

44,548,720

40,814,960

1880.59

2029.68

\(-\)8.38%

XGB

93.4%

92.5%

49,676,000

50,549,740

1963.72

2260.20

1.76%

A-27

4887

CART

66.67%

92.3%

85.1%

1,858,140

2,575,538

72.44

442.86

38.61%

RF

92.9%

85.0%

1,885,853

2,387,018

79.65

531.25

26.57%

XGB

93.4%

85.4%

2,179,093

2,879,880

86.09

544.35

32.16%

A-28

6047

CART

29.42%

92.3%

89.4%

47,310,370

43,168,880

1847.51

2519.41

\(-\)8.75%

RF

92.9%

89.9%

42,416,140

38,573,620

1790.59

2504.61

\(-\)9.06%

XGB

93.4%

90.5%

47,399,990

47,812,830

1873.72

2721.63

0.87%

A-29

5070

CART

53.79%

92.3%

86.5%

\(-\) 204 467

\(-\) 5098

\(-\) 7.95

\(-\) 1.66

97.51%

RF

92.9%

86.1%

163,014

273,435

6.90

40.30

67.74%

XGB

93.4%

86.8%

115,297

248,982

4.55

28.64

115.95%

A-30

6179

CART

24.36%

92.3%

90.3%

7,522,978

7,058,487

293.94

382.06

\(-\)6.17%

RF

92.9%

90.6%

7,534,275

7,068,293

318.08

411.80

\(-\)6.18%

XGB

93.4%

91.2%

8,219,857

8,265,167

324.94

442.88

0.55%

A-31

4627

CART

90.18%

92.3%

83.6%

\(-\) 2 506 749

\(-\) 139 983

\(-\) 97.95

\(-\) 121.44

94.42%

RF

92.9%

83.5%

\(-\)1,969,564

73,101

\(-\) 83.10

111.49

103.71%

XGB

93.4%

83.6%

\(-\)2,160,719

76,641

\(-\) 85.45

93.28

103.55%

A-32

5679

CART

51.25%

92.3%

86.8%

5, 220,695

5,811,833

203.94

583.55

11.32%

RF

92.9%

86.9%

5,401,696

5,269,505

228.08

605.69

\(-\)2.45%

XGB

93.4%

87.4%

5,943,841

6,682,230

234.94

670.03

12.42%

No

Time (s)

Model

% Target diff

Accuracy

Retention gain

RG/target

Improvement\(^{\text{a}}\)

\(y_i\)

\(\tilde{y}_i\)

\(y_i\)

\(\tilde{y}_i\)

\(y_i\)

\(\tilde{y}_i\)

B-1

4778

CART

75.89%

92.3%

84.0%

\(-\) 627,165

\(-\)148,913

\(-\)24.46

\(-\)65.19

76.26%

RF

92.9%

83.7%

\(-\)280,855

11,973

\(-\)11.84

9.57

104.26%

XGB

93.4%

84.1%

\(-\)366,103

25,099

\(-\)14.47

12.30

106.86%

B-2

6074

CART

29.70%

92.3%

89.7%

3,862,156

3,397,247

150.95

203.11

\(-\)12.04%

RF

92.9%

89.9%

4,127,224

3,550,730

174.26

230.67

\(-\)13.97%

XGB

93.4%

90.6%

4,451,686

4,408,819

175.99

250.17

\(-\)0.96%

B-3

4528

CART

96.60%

92.3%

83.2%

\(-\)2,929,448

\(-\)85,465

\(-\)114.46

\(-\)1482.06

97.08%

RF

92.9%

83.3%

\(-\)2,413,433

3724

\(-\)101.84

\(-\)108.33

100.15%

XGB

93.4%

83.3%

\(-\)2,642,119

9092

\(-\)104.47

93.79

100.34%

B-4

5476

CART

60.93%

92.3%

85.9%

1,559,874

2,471,262

60.95

329.63

58.43%

RF

92.9%

85.8%

1,994,645

2,517,111

84.26

422.45

26.19%

XGB

93.4%

86.3%

2,175,670

3,089,897

85.99

422.77

42.02%

B-5

4708

CART

84.33%

92.3%

83.4%

\(-\)857,439

\(-\)159,856

\(-\)33.45

\(-\)218.16

81.36%

RF

92.9%

83.3%

\(-\)484,459

40

\(-\)20.44

7.23

100.01%

XGB

93.4%

83.3%

\(-\)596,203

897

\(-\)23.57

46.96

100.15%

B-6

5906

CART

36.63%

92.3%

88.8%

705,721

922,490

27.69

60.21

30.72%

RF

92.9%

88.9%

1,352,182

1,269,349

57.11

97.63

\(-6.13\)%

XGB

93.4%

89.6%

1,342,882

1,428,722

53.09

96.76

6.39%

B-7

4400

CART

98.49%

92.3%

83.2%

\(-\)3,159,722

\(-\)39,633

\(-\)123.45

\(-\)1230.61

98.75%

RF

92.9%

83.3%

\(-\)2,617,037

1024

\(-\)110.44

0.56

100.04%

XGB

93.4%

83.3%

\(-\)2,872,219

295

\(-\)113.57

19.31

100.01%

B-8

5278

CART

73.18%

92.3%

84.6%

\(-\)1,596,562

169,852

\(-\)62.31

41.78

110.64%

RF

92.9%

84.6%

\(-\)780,396

637,625

\(-\)32.89

194.52

181.71%

XGB

93.4%

85.0%

\(-\)933,133

780,845

\(-\)36.91

188.79

183.68%

B-9

4601

CART

94.12%

92.3%

83.3%

\(-\)2,380,789

\(-\)113,444

\(-\)92.86

\(-\)840.25

95.24%

RF

92.9%

83.3%

\(-\)1,403,468

317

\(-\)59.21

7.96

100.02%

XGB

93.4%

83.3%

\(-\)1,724,731

3980

\(-\)68.17

149.44

100.23%

B-10

5947

CART

35.98%

92.3%

89.0%

\(-\)760,449

429,196

\(-\)29.35

29.80

156.44%

RF

92.9%

88.5%

1,175,540

1,354,131

49.71

118.11

15.19%

XGB

93.4%

89.8%

871,455

1,456,080

34.48

96.25

67.09%

B-11

4229

CART

99.16%

92.3%

83.3%

\(-\)4,683,072

\(-\)48,985

\(-\)182.86

\(-\)1186.22

98.95%

RF

92.9%

83.3%

\(-\)3,536,046

0

\(-\)149.21

0

100.00%

XGB

93.4%

83.3%

\(-\)4,000,747

0

\(-\)158.17

0

100.00%

B-12

5391

CART

66.76%

92.3%

85.0%

\(-\)3,062,732

\(-\)388,289

\(-\)119.35

\(-\)80.44

87.32%

RF

92.9%

84.7%

\(-\)957,039

710,688

\(-\)40.29

220.55

174.26%

XGB

93.4%

85.3%

\(-\)1,404,561

834,198

\(-\)55.52

163.88

159.39%

B-13

4493

CART

96.30%

92.3%

83.3%

\(-\)2,358,179

\(-\)159,922

\(-\)91.98

\(-\)2793.13

93.22%

RF

92.9%

83.3%

\(-\)1,384,098

0

\(-\)58.40

0

100.00%

XGB

93.4%

83.3%

\(-\)1,705,577

0

\(-\)67.42

0

100.00%

B-14

5851

CART

42.98%

92.3%

87.8%

\(-\)3,251,762

\(-\)1,761,821

\(-\)126.63

\(-\)143.20

45.82%

RF

92.9%

86.4%

\(-\)1,013,089

79,273

\(-\)42.69

11.90

107.82%

XGB

93.4%

83.3%

\(-\)1,582,006

4396

\(-\)62.52

287.68

100.28%

B-15

4040

CART

99.67%

92.3%

83.3%

\(-\)4,660,462

\(-\)38,969

\(-\)181.98

\(-\)2075.03

99.16%

RF

92.9%

83.3%

\(-\)3,516,676

0

\(-\)148.40

0

100.00%

XGB

93.4%

83.3%

\(-\)3,981,592

161

\(-\)157.42

10.53

100.00%

B-16

5182

CART

77.97%

92.3%

84.2%

\(-\)5,554,044

\(-\)1,491,522

\(-\)216.63

\(-\)549.23

73.15%

RF

92.9%

83.6%

\(-\)3,145,668

52,475

\(-\)132.69

84.54

101.67%

XGB

93.4%

83.3%

\(-\)3,858,022

0

\(-\)152.52

0

100.00%

No

Time (s)

Model

% Target diff

Accuracy

Retention gain

RG/target

Improvement\(^{\text{a}}\)

\(y_i\)

\(\tilde{y}_i\)

\(y_i\)

\(\tilde{y}_i\)

\(y_i\)

\(\tilde{y}_i\)

B-17

5324

CART

32.66%

92.3%

88.9%

3,361,471

3,037,200

131.25

191.31

\(-\)9.65%

RF

92.9%

89.3%

3,241,680

2,911,023

136.86

204.43

\(-\)10.20%

XGB

93.4%

89.6%

3,596,593

3,546,671

142.15

222.04

\(-\)1.39%

B-18

6411

CART

13.83%

92.3%

91.1%

39,860,670

37,695,680

1556.66

1778.71

\(-\)5.43%

RF

92.9%

91.7%

35,787,050

32,345,100

1510.72

1654.32

\(-\)9.62%

XGB

93.4%

92.0%

39,908,670

40,886,810

1577.61

1848.71

2.45%

B-19

4853

CART

70.34%

92.3%

84.7%

1,059,189

1,813,631

41.25

392.14

71.23%

RF

92.9%

84.8%

1,109,101

1,808,616

46.86

474.33

63.07%

XGB

93.4%

85.0%

1,320,578

2,141,271

52.15

482.34

62.15%

B-20

5973

CART

31.76%

92.3%

89.2%

37,558,390

34,068,550

1466.66

2125.97

\(-\)9.29%

RF

92.9%

89.4%

33,654,470

30,032,580

1420.72

2072.47

\(-\)10.76%

XGB

93.4%

90.1%

37,632,650

38,008,480

1487.61

2277.17

1.00%

B-21

5228

CART

41.79%

92.3%

87.7%

1,136,879

1,179,837

44.40

92.50

3.78%

RF

92.9%

88.1%

1,276,808

1,188,256

53.91

104.81

\(-\)6.94%

XGB

93.4%

88.7%

1,385,145

1,356,864

54.74

104.76

\(-\)2.04%

B-22

6296

CART

19.52%

92.3%

90.7%

18,704,980

17,177,190

730.55

852.81

\(-\)8.17%

RF

92.9%

91.1%

17,182,100

15,732,340

725.34

859.29

\(-\)8.44%

XGB

93.4%

91.5%

19,071,370

19,050,020

753.90

939.00

\(-\)0.11%

B-23

4746

CART

81.36%

92.3%

84.1%

\(-\) 1,165,404

458,223

\(-\) 45.60

172.83

139.32%

RF

92.9%

84.0%

\(-\) 855,770

525,335

\(-\) 36.09

288.55

161.39%

XGB

93.4%

84.1%

\(-\) 890,871

645,445

\(-\) 35.26

310.86

172.45%

B-24

5845

CART

40.47%

92.3%

88.2%

16,402,700

15,013,310

640.55

1093.43

\(-\)8.47%

RF

92.9%

88.4%

15,049,520

13,423,040

635.34

1122.81

\(-\)10.81%

XGB

93.4%

88.9%

16,795,360

17,144,260

663.90

1247.50

2.08%

B-25

5274

CART

37.42%

92.3%

88.6%

1,607,847

1,839,864

62.84

126.33

14.43%

RF

92.9%

88.7%

2,119,067

1,923,982

89.49

152.71

\(-\)9.21%

XGB

93.4%

89.2%

2,237,965

2,194,469

88.45

155.54

\(-\)1.94%

B-26

6425

CART

14.83%

92.3%

91.1%

35,238,060

32,690,970

1376.37

1558.26

\(-\)7.23%

RF

92.9%

91.6%

32,835,370

29,986,540

1386.17

1543.12

\(-\)8.68%

XGB

93.4%

92.0%

36,328,440

36,803,630

1436.10

1688.53

1.31%

B-27

4811

CART

73.92%

92.3%

84.3%

\(-\) 694,436

751,404

\(-\) 27.16

226.99

208.20%

RF

92.9%

84.4%

\(-\)13,512

1,018,369

\(-\)0.51

356.48

7636.98%

XGB

93.4%

84.7%

\(-\)38,050

1,253,252

\(-\)1.55

345.94

3393.68%

B-28

5995

CART

32.61%

92.3%

89.1%

32,935,780

29,342,930

1286.37

1847.71

\(-\)10.91%

RF

92.9%

89.4%

30,702,790

27,725,620

1296.17

1933.38

\(-\)9.70%

XGB

93.4%

90.0%

34,052,420

34,390,060

1346.10

2094.90

0.99%

B-29

5143

CART

47.03%

92.3%

87.3%

\(-\)363,861

55,985

\(-\)14.12

3.38

115.39%

RF

92.9%

87.4%

377,170

488,284

15.95

49.62

29.46%

XGB

93.4%

88.0%

275,772

491,567

10.89

44.89

78.25%

B-30

6243

CART

20.47%

92.3%

90.7%

14,747,500

13,838,380

576.23

690.22

\(-\)6.16%

RF

92.9%

91.1%

14,816,830

13,378,460

625.54

743.34

\(-\)9.71%

XGB

93.4%

91.5%

16,146,490

16,169,440

638.30

814.80

0.14%

B-31

4730

CART

83.83%

92.3%

83.7%

\(-\)2,666,144

\(-\)487,716

\(-\) 104.12

\(-\) 267.75

81.71%

RF

92.9%

83.7%

\(-\) 1,755,409

139,545

\(-\) 74.05

102.66

107.95%

XGB

93.4%

83.7%

\(-\) 2,000,244

134,199

\(-\) 79.11

130.13

106.71%

B-32

5865

CART

41.41%

92.3%

88.0%

12,445,210

11,693,070

486.23

884.49

\(-\)6.04%

RF

92.9%

88.3%

12,684,250

11,381,260

535.54

971.28

\(-\)10.27%

XGB

93.4%

88.8%

13,870,470

14,101,470

548.30

1048.38

1.67%

  1. \(^{\text{a}}\) In order to account for negative retention gains, the improvement is computed with an absolute value for the denominator. This leads to a rather unintuitive improvement measure whenever one of the models yields negative RG and the other positive RG

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Valla, M., Milhaud, X. & Olympio, A. Including individual customer lifetime value and competing risks in tree-based lapse management strategies. Eur. Actuar. J. 14, 99–144 (2024). https://doi.org/10.1007/s13385-023-00358-0

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s13385-023-00358-0

Keywords

Navigation