Abstract
This paper addresses the prediction of the total damage costs brought on by a drought episode under the French “Régime de Catastrophes Naturelles”. Due to the specificity of this natural disaster compensation scheme, an early prediction of the cost of a disaster is needed to improve strategic decisions. Taking advantage of the access, thanks to a partnership with the Mission Risques Naturels, to a database of natural disaster claims fed by the major French insurance companies, we combine the information of drought event claims contained in this database with meteorological and socioeconomic data to achieve a more comprehensive knowledge of the exposure. Our prediction approach relies on the comparison of different statistical models and machine learning algorithms. To improve the prediction performance, we propose an aggregation of the different models. Since the main difficulty encountered is imbalanced data as a large majority of cities are not affected by a drought event, the predictions are assessed by F1-scores and Precision and Recall curves.
Similar content being viewed by others
Availability of data and material
The database is not publicly available for confidential reasons.
Code availability
The code is publicly available at :https://github.com/antoine-heranval/Paper_Application-of-machine-learning-methods-for-cost-prediction-of-drought-in-France
References
Catastrophes naturelles : la facture salée des sécheresses à répétition. Tech. rep., Argus de l’Assurance. https://www.argusdelassurance.com/assurance-dommages/catastrophes-naturelles-la-facture-salee-des-secheresses-a-repetition.169969
Avant de construire - prendre en compte les risques du terrain. Tech. rep., Agence Qualité Construction (2014). https://qualiteconstruction.com/publication/avant-de-construire-prendre-en-compte-les-risques-du-terrain/
Etude : Changement climatique et assurance à l’horizon 2040. Tech. rep., Fédération Française de l’assurance (2015). https://www.ffa-assurance.fr/la-federation/publications/enjeux-climatiques/etude-changement-climatique-et-assurance-horizon-2040
Présentation de la MRN. Tech. rep., Mission Risques Naturels (2018). https://www.mrn.asso.fr/wp-content/uploads/2018/09/presentation-mrn_v21092018-1.pdf
Sécheresse géotechnique, de la connaissance de l’aléa à l’analyse de l’endommagement du bâti. Tech. rep., Mission Risques Naturels (2018). https://www.mrn.asso.fr/wp-content/uploads/2019/01/21-01-2018_rapport-mrn_secheresse-2018.pdf
Lettre d’information de la Mission Risques Naturels 30, juillet 2019. Tech. rep., Mission Risques Naturels (2019). https://www.mrn.asso.fr/wp-content/uploads/2019/10/lettre-n30_vf.pdf
Procédure de reconnaissance de l’état de catastrophe naturelle - Révision des critères permettant de caractériser l’intensité des épisodes de sécheresse-réhydratation des sols à l’origine de mouvements de terrain différentiels. Tech. rep., Ministère de l’interieur (2019). https://www.legifrance.gouv.fr/download/pdf/circ?id=44648
Contribution de Météo-France à l’analyse de la sécheresse géotechnique à l’attention de la Commission CatNat pour l’année 2019. Tech. rep., Météo France, Direction de la Climatologie et des Services Climatiques (2020). https://meteofrance.fr/sites/meteofrance.fr/files/files/editorial/Rapport-catnat-secheresse-2020.pdf
Météo-France dans le dispositif CATNAT sécheresse. Tech. rep., Météo France (2020). https://meteofrance.fr/sites/meteofrance.fr/files/files/editorial/Catnat09032022.pdf
L’assurance des événements naturels en 2019. Tech. rep., Fédération Française de l’assurance (2021). https://www.mrn.asso.fr/wp-content/uploads/2021/03/2021-mrn-lassurance-des-evenements-naturels-en-2019.pdf
Arnold C (2018) Le parc de logements en France au 1er janvier 2018. Tech. rep., INSEE. https://www.insee.fr/fr/statistiques/3620894
Assadollahi H (2019) The impact of climatic events and drought on the shrinkage and swelling phenomenon of clayey soils interacting with constructions. Ph.D. thesis, Université de Strasbourg. https://tel.archives-ouvertes.fr/tel-02331567/file/Assadollahi_Hossein_2019_ED269.pdfs
Breiman L (2001) Random forests. Mach Learn 45:5–32. https://doi.org/10.1023/A:1010933404324
Breiman L, Friedman J, Stone CJ, Olshen RA (1984). Classification and regression trees CRC Press. https://doi.org/10.1201/9781315139470
Brownlee J (2020) Imbalanced Classification with Python: Better Metrics, Balance Skewed Classes. Cost-Sensitive Learning, Machine Learning Mastery
Charpentier A, James MR, Ali H (2021) Predicting drought and subsidence risks in France. Natural Hazards and Earth System Sciences Discussions pp. 1–27
Chen T, Guestrin C (2016) XGBoost: A scalable tree boosting system. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 785–794. ACM. https://doi.org/10.1145/2939672.2939785
Chinchor N, Sundheim BM (1993) Muc-5 evaluation metrics. In: Fifth Message Understanding Conference (MUC-5): Proceedings of a Conference Held in Baltimore, Maryland, August 25-27, 1993 . https://aclanthology.org/M93-1007.pdf
Corti T, Muccione V, Köllner-Heck P, Bresch D, Seneviratne SI (2009) Simulating past droughts and associated building damages in France. Hydrol Earth Syst Sci 13(9):1739–1747
Denuit M, Charpentier A (2005) Mathematiques de l’Assurance Non-Vie. Tome II: Tarification et Provisionnement
Ecoto G, Bibaut A, Chambaz A (2021) One-step ahead sequential super learning from short times series of many slightly dependent data, and anticipating the cost of natural disasters. arXiv preprint arXiv:2107.13291
Friedman J, Hastie T, Tibshirani R (2010) Regularization paths for generalized linear models via coordinate descent. Journal of statistical software 33(1), 1 . https://pubmed.ncbi.nlm.nih.gov/20808728/
Habets F, Boone A, Champeaux JL, Etchevers P, Franchistéguy L, Leblois E, Ledoux E, Le Moigne P, Martin E, Morel S, Noilhan J, Quintana Seguí P (2008) Rousset-Regimbeau F, Viennot P The SAFRAN-ISBA-MODCOU hydrometeorological model applied over France 113, D06113. https://doi.org/10.1029/2007JD008548
Marquardt DW, Snee RD (1975) Ridge regression in practice. Am Stat 29(1):3–20. https://doi.org/10.1080/00031305.1975.10479105
McKee TB, Doesken NJ, Kleist J, et al. (1993) The relationship of drought frequency and duration to time scales. In: Proceedings of the 8th Conference on Applied Climatology, vol. 17, pp. 179–183. Boston . https://climate.colostate.edu/pdfs/relationshipofdroughtfrequency.pdf
Nelder JA, Wedderburn RW (1972) Generalized linear models. J Royal Stat Soc Ser A (Gen) 135(3):370–384. https://doi.org/10.2307/2344614
Pritchard OG, Hallett SH, Farewell TS (2015) Probabilistic soil moisture projections to assess Great Britain’s future clay-related subsidence hazard. Climat Change 133(4):635–650. https://doi.org/10.1007/s10584-015-1486-z
Rijsbergen C (1979) Information retrieval, 2nd ed. Buttersworth, London
Saito T, Rehmsmeier M (2015) The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PloS one 10(3):e0118432. https://doi.org/10.1371/journal.pone.0118432
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Royal Stat Soc Ser B (Methodol) 58(1):267–288. https://doi.org/10.1111/j.2517-6161.1996.tb02080.x
Vidal JP, Martin E, Kitova N, Najac J, Soubeyroux JM (2012) Evolution of spatio-temporal drought characteristics: validation, projections and effect of adaptation scenarios. Hydrol Earth Syst Sci 16(8):2935–2955. https://doi.org/10.5194/hess-16-2935-2012
Vidal JP, Moisselin JM (2011) Impact du changement climatique sur les sécheresses en France. http://www.drias-climat.fr/public/shared/rapport_final_CLIMSEC.pdf
Vincent M, Plat E, Le Roy S (2007) Cartographie de l’aléa retrait-gonflement et plans de prévention des risques. Revue française de géotechnique 120–121:189–200. https://doi.org/10.1051/geotech/2007120189
Wright MN, Ziegler A (2015) Ranger: A fast implementation of random forests for high dimensional data in C++ and R 77(1). https://doi.org/10.18637/jss.v077.i01
Zou H, Hastie T (2005) Regularization and variable selection via the elastic net. J Royal Stat Soc Ser B (Stat Methodol) 67(2):301–320. https://doi.org/10.1111/j.1467-9868.2005.00503.x
Funding
This research was supported by the Mission Risques Naturels.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Heranval, A., Lopez, O. & Thomas, M. Application of machine learning methods to predict drought cost in France. Eur. Actuar. J. 13, 731–753 (2023). https://doi.org/10.1007/s13385-022-00327-z
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13385-022-00327-z
Keywords
- Natural catastrophe
- Generalized linear models
- Lasso and elastic-net penalties
- Extreme gradient boosting
- Random forests