Use of machine learning tools to predict health risks from climate-sensitive extreme weather events: A scoping review

Shakirah N. Ssebyala; Timothy M. Kintu; David J. Muganzi; Caleb Dresser; Michelle R. Demetres; Yuan Lai; Kobusingye Mercy; Chenyu Li; Fei Wang; Soko Setoguchi; Leo Anthony Celi; Arnab K. Ghosh

doi:10.1371/journal.pclm.0000338

Abstract

Machine learning (ML) algorithms may play a role in predicting the adverse health impacts of climate-sensitive extreme weather events because accurate prediction of such effects can guide proactive clinical and policy decisions. To systematically review the literature that describe ML algorithms that predict health outcomes from climate-sensitive extreme weather events. A comprehensive literature search was performed in the following databases from inception–October 2022: Ovid MEDLINE, Ovid EMBASE, The Cochrane Library, Web of Science, bioRxiv, medRxiv, Institute of Electrical and Electronic Engineers, Google Scholar, and Engineering Village. The retrieved studies were then screened for eligibility against predefined inclusion/exclusion criteria. The studies were then qualitatively synthesized based on the type of extreme weather event. Gaps in the literature were identified based on this synthesis. Of the 6096 records screened, seven studies met the inclusion criteria. Six of the studies predicted health outcomes from heat waves, and one for flooding. Health outcomes described included 1) all-cause non-age standardized mortality rates, 2) heat-related conditions and 3) post-traumatic stress disorder. Prediction models were developed using six validated ML techniques including non-linear exponential regression, logistic regression, spatiotemporal Integrated Laplace Approximation (INLA), random forest and decision tree methods (DT), and support vector machines (SVM). Use of ML algorithms to assess adverse health impacts from climate-sensitive extreme weather events is possible. However, to fully utilize these ML techniques, better quality data suitable for use is desirable. Development of data standards for climate change and health may help ensure model robustness and comparison across space and time. Future research should also consider health equity implications.

Citation: Ssebyala SN, Kintu TM, Muganzi DJ, Dresser C, Demetres MR, Lai Y, et al. (2024) Use of machine learning tools to predict health risks from climate-sensitive extreme weather events: A scoping review. PLOS Clim 3(1): e0000338. https://doi.org/10.1371/journal.pclm.0000338

Editor: Olivier Damette, Universite de Lorraine and Climate Economic Chair Paris Dauphine, FRANCE

Received: August 9, 2023; Accepted: December 11, 2023; Published: January 17, 2024

Copyright: © 2024 Ssebyala et al. This is an open access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Data Availability: This is a scoping review and all data used was publicly available. Details are noted in the supplemental information.

Funding: The authors received no specific funding for this work.

Competing interests: The authors have declared that no competing interests exist.

Introduction

Climate change represents the greatest public health challenge of our time. In the United States, seven in ten Americans experienced extreme weather events in 12022 [1]. While there has been increasing acknowledgement in government [2, 3] and health-related organizations [4, 5] of the impending threat that climate change and associated extreme weather events pose to at-risk populations, there has been a focus on mitigating its severity rather than adapting to anticipated impacts. However, adapting to future climate impacts will be increasingly important [6]. Extreme weather events are likely to increase in frequency and destructiveness [7]; the warming climate is leading to escalating risks related to heat waves, droughts, wildfires, severe weather, tropical cyclones, extreme rainfall, and flooding.

Machine learning (ML) algorithms have the potential to play an important role in predicting individual and population-level health impacts of climate-sensitive extreme weather. By leveraging vast amounts of clinical, socioeconomic, and environmental data, ML-guided tools can be harnessed to identify and quantify the risk to individuals and populations from specific threats [8, 9], similar to their use in predicting breast cancer survival [10], occurrence of coronary artery disease [11], and in classifying and detecting cancerous lesions on images [12, 13]. Additionally, ML-guided tools have the potential to build on existing systems that assess populations at risk from extreme weather events and take advantage of secondary data to do so. By leveraging healthcare system data (e.g., electronic medical records), healthcare organizations and public health authorities may have the opportunity to tailor algorithms to specific populations, making them more context- and disease-specific, and perhaps addressing limitations of existing emergency preparedness systems.

It is unclear to what extent ML techniques are being used to predict health outcomes from extreme weather events, and to what extent they may be useful for doing so. In this scoping review, we aimed to comprehensively describe peer-reviewed manuscripts in the published and grey literature reporting use of ML methods to predict health outcomes from climate-sensitive extreme weather events worldwide.

Methods

We performed a scoping review to examine the use of ML techniques to predict health outcomes during and after extreme weather events. This study was performed following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses extension for scoping reviews (PRISMA-ScR) [14]. In accordance with these standards, a protocol was submitted and preregistered by the International Prospective Register of Systematic Reviews (PROSPERO; CRD42023391186). The PRISMA flow diagram is described in the Fig 1.

Download:

Fig 1. PRISMA 2020 flow diagram for new systematic reviews which included searches of databases and registers only.

*Due to the inherently large number of results retrieved from a Google Scholar search, only the first 500 results were screened.

https://doi.org/10.1371/journal.pclm.0000338.g001

Search strategy

A medical librarian (MRD) performed comprehensive searches to identify studies that addressed ML approaches to predicting at-risk populations or health impacts from climate-sensitive extreme weather events. Searches were finalized October 20, 2022, in the following databases: Ovid MEDLINE, Ovid EMBASE, The Cochrane Library (Wiley), Web of Science (Core Collection–Clarivate), bioRxiv, medRxiv, Institute of Electrical and Electronic Engineers (IEEE) database, Google Scholar, and Engineering Village (Elsevier). The search strategy included all appropriate controlled vocabulary and keywords for the concepts of "machine learning" and "natural disasters" or "extreme weather." The full search strategies for all databases are available in Supplement (S1 Table). To limit publication bias, there were no language, publication date, or article type restrictions on the search strategy. For articles selected for inclusion in this study, reference lists and citing articles were pulled from Scopus (Elsevier) and screened. Two additional studies were identified from hand-searching relevant journals.

Study selection

Retrieved studies were screened (AKG, MRD, SNS, SS, FW, LAC, CD, YL, DJM, KM, TMK) for inclusion using Covidence systematic review software. Titles and abstracts were reviewed against predefined inclusion/exclusion criteria by two independent reviewers. Discrepancies were resolved by consensus. For final inclusion, the full text was retrieved and then screened by two independent reviewers.

Inclusion and exclusion criteria

Our inclusion criteria were: (1) Population: participants ≥ 18 years old; (2) Exposure: Extreme weather and related events where the onset is acute, defined as within 2 weeks, and that are thought to be exacerbated by climate change (i.e., hurricanes /cyclones/typhoons, wet precipitation leading to flooding, wildfires, heat waves, extreme cold, mudslides); (3) Outcomes studied: health events (death, hospitalization, any presentation to clinic or emergency department) within a defined time-frame (e.g., 30 days, 2 months, 1 year, 5 years); (4) Employs ML-based modeling methods (e.g., linear regression, decision trees, k-nearest neighbors, random forests, kernel methods, deep neural networks) [15] to predict the health outcomes; (5) The ML methods undergoes evaluation by assessment and/or creation of accuracy, precision, recall, specificity, F1 score, Receiver Operator Characteristics curve, Precision Recall curve; (6) The ML models undergo internal validation using either training and testing split, resubstituting, K fold cross validation, bootstrapping, nested cross validation, or external validation using a different dataset.

Studies were excluded according to the following criteria: (1) Written in non-English language; (2) Review articles, commentaries, or editorials; and (3) Authors do not include measures for ML model validation.

Data extraction

Data extraction was performed independently by a pair of reviewers with predefined, standardized templates. Each extraction was reviewed independently by a secondary reviewer (SNS) after extraction by the primary reviewer (AKG). Data points defined for extraction were extreme weather event; study design; population; health outcome; time horizon for health outcome; ML techniques; validation methods; and key findings.

Data synthesis

Following data extraction, findings were synthesized qualitatively to describe the geographies covered and study settings, types of extreme weather events, health outcomes, ML techniques used, and validation methods employed. Gaps in the literature were then identified based on this qualitative synthesis. No quantitative assessment of the literature (i.e., meta-analysis) was performed because of the heterogeneity in ML techniques used and outcomes assessed.

Results

Summary of articles

The study selection process is outlined in the PRISMA flow diagram in Fig 1. After removal of duplicates, a total of 6096 records were screened. A total of 7 studies, summarized in Table 1, met criteria for inclusion in the analysis.

Download:

Table 1. Summary of study results.

https://doi.org/10.1371/journal.pclm.0000338.t001

1. Geographies covered and study settings.

The seven studies analyzed in here report data from the United States of America [16], Europe [17–19], China [20, 21] and South Korea [22]. Data was collected at the country [18, 19], county [16] or division [20, 22], and city level [17, 21]. There were 16 European countries included, namely Austria, Belgium, Croatia, the Czech Republic, Denmark, France, Germany, Italy, Luxembourg, Netherlands, Poland, Portugal, Slovenia, Spain, Switzerland and the United Kingdom (England and Wales only) [18, 19]. The main study design used was retrospective secondary data analysis, utilized in six of the articles [16–19, 21, 22]. The remaining study from Hunan, China used a multistage, stratified cluster sampling design [20].

2. Extreme weather events studied.

Six studies examined the impact of heat waves on population health either daily [16, 18, 19, 21], within 2 days [17], or 1 week after the heat wave [22]. There was variation in the definition of heat waves in different regions. Heat-associated outcomes were documented for temperature >29°C in Lisbon, Portugal [17], >33°C in Korea [22] and >35°C in China [21]. The study from Georgia, USA documented extreme heat events as either a temperature measure greater than 95% percentile threshold of daily maximum temperature, or 98th percentile threshold of daily minimum temperature or 99% percentile threshold over apparent temperature [16]. Both Lowe et al. studies from Europe examined a heat wave and cold spells for which the exact temperatures were not defined [18, 19]. One study from the Hunan province in China examined the effect of a 1998 flood approximately two years later [20].

3. Health outcomes evaluated.

The health outcomes described in the final studies were all-cause, non-age-standardized mortality [17–19], post-traumatic stress disorder (PTSD) diagnosed using validated instruments [20], emergency department presentations [16], and heat-related conditions including heat stroke, cardiovascular and respiratory diseases using relevant International Classification of Disease version 9 codes from a thermal disease monitoring system [21, 22].

4. Description, and comparison of machine learning techniques employed.

Among the seven studies, there were six ML techniques utilized: non-linear exponential regression [17], logistic regression [16, 20], spatiotemporal Integrated Laplace Approximation (INLA) [18, 19], random forest models [21, 22] decision tree methods (DT), and support vector machines (SVM) [22]. Only one study [22] compared their proposed model to other ML strategies: random forest model to decision tree methods (DT), support vector machines (SVM) and logistic regression (LR).

The techniques examined in this review are all supervised learning whose goal is to use labelled data to predict outcomes. Of note, these techniques have varied levels of interpretability and performance. In particular, the regression methods such as non-linear exponential and logistic regression describe outputs that are more easily interpreted but may have lower performance. On the other hand, the classification models including INLA, DT, and SVM use categorical labels, have lower interpretability and higher performance compared to the regression models [15].

A common ML algorithm employed was logistic regression, an adaptive regression method that attempts to construct predictors as Boolean combinations of binary covariates. For example, Dessai et al. [17] employed non-linear exponential regression to model the aggregate, non-linear effects of climate-related mortality. Jiang et al. [16] applied logistic regression to a time-series analysis of ED visit data and extreme heat indicators. The different combinations of the binary covariates allowed the authors to generate better predictive models that capture both lagged and sustained effects of extreme heat. In their study, Huang et al. used a step-wise forward regression to select predictive risk factors for the binary dependent variable, the presence or absence of PTSD [20].

Differently, both studies authored by Lowe employed INLA, a Bayesian technique that efficiently models data structures with spatiotemporal components. They developed models to estimate mortality-apparent temperature relationships in a Bayesian model framework, which allows the simulation of probabilistic predictions of daily mortality in space and time. These models were then fitted with the INLA to simulate the mortality predictions for the heat and cold spells [18, 19].

The third type of ML modeling was the random forest model. Random forest models are used to construct hundreds or thousands of deep decision trees by employing a bootstrapping method (a resampling technique that involves random sampling with replacement which makes the forests stronger predictors). The outputs of these trees are then combined which makes this method less prone to overfitting and multi-collinearity, allowing high predictability with robustness that improve its generalizability. Additionally, random forests usually require less parameter tuning which makes them a good method for smaller datasets [15]. Wang et al. used the Boruta algorithm, a wrapper built around the random forest algorithm that uses a z-score for feature selection, ensuring that the variables in their model was significantly correlated with the outcome variable. They added socioeconomic variables including location, gross domestic product, and population density which improved prediction and the model fitness [21]. The Park et al. group also used the Boruta method to consider a range of socioeconomic, meteorological, and demographic variables in their model. Of note, they report that regional economic indicators such as income and insurance had less impact on prediction of the diseases related to heatwaves [22].

Park et al. used other ML techniques including DT and SVM. DT are an essential building block for many ML techniques (including random forest models) as they cluster data based on classes or probabilities. However, a disadvantage of DT methods is that they are prone to overfitting. SVM methods are used for classification of continuous output, and they are robust to noise and particularly effective in high-dimensional datasets. The goal of SVM is to find the optimal decision boundary that separates the classes. Compared to DT and RF methods, the outputs of an SVM may be less in interpretable [15]. Park et al. compared their RF model to the DT and SVM to determine the accuracy in making predictions. Using the same training data set, they found that the RF model was more accurate than other models for making predictions [22].

5. Validation methods used.

The ML techniques were validated using three methods including internal validation [17, 20, 21], simulation [16], and quantitative comparison between predicted and observed data [18, 19, 22]. In one study, the authors undertook internal validation of the ML method by splitting the data sets into two time periods, trained the model with one time period and tested it on another, and then compared the observed and expected values through residual analysis and regression coefficient [17]. In other studies, there was a 70/30 [20] or 90/10 [21] training/test dataset which was used to calculate area under the receiver operator curve (ROC) [20, 21].

There was slight variation in the modes of quantitative comparison between the studies. The Lowe studies also reported the area under the ROC, and positive predictor value (PPV) for the heat wave model [18, 19]. Park et al. compared all three modeling strategies using the mean absolute error (MAE), root mean squared error (RMSE), root mean squared logarithmic error (RMSLE), and coefficient of determination R² [22].

Qualitative synthesis

1. Heat wave- and temperature-focused papers.

Six of the seven studies that were included focused on predicting health outcomes associated with heat waves [16–19, 21, 22]; two also evaluated cold spells [18, 19]. These large retrospective studies were conducted at city [17, 21], county [16], division [22], or country [18, 19] level. There was variation in the definition of heat waves, with a range of temperatures above 29–35°C [17, 21, 22]. Most of the studies evaluated health outcomes daily [16, 18, 19, 21], while others examined at 2 days [17] or 1 week after the heat wave [22]. These included all-cause, non-age-standardized mortality [17–19], emergency department presentations [16], or specific heat-related health conditions including heat stroke, exhaustion, cramps, fainting and edema [21, 22]. The validated ML techniques used have been described above. Two studies incorporated a range of socioeconomic data such as area-level measures of income, insurance premiums, occupational groups and other environmental factors including internet search index, urban vs rural populations, air conditioner number per hundred houses, and normalized difference in vegetation rates [21, 22].

2. Flooding paper.

Huang et al. examined the effect of a 1998 flood in the Hunan province in China approximately two years after the event [20]. They collected data from a qualitative survey of 29285 individuals impacted by the flood. Seven independent predictive factors (age, gender, education, type of flood, severity of flood, flood experience, and mental status before the flood) were identified and used as key variables in a risk score model using stepwise, forward logistic regression. The area under the ROC curve for the model was 0.853 in the validation data. The sensitivity, specificity, positive predictive value (PPV) and negative predictive value (NPV) of this risk score model were 84.0%, 72.2%, 23.4%, and 97.8%, respectively, at a cutoff value of 67.5 in the validation data.

Discussion

This scoping review identified seven articles predicting the relationship between exposure to climate-sensitive extreme weather events and health outcomes using ML tools that underwent validation. Heat waves were the most studied extreme weather event; two groups also compared cold spells [18, 19]. All but one [16] of the studies was from outside the United States.

Findings from these studies suggest that the use of ML to predict health outcomes in populations at risk of climate-amplified extreme weather events is possible, can be done in a rigorous manner, and has significant public health potential at the local level, regional, national, and international level. Nonetheless, despite ML’s ever increasing role in the healthcare landscape over the last two decades [15], the relatively small number of studies identified in this review highlights the lack of realized potential in using these tools to identify and safeguard future populations from climate-sensitive extreme weather events.

ML powerfully enables complex analysis of huge datasets that can be leveraged to develop risk prediction tools tailored to specific populations [8]. Additionally, ML-focused projects have the potential to augment existing emergency preparedness systems by making them more adaptable to different contexts and specific diseases. A key question stemming from this scoping review is why, given the growing role of ML across the spectrum of healthcare in general, has ML not been more rigorously utilized in assessing risk from climate-sensitive extreme weather events, particularly in the US setting?

We posit three possible reasons. First, few countries and organizations have access to the data necessary to examine the population-level health effects of climate change. Second, there is a growing need for robust health and environmental data infrastructures including data/metadata standards in climate and health. And third, there are unclear uses for ML-formed risk prediction models in climate change.

To be useful as a public health tool, ML methods need to define populations at risk of climate-sensitive extreme weather events completely. We note that all except one of the reviewed studies was from Europe and Asia where robust national healthcare data infrastructures exist, thus allowing complex population-level analyses. In the US, national population-level healthcare is only organized for a few set populations (e.g., Medicare data for those > 65 years, those on hemodialysis). Moreover, health data focused on disease surveillance and healthcare utilization is often collected by different organizations at the federal, state, and local levels, leading to fragmentation across different geographic regions, populations, and levels of specificity. Although methods have been developed to organize healthcare data in the US, there is still work to be done. Consequently, the United States is missing a significant opportunity to harness its extensive expertise in ML and artificial intelligence to examine the impact of climate-sensitive exposures on populations, predict individuals at risk, and implement interventions to safeguard lives and livelihoods in both the short and long term.

Our review highlights the need for more robust health and environmental data infrastructures that incorporate standard definitions and measurements of climate-sensitive exposures. We note the heterogeneity in definitions of the heat waves with ranges in temperature from 29–35°C or undefined but > 95th percentile of the daily minimum range. Other variables such as the health outcomes also varied in that some articles evaluated the daily mortality versus emergency department visits versus heat-related conditions, some of which lacked specificity. Not only does this lack of standardization pose challenges to compare and possibly fit proposed models to different places without standardized metrics for the variables, but it also limits any comparisons between geographic regions and across time. Thus, as a result, the lack of standards prevents the identification and development of unified measures and guidelines to examine the health effects of climate change broadly and build consensus towards potential interventions. This is important, as climate-amplified extreme weather events are increasing in frequency, intensity, and duration, and novel tools are critical to protecting populations from their consequences.

Last, once developed, the appropriate agency or group which might employ these ML-informed risk prediction tools is unclear. Currently, except for a few notable cases including the Department of Health and Human Services emPOWER program [23], few actors have the breadth, ability, and resources to leverage the knowledge generated from these tools to protect the well-being of at-risk populations, particularly in the US. Lack of proper infrastructure for surveillance, reporting, and integrated evidenced-based decision-making likely made the US more susceptible to impacts from such events, as compared to other high-income countries [24]. Historically, it has been the purview of emergency management services to address the devastation wrought by extreme weather events. However, as these become more frequent, there is concern that these services–typically provided only in the short-term–may not be able to provide the necessary care in a repeated fashion and may miss opportunities for prevention-focused efforts on longer timescales. As a result, countries, states, and cities may seek to reorganize their preparation for and responses to extreme weather hazards, particularly as they become more damaging in terms of health and financial cost. As these changes occur, ML prediction tools may have a role in planning public health approaches and priorities in both pre- and post-event phases.

A final notable feature of our review was the limited focus on factors related to the inequitable impacts of climate-responsive environmental hazards. Addressing health inequities in communities affected by climate-amplified extreme weather events requires nuanced socioeconomic data. ML methods are well-equipped to manage the nested, hierarchical structure of such data forms. However, only two studies incorporated socioeconomic and other environmental factors in their model to capture the complexity of their influence on health outcomes [21, 22]. It is increasingly reported that extreme weather events disproportionately affect the health of populations based on several factors [25] including advanced age [26], gender [26, 27], preexisting conditions [27–29], geographical features [28] especially cities [30], poverty [28, 31], and population density [31]. Incorporation of socioeconomic and spatial parameters in future work may be beneficial.

Limitations in this study exist. Although we performed a comprehensive bibliometric analysis of existing peer-reviewed and grey literature in English, this review included only a small number of final articles. While this limits synthesis of the data to generate strong conclusions on how ML is used as a tool to predict health outcomes, we identify an opportunity to expand use of this apparently underleveraged tool. Moreover, our methodology was robust in that our initial screening was not language specific, and re-evaluated bibliographies of the final selected manuscripts to ensure the completeness of the research.

In conclusion, this scoping review demonstrated the feasibility of using ML methods to predict the risk of adverse health outcomes from climate-responsive extreme weather events including heat waves, cold spells, and floods. Despite the comprehensive approach, this review yielded only seven articles meeting criteria for inclusion. While there is great opportunity to use ML as a tool to identify and potentially develop strategies to protect vulnerable populations from health harms resulting from extreme weather events, utilization of this tool has been limited. Future efforts may benefit from focusing on utilizing more comprehensive and higher quality data with ML tools, creating data standards for climate change and health datasets to ensure robustness of the models and comparability of results, and expanding the capacity of agencies, health professionals, and other organizations to utilize this data and translate its findings into actionable public health interventions.

Supporting information

S1 Checklist. Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews (PRISMA-ScR) checklist.

JBI = Joanna Briggs Institute; PRISMA-ScR = Preferred Reporting Items for Systematic reviews and Meta-Analyses extension for Scoping Reviews. * Where sources of evidence (see second footnote) are compiled from, such as bibliographic databases, social media platforms, and Web sites. † A more inclusive/heterogeneous term used to account for the different types of evidence or data sources (e.g., quantitative and/or qualitative research, expert opinion, and policy documents) that may be eligible in a scoping review as opposed to only studies. This is not to be confused with information sources (see first footnote). ‡ The frameworks by Arksey and O’Malley (6) and Levac and colleagues (7) and the JBI guidance (4, 5) refer to the process of data extraction in a scoping review as data charting. § The process of systematically examining research evidence to assess its validity, results, and relevance before using it to inform a decision. This term is used for items 12 and 19 instead of "risk of bias" (which is more applicable to systematic reviews of interventions) to include and acknowledge the various sources of evidence that may be used in a scoping review (e.g., quantitative and/or qualitative research, expert opinion, and policy document). From: Tricco AC, Lillie E, Zarin W, O’Brien KK, Colquhoun H, Levac D, et al. PRISMA Extension for Scoping Reviews (PRISMAScR): Checklist.

https://doi.org/10.1371/journal.pclm.0000338.s001

(DOCX)

S1 Table. Search strategies for databases.

https://doi.org/10.1371/journal.pclm.0000338.s002