Communicating uncertainty in epidemic models


 While mathematical models of disease transmission are widely used to inform public health decision-makers globally, the uncertainty inherent in results are often poorly communicated. We outline some potential sources of uncertainty in epidemic models, present traditional methods used to illustrate uncertainty and discuss alternative presentation formats used by modelling groups throughout the COVID-19 pandemic. Then, by drawing on the experience of our own recent modelling, we seek to contribute to the ongoing discussion of how to improve upon traditional methods used to visualise uncertainty by providing a suggestion of how this can be presented in a clear and simple manner.



Highlights
 Communicating uncertainty during the COVID-19 pandemic in a clear and understandable manner to decision-makers is vital, but uncertainty is often poorly reflected in the visualisations presented.  Traditional summary statistics (median + 95% confidence intervals) often hide the nuances of individual trajectories as well as important features of the epidemic.  We present the individual trajectories of ICU demand over the simulation period alongside individual metrics of magnitudes, peak timings and ICU capacity. In an effort to engage the audience more with the uncertainty, the colour of each epidemic trajectory reflects the ranked magnitude of the first peak allowing the key metrics to be more easily linked to the trajectory"s epidemic dynamics.
During a public health crisis, such as an outbreak of a novel pathogen, decision-makers are often faced with making difficult and rapid decisions in the face of uncertainty. 1 Mathematical models of disease transmission are one piece of evidence widely used to inform public health decision-makers globally, but, like all models, their outputs are uncertain. 2,3 Communicating this uncertainty during the COVID-19 pandemic in a clear and understandable manner to decision-makers is vital, but uncertainty has been often poorly reflected in the visualisations presented. 4 The most important considerations when deciding on a data visualisation is knowing who the audiences are and ensuring that key messages can be easily and quickly absorbed. 1 The pandemic has produced numerous examples of where statistics have been misunderstood and efforts to inform have ultimately confused audiences. 5 One lesson that must be learnt from the pandemic is more effective ways to communicate quantitative findings and their uncertainty. Drawing on the experience of our own recent COVID-19 modelling, we seek to contribute to the ongoing discussion of how to improve upon traditional methods used to visualise uncertainty by providing suggestions of how this can be presented in a clear and simple manner. The transmission of infections through a population is intrinsically complex and not directly observable. Transmission is often observed indirectly through recording cases of illness or death, or testing individuals for antibodies indicating history of previous infection. Mathematical models represent these processes by classifying the population into distinct states of infection and disease progression, with transitions between states governed by various epidemiological parameters. There are multiple parameter estimates and each is subject to uncertainty, so many sets of plausible combinations of parameter values exist. This uncertainty can be explored by producing multiple realisations of the model using different parameter sets. Uncertainty can also be introduced into simulations via the use of stochastic models, which, unlike deterministic models, incorporate the effects of random chance and are inherently "noisy". The choice of model structure and parameterisation depends on the questions under consideration and often on the stage of the epidemic. For example, stochastic models were commonly deployed at the beginning of the COVID-19 pandemic, at which point there was limited understanding of the transmission dynamics of SARS-CoV-2. 6,7 Figure 1 presents an illustration of ten simulation trajectories of cumulative deaths under three adaptations of a previously published model 8 in which uncertainty arises from different sources. Figure 1a displays output from a deterministic model in which the uncertainty is driven by varying model parameters, whereas Figure 1b presents the results of a stochastic model in which the parameters are held constant for each realisation. Finally, Figure 1c presents a stochastic model in which model parameters are also varied. The combination of two sources of uncertainty lead to much greater variation in trajectories than that which is observed under the models with a single source of uncertainty. Regardless of the source and magnitude of the uncertainty in epidemic models, it is critical that uncertainty is always considered carefully in the presentation of any results. Additionally, depending on the audience and hypotheses under investigation, it could also aid the interpretation of results if the uncertainties in model inputs are also presented alongside model outputs.
Results of epidemic models are most frequently summarised using simple summary statistics, such as the median and interquartile range. Drawing on our recent COVID-19 modelling work investigating intensive care unit (ICU) spare capacity in Europe, 9 we present an example of such an illustration in Figure 2. Data are simulated daily hospital and ICU demand from COVID-19 patients and COVID-19 deaths using a stochastic age-dependent susceptible-exposed-infected-recovered (  (Figure 2c). However, under the "scheduled" strategy there is greater uncertainty in the magnitude than the timing of the first peak whereas, the alternative "reactive" strategy has an additional layer of uncertainty, with there being heterogeneity in both the magnitude and timing of peaks ( Figure 2c). This allows decisionmakers at a national-level to consider the possible burden different policies would place on the healthcare system, while also providing information to decision-makers at a hospital-level about the potential level of additional resources required to meet the projected surge in demand, conditional on the policy choice.
J o u r n a l P r e -p r o o f However, such aggregation often hides the nuances of individual trajectories as well as important features of the epidemic. For example, the typical peak number of infections across the set of individual trajectories may be drastically underestimated if results for multiple different, asynchronous epidemic trajectories are crudely summarised. We have encountered such issues in our work 9 ( Figure 3) and hypothesise that this is also true of other research groups who have chosen similar aggregation methods given the underlying dynamics of compartmental transmission models. [10][11][12] We have observed that models in which acquisition of immunity has a substantial impact on transmission dynamics or in which interventions are triggered when a threshold burden of infection is reached are particularly prone to producing asynchronous trajectories that are difficult to summarise in ways that do J o u r n a l P r e -p r o o f not obscure relevant properties of individual trajectories. These effects are often amplified when using stochastic models (Figure 1).
Alternative means of communicating uncertainty have been explored throughout the COVID-19 pandemic. For example, Juul et al. 4 use central ranking systems which consider entire epidemic curves in relation to one another to produce curve boxplots and compare the likelihood of individual simulations in relation to key metrics of interest (i.e. daily number of new cases requiring hospitalisations). On the other hand, Davies et al. 7 plot a subset of "representative" curves drawn from each decile corresponding to the total number of simulated new cases. Additionally, Koo et al. 13 show each realisation of their simulations and use darker colours to indicate those which are closer to the median. While these approaches enable variation in epidemic trajectories to be communicated, we believe that sometimes the interpretability of the results to non-technical audiences is hindered. We argue that focussing directly on the metrics that inform policy decisions and making judicious choices of what quantities are plotted allows results, both in terms of central estimates and uncertainty to be communicated accurately, whilst ensuring these results are accessible to non-technical audiences.
In Figure 3, we instead present the individual trajectories of ICU demand over the simulation period alongside individual metrics of magnitudes and peak timings. In an effort to engage the audience more with the uncertainty, the colour of each epidemic trajectory reflects the ranked magnitude of the first peak allowing the key metrics presented in Figures 3c and 3d to be more easily linked to the trajectory"s epidemic dynamics in Figures 3a and 3b. Figure 3 paints a different picture of the potential unfolding epidemics than that provided via the simple summary statistics in Figure 2. This is particularly applicable to the "reactive" strategy, which observes a high degree of variability across trajectories which is not clearly conveyed using summary statistics calculated at each time point. The individual metrics of magnitude and peak timing allow for a clearer interpretation of the necessary support that national-level decision-makers ought to allow hospital-level decision-makers to prepare for a surge in demand, depending on the NPI strategy they implement. For example, the "scheduled" strategy suggests public health action should focus on providing additional hospital beds (Figure 3c) whereas the "reactive" strategy suggests that officials may also need to consider how quickly they need to procure additional hospital beds (Figure 3d). To further aid public health officials, in Figure 4 we present our simulations in conjunction with measures of ICU capacity. Under the assumption of 1500 available beds, 50% of the simulations from the "scheduled" strategy estimate that capacity will be breached in comparison to 70% of simulations from "reactive" strategy (Figures 4a and 4c). In Figures 4b  and 4d, the first day and duration of capacity breaches are presented individually for those simulations in which capacity is exceeded, again using colour to link to the individual trajectories as in Figure 3. In each plot, we have used text to provide brief, key messages to the audience without obscuring the full range of uncertainty underlying each message.
Together, this provides a more detailed insight into the range of scenarios that ICUs may be faced with than that which could be gained from solely looking at the individual trajectories and may be particularly useful to national-level decision-makers in deciding which NPI strategy to implement. Similar to the timing of the first peak (Figure 3c and 3d), there is greater variability in the first day of a possible capacity breach under the "reactive" strategy compared to the "scheduled" strategy. However, although more simulations breach capacity under the "reactive" strategy, the duration of the breach is anticipated to be shorter compared to the "scheduled" strategy given the model assumptions. Supplementary Figures  2 and 3 are analogous to Figures 3 and 4 but instead consider general hospital demand.
We believe these simple visualisations allow for the key metrics to be more easily understood while being transparent about the uncertainty in our epidemic trajectories.
Although we have chosen in this instance to focus on the dynamics of ICU demand at the first peak, understanding the audience is still paramount and it is imperative that visualisations are tailored as such. For example, decision-makers may be more interested in the outcomes of specific interventions and contrasting alternative scenarios. This is where interactive tools become increasingly valuable and allow users to explore scenarios specific to their needs, such as surrounding COVID-19 vaccine allocation strategies. 14 However, greater flexibility introduces further complexities in striking the correct balance between informing and not overwhelming audiences in chosen data visualisations. 15 Although the clear communication of uncertainty was important long before the COVID-19 pandemic, the severity and rapidly-evolving nature of this emergency has underlined its importance. Brooks-Pollock et al. 16 note that "communication to non-specialists became an overnight skill required of disease modellers". Some scientific advisors in the UK have noted the improvement in both advisor"s and decision-maker"s understanding of mathematical modelling over the course of the pandemic, but as noted by other former advisors and current members of the UK Scientific Advisory Group for Emergencies (SAGE), the communication of uncertainty inherent in these models requires improvement. 17 The responsibility of communication has traditionally fallen solely on scientists, but it has been noted that developing the general scientific understanding of decision-makers could also ease the intense pressure placed upon the scientific advisors when responding to an emergency. 17 Additionally, it may be more appropriate to train intermediaries for specialist science communication and data visualisation roles in order to share the substantial burden that such activities place on both scientists" and decision-makers" time.
As the pandemic continues, mathematical models of transmission to inform policy tend to become increasingly complex to account for vaccination and the emergence of variants of SARS-CoV-2 with different phenotypes while still modelling the impact of population behaviour, for example through the implementation of non-pharmaceutical interventions. Clear communication of uncertainty will remain important now and also in future health emergencies. We hope here to have added to a much-needed discussion about effective visualisation approaches for conveying uncertainty.