Assessing the utility of COVID-19 case reports as a leading indicator for hospitalization forecasting in the United States

Identifying data streams that can consistently improve the accuracy of epidemiological forecasting models is challenging. Using models designed to predict daily state-level hospital admissions due to COVID-19 in California and Massachusetts, we investigated whether incorporating COVID-19 case data systematically improved forecast accuracy. Additionally, we considered whether using case data aggregated by date of test or by date of report from a surveillance system made a difference to the forecast accuracy. Evaluating forecast accuracy in a test period, after first having selected the best-performing methods in a validation period, we found that overall the difference in accuracy between approaches was small, especially at forecast horizons of less than two weeks. However, forecasts from models using cases aggregated by test date showed lower accuracy at longer horizons and at key moments in the pandemic, such as the peak of the Omicron wave in January 2022. Overall, these results highlight the challenge of finding a modeling approach that can generate accurate forecasts of outbreak trends both during periods of relative stability and during periods that show rapid growth or decay of transmission rates. While COVID-19 case counts seem to be a natural choice to help predict COVID-19 hospitalizations, in practice any benefits we observed were small and inconsistent.


EPIFORGE 2020 reporting items
We include here in Table 1 the recommended reporting items for epidemic forecasting and prediction research.[ 2 Data revisions Data for both hospitalizations and cases are sometimes revised by the surveillance system after initially being reported.The revisions can sometimes be substantial and occur as much as weeks or months after the data were initially reported.In the main manuscript we used "finalized" versions of the data.
To be clear about our usage of different dates, we will refer to the 'event date' as the date on which a particular event (e.g., a hospital admission or a case report) occurs, and the 'issue date' as the date on which a particular set of observations are released by a public health agency.
To provide a summary of the scale of the data revisions, we computed, for counts of new cases or hospitalizations on every event date, the ratio of the reported value on every issue date to the value as reported on the final issue date of July 26, 2022.We let y t,d refer to the observation of a particular data source associated with event date t that was available on issue date d, then we compute a revision ratio r t,d = yt , where y t is the version of y t,d that was available on July 26, 2022.Therefore, values of r t,d less than one indicate that the issue of the data at date d was lower than the eventual final value, and revisions would increase the observed counts.Whereas values of r t,d greater than one indicate that the current issue of the data as of date d was higher than the eventual final value and revisions would bring the observed counts down.Boxplots of the revision ratios are shown in Figure 1, showing revisions to report-date cases from JHU CSSE for California.
Figure 1: Boxplots of revision ratios as a function of time between the issue date for an observation and the event date.The left-hand column of plots show revision ratios for Massachusetts whereas the right-hand column show plots for California.Each row represents a different data source, from top to bottom: test-date cases, report-date cases (from JHU CSSE) and hospitalizations.The patterns of reporting vary by state and data type.For reference, each figure has a horizontal dashed line drawn at y = 1 to illustrate where an observation is equal to its final value.Observations below the dashed line were subsequently revised upward and observations above the line were subsequently revised downwards.The x-axis represents the difference between the issue date of the observation in the numerator of the revision ratio and the event date of the observation.For example, test-date cases in both states are typically revised upwards, with the entire inter-quartile range of revision ratios showing above 90% reporting at 4 days after the event date in Massachusetts and at 15 days in California.In Massachusetts there were rarely any revisions to report-date cases, whereas in California there were occasional substantial revisions both up and down.For hospitalizations, both states showed that after three days a majority of the observations experienced no revisions, although occasionally large revisions were made up to about two weeks past the event date.

Results with non-smoothed models
In the main manuscript, only results from models that had pre-smoothed case data as inputs were shown, as those were the only models used in the test phase analysis.Table 2 shows results from models that both did and did not use smoothed case data.The rows with "TRUE" in the column named "smoothed" are identical to the results in the main manuscript table.The results with "FALSE" are supplemental, and only shown in this supplemental table.For every model-type in each location, pre-smoothing the case data improved the model accuracy.The number of models in the rank column denominator indicates the total number of models including all variations of (p, d, P, D), different case data types used, and with smoothed and unsmoothed case data (when case data were used).In a post-hoc analysis, we examined the difference between the relationship between cases and hospitalizations between the validation and test phase of the experiments conducted in this paper.Two central conclusions emerged: 1.The degree to which cases lagged (or led) hospitalizations did not change substantially between the two phases.
2. The scale of the association between cases and hospitalizations changed between the two phases, with a higher number of cases being associated with one hospitalization during the test phase on average.
The analysis below shows results from the main manuscript (the rows with the "both" phase are taken from the main text) along with new results by phase (Table 3).Largely, the results are similar between the phases.For example, in Massachusetts, the cross-correlation between smooth JHU report-date cases and hospitalizations was highest with no lag between the two time series (i.e., "lag at max cor." = 0) in both phases and in each of the validation and test phases separately.The largest observed temporal shift in maximum cross-correlation was in Massachusetts where cases had a maximum correlation with hospitalizations at a zero week lag in the validation phase and a 3 week lag in the test phase.In all cases, correlations were very high for all lags of +/-7 days.Additionally, the maximum correlation was lower for both phases combined due to the changing relationship between cases and hospitalizations described in the next paragraph.3: For each state, analysis phase and case data type, we show the results of a correlation analysis between the given case data type and hospitalization data.The "lag at max cor." value is the lag at which the two data streams are most highly correlated with each other.For example, a lag of -3 indicates that the data sources were most highly correlated with cases from day t − 3 and hospitalizations at day t.The "max cor." column shows the correlation at that lag.
We observed that the scale at which cases and hospitalizations were associated with each other was different between the validation and test phase of the analysis (Figure 2).While there were periods of both phases with similar cases-to-hospitalization ratios, both states had periods in the test phase where more cases were reported for each hospital admission.As described in the main manuscript, "changes in the circulating variants and in population immunity, due to prior infections or vaccinations, likely impacted the severity of disease, thereby altering the relationship between cases and hospitalizations over time."The daily ratio of number of reported cases per reported hospital admission changed over time.Panels A and C show this ratio as a function of time, with blue triangles representing data points from the validation phase and red circles representing ones from the test phase.In both states, the test phase had periods in which this ratio was substantially higher than in the validation phase.Panels B and D show scatterplots of hospitalizations (y-axis) vs. smooth JHU report-date cases (x-axis), with a linear regression line drawn through the observations from each phase.At higher levels of reported cases in both states, a pattern is visible where there are fewer cases per hospital admission in the test phase, consistent with the data as shown in panels A and C.
correlation of cases and hospitalization

Figure 2 :
Figure2: The daily ratio of number of reported cases per reported hospital admission changed over time.Panels A and C show this ratio as a function of time, with blue triangles representing data points from the validation phase and red circles representing ones from the test phase.In both states, the test phase had periods in which this ratio was substantially higher than in the validation phase.Panels B and D show scatterplots of hospitalizations (y-axis) vs. smooth JHU report-date cases (x-axis), with a linear regression line drawn through the observations from each phase.At higher levels of reported cases in both states, a pattern is visible where there are fewer cases per hospital admission in the test phase, consistent with the data as shown in panels A and C.

Table 2 :
Validation period accuracy metrics for forecasts of California and Massachusetts hospital admissions, including results from models that used smoothed case data as inputs.The models shown include the best individual autoregressive models from the validation phase that used test-date data (TestCase), report-date data (ReportCase) and no case data (HospOnly) as inputs.For the models that used case data, the best models are shown that both smoothed and did not smooth that data stream.The mean weighted interval score (MWIS), mean absolute error (MAE) and 95% prediction interval coverage (PIcov 0.95 ) scores are shown for each model with the best scores in the validation period highlighted.Within each state, the models are sorted by highest accuracy (lowest MWIS) scores at the top.The model parameters for the auto-regressive model are also provided in the (p,d,P,D) column.