Prediction model for COVID-19 patient visits in the ambulatory setting

Objective Healthcare systems globally were shocked by coronavirus disease 2019 (COVID-19). Policies put in place to curb the tide of the pandemic resulted in a decrease of patient volumes throughout the ambulatory system. The future implications of COVID-19 in healthcare are still unknown, specifically the continued impact on the ambulatory landscape. The primary objective of this study is to accurately forecast the number of COVID-19 and non-COVID-19 weekly visits in primary care practices. Materials and Methods This retrospective study was conducted in a single health system in Delaware. All patients’ records were abstracted from our electronic health records system (EHR) from January 1, 2019 to July 25, 2020. Patient demographics and comorbidities were compared using t-tests, Chi square, and Mann Whitney U analyses as appropriate. ARIMA time series models were developed to provide an 8-week future forecast for two ambulatory practices (AmbP) and compare it to a naïve moving average approach. Results Among the 271,530 patients considered during this study period, 4,195 patients (1.5%) were identified as COVID-19 patients. The best fitting ARIMA models for the two AmbP are as follows: AmbP1 COVID-19+ ARIMAX(4,0,1), AmbP1 nonCOVID-19 ARIMA(2,0,1), AmbP2 COVID-19+ ARIMAX(1,1,1), and AmbP2 nonCOVID-19 ARIMA(1,0,0). Discussion and Conclusion: Accurately predicting future patient volumes in the ambulatory setting is essential for resource planning and developing safety guidelines. Our findings show that a time series model that accounts for the number of positive COVID-19 patients delivers better performance than a moving average approach for predicting weekly ambulatory patient volumes in a short-term period.


-3 On
March 11, 2020, COVID-19 was declared a global pandemic by the World Health Organization (WHO). 4 By mid-March, transmission of COVID-19 had rapidly accelerated, increasing case counts throughout the United States, and it was found that many patients with severe disease also had common comorbidities such as hypertension, obesity and diabetes . 5 6 In the state of Delaware, the rst presumptive positive case of COVID-19 was reported by the Delaware Division of Public Health on March 11, 2020. 7 In order to mitigate the spread of the virus, the Governor of Delaware declared a state of emergency on March 13, 2020. The weeks that followed included several modi cations to the original state of emergency to minimize the spread of the virus.
In response to the growing pandemic, ChristianaCare Health Services, Inc. (ChristianaCare), which serves the majority catchment area of Northern Delaware and the most populous county in the state, followed suit with its own measures to mitigate spread, postponing all elective procedures in hospitals and all ambulatory practices effective March 17, 2020 to adhere to state and CDC guidelines. The ambulatory services at ChristianaCare adjusted the delivery of healthcare services by reducing the number of inperson visits to minimize the risk to patients and healthcare providers, redirecting patients to telehealth when appropriate. This resulted in a decrease of patient volumes throughout the ambulatory system.
With the uncertainty that COVID-19 presented then, the Phase 1 reopening that occurred on June 1, 2020, and the exponential rise in cases occurring now, it is essential to understand how the ambulatory setting will continue to be affected in order to develop proper guidelines.
To understand the impact of the novel virus, scientists rely on community spread models to predict possible transmission. The popular susceptible, infected, and recovered (SIR) epidemiologic model and variations of this model have been used to gauge community spread of a variety of infectious diseases such as in uenza and dengue fever. [8][9][10][11][12] SIR models have also been applied to inpatient settings to predict hospital capacity regarding admissions, ICU beds, and ventilators. 9,11,13,14 In addition to SIR models, the current literature on predicting patient volume varies from descriptive statistics to advanced time series models, with most of the studies that have used time series forecasting models focusing on emergency department and hospital admissions.
Time series forecasting in ambulatory visits prior to the COVID19 pandemic have been described in a few reports. [15][16][17][18][19][20][21][22] The most used method for time series forecasting is the Box-Jenkins method otherwise known as the AutoRegressive Integrated Moving Average (ARIMA) model. 23 The ARIMA model has been used for its simplicity and exibility in capturing linear patterns in a time series. 17,19−22 Signi cance The future implications of COVID-19 in healthcare are still unknown, speci cally how it will continue to affect the ambulatory landscape. This work aims to inform COVID-19 and nonCOVID-19 ambulatory resources allocation as well as guide ambulatory practices reopening for in-person visits as in-person care might have been delayed. We propose an ARIMA time series model to capture the changes in ambulatory patient volumes as a result of COVID-19.

Objective
The primary objective of this study is to accurately forecast the number of COVID-19 and nonCOVID-19 weekly visits in primary care practices. The ability to forecast patient volumes in primary care locations by accurately evaluating the dynamic changes in patient visits and tting these data to a statistical model is useful for the appropriate allocation of human and material resources for future planning. With the uncertainty that COVID-19 presents, healthcare systems have been adapting their ambulatory practices to adhere to state guidelines and prepare for state reopening phases. Therefore, we developed a time series model that provides an 8-week future forecast for ambulatory practices and compared it to a naïve moving average approach.

Study Design
This retrospective study was conducted in a single health care system in Delaware (ChristianaCare), serving the primary catchment area of New Castle County. New Castle County is in the northernmost region of Delaware and as of 2019 has an estimated population of 558,753, accounting for nearly twothird of the entire state population. 24 We selected the patients' records from the two practices that had the highest historical patient volumes among all clinics a liated with ChristianaCare. Our study population included (1) COVID-19 patients who had prior family medicine ambulatory services within ChristianaCare in 2019 and had been previously hospitalized and discharged or were self-monitoring at home and had not been hospitalized for the disease. (2) Any patient who utilized ambulatory services from the same practices during the same time period and were not diagnosed with COVID-19. COVID-19 patients currently hospitalized were excluded from the population.
We extracted all patients' records from our electronic health records system (EHR) from January 1, 2019 to July 25, 2020 and built two datasets. One included patient-level data (e.g. age, gender, race, ethnicity, insurance, marital status, and Elixhauser comorbidities and the other ambulatory practice-related data (e.g. encounter location, encounter providers, and weekly patient volumes). 25 Patient-level data were used for characterizing the study population. Ambulatory practice-related data were primarily used for our time series models. For model development, we used one year of data between January 2019 and December 2019, and for model validation data from January 2020 until the most recent data available (July 2020).

Statistical and Forecasting Methods
Descriptive statistics Patient demographics and comorbidities were compared using t-tests, Chi square, Mann Whitney U analyses as appropriate according to the distribution.

Time-series
A time series is a sequential set of data points, measured typically over successive times. The ARIMA model was created for auto-correlated and non-stationary time series data. 11 The framework for ARIMA is displayed in Table 1.  Tables 2 and 3.

Discussion
The novelty of the COVID-19 virus created a conundrum, not only for the inpatient world, but for the ambulatory outpatient clinical environment as well. While many people were being hospitalized due to complications from the virus, an overwhelming majority were being evaluated and treated in the outpatient setting, either by urgent care or by their primary care provider. At the height of the pandemic, prediction models only existed for hospitals to anticipate the need for sta ng, personal protective equipment (PPE), equipment and other resources as the cases surged, which made it very challenging to anticipate sta ng, equipment and logistical needs for primary care practices. There were many questions/scenarios to consider such as designating speci c practices to care for COVID-19 infected patients, estimating the number of sta ng and PPE necessary at each practice site for in person care vs delivering telehealth care; redeployment of staff to our Ambulatory COVID-19 treatment center (in person care for non-emergent patients with COVID-19) and to our Virtual COVID-19 primary care practice, which monitored moderately ill patients infected by the disease via video visits and secure texting, based on ambulatory patient volumes. Development of the ambulatory COVID-19 model provided the opportunity to identify volume trends and anticipate the need to modify our care delivery models based on the estimates.
We found that the ARIMA/ARIMAX forecasting models considered in this study outperformed more traditional modeling approaches such as the moving average-based approach. Although a MAPE of < 10% is considered an accurate forecast, the COVID-19 ARIMA models we generated provided a more dynamic prediction than the moving average forecast. Our model accounted for the number of weekly positive COVID-19 cases.
Our ndings support the use of such models by health care system administrators to forecast patient volumes and make prospective resource adjustments.
Data utilized for the test set prediction included the time period from March 1, 2020 -July 25, 2020 and was not inclusive of the months during which u season typically occurs. The impending u season brings much uncertainty in relation to COVID-19 for both inpatient and ambulatory settings because the two infections may overlap in the winter season. COVID-19 and the u share clinical characteristics, therefore differentiating between the illnesses is paramount. 26 This convergence has the potential to yet again overwhelm healthcare systems especially the ambulatory environment, which is often the rst point of contact during respiratory virus season. It is thought that the possible resurgence of COVID-19 would lead to tighter social distancing measures which would lessen u transmission. 27 However, in absence of that certainty, having an ambulatory prediction model to project patient volumes is crucial.

Limitations
The current study is limited to retrospectively using electronic health records and positive COVID-19 results from only one hospital system. Our results may not be generalizable to other hospital systems, particularly those who serve patients with different characteristics. Other forecasting methods may be appropriate for different hospitals due to the differences in organizational structure and resources. Our study period is limited to one year of historical data, ignoring potential factors such weather and seasonal affects that could possibly improve forecasting accuracy. However, due to the unpredictable nature of COVID-19 our regular volumes and trends were disrupted. Therefore, using volumes from more years might not actually give us any more accurate prediction since our system is in a transient state, especially during the time of this study. Also, our weekly predictions did not differentiate between inperson and virtual volumes. In future studies, dividing the volumes between in-person and virtual could improve accuracy and provide additional information to healthcare providers for resource planning. Lastly, our models are short-term forecasts. Long-term forecasts can be generated, although the error rate will increase as the prediction period increases.
TC designed and conceptualized the overall study. RCL performed time series model development and evaluation. CKH performed the statistical analysis. KN performed data extraction and cleaning. RCL and CKH led the writing of this manuscript. CTJ, MAP, RK, CT, and TC provided input in the interpretation of the results, reviewed the manuscript, and contributed to revisions. All authors gave their approval for the nal version to be submitted and published.