Effect of the timing of stay-at-home orders on COVID-19 infections in the United States of America


 The current COVID-19 pandemic has sparked growing global interest in understanding the spread and outcome of the virus. From previous pandemics, we have learned that several demographic, geographic, and socio-economic factors may play a role in increasing risk of infection. Our objective was to examine the association of timing of mandated stay-at-home orders at the county-level with COVID-19 cases, daily case rate and mortality in the United States of America (USA). Publicly available data were used to perform a cross-sectional study of USA counties with > 100,000 population, and at least 50 confirmed cases per 100,000 people as of May 24, 2020. The three outcome variables were: total cases/100,000, daily case rate (DCR), and total deaths/100,000. Out of 3142 USA counties, 569 met the inclusion criteria. Of all variables, the timing of state-mandated stay-at-home order had the most significant effect on all three outcomes after adjusting for multiple socio-demographic, geographic and health related factors. Additional factors with significant association with increased cases and deaths include population density, housing problem, unemployment, African American race, and age > 65. Policymakers at the local level must take this into consideration while planning for interventions to prevent the spread of COVID-19.


Introduction
The novel coronavirus infection of 2019 (COVID- 19) was initially discovered in Wuhan, China in December 2019, and has since spread around the globe 1

. The rst patient with COVID-19 in the United
States of America (USA) was diagnosed on January 20, 2020 2 . Since then, the infection has progressed into a worldwide pandemic 3 , with the USA having the greatest number of cases and deaths in the world as of May 24, 2020 4 . There is currently signi cant global interest in understanding how the infection is spreading, containing the spread of the infection, and developing a vaccine against the virus.
Many nations have opted to secure their borders and implement social distancing measures to mitigate the spread of the virus. While state mandated social distancing measures, henceforth referred to as stayat-home measures, in the USA were in place in most states, the degree of implementation varied between states, and even counties. Most states allowed their residents to leave their homes to get groceries, to exercise outdoors, and to receive medical care. Concurrently, such states opted to close non-essential services and mandate social distancing, while allowing people associated with essential services to go to work 5 . Several states never formally implemented any social distancing, and at the time of this study, many had already started relaxing these measures. There continues to be ongoing debate about the timing and necessity of state mandated stay-at-home orders, as the economy of the states and the country have slowed down due to these measures. At the time of the preparation of this manuscript, there was signi cant skepticism regarding withdrawal of state mandated stay-at-home orders.
Social determinants of health may lead to healthcare disparities 6 , and have been observed to be associated with an increased risk of infection during prior viral respiratory pandemics 7 8 . An understanding of these factors may aid the medical and scienti c communities in identifying populations at greater risk for acquiring COVID-19. There have recently been several local studies detailing the characteristics of COVID-19 patients in their respective health systems, but there is a lack of nationwide analyses. Furthermore, a state-level analysis may not account for the signi cant regional variance within a state. Therefore, our aim was to determine the effects of disparities at the county level on the incidence, spread and deaths from the virus. We sought to determine the association of timing of mandatory social distancing with the number of cases, infection rate and mortality while adjusting for multiple demographic, geographic, socio-economic and known risk factors with the three outcome variables.

Methods
This cross-sectional study was conducted from publicly available county-level data. No individual patient protected health information was utilized. A list of all USA counties and county-equivalent entities was obtained from the United States (US) Census Bureau, with each county identi ed by its unique Federal Information Processing System (FIPS) code 9 . For each variable, the FIPS code was matched to ensure all variables were obtained for the same census designed area. The estimated 2019 population of each county was used, as extrapolated by the US Census Bureau using its own 2010 data 9 . Data on the COVID-19 cases were obtained from various sources as mentioned below. All con rmed cases and con rmed deaths starting from the rst case until May 24, 2020 were included in the study, the cut-off date for the data analysis in this study. The inclusion criteria were then applied to generate the sample.
Inclusion criteria: 1. USA Counties with greater than 100,000 population.
2. USA Counties with con rmed cases of at least 50 per 100,000 people.
The independent variables (demographic, geographic, health related factors and the number of days from the stay-at-home order in that county to the day of 50/100,000 cases): Total population, population density, percentages of unemployment, poverty, population greater than 65 years, education level below high school level, female gender, African-American, Hispanic, rural, poor health, smoking, obesity, number of people per primary care physician, received in uenza vaccine, social association, housing problem, diabetes mellitus, food insecurity, uninsured, and number of days to reach the target daily case rate from the day of o cial announcement of stay-at-home order.
Dependent variables (outcome): 1. Daily case rate (DCR): The rate of daily increase in cases recorded from the rst case to 50/100,000 cases. Calculation was performed by dividing 50 (numerator) by the number of days (denominator) in that county from the rst recorded case to the day of 50/100,000 cases.
2. Con rmed cases per 100,000: The total number of con rmed cases per 100,000 in that county on May 24, 2020, the cut-off date for this study.
3. Deaths per 100,000: The total number of deaths in the county per 100,000 on May 24, 2020, the cut-off date for this study.

De nitions and sources:
The daily number of cases of COVID-19 was obtained from USAfacts.org 10 . Population density was computed by dividing the population of each county by the number of square miles per county 11 . Data for high school education, unemployment, and poverty were obtained from the Economic Research Service of the US Department of Agriculture as of 2018 12 . For education, individuals of age 25 or older who had not obtained a high school diploma were included. The threshold for poverty was de ned per US Census Bureau, based on family size and income. All remaining data for the following variables were obtained from County Health Rankings website 13  Housing problem was measured by presence of overcrowding, high housing costs, and lack of kitchen or plumbing facilities. Food insecurity was measured by the percentage of population that indicated a lack of adequate access to food.
The date of stay-at-home order for each county was obtained from the Institute of Health Matrix and Evaluation (IHME) website and was measured by the number of days from the state mandated stay-athome order to the day of con rmed cases at 50/100,000 in that county, henceforth called as "days from stay-at-home order" 14 .

Statistical analysis:
Linear regression was used to evaluate the association between the independent and the three dependent variables. All of the independent variables were tested for normality. Analysis was carried out in two stages.
In the rst stage, each independent factor (demographic or risk factor) was examined for correlation with each outcome variable using spearman's rho or pearson's method, depending on whether the data were non-normal or normally distributed respectively ( Table 1). The independent factors were also assessed for collinearity with each other. If the coe cient of correlation was +0.5 or greater, one or more of the collinear factors were excluded from the analysis in stage two.
In the second stage, all the factors that were signi cantly associated (level of signi cance at p < 0.05) with each of the outcome variables and had a coe cient of correlation at least ± 0.1 or above were included in multiple linear regression models using stepwise method. Models with best R 2 value from the multiple regression were chosen for each of the three outcome variables for interpretation. Effect size was reported as standardized coe cients beta with a 95% con dence interval for comparison and a p value.

Results
A total of 569 counties met the inclusion criteria from a list of 3142 counties assessed. All 50 US states were represented. The socio-demographic, geographic, health and other outcome characteristics of the population of the counties are illustrated in Table 1, as well as the three outcome variables in columns 3, 4 and 5. The coe cient of correlation between each factor with the corresponding outcome is shown in the table along with the information about the direction and signi cance of the correlation. The highest coe cient of correlation was observed for days from the stay-at-home order for all three outcomes (-0.73, -0.60 and -0.58). The population density per square mile had the next best coe cient for con rmed cases and deaths (0.34) followed by percentage of uninsured (-0.21 and -0.26). All the variables showed nonnormal distribution except smoking. Collinearity ³ + 0.5 was found to be present between population and population density; female sex with African American race; Hispanic race with housing problems; education below high school with poverty; poor health with poverty, smoking, diabetes mellitus, food insecurity and uninsured; smoking with diabetes, obesity, food insecurity, and poverty; food insecurity with poverty, smoking, uninsured and poor health.
For DCR, the independent variables considered for the second stage were population density, unemployment, female, African American race, Hispanic, uninsured and days from stay-at-home order. The best t model obtained by stepwise regression in Table 2 shows that only three factors were signi cantly associated with DCR after adjusting for the rest: days from stay-at-home order, uninsured, and unemployment. The R 2 value of 0.50 shows a good tted model. For the state mandated stay-athome order, placing of the order a day earlier would have decreased the DCR by 0.04 cases. For each 1% increase in uninsured population, the DCR was predicted to decrease by 0.03 cases and for each 1% increase in unemployment, 0.1 cases.
In terms of con rmed cases/100,000: population density, education, African American race, in uenza vaccination in Medicare population, housing problems, uninsured and days from stay-at-home quali ed to be tted into regression models. Percentage of rural population and food insecurity were excluded due to collinearity with population density and uninsured respectively because they quali ed for other regression models as well. The best t model comprised of a negative association of the independent variables: days from stay-at-home order and uninsured, while positive association was found with housing, population density, education, African American race and in uenza vaccination. The largest standardized coe cient beta values were obtained for days from stay-at-home order and housing problems. About 47% of the variability of the con rmed cases in the counties could be explained by this model.
For the association of deaths/100,000, population density, unemployment, age > 65 years, female sex, African American race, in uenza vaccinated, housing problem, uninsured and days from the stay-at-home order were entered in a stepwise manner into the models. Food insecurity was removed due to collinearity with population density. Population density, housing problem, in uenza vaccine, unemployment and age > 65 years were positively and signi cantly associated with deaths/100,000, while days from stay-athome order and uninsured had a negative but signi cant association with deaths/100,000. The model was a moderate t at 0.48, which means that 48% of the variability of deaths/100,000 could be explained by these independent variables. Again, population density (0.37) and days from stay-at-home order (-0.32) have the maximum strength of effect on deaths/100,000 population due to COVID-19.

Discussion
In this study, we demonstrated that the timing of state mandated stay-at-home orders had the most signi cant effect on all three outcomes: the total cases/100,000, daily case rate, and total deaths/100,000 after adjusting for multiple socio-demographic, geographic and health related factors. A proclamation of this order in the counties that implemented it, even a day earlier, would have reduced the DCR by 0.04, con rmed cases by 12/100,000, and deaths by 0.87/100,000.
Health disparities between the counties in the United States have long existed and been previously well studied 15 16 17 . We sought to determine the effects of these disparities on the incidence and spread of COVID-19 at the county level. Socio-economic factors including gender, race, age, income and insurance have been shown to be associated with the incidence and spread of COVID-19 18 . Counties with low population in the USA may not have the adequate health infrastructure to manage complicated COVID-19 patients, leading to movement of these patients to adjoining counties with more advanced facilities. Patients from smaller counties may be referred to larger health centers in adjacent counties with larger population, leading to sampling bias in the analysis. Furthermore, individuals with appropriate health insurance coverage may seek out referral centers in adjacent counties with better health facilities. We included counties with a population of > 100,000 population and a minimum case rate of 50/100,000 in an attempt to reduce these biases.

DCR:
We considered a variety of factors which may be hypothesized to have an association with COVID-19 incidence and spread, but the factor that emerged with the most signi cant association was the days from stay-at-home order (coe cient of correlation -0.73 and a standardized beta of -0.69). The differential effects of the timing of intervention in terms of social distancing, to prevent the spread of the virus, has already been established very recently by Pei (2020) 19 . In this study, we were able to show that after adjusting for all possible independent factors, enforced social distancing measures (days from the order of stay-at-home) had the largest effect size in preventing the spread of virus. Uninsured patients had a signi cant negative correlation with daily case rate, implying that higher uninsured population in a county would be associated with lower COVID-19 cases. The differential accessibility of uninsured population to testing is the likely explanation for this nding. Uninsured people are more likely to experience poverty and have transportation di culties, therefore they may not be able to reach the few and distant COVID-19 testing sites 20 .
Con rmed cases per 100,000: Signi cant positive correlation in the best t multiple regression model was observed between housing problem, population density, lower education, African American race and number of people that received in uenza vaccine. Previous research has shown that housing problems may lead to overcrowding, an important determinant for the spread of virus 21 . High population density may be associated with increased spread of the virus (Rocklöv and Sjödin, 2020). In our study, higher population density had a large standardized beta coe cient of 0.17 and 0.37 with con rmed cases/100,000 and deaths/100,000 respectively, making this the second most important factor after days from stay-at-home order to affect the outcomes. It is well known that racial and ethnic minority groups are often affected disproportionately more by pandemics. Non-Hispanic African Americans not only have a higher case burden, but also are hospitalized more often and have a higher death rate (CDC, 2020) 24 25 . Our study showed a signi cant association between the number of cases and African American race. Recently, questions were raised about the adverse effects of in uenza vaccine on the susceptibility of individuals to coronavirus but other studies have refuted the possibility 26 27 . The association of higher rate of in uenza vaccination with con rmed cases may be attributed to the fact that in Medicare population, those who have received the in uenza vaccination are more likely to get tested and treated for COVID-19. We were unable to determine an alternative explanation for this nding. Signi cant negative association was observed for days from the stay-at-home orders and uninsured population among the counties.
Con rmed deaths per 100,000: Several factors that were associated with con rmed cases were also associated with deaths, such as days from stay-at-home order, housing problem, population density and the uninsured. The percentage of unemployed and age > 65 years, both showing positive correlation, were the new factors associated with COVID-19 related deaths. Unemployment is an important socioeconomic factor indirectly related to the higher death rate due to the virus. Lack of timely access to testing, housing problems, chronic illnesses and delayed access to medical treatment can be associated with unemployment, as well as increased number of deaths from the virus. Age > 65 years have been shown to be strongly associated with deaths 28 . Limitations: The major limitation of the study is that cases of COVID-19 in the counties cannot be directly linked to the subjects with the demographic, socioeconomic or geographic factor. This study merely compares the population characteristics of the counties with the outcomes within the same and other counties. In addition, we considered about 20% of the most populated counties of the USA, which may lead to a bias resulting from the factors associated with less populated counties. Population movement due to referrals from satellite hospitals to large health centers may also lead to statistical bias.

Conclusion
In the USA, at the county level, the timing of the state mandated stay-at-home order was the most signi cant factor affecting all three outcomes: daily case rate, con rmed cases/100,000 and deaths/100,000, after adjusting for multitude of social, demographic, geographic and health related factors. The earlier the order was put in place, the lower the values of the outcome variables were, suggesting that policymakers at the local level must consider this fact while planning for the pandemic response, and that the public should continue to take the necessary precautions to prevent cases and deaths.  The three outcome variables included are daily case rate, con rmed cases, and deaths. All are calculated per 100,000 people, and data was included up to May 24, 2020.