COVID-19 infection dynamics in care homes in the East of England: a retrospective genomic epidemiology study

Background COVID-19 poses a major challenge to infection control in care homes. SARS-CoV-2 is readily transmitted between people in close contact and causes disproportionately severe disease in older people. Methods Data and SARS-CoV-2 samples were collected from patients in the East of England (EoE) between 26th February and 10th May 2020. Care home residents were identified using address search terms and Care Quality Commission registration information. Samples were sequenced at the University of Cambridge or the Wellcome Sanger Institute and viral clusters defined based on genomic and time differences between cases. Findings 7,406 SARS-CoV-2 positive samples from 6,600 patients were identified, of which 1,167 (18.2%) were residents from 337 care homes. 30/71 (42.3%) care home residents tested at Cambridge University Hospitals NHS Foundation Trust (CUH) died. Genomes were available for 700/1,167 (60%) residents from 292 care homes, and 409 distinct viral clusters were defined. We identified several probable transmissions between care home residents and healthcare workers (HCW). Interpretation Care home residents had a significant burden of COVID-19 infections and high mortality. Larger viral clusters were consistent with within-care home transmission, while multiple clusters per care home suggested independent acquisitions.

is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020.    is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. Colorado, USA over a two-week period identified a number of clusters among staff working at the 9 same facility, but did not include sequences from residents. A single sequence reported from a care 10 home resident in Hungary was found to differ from other Hungarian SARS-CoV-2 sequences. Finally, 11 an epidemiological study of COVID-19 in 189 care homes in Scotland did not include any genomic data.

13
This study includes care home residents tested during the course of the first phase of the COVID-19 14 pandemic in a large geographical region in the UK. It is more comprehensive and representative than 15 previous studies and includes detailed metadata and genomic data, which are being made openly 16 available as a resource for other researchers. We used a clustering algorithm and demonstrated how 17 genomic and epidemiological data can be integrated to define possible transmission networks.

19
Detailed combined epidemiological and genomic studies are essential to improve our understanding 20 of the transmission and impact of SARS-CoV-2 in long term nursing and residential facilities. Our study 21 has identified two patterns of transmission (outbreaks within care homes and multiple, distinct 22 clusters among care home residents from the same care homes) that will require tailored infection 23 control measures to prevent and mitigate them. is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020.

17
representing 3.2% of the total population at this age; 82.5% of the care home population was aged 65 18 years or older. 5 Care homes are known to be high risk settings for infectious diseases, owing to a 19 combination of the underlying vulnerability of residents who are often frail and elderly with multiple 20 comorbidities, the shared living environment with multiple communal spaces, and the high number 21 of interpersonal contacts between residents, staff and visitors in an enclosed space. 6,7 Understanding 22 the transmission dynamics of SARS-CoV-2 within care homes is therefore an urgent public health 23 priority.

25
Rapid SARS-CoV-2 sequencing combined with detailed epidemiological analysis has been used to trace 26 viral transmission networks in hospital and community-based healthcare settings. 8 Previous 27 epidemiological studies of COVID-19 in care homes have been limited in population size, temporal 28 scale and/or the amount of genomic data included. 9-13 Here, we apply genomic epidemiology to 29 investigate viral transmission dynamics in care home residents across the East of England (EoE). We 30 aimed to address questions of key public health concern: What is the burden of care home associated is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. with priority for homes caring for people aged 65 years or older. Prior to this, systematic screening of 6 all residents within care homes was much less common; testing primarily occurred where there was a 7 suspicion of an outbreak, and hence there is reduced risk of bias introduced by systematic screening.

8
During the study period the scope of testing in hospital and community settings, including care homes, 9 changed several times, as eligibility criteria were modified (Appendix p 15).

11
Patients were initially identified as potential care home residents if search terms including "care 12 home" or "nursing home" were identified in the address fields of their electronic healthcare records.

13
Next, the names of care homes registered in the CQC database (which aims to include all care homes 14 in England) were matched with patient addresses to identify further care home residents (Appendix 15 pp 4-7, 16). The resulting dataset was manually inspected, linked to CQC registered care homes, and 16 matching care home addresses were assigned anonymised care home codes. We refer to care homes 17 recorded by the CQC as having nursing care available as "nursing homes" and care homes without   is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. . https://doi.org/10.1101/2020.08.26.20182279 doi: medRxiv preprint available). Tests for statistical significance were performed in R; non-parametric population 1 comparisons were made using the Wilcoxon rank sum test. The lowest P-values reported are <0.0001.

3
This study was conducted as part of surveillance for COVID-19 infections under the auspices of Section 4 251 of the NHS Act 2006. It therefore did not require individual patient consent or ethical approval.

5
The COG-UK study protocol was approved by the Public Health England Research Ethics Governance 6 Group (reference: R&D NR0195).

24
There was a slight trend for nursing homes to have more cases per home than residential homes 25 (median 3, IQR 2-4 versus median 2, IQR 1-3, respectively (P=0.03, Wilcoxon rank sum test)) (Appendix 26 p 22). The number of cases per care home per week increased over the study period (Appendix p 23), 27 likely reflecting increased testing. While non-care home cases declined during April 2020, care home 28 numbers were initially maintained and then declined more slowly; the proportion of cases coming 29 from care homes relative to non-care homes increased, from <10% in March to >40% in the first week is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020.  Figure 2B), of which <7% were admitted to the intensive care unit  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. . https://doi.org/10.1101/2020.08.26.20182279 doi: medRxiv preprint using a previously published algorithm adjusted for SARS-CoV-2 (Appendix pp 10-11, 31-32). 20

12
We investigated transmission networks involving care home residents and healthcare workers (HCW)

13
for people tested at CUH (HCW data were not available outside of CUH) (Appendix pp 10-11). We 14 defined clusters using the same method as for the care home resident analysis but allowed HCW to 15 belong to clusters from multiple care homes, allowing for multiple care home residents to be linked is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. . https://doi.org/10.1101/2020.08.26.20182279 doi: medRxiv preprint the transcluster algorithm. However, we accept that without confirmatory epidemiological data this 1 interpretation remains speculative.

27
Our transmission modelling suggested that some care homes had experienced "outbreaks" with all or 28 a large proportion of cases linked in a single transmission network. We also observed care homes 29 containing multiple distinct transmission clusters. Some of these may represent hospital-acquired 30 infections for care home residents that were admitted to hospital at or shortly before the time of their 31 positive test result, rather than independent transmission events within the care homes. However, we 32 note that only 7/71 (10%) care home residents tested at CUH had suspected or definite hospital-

33
. CC-BY-NC-ND 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. . https://doi.org/10.1101/2020.08.26.20182279 doi: medRxiv preprint acquired COVID-19 infections, so some of the identified instances of multiple transmission clusters 1 from a single care home may represent independent introductions of the virus into the care home.

2
These findings suggest that preventing the introduction of new infections into care homes should be 3 a key priority to limit outbreaks, alongside infection control efforts to reduce transmission within care 4 homes, including once an outbreak has been identified. We also found transmission networks 5 involving care home residents and HCW such as paramedics, care home workers, and hospital-based 6 healthcare staff, suggesting a potential link between care home infections and healthcare-associated 7 COVID-19 cases 8 .

9
We acknowledge several limitations to our study. First, defining who is a care home resident from 10 large electronic healthcare records is challenging and, despite our best efforts, we may not have 11 identified all care home residents. However, we linked every care home included in the analysis to 12 CQC registered care homes, so the care homes included should be accurate. Using pre-defined coding 13 such as care home CQC registration numbers when patients are booked into hospital systems, rather 14 than free-text data entry, would help considerably with care home surveillance. Second, we did not 15 have viral sequence data available for 40% of care home residents; this was due to a combination of is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. . https://doi.org/10.1101/2020.08.26.20182279 doi: medRxiv preprint gathering epidemiological data on between-care home contacts, such as paramedic calls and carers 1 working across multiple care homes. Without this epidemiological information, the low viral genetic 2 diversity makes it difficult to infer transmission. Even within care homes, it is possible some genetically 3 similar viruses are from distinct introduction events, though incorporating genomic data will be more 4 accurate for excluding linked transmission than if only temporal data were available. Finally, we did 5 not have data available on who was a HCW or had hospital-acquired infection for patients tested 6 outside of CUH. This means that some of the care home clusters we identified could represent 7 transmission events occurring in hospitals and other healthcare settings, rather than in the care homes 8 themselves. However, large clusters involving multiple residents from the same care home, who often 9 have reduced mobility, and in the context of a lockdown, are suggestive of transmission taking place 10 between residents within the care home.

12
In conclusion, care homes represent a major burden of COVID-19 morbidity and mortality, with

32
. CC-BY-NC-ND 4.0 International license It is made available under a perpetuity.
is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020.         is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020.  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. . https://doi.org/10.1101/2020.08.26.20182279 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. March 2020. In both settings a prolonged right-hand "tail" was observed as case numbers gradually 5 fell. The relative proportion of cases admitted from care homes increased over this period for both 6 sample sets (numbers provided in Appendix), while the contribution of general community cases fell 7 more quickly. Note that if the patient address was missing, and they were not a HCW, then the care 8 home status was undetermined. CAI = Community Acquired Infection; HAI = hospital acquired 9 infection; HCW = healthcare worker; "Other" mainly comprise inpatient transfers from other hospitals 10 to CUH for which metadata was lacking to determine the infection category. CAI was considered  is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. . https://doi.org/10.1101/2020.08.26.20182279 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. . https://doi.org/10.1101/2020.08.26.20182279 doi: medRxiv preprint 1 collected December 2019. Colour bar indicates the ten care homes with the largest number of 2 genomes. B. Distributions of pairwise SNP differences for the ten care homes with the largest number 3 of genomes (as shown in panel A). Among the ten care homes with the largest number of genomes, 4 some clustered closely on the phylogenetic tree with low pairwise SNP differences (e.g. CARE0063, 5 CARE0264, CARE0314); in contrast, some care homes were distributed across the tree with higher 6 pairwise SNP differences (e.g. CARE0061, CARE0151, CARE0173, CARE0263). is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. . https://doi.org/10.1101/2020.08.26.20182279 doi: medRxiv preprint is the author/funder, who has granted medRxiv a license to display the preprint in (which was not certified by peer review) preprint The copyright holder for this this version posted September 1, 2020. . https://doi.org/10.1101/2020.08.26.20182279 doi: medRxiv preprint