Changes in symptomatology, reinfection, and transmissibility associated with the SARS-CoV-2 variant B.1.1.7: an ecological study

Summary Background The SARS-CoV-2 variant B.1.1.7 was first identified in December, 2020, in England. We aimed to investigate whether increases in the proportion of infections with this variant are associated with differences in symptoms or disease course, reinfection rates, or transmissibility. Methods We did an ecological study to examine the association between the regional proportion of infections with the SARS-CoV-2 B.1.1.7 variant and reported symptoms, disease course, rates of reinfection, and transmissibility. Data on types and duration of symptoms were obtained from longitudinal reports from users of the COVID Symptom Study app who reported a positive test for COVID-19 between Sept 28 and Dec 27, 2020 (during which the prevalence of B.1.1.7 increased most notably in parts of the UK). From this dataset, we also estimated the frequency of possible reinfection, defined as the presence of two reported positive tests separated by more than 90 days with a period of reporting no symptoms for more than 7 days before the second positive test. The proportion of SARS-CoV-2 infections with the B.1.1.7 variant across the UK was estimated with use of genomic data from the COVID-19 Genomics UK Consortium and data from Public Health England on spike-gene target failure (a non-specific indicator of the B.1.1.7 variant) in community cases in England. We used linear regression to examine the association between reported symptoms and proportion of B.1.1.7. We assessed the Spearman correlation between the proportion of B.1.1.7 cases and number of reinfections over time, and between the number of positive tests and reinfections. We estimated incidence for B.1.1.7 and previous variants, and compared the effective reproduction number, Rt, for the two incidence estimates. Findings From Sept 28 to Dec 27, 2020, positive COVID-19 tests were reported by 36 920 COVID Symptom Study app users whose region was known and who reported as healthy on app sign-up. We found no changes in reported symptoms or disease duration associated with B.1.1.7. For the same period, possible reinfections were identified in 249 (0·7% [95% CI 0·6–0·8]) of 36 509 app users who reported a positive swab test before Oct 1, 2020, but there was no evidence that the frequency of reinfections was higher for the B.1.1.7 variant than for pre-existing variants. Reinfection occurrences were more positively correlated with the overall regional rise in cases (Spearman correlation 0·56–0·69 for South East, London, and East of England) than with the regional increase in the proportion of infections with the B.1.1.7 variant (Spearman correlation 0·38–0·56 in the same regions), suggesting B.1.1.7 does not substantially alter the risk of reinfection. We found a multiplicative increase in the Rt of B.1.1.7 by a factor of 1·35 (95% CI 1·02–1·69) relative to pre-existing variants. However, Rt fell below 1 during regional and national lockdowns, even in regions with high proportions of infections with the B.1.1.7 variant. Interpretation The lack of change in symptoms identified in this study indicates that existing testing and surveillance infrastructure do not need to change specifically for the B.1.1.7 variant. In addition, given that there was no apparent increase in the reinfection rate, vaccines are likely to remain effective against the B.1.1.7 variant. Funding Zoe Global, Department of Health (UK), Wellcome Trust, Engineering and Physical Sciences Research Council (UK), National Institute for Health Research (UK), Medical Research Council (UK), Alzheimer's Society.


Introduction
In early December, 2020, a phylogenetically distinct cluster of SARS-CoV-2 was genetically characterised in the southeast of England. Most cases had been detected in November, with a small number detected as early as September, 2020. 1 Genomic surveillance revealed that this new variant, termed B.1.1.7, has several mutations of immuno logical significance and has been spreading rapidly, with cases increasing in frequency. 2 It is important to under stand how these mutations could affect the presenta tion and spread of COVID-19 so that effective public health responses can be formulated. 3 Preliminary evidence from epidemiological studies suggests that the B.1.1.7 variant is more transmissible than pre-existing variants. Davies and colleagues 4 found the B.1.1.7 variant to be 43-90% (95% CI 38-130) more transmissible than pre-existing variants, and Volz and colleagues have shown that the B.1.1.7 variant increases the effective reproduction number, R t , by a factor of 1·5-2·0. 5 Evidence suggests that the B.1.1.7 variant increases the risk of admission to hospital and death. 6 However, much is still unknown. From a public health perspective, it is crucial to understand whether the B.1.1.7 variant necessitates changes to existing measures for disease monitoring and containment. For instance, changes to symptomatology could require modi fications to symptomatic testing programmes to ensure that new cases are identified, and changes to disease duration could require changes in the duration of isolation required for infected individuals. It is important for modelling and forecasting to understand whether the B.1.1.7 variant alters the rate of reinfection. Early estimates of the transmissibility of the B.1.1.7 variant are uncertain and additional estimates using independent data sources are needed. Furthermore, it is important to understand how these findings will affect measures to control the spread of the pandemic using non-pharmaceutical inter ventions, such as lockdowns.
We aimed to investigate the symptomatology, disease course, rates of reinfection, and transmissibility of the B.1.1.7 variant in the UK population.

Study design and participants
We did ecological studies to assess the symptoms, disease course, rates of reinfection, and transmissibility associated with increasing proportions of infections with the B.1.1.7 variant in the UK population. We used data from the COVID Symptom Study, 7 a longitudinal dataset providing symptom reports and test results from a population of more than 4 million adults living in the UK, in combination with surveillance data from the COVID- 19 Genomics UK (COG-UK) Consortium 8 and a spike-gene target failure (SGTF) correlate in community testing data.
The study was approved by the King's College London Ethics Committee (REMAS ID 18210, review refer ence LRS-19/20-18210). All participants provided consent through the COVID Symptom Study app.

Data sources
Longitudinal data were prospectively collected through the COVID Symptom Study app, developed by Zoe Global with input from King's College London (London, UK), Massachusetts General Hospital (Boston, MA, USA), and Lund and Uppsala Universities (Sweden). The app 7 guides users through a set of enrolment questions, establishing baseline demographic and health information. Users are asked to record each day whether they feel physically normal and, if not, to log any symptoms. After a user reports any symptoms, they are asked "Where are you right now?", with the options "At home", "At hospital with

Research in context
Evidence before this study To identify existing evidence on the SARS-CoV-2 B.1.1.7 variant, we searched PubMed and Google Scholar for articles published between Dec 1, 2020, and Feb 1, 2021, using the keywords "COVID-19" AND "B.1.1.7" with no language restrictions, finding 281 results. We did not find any studies that investigated B.1.1.7-associated changes in symptoms or their severity and duration, but found one study showing that the B.1.1.7 variant did not change the ratio of symptomatic to asymptomatic infections. We found six articles describing laboratory-based investigations of the responses of the B.1.1.7 variant to vaccine-induced immunity, but no work investigating what this means for natural immunity and the likelihood of reinfection outside the laboratory. We found five articles that showed increased transmissibility of the B.1.1.7 variant. Other identified studies were not relevant.

Added value of this study
To our knowledge, this is the first study to explore changes in symptom type and duration and community reinfection rates associated with the B.1.1.7 variant. We used self-reported symptom logs from 36 920 users of the COVID Symptom Study app who reported positive test results between Sept 28 and Dec 27, 2020. The B.1.1.7 variant was not associated with changes in the COVID-19 symptoms reported, nor their duration. We also did not find evidence for an increase in reinfections in the presence of the B.1.1.7 variant. We found a multiplicative increase in the effective reproduction number, R t , of the B.1.1.7 variant by a factor of 1·35 (95% CI 1·02-1·69) compared with pre-existing variants. However, we found that R t fell below 1 during regional and national lockdowns, even in regions with high proportions of infections with the B. suspected COVID-19 symptoms", or "Back from hospital". Users are also asked to maintain a record of any COVID-19 tests and their date, type, and result in the app. Users are able to record the same data on behalf of others, such as family members, to increase data coverage among those unlikely to use mobile apps, such as older adults. We included users living in the UK who had logged responses through the app at least once in the period between Sept 28 and Dec 27, 2020.
We used data released on Jan 13, 2021, from COG-UK to extract time-series of the percentage of daily cases resulting from the B.1.1.7 variant in Scotland, Wales, and each of the seven National Health Service (NHS) regions in England. Northern Ireland was excluded because of the low number of samples in the COG-UK dataset. These data were produced by sequencing a sample of PCR tests done in the community. Because of a delay of around 2 weeks 2 between PCR tests and genomic sequencing, we used data only from samples taken up to Dec 31, 2020, to avoid censoring effects.
Additionally, we used data from Public Health England on the probable new variant captured in community cases in England according to SGTF. One of the spike gene mutations in the B.1.1.7 variant has been observed to cause an SGTF in the test used in three of England's large laboratories for the analysis of community cases. 1 This failure results in a marker that is sensitive, but not necessarily specific, to the B.1.1.7 variant, as other circulating variants also contain the mutation leading to an SGTF. Comparison with genomic data shows that, from Nov 30, 2020, onwards, more than 96% of cases with the SGTF were from the B.1.1.7 variant. 9 The propor tion of SGTF cases is made available in England for each of the 316 lower-tier local authorities. We grouped these data into each NHS region using a population-weighted average to enable integration with other data sources.

Statistical analysis
To assess whether the symptomatology of infection with the B.1.1.7 variant differed from that of previous variants, we investigated the change in symptom reporting from Sept 28 to Dec 27, 2020, covering 13 complete weeks over the period when the proportion of infections with the B.1.1.7 variant grew most notably in the NHS regions of London, South East, and East of England. For each week, in every region considered, we calculated the proportion of users reporting each symptom. Users were included in a week if they had reported a positive swab result (by PCR or lateral-flow test) in the period 14 days before or after that week. For each region and symptom, we did a linear regression, examining the association between infections and the B.1.1.7 variant as a proportion of total SARS-CoV-2 infections in that region (independent variable) and the proportion of users reporting the symp tom (dependent variable) over the 13 weeks considered. We adjusted for the age and sex of users, as well as for two seasonal environmental confounders: regional tempera ture and humidity. Seasonal confounders were calculated each day from the temperature and relative humidity at 2 m above the surface (obtained from NASA climate data), averaged across each region considered.
We also examined the association between the proportion of infections with the B.1.1.7 variant and the disease burden, measured here as the total number of different symptoms reported over a period of 2 weeks before and 2 weeks after the test, and the relation with asymptomatic infection, defined as users reporting a positive test result but no symptoms in the 2 weeks before or after the test. We also investigated the rate of self-reported hospital visits, including users who reported being in hospital with suspected COVID-19 symptoms or being back from hospital. We also investigated the proportion of individuals reporting a long duration of symptoms, using a previously published definition of continuous symptoms reported for at least 28 days. 10 To avoid censoring effects, the analyses of admission to hospital and long symptom duration included symptom reports to Jan 18, 2021, and the analysis of long symptom duration also considered reports of positive tests up to Dec 21, 2020. All analyses were adjusted for sex, age, temperature, and humidity. We controlled for the false discovery rate to account for multiple comparisons.
We defined possible reinfection as the presence of two reported positive tests separated by more than 90 days with a period of reporting no symptoms for more than 7 days before the second positive test. We calculated the proportion of possible reinfections among individuals who reported their first positive test before Oct 1, 2020. To assess whether the risk of reinfection was stronger in the presence of the B.1.1.7 variant, we did ecological studies in every region, examining the Spearman correlation between the proportion of infections with the B.1.1.7 variant and the number of reinfections over time, and between the proportion of positive tests reported through the app and the number of reinfections. We compared these two corre lations in each region with use of the Mann-Whitney U test.
Daily estimates of the incidence of SARS-CoV-2 infection in Scotland, Wales, and each of the seven NHS regions in England during the period from Oct 1 to Dec 27, 2020, were produced using data from the COVID Symptom Study app and previously described methods. 11 Using the COG-UK data to estimate the proportion of infections with the B.1.1.7 variant in each region per day, these incidence estimates were decomposed into two incidence time-series per region, one for pre-existing variants and one for B.1.1.7, with the constraint that the two time-series should sum to match the total incidence. R t was estimated separately for the pre-existing variants and B.1.1.7 using previously described methods. 11 Briefly, we used the relationship I t+1 =I t exp(μ[R t -1]), where 1/μ is the serial interval and I t the incidence on day t. We modelled the system as a Poisson process and assumed that the serial interval was drawn from a gamma distribution with α=6·0 and β=1·5, and used Markov Chain Monte-Carlo methods to estimate R t .
For the NASA climate data source see https://power.larc. nasa.gov/ We compared both multiplicative and additive differences of the new and old R t values for days when the proportion of infections with the B.1.1.7 variant in a region was greater than 3%. Although data were not available for the proportion of infections with the B.1.1.7 variant in January, 2021, we also computed total incidence and R t from Oct 1, 2020, to Jan 16, 2021, to see how they changed during the national lockdown in England.

Role of the funding source
Zoe Global developed the app for data collection. The funders of the study had no role in the study design, data collection, data analysis, data interpretation, or writing of the report.

Results
From March 24 to Dec 27, 2020, 4 327 245 participants from the UK signed up to use the COVID Symptom Study app. We excluded users living in Northern Ireland because of the low number of users who signed up (38 976 users), as well as 383 352 users without information on sex, and 2 175 979 who had not logged responses in the app between Sept 28 and Dec 27, 2020, leaving a total of 1 767 914 users. From Sept 28 to Dec 27, these users collectively recorded 65 613 697 logs in the app. In the same period, 497 989 users reported a swab test, of whom 55 192 reported a positive test, and we investigated the symptom reports of the 36 920 of those with a positive test whose region was known and who reported as healthy on app sign-up. The table shows the demographic data for the cohort studied.
Between Sept 27 and Dec 31, 2020, 98 170 sequences were made available by COG-UK, corresponding to 4·4% of the 2 207 476 cases recorded during this period. 12  Data are n or n (%) unless otherwise specified. Invalid age refers to ages <1 or >100, which were usually caused by incorrect entries (eg, confusing the date of birth field with age). *There could be more than one test per individual as the overall number contains failed tests and unknown results. †Reports logged between Sept 28 and Dec 27, 2020; for some analyses we took further reports from an extended period from Sept 14, 2020, to Jan 18, 2021.  Figure 1 shows the increase in the pro portion of infections with the B.1.1.7 variant over time in regions of the UK, using the COG-UK and SGTF data. Analysis of the variation in symptom occurrence over time showed no qualitative change in the proportion of users reporting each symptom with an increasing proportion of infections with the B.1.1.7 variant (figure 2; appendix p 2). Linear regression (both unadjusted and adjusted for participant age and sex, and regional temperature and humidity) did not show evi dence of an association between the proportion of infec tions with the B.1.1.7 variant and symptoms reported, after controlling for the false discovery rate (appendix p 6).
Visual inspection of the total number of symptoms reported, asymptomatic infections, self-reported hospital visits, and instances of long symptom duration over time suggested no change in any of these outcomes with an increasing proportion of infections with the B.1.1.7 variant (appendix pp 3-4). When correcting for mean age, sex, and ambient temperature and humidity, there was no evidence of an association between the proportion of infections with the B.1.1.7 variant and the number of symp toms reported over a 4-week period, the number of admissions to hospital, long symptom duration, or the proportion of asymptomatic cases (appendix p 8).
We identified 304 individuals who reported two positive tests separated by an interval of at least 90 days. Among these individuals, symptom reporting allowed us to identify 249 users for whom there was a period of at least 7 symptom-free days in between positive tests, accounting for 0·7% (95% CI 0·6-0·8) of the 36 509 individuals who reported a positive swab test before Oct 1, 2020. Daily reports were available in the periods around both positive tests for 173 of those 249 users. There was no difference in reinfection reporting rates across the different NHS regions (p=0·11; appendix p 9). Figure 3 shows the evolution in the number of possible reinfections along with reported positive cases and the proportion of infections with the B.1.1.7 variant. For all regions except Scotland (which had a low number of app users), reinfection occurrences were more positively correlated with the overall regional rise in cases (r s =0·56 to 0·69 for South East, London, and East of England) than the regional rise in the proportion of infections with the B.1.1.7 variant (r s =0·38 to 0·56 for South East, London, and East of England; appendix p 9).
Comparison of bootstrapped median values of these correla tions using the Mann-Whitney U test showed the differences in correlation within each region were significant (p<0·001; appendix p 10). When assessing the incidence and R t for pre-existing variants and the B.  and R t . When considering only the period after the end of the second lockdown, we found a mean additive increase in R t of 0·28 (0·01-0·61) and a mean multiplicative increase of 1·28 (1·02-1·61) for the B.1.1.7 variant. Conducting the same analysis using SGTF data, limited to the period after Dec 1, 2020, when at least 95% of all SGTF cases were attributable to B.1.1.7, we found the R t of the B.1.1.7 variant to have a mean additive increase of 0·26 (0·15-0·37) and a mean multiplicative increase of 1·25 (1·17-1·34; appendix p 5). These data were provided weekly, and linear interpolation was used to obtain daily estimates, leading to smoother estimates for variantspecific incidence and R t .
On Dec 19, 2020, London and much of the South East and East of England were placed under Tier 4 restrictions, enforcing stricter rules for physical distancing and decreased human-to-human contact that stopped short of nationwide measures. On Jan 5, 2021, the whole of England was placed in national lockdown. In January, the proportion of infections with the B.1.1.7 variant in London, the South East, and the East of England (these three regions had the largest proportion of infections with the B.1.1.7 variant in England) was at least 80%, assuming the proportion had not decreased from the end of December. R t fell to around 0·8 in all three of these regions during the national lockdown ( figure 5). An

Discussion
Using data collected through community reporting of symptoms and tests via the COVID Symptom Study app, we did an ecological study to investigate whether the appearance of the B.1.1.7 variant, first detected in a sample from England in September, 2020, was associated with differences in symptoms, disease duration, admission to hospital, asymptomatic infection, risk of reinfection, and transmissibility for users reporting a positive test result between Sept 28 and Dec 27, 2020. We did not find associations between the proportion of infections with the B.1.1.7 variant and the type of symptoms reported by our app users. We also did not find evidence for any change associated with the B.1.1.7 variant in the total number of symptoms reported by individuals, nor in the proportion of individuals with a long disease duration, defined as recording symptoms for more than 28 days without a break of more than 7 days. The proportion of users with asymptomatic disease did not significantly change as the B.1.1.7 variant increased in prevalence, in agreement with other studies on the subject. 13 We also found no changes in admissions to hospital; however, other reports have shown that the B.1.1.7 variant increases rates of admission to hospital. 6 Limitations to the assessments of the proportions of asymptomatic cases and admission to hospital should be noted: most of our users get tested only when they have symptoms, so relatively few asymptomatic infections are recorded, and the self-reported nature of our data on admission to hospital means we are likely to miss more severe hospitalised cases, when the individual is unlikely to self-report. There is also evidence that infection with the B.1.1.7 variant is associated with increased risk of mortality, 6 and our data do not allow us to assess this.
A report from the COVID-19 Infection Survey, conducted by the UK Office for National Statistics, showed that individuals infected with the B.1.1.7 variant were more likely to report a cough, sore throat, fatigue, myalgia, and fever in the 7 days preceding the test, and less likely to report a loss of taste or smell. 14 It is not clear whether this report adjusted for age, sex, and environmental factors, although we found that adjustment for these factors did not affect the results of our analysis (appendix p 6). The discrepancy between our results and those of the We observed, among 249 potential cases, a very low prevalence of possible reinfection (0·7% [95% CI 0·6-0·8]), consistent with another study of 6614 healthcare workers who had previously tested positive for COVID-19, in which 44 (0·66%) possible reinfections were identified. 16 Our reinfection rate did not vary across regions or time, which is consistent with the hypothesis that reinfection is no more likely in the context of the B.1.1.7 variant. This might mean that, if adequate immunity is built during the first infection, it might be sufficient to protect against reinfection in the presence of the B.1.1.7 variant. Ultimately, this is a positive sign that the immunity built through vaccination against pre-existing variants could also be effective against the B.1.1.7 variant. This finding is in line with initial, laboratory-based studies of the efficacy of vaccines designed for pre-existing variants against this newer variant. [17][18][19] We found an increase in R t associated with the B.1.1.7 variant. There was a mean multiplicative increase in R t of 1·35 (95% 1·02-1·69), which is similar to estimates from Volz and colleagues, 5 who estimated an increase in R t of 1·5-2·0; Davies and colleagues, 4 who estimated an increase in transmissibility of 1·43-1·90 (95% CI 1·38-2·30); and Walker and colleagues, 13 who found an increase in growth rate that corresponded to a transmissibility increase of 1·33 (95% CI 1·21-1·53) assuming a generation time of 4·7 days. 20 These increases in transmissibility have worrying implications for the ability of lockdown measures to control spread of the B.1.1.7 variant, given that R t was estimated to be 0·7-0·9 during the first national lockdown in England. 21 However, we found R t to be around 0·8 in the three regions in England in which at least 80% of infections were likely to be due to the B.1.1.7 variant during the national lockdown beginning on Jan 5, 2021. There are several potential explanations for this finding. Adherence to this lockdown could have been greater than in previous lockdowns, helping to reduce R t . The true increase in transmissibility might also be at the lower end of the available estimates, or it is possible that the increase in transmissibility estimated outside lockdown cannot be extrapolated to lockdown, perhaps because of the B.1.1.7 variant respond ing differently to lockdown measures than pre-existing variants. Another possible explanation is that there is now sufficient community immunity to reduce R t further than in previous lockdowns. One serology study estimated that, from Dec 21, 2020, to Jan 18, 2021, 15·3% (95% CI 14·7-15·9) of individuals in England would have tested positive for COVID-19 antibodies. 22 Many countries have now detected infections with the B.1.1.7 variant, and work to better understand the factors that helped to suppress its spread in the UK will help other countries to formulate their public health responses. 23 Our study has several strengths. The large, longitudinal nature of the COVID Symptom Study data, with good coverage of the UK population, provides a unique opportunity to study potential changes in symptomatology and disease duration. The ability to match tests and symptom reports over long periods also allows us to measure possible reinfection rates. Our data also offer the ability to provide a valuable complementary measure to existing measurements of the increased transmissibility of the B.1.1.7 variant: we were able to use real-time, representative incidence estimates to measure R t , whereas other studies have relied on deaths and admissions to hospital, which are lagged, or community case numbers, which do not reflect true infection numbers. We acknowledge several limitations to this study. First, as we had no information on the variant causing individual positive infections reported through the app, we did an ecological study, assessing the association between the proportion of infections with the B.1.1.7 variant and population-level measures. This design does not allow for causal interpretation of the effect of the B.1.1.7 variant on the measures we investigate. Our work assumes that all non-B.1.1.7 variants in circulation at the time of study give rise to the same range of symptoms and immune response, and have the same transmissibility. Genomic sur veil lance has detected a very low number of non-B.1.1.7 vari ants of concern in circulation, 24 supporting the validity of this assumption, but it cannot be ruled out that other variants with different characteristics are circulating undetected.
Second, data obtained from participatory digital platforms have well documented 25 biases in demographics. Although we were able to correct for some of these factors, such as age and sex, in our analysis, others are more difficult to characterise and correct for. For example, respondents signing up to a participatory platform such as the COVID Symptom Study app are likely to be more interested in health and COVID-19 than the wider population, and might exhibit different behaviours. Partici patory studies might also suffer from ascertainment or collider bias. 26 Self-report also carries the risk of data input errors, although the design of the app seeks to minimise this; for example, each time a user submits a log in the app, they are shown the full history of their test results and are given the option to amend incorrect entries. Previous publications from our group have found that population-level estimates of disease prevalence from our app triangulate well with those obtained from studies designed to be representative of the population. 11 Another limitation was that we assumed that testing positive for SARS-CoV-2 infection after an interval of 90 days with at least a 7-day period with an absence of symptoms is consistent with reinfection. Repeated positive testing has been reported shortly after hospital discharge, 27 with PCR positivity detected up to 28 days after symptom resolution. Although the chosen cutoff of 90 days between two positive tests is unlikely to be due to prolonged PCR positivity, this cannot be ruled out; however, this would probably only affect a small number of cases. Viral sequencing of the two infections would ideally be used to confirm reinfection.
Finally, despite correcting for changes in temperature and humidity, comparisons in symptoms were made over time, and seasonal effects (including effects on symptoms) might not have been fully taken into account. 28 In summary, after examining the effect of the proportion of infections with the SARS-CoV-2 B.1.1.7 variant on COVID-19 symptoms, disease course, rates of reinfection, and transmissibility in the UK, we found no change in symp toms or their duration. Reinfections were rare (0·7% of app users) and there was no evidence of increased reinfection rates associated with the prevalence of the B.1.1.7 variant. We found an increase in R t for the B.1.1.7 variant, but R t fell below 1 during lockdown, even in regions with very high (>80%) proportions of infections with the B.1.1.7 variant.

Contributors
MSG, CHS, ATC, JW, TDS, CJS, and SO contributed to study concept and design. CHS, AM, BM, DAD, LHN, LP, SS, CH, JC, The COG-UK Consortium, ATC, JW, TDS, CJS, and SO contributed to acquisition of data. MSG, CHS, AM, and TV contributed to data analysis and verified the underlying data. MSG and CHS contributed to initial drafting of the manuscript. All authors contributed to interpretation of data and critical revision of the manuscript. ATC, CJS, TDS, and SO contributed to study supervision. All authors had full access to all the data in the study and had final responsibility for the decision to submit for publication.

Declaration of interests
AM, LP, SS, JC, CH, and JW are employees of Zoe Global. TDS is a consultant for Zoe Global. DAD and ATC previously served as investigators on a clinical trial of diet and lifestyle using a separate smartphone app supported by Zoe Global. ATC reports grants from the Massachusetts Consortium on Pathogen Readiness during the conduct of the study; and personal fees from Pfizer, Boehringer Ingelheim, and Bayer Pharma outside the submitted work. DAD reports grants from the US National Institutes of Health, the Massachusetts Consortium on Pathogen Readiness, and the American Gastroenterological Association during the conduct of the study. All other authors declare no competing interests.

Data sharing
Data collected in the COVID Symptom Study smartphone app are being shared with other health researchers through the UK NHS-funded Health Data Research UK and Secure Anonymised Information Linkage consortium, housed in the UK Secure Research Platform (Swansea, UK). Anonymised data are available to be shared with researchers according to their protocols in the public interest (https://web.www. healthdatagateway.org/dataset/fddcb382-3051-4394-8436-b92295f14259). US investigators are encouraged to coordinate data requests through the Coronavirus Pandemic Epidemiology Consortium (https://www. monganinstitute.org/cope-consortium).