Global, regional, and national estimates of the population at increased risk of severe COVID-19 due to underlying health conditions in 2020: a modelling study

Summary Background The risk of severe COVID-19 if an individual becomes infected is known to be higher in older individuals and those with underlying health conditions. Understanding the number of individuals at increased risk of severe COVID-19 and how this varies between countries should inform the design of possible strategies to shield or vaccinate those at highest risk. Methods We estimated the number of individuals at increased risk of severe disease (defined as those with at least one condition listed as “at increased risk of severe COVID-19” in current guidelines) by age (5-year age groups), sex, and country for 188 countries using prevalence data from the Global Burden of Diseases, Injuries, and Risk Factors Study (GBD) 2017 and UN population estimates for 2020. The list of underlying conditions relevant to COVID-19 was determined by mapping the conditions listed in GBD 2017 to those listed in guidelines published by WHO and public health agencies in the UK and the USA. We analysed data from two large multimorbidity studies to determine appropriate adjustment factors for clustering and multimorbidity. To help interpretation of the degree of risk among those at increased risk, we also estimated the number of individuals at high risk (defined as those that would require hospital admission if infected) using age-specific infection–hospitalisation ratios for COVID-19 estimated for mainland China and making adjustments to reflect country-specific differences in the prevalence of underlying conditions and frailty. We assumed males were twice at likely as females to be at high risk. We also calculated the number of individuals without an underlying condition that could be considered at increased risk because of their age, using minimum ages from 50 to 70 years. We generated uncertainty intervals (UIs) for our estimates by running low and high scenarios using the lower and upper 95% confidence limits for country population size, disease prevalences, multimorbidity fractions, and infection–hospitalisation ratios, and plausible low and high estimates for the degree of clustering, informed by multimorbidity studies. Findings We estimated that 1·7 billion (UI 1·0–2·4) people, comprising 22% (UI 15–28) of the global population, have at least one underlying condition that puts them at increased risk of severe COVID-19 if infected (ranging from <5% of those younger than 20 years to >66% of those aged 70 years or older). We estimated that 349 million (186–787) people (4% [3–9] of the global population) are at high risk of severe COVID-19 and would require hospital admission if infected (ranging from <1% of those younger than 20 years to approximately 20% of those aged 70 years or older). We estimated 6% (3–12) of males to be at high risk compared with 3% (2–7) of females. The share of the population at increased risk was highest in countries with older populations, African countries with high HIV/AIDS prevalence, and small island nations with high diabetes prevalence. Estimates of the number of individuals at increased risk were most sensitive to the prevalence of chronic kidney disease, diabetes, cardiovascular disease, and chronic respiratory disease. Interpretation About one in five individuals worldwide could be at increased risk of severe COVID-19, should they become infected, due to underlying health conditions, but this risk varies considerably by age. Our estimates are uncertain, and focus on underlying conditions rather than other risk factors such as ethnicity, socioeconomic deprivation, and obesity, but provide a starting point for considering the number of individuals that might need to be shielded or vaccinated as the global pandemic unfolds. Funding UK Department for International Development, Wellcome Trust, Health Data Research UK, Medical Research Council, and National Institute for Health Research.


Methods used to calculate the proportion of individuals with at least one underlying condition
Note: Some of the text from the main paper is repeated here for convenience.

Proportion with at least one underlying condition relevant to severe COVID-19 disease
The GBD study provides prevalence estimates for each disease category separately, but not what we needed, which was the prevalence of people in at least 1 of these categories. Diseases may cluster, for example if they are causally related. To deal with this, we first calculated , which is the expected proportion of individuals with at least one condition assuming no clustering and that the various prevalences are independent (e.g. the fact that someone has diabetes does not affect their risk of getting cancer) as 1 minus the probability of not having any of the conditions c1, c2, c3….i.e. 1 -(1 -p_c1) x (1 -p_c2) x (1 -p_c3)….
We then estimated the proportion who have at least one underlying condition as = × , where is the ratio between the observed and expected percentage of individuals with at least one condition. We based on evidence from large cross-sectional multimorbidity studies in Scotland 1 and Southern China. 2 The ratio was broadly consistent by age, sex and study (see Figure 1 overleaf). For the analysis of both males and females combined, the mean of all age-specific values of r was 0.92 (range 0.86 to 0.99) in Scotland and 0.92 (range (0.75 -1.15) in China. When extrapolating this value to other countries, we used a ratio of 0.9 for all age groups varied this between 0.7 and 1.0 for transparency. The resulting national estimates of were constrained to be no less than each country's single most prevalent condition. We conducted sensitivity analyses to explore the impact on results of using the observed age-specific values of rather than the same value for all ages, but this had a very small impact on the share of the population estimated to be at increased risk i.e. the share of the population at increased risk changed from 22.5% to 22.8%.

Adjustment for multimorbidity
In addition to providing estimates for , the studies in Scotland and Southern China were also used to calculate the multimorbidity fraction i.e. the proportion of individuals with multiple (two or more) underlying conditions among those with at least one, by age group and sex. All analyses were done using disease categories that matched as closely as possible to the COVID-19-relevant categories defined in our analysis. In both studies this included: CVD (defined as the presence of one or more of coronary heart disease, hypertension, cerebrovascular disease, peripheral arterial disease, heart failure, or atrial fibrillation); chronic neurological disease (defined as one or more of dementia, multiple sclerosis and Parkinson's disease); and CRD (defined as one or both of chronic obstructive pulmonary disease and bronchiectasis). Other COVID-related conditions listed in the main analysis were counted separately. The GBD provide separate estimates for hypertensive heart disease and CKD due to hypertension, but it was not possible to make this distinction in the multimorbidity datasets, so all hypertension was included in the CVD category.
We calculated pooled estimates of the multimorbidity fraction by age and sex and extrapolated these pooled estimates to all countries included in the analysis (see Figure 1 overleaf).

Figure 1. Empirical estimates of the ratio (left panel) and the multimorbidity fraction among those with at least one underlying health condition relevant to COVID-19 (right panel) from cross-sectional studies in Scotland and Southern China
The top row shows results for females and males combined. The middle row shows results for females only and the bottom row shows results for males only.
The left panel/column shows the ratio between the observed and expected % of individuals with at least one condition by age. Expected estimates were calculated by assuming the prevalences of COVID-19 underling conditions are independent (e.g. the fact that someone has diabetes does not affect their risk of getting cancer) as 1 minus the probability of not having any of the conditions c1, c2, c3….i.e. 1 -(1 -p_c1) x (1 -p_c2) x (1 -p_c3)…. This was then compared to the observed value of the % of individuals with at least one condition based on the same dataset (either Scotland or Southern China). The ratios between expected and observed are shown on the left panel/column below. Both studies indicate that the expected value based on the assumption of independence would provide reasonable estimates of the observed value. In our main analysis we assumed the ratio was 0.9 but varied this between 0.7 and 1.0 in uncertainty analysis.
The right panel/column shows the proportion of those with at least one underlying condition relevant to COVID-19 with multimorbidity (two or more conditions). As expected, this percentage increases with age in both studies. The grey lines represent pooled estimates and 95% confidence intervals based on a 2 nd order polynomial model fitted to all data points. Pooled estimates were extrapolated to all countries included in the analysis by age and sex. The lower and upper CI values were used in our low and high estimates in the main paper.

Infection hospitalisation ratios
To estimate the number of individuals at high risk (those that would require hospital admission if infected) we applied country-level UN estimates of the number of individuals alive in each 5-year age group 3 to age-specific infection hospitalisation ratios (IHRs) recently estimated for mainland China by Verity et al. 4 IHRs represent the proportion of people who are infected that would have symptoms severe enough to require hospital admission. The term 'require hospital admission' is consistent with the WHO definition for severe cases. 5 Whether or not these individuals actually receive hospital care will depend on the health system in the country concerned but is beyond the scope of this analysis.

Adjustments to IHRs
We made two adjustments to account for differences between IHRs in China and other countries. The first was designed to capture the effect on IHRs of national variations in prevalence mix. The second was to adjust for infections in given age groups being more severe in higher mortality settings.
1. Adjusting for underlying conditions. For each 5-year age group and sex, the prevalence rates for each underlying condition were multiplied by their respective relative risks (RRs) for hospitalisation. RRs of 3.0 were assumed for CKD, diabetes and CVD and of 1.5 for the eight other conditions. The RRs we used were informed by a rapid review of what is currently known about the strength of association between different variables and COVID-19 hospital admission (see table 2 below). The totals were then summed across all 11 conditions and added to the proportion of individuals without underlying conditions, to create a risk score for each 5-year age group. IHRs were then adjusted to account for the ratio of the risk score for the country of interest and China. For example, for males aged 55-59 years the risk score was 2.63 in Afghanistan and 1.95 in China, so the IHR for this age group was multiplied by a ratio of 1.35 (2.63/1.95). This adjustment assumes that for each underlying condition, the RR of hospital admission is the same for every country.

2.
Adjusting for age-based frailty. Direct interpretation of the first adjustment relies on constant RRs of admission for each condition across countries. To address the likelihood of more severe infections at given ages in higher mortality settings we went further. For each 5-year age group and sex, we divided UN estimates of age-specific life expectancy 3 for China by the equivalent estimate for the country of interest. This ratio (proxy for difference in age-based frailty) was then applied to the IHR for same age group. For example, the life expectancy for males at age 55 years is 22.8 and 19.1 years in China and Afghanistan, respectively, so the IHR for this age group was multiplied by a ratio of 1.19 (22.8/19.1) to generate the estimate for Afghanistan. Country-specific differences in life expectancy may be more extreme than country-specific differences in infection severity because life expectancy depend not only general health status, but also on access to effective health care. Also, the extent to which infections will be more severe in higher mortality settings is still very uncertain. We therefore show results with and without this adjustment.
Thus, for males aged 55-59 years in Afghanistan, separate adjustments for underlying conditions and age-based frailty had the effect of increasing the IHR by 1.61 (IHR x 1.35 x 1.19), from 9% to 15%.
Several studies are emerging on the risk of mortality among those already hospitalised 6,7 but we restricted our analysis of RRs to studies that allowed comparison of hospitalised and non-hospitalised COVID-19 cases. This was because our analysis focused on the risk of severe disease (requiring hospital admission) rather than the risk of death. Three studies from the USA met these criteria. The first contained descriptive data on 6,637 COVID-19 cases reported to the CDC as of 28 th March, 2020. For this study, we derived crude univariable RRs for each condition from the reported case counts. 8 The second was a multivariable analysis of 4,103 COVID-19 cases in New York City, between 1 st March 2020 and 2 nd April 2020. 9 The third was a multivariable study from Northern California with 1,052 confirmed cases of COVID-19 captured between 1 st January and 8 th April, 2020. 10 For conditions where evidence was weak or missing, we included studies that did not meet our initial inclusion criteria, either because they were not yet published (i.e. multivariable analyses in Italy 11 and the UK 12 ) or because many of the COVID-19 cases in the less severe group were hospitalised i.e. one published metaanalysis of 4 studies from China. 13 Using the evidence from these studies, we summarised the strength of the association with hospital admission (low, moderate, high) and graded our confidence in the strength of that association (low, moderate high).
This rapid review aimed to summarise what is currently known about the strength of association between different variables and COVID-19 hospital admission, but this is unlikely to be exhaustive. We therefore chose to use a very simple stratification of risk for the 11 conditions (either RR= 3 or RR = 1.5) based on the range of RRs reported for these conditions. For three conditions (CKD, diabetes and CVD) we had moderate confidence that the strength of the association would be high based on univariable and multivariable analysis, so assumed a RR of 3.0. For the eight other conditions we assumed a RR of 1.5 because there was insufficient evidence of a significant independent association with hospital admission from multivariable analyses.
RR values for each condition were varied one at a time (assuming low and high values of 1 and 10 respectively) to assess the impact of these changes on the total population at high risk (p 11). Gender is not included in current guidelines. In one multivariable analysis in New York City (n=4103) male gender was a significant independent predictor of hospitalisation (OR = 2.80, 95% CI 2.38 -3.30, p <0.001). 9 Males were twice as likely to be admitted to hospital as females (OR = 1.9, p =0.001) based on multivariable analysis from Northern California (n=1,052). 10 In a large study of hospitalised patients in the UK, 60% of hospitalised patients were male and this effect was seen across all ages (n=16,749). 6 Males had a RR of 1.64 (95% CI 1.25 -2.14, p <0.001) in a multivariable study that compared a cohort that did and did not have a test-positive hospital admission in the UK (n=428,225). 12 Data from an unpublished multivariable analysis in Italy (n=1866) found an association between male gender and hospital admission (HR = 1.4 [95% CI 1.2-1.6]). 11 High (for simplicity we assumed males were twice as likely as females to be at high risk i.e. to require hospital admission if infected).

Pregnancy
The prevalence of pregnancy was the same among COVID-19 hospital admissions (6%) and the wider community in a large study in the UK (n=16749), 6 but this is likely to remain in current guidelines until more evidence emerges e.g. on the risks at different stages of pregnancy.  Table 3 (overleaf) summarises the share of the population estimated to be at high risk (those that would require hospital admission if infected) for the base case scenario and for a range of alternative scenarios.

Sensitivity analysis for estimates of the number of individuals at high risk
The following assumptions were varied: 1. Low and high credible intervals of IHRsscenarios based on the low and high credible interval values of the IHRs reported in Verity et al, 16 were influential. The global population at high risk (4.5%) decreased to 2.7% with low IHRs and increased to 9.1% with high IHRs; 2. Adjustments for underlying conditionsremoving this adjustment reduced the global population at high risk from 4.5% to 4.0% (and from 3.1% to 2.7% in Africa). In some countries removing this adjustment had a more substantial impact, due to very high prevalence of specific conditions relative to the same conditions in China e.g. diabetes in Fiji, HIV/AIDs in Swaziland; 3. Adjustments for age-based frailty -as expected, removing the age-based frailty adjustment resulted in a lower population at high risk in Africa (from 3.1% to 2.6%) and a higher population at high risk in Europe and other high-income settings; 4. Altering the maximum proportion of the population infectedour central estimates of the numbers at high risk of severe COVID-19 disease assume there is no theoretical maximum to the proportion of the population that could ever be infected. Thus, the total number of individuals at high risk is calculated by multiplying the total population in each age group by the IHR for the same age group. However, as our understanding of COVID-19 transmission dynamics improves, empirical data may begin to emerge on the scale and duration of immunity acquired from natural infections, and on the proportion of the population that could ever be infected given widespread community transmission. If this is lower than our current assumption of 100%, then fewer individuals would be at high risk using our method. When we varied this value, the global population at high risk decreased from 4.5% assuming all individuals could ever be infected to 4.0%, 3.6%, 3.1% and 2.7% when assuming this value could not exceed 90%, 80%, 70% and 60%, respectively; and,

5.
Altering the RRs for specific conditionschanging the RR values for each condition one at a time (assuming low and high values of 1 and 10 respectively) had a limited impact on the total population at high risk (<5% increase/decrease). Results were most sensitive to CVD, CKD, diabetes and liver disease (due to its higher prevalence in China relative to many other settings). Increasing the RR for HIV/AIDS from 1.5 to 10.0 was influential in Africa and increased the share of the population at high risk from 3.1% (42 million) to 3.7% (49 million).

Table 4. Number of individuals in millions at increased risk of severe COVID-19 illness by age, number of conditions, region, and age threshold: low scenario estimates
For numbers at increased risk, the low estimates were based on a scenario assuming the lower 95% CI values for the age-sex-specific population estimates, disease prevalence rates, and multimorbidity fraction, and assume an r ratio of 0.7.    Figure 3 in the main paper shows the share of the population at risk in different countries based on real-world differences in population structure and disease prevalence. This information is important when calculating the numbers that might need to be shielded or vaccinated but does not allow direct comparison of risks at equivalent ages in different countries. In this alternative version (see below), circles have been added to show the age-standardised share of the population at high risk (black circles) and increased risk (open circles). These assume each country has the same WHO standard reference population. 17 A low age-adjusted population at risk in countries with older populations (eg, Japan, Europe and Puerto Rico) helps to confirm that older age is the main reason why these countries have a high unadjusted population at risk. Similarly, a high age-adjusted population at risk in African countries with high HIV prevalence (eg, eSwatini, Lesotho) and small island nations with high diabetes prevalence (eg. Fiji, Mauritius) explains why these countries have a high unadjusted population at risk, despite having younger populations. Differences in demography can mask important differences in age-specific risks that may be revealed by age-standardisation. For example, in eSwatini and New Zealand the population at high risk is 5% in both countries, but when risks are compared for equivalent age groups (within the spreadsheet tool) the age-specific risks in eSwatini are more than double those in New Zealand (consistent with eSwatini having a higher age-adjusted population at high risk ie, 8% vs 3%). Thus, although younger populations will tend to have a lower share of the population at risk than older populations, their risk at equivalent ages could still be higher.