Group B streptococcus infection during pregnancy and infancy: estimates of regional and global burden

Summary Background Group B streptococcus (GBS) colonisation during pregnancy can lead to invasive GBS disease (iGBS) in infants, including meningitis or sepsis, with a high mortality risk. Other outcomes include stillbirths, maternal infections, and prematurity. There are data gaps, notably regarding neurodevelopmental impairment (NDI), especially after iGBS sepsis, which have limited previous global estimates. In this study, we aimed to address this gap using newly available multicountry datasets. Methods We collated and meta-analysed summary data, primarily identified in a series of systematic reviews published in 2017 but also from recent studies on NDI and stillbirths, using Bayesian hierarchical models, and estimated the burden for 183 countries in 2020 regarding: maternal GBS colonisation, iGBS cases and deaths in infants younger than 3 months, children surviving iGBS affected by NDI, and maternal iGBS cases. We analysed the proportion of stillbirths with GBS and applied this to the UN-estimated stillbirth risk per country. Excess preterm births associated with maternal GBS colonisation were calculated using meta-analysis and national preterm birth rates. Findings Data from the seven systematic reviews, published in 2017, that informed the previous burden estimation (a total of 515 data points) were combined with new data (17 data points) from large multicountry studies on neurodevelopmental impairment (two studies) and stillbirths (one study). A posterior median of 19·7 million (95% posterior interval 17·9–21·9) pregnant women were estimated to have rectovaginal colonisation with GBS in 2020. 231 800 (114 100–455 000) early-onset and 162 200 (70 200–394 400) late-onset infant iGBS cases were estimated to have occurred. In an analysis assuming a higher case fatality rate in the absence of a skilled birth attendant, 91 900 (44 800–187 800) iGBS infant deaths were estimated; in an analysis without this assumption, 58 300 (26 500–125 800) infant deaths from iGBS were estimated. 37 100 children who recovered from iGBS (14 600–96 200) were predicted to develop moderate or severe NDI. 40 500 (21 500–66 200) maternal iGBS cases and 46 200 (20 300–111 300) GBS stillbirths were predicted in 2020. GBS colonisation was also estimated to be potentially associated with considerable numbers of preterm births. Interpretation Our analysis provides a comprehensive assessment of the pregnancy-related GBS burden. The Bayesian approach enabled coherent propagation of uncertainty, which is considerable, notably regarding GBS-associated preterm births. Our findings on both the acute and long-term consequences of iGBS have public health implications for understanding the value of investment in maternal GBS immunisation and other preventive strategies. Funding Bill & Melinda Gates Foundation.

, Methods and Supplementary Methods. Detailed information on the different studies can be obtained from the supplementary appendices of the different literature reviews referenced in the paper and below.
6 Identify and describe any categories of input data that have potentially important biases (e.g., based on characteristics listed in item 5).
Limitations of data and model assumptions are described in Table S9 and in the Discussion section. For data inputs that contribute to the analysis but were not synthesized as part of the study: 7 Describe and give sources for any other data inputs.
This information is presented in Table 1 and in the Methods and Supplementary Methods. For all data inputs: 8 Provide all data inputs in a file format from which data can be efficiently extracted (e.g., a spreadsheet rather than a PDF), including all relevant meta-data listed in item 5. For any data inputs that cannot be shared because of ethical or legal reasons, such as third-party ownership, provide a contact name or the name of the institution that retains the right to the data.
Published data used in meta-analyses have been uploaded in a data repository. Some of the datasets are also available in data repositories from authors who performed the literature reviews or in the supplementary appendices of the reviews. Access to unpublished data requires direct communication with leading investigators of specific studies (e.g. for new data on NDI after iGBS). Data analysis 9 Provide a conceptual overview of the data analysis method. A diagram may be helpful.

Methods and Supplementary
Methods describe the statistical approach in detail. Figure 1 shows the different outcomes being modelled.

0
Provide a detailed description of all steps of the analysis, including mathematical formulae. This description should cover, as relevant, data cleaning, data pre-processing, data adjustments and weighting of data sources, and mathematical or statistical model(s).
This information is in the Methods section and the Supplementary Methods.

1
Describe how candidate models were evaluated and how the final model(s) were selected.
In the Supplementary Methods, we describe model checks, sensitivity analyses, and secondary analyses for some of the outcomes; see also in 1 .

2
Provide the results of an evaluation of model performance, if done, as well as the results of any relevant sensitivity analysis.
Supplementary Methods (including figures) and Table S8 1 3 Describe methods for calculating uncertainty of the estimates. State which sources of uncertainty were, and were not, accounted for in the uncertainty analysis.

4
State how analytic or statistical source code used to generate estimates can be accessed.

Results and Discussion 1 5
Provide published estimates in a file format from which data can be efficiently extracted.
Summary tables (Tables in the main  manuscript) are available in a data repository 1 6 Report a quantitative measure of the uncertainty of the estimates (e.g. uncertainty intervals).

7
Interpret results in light of existing evidence. If updating a previous set of estimates, describe the reasons for changes in estimates.

8
Discuss limitations of the estimates. Include a discussion of any modelling assumptions or data limitations that affect interpretation of the estimates.

Methods Supplement -Details per parameter
This section is subdivided in subsections: Maternal GBS colonisation; Early-onset invasive GBS disease; Late-onset invasive GBS disease; Mortality during invasive GBS disease; Neurodevelopmental impairment after invasive GBS disease; Stillbirths attributed to GBS; Maternal disease; Preterm birth associated with maternal GBS colonisation. In each subsection, we describe the data used, present the model used in our analysis, and include discussions on prior assumptions when relevant.

Maternal GBS colonisation
We developed a Bayesian hierarchical model to estimate country-level maternal GBS colonisation prevalence and its association with relevant country-level variables. A total of 325 data points, from 82 countries, directly informed this estimation; these data were reviewed in 2 , where GBS colonisation was defined based on culture results. The model below shows how data from prevalence studies informed national estimates:

Country level
Priors The logit-prevalence in a country m, μ m , was assumed to depend on the global intercept, μ g-c , and countrylevel variables, , that were standardised before analysis; represents a vector of regression coefficients. The following country-level variables were used based on previous analyses 1, 3  . Priors used for this analysis were: for regression coefficients, the 's, we used weakly informative normal priors, Normal ~ (0, 1); for μ g-c , the prior Normal ~ (-1, 1) was used; Uniform ~ (0, 5) was used for scale parameters, 's. In particular, we used priors consistent with our knowledge that generally less than half of the maternal population is colonised by GBS. In Table S2, we present posterior estimates of model coefficients. Posterior distributions of the coefficients were used in the estimation of maternal GBS colonization prevalence for countries without data; this also incorporates the unexplained between-country variation represented by the scale parameter . Region-specific numbers of colonised mothers, as reported in Figure 2, were calculated by multiplying country-specific numbers of births and the inverse-logit of the corresponding parameter μ m for each posterior sample.
Note that predictive checks for the model on country-level GBS colonisation prevalence and for the model described in the following subsection, on early-onset invasive GBS disease risk, are presented in 1 .

Early-onset invasive GBS disease
Thirty studies with at least 200 GBS-colonised pregnant women were identified in a recent review 4 ; two studies with only term babies were not included in this analysis, as it is possible risks in these studies do not reflect risk in the general population 5 . Early-onset invasive GBS disease (EOGBS) was defined based on blood or cerebrospinal fluid culture. The model is described below: Here, represents the number of early-onset GBS cases in study i. This number follows a binomial distribution with parameters , sample size of study i (i.e. number of GBS colonised mothers), and , risk in the study i. are study-specific intercepts, normally distributed with location parameter − and scale . is the coefficient of the association between study-level intrapartum antibiotic use and risk of EOGBS. Priors were: for − , Normal ~ (-4, 1), which corresponds to the expected low risk of iGBS disease 6 ; Uniform ~ (0, 5) and Normal ~ (0, 1) were used for and . Prior predictive distribution for this model is presented in 1 .
To estimate country-level numbers of EOGBS cases, in addition to country-specific live births in GBScolonised mothers, which were calculated based on total number of births, stillbirth risk and estimated GBS colonisation prevalence, we used _ ( − + ), where is countryspecific intrapartum antibiotic prophylaxis (IAP) coverage estimated by Le Doare and colleagues 7 , and which was incorporated in our analysis as fixed value rather than uncertain quantity. Of note, for countries for which Le Doare and colleagues did not estimate IAP coverage, we assumed zero coverage of IAP in developing countries (N = 87), as this was the coverage in 56% of developing countries with estimated coverage, and assumed 80% coverage in developed countries (N = 16). By assuming fixed values, our results reflect uncertainty in other parameters but not in this parameter. As a sensitivity analysis, if IAP coverages in countries with above-zero (assumed or estimated) coverage were for example (i) 5% higher or (ii) 5% lower than our current assumptions, estimated numbers of EOGBS cases would be 224,00 (110,600 -451,500) and 240,000 (111,900 -477,500), respectively.
We also performed a secondary analysis that combined information on maternal GBS colonisation prevalence, risk of EOGBS in babies born to mothers with GBS colonisation and data from studies with direct incidence estimates in all births. These different study types were linked using a parameter that corresponds to the probability of reporting incident cases, assumed common for all incidence studies, and through functions of parameters estimated in this and the previous section, including the parameter.
Ten incidence studies considered to be less subject to bias and described in 8 were included. The equation used is shown below and described in detail in 1 ( Table S3): where represents the underlying risk of confirmed EOGBS in all births in incidence study i; is the aforementioned parameter that corresponds to reporting probability; and ′ and ′ are, respectively, the estimated study-specific intercept for the risk of EOGBS given colonisation in study i, and estimated study-specific prevalence of maternal GBS colonisation based on the model in the previous section.
represents IAP coverage in country m, assumed to correspond to coverage in the incidence study i population.
Whilst this framework allowed inclusion of additional data in the estimation, calculation of countryspecific numbers of EOGBS cases was as outlined above. In the different analyses described in this subsection, we do not discriminate between microbiological-and risk factor-based IAP approaches; this might have led to underestimation of EOGBS risk in countries with primarily risk factor-based IAP policy.

Late-onset invasive GBS disease
It has been argued that incidence studies, i.e. studies that directly estimate incidence of iGBS in all births, only capture a fraction of all incident cases. Evidence comes from studies that estimated under-reporting (e.g. an early study by Heath and colleagues in the United Kingdom 9 ), from a review on GBS incidence studies performed in low-and middle-income countries 10 and from individual studies that discuss underascertainment and/or under-reporting 11 . Under-estimation is thought to occur, probably due to different reasons, for both EOGBS and late-onset invasive GBS disease (LOGBS) incidence. Although it is possible that the degree of under-estimation varies depending on timing of disease onset, here we assumed that incidence under-estimation was of similar degree for both EOGBS and LOGBS. Whilst it was possible to indirectly estimate number of EOGBS cases by combining maternal GBS colonisation prevalence estimates and estimates on the risk of EOGBS given maternal GBS colonisation, the same approach cannot be used for LOGBS. For this reason, for incidence studies with appropriate follow-up duration and that assessed the incidence of both EOGBS and LOGBS cases (N = 20, reviewed in 8 ), we estimated relative frequencies of these presentations and applied them to our estimates of country-specific EOGBS incidence. The following model was used: The total number of iGBS cases in each study is . We assumed the number of LOGBS cases in the study i performed in region j ( ) follows a binomial distribution, with study-specific logit-proportion ( ( )) being normally distributed with location parameter corresponding to the region-specific proportion at the logit scale, . Posterior estimates are shown in Figure S3. We assumed the following priors: − ~ (0, 1) and − ~ (0, 1). Note that here and throughout this analysis, whenever a Normal distribution was used as prior for a scale parameter, the parameter was ) that were applied to country-specific numbers of EOGBS cases, as estimated in the previous section.

Mortality during iGBS (CFR)
Data from 47 and 29 studies were used to estimate case fatality rates (CFR) of EOGBS and LOGBS cases, respectively. The number of EOGBS cases in these studies ranged from 1 to 517 and of LOGBS, from 3 to 373. In addition to analyses including all these studies, reviewed in 8 , we also performed a sensitivity analysis that only included studies with more than 10 cases and with appropriate follow-up (i.e., 0 -6 or 0 -7 days for EOGBS data and 7 -89 or 7 -90 days for LOGBS data). These estimates were not dissimilar to results presented in Table S5, except: EOGBS CFR in Latin America and Caribbean region was lower than in countries in the developed group (posterior medians ~2 versus ~6%). We estimated region-specific CFR using the model below: Whilst model structure was similar for EOGBS and LOGBS, two separate models were used since EOGBS and LOGBS have different compositions of clinical syndromes (sepsis and meningitis). The following description is valid for both models: represents region-specific logit-CFR, and , study-specific CFR; − and − are scale parameters that represent between-region and within-region between-study variations. Posterior estimates are shown in Table S5 and Figure S4.
For both EOGBS and LOGBS CFR models, we assumed the following priors: and − ~ (0, 1). Estimates using other prior assumptions are presented in Table S8.
Below we show the prior predictive distribution for the EOGBS CFR model, which suggests most datasets compatible with our prior assumption would have higher-than-observed CFR. However, as shown in Table   S8, similar results are obtained with other prior assumptions.  Of note, we also fit a model with region-specific scale parameters for EOGBS CFR and obtained similar results.
Region-specific CFR ( _ ( )) were applied to estimated numbers of cases (either EOGBS or LOGBS). To estimate the number of deaths in children who developed iGBS in 2020, we also used information on skilled birth attendance coverage. We assumed that children without skilled birth attendance would suffer high mortality, a fixed value of 90%, if they developed EOGBS. This is similar to the average mortality assumed by Seale et al 3 , although here we did not introduce uncertainty in this parameter; as stated in the manuscript, uncertainty intervals thus do not reflect our lack of data on this risk. For children who had access to skilled birth attendance and developed EOGBS and for all children who developed LOGBS, we applied the CRFs estimated above. For five countries, skilled birth attendance data were missing and we used the regional median.

Neurodevelopmental impairment after iGBS
Since most studies assessing long-term risk of neurodevelopmental impairment after iGBS did not include a comparator group, to use data from studies reviewed in 12 , in addition to data from a large cohort study in Denmark 13 , we conducted a meta-analysis of the risk of impairment in children with history of iGBS, rather than of the association between iGBS and neurodevelopmental impairment (NDI). Unpublished data (Proma Paul, personal communication) collected in five low-and middle-income countries (Argentina, India, Kenya, Mozambique, South Africa) on NDI risk after iGBS were also used in this estimation; these data (henceforth, LMIC-NDI data) were collected in a recent multi-centre study that used multiple direct assessment tools 14 and a multi-domain definition of NDI 15 . As mentioned in the manuscript, data from the Argentinian study were not included in this analysis as the low proportion of iGBS survivors identified who were assessed might have been linked to selection bias. In the study in South Africa, a similarly low proportion of eligible iGBS survivors had NDI assessment; however, the clinical characteristics of these children did not differ significantly from those of a larger cohort of iGBS survivors of which they were part.
As recent data suggest 13 , risk of NDI varies by GBS syndrome; for this reason, we first estimated the distribution of iGBS survivors by clinical presentation (sepsis and meningitis). We included 24 and 14 studies with data on the proportions of EOGBS and LOGBS cases, respectively, diagnosed as meningitis 8 .
This was modelled as: was limited in the LMIC-NDI data (2 of the 4 sites had fewer than 10 participants with GBS meningitis), we estimated a global risk of NDI after GBS meningitis, rather than region-specific risks. To quantify the risk of NDI after GBS sepsis, data were available from 5 studies performed in high-income countries, four described in the Supplementary Appendix of a recent systematic review 12 and the Danish cohort study mentioned above; a meta-analysis of these studies was performed in estimating NDI risk after GBS sepsis in high-income countries. The LMIC-NDI data, specifically generated to address the data gap on NDI risk related to GBS sepsis in low-and middle-income countries, were used to model risk of NDI in these countries (range 22 -31 participants with history of GBS sepsis). In addition to modelling risks of moderate and severe impairment, for each syndrome and also for each country group (high-income and low-and middle-income groups) in the GBS sepsis-specific estimation, we also model risk of any severity impairment, which includes milder forms of impairment. However, results for the latter estimation are only presented below, not in the main text, because, given the variation in study design, case ascertainment and methods used to diagnose NDI, we believe risk of moderate and severe impairment is the most appropriate outcome to be reported as it is more likely to be consistent across studies and settings. corresponds to the expectation that NDI risk after GBS sepsis is lower than the risk after GBS meningitis.
Below we show the prior predictive distribution for one of the models. As can be seen in Table S8, prior assumptions for the parameter − in the analysis on GBS sepsis in low-and middle-income countries, that involves a small number of studies, have an effect on the uncertainty of estimates.  with GBS sepsis in low-and middle-income countries are predicted to develop NDI (any severity). Applying these risks to numbers of survivors, we estimated 107,000 NDI cases (42,600 -253,900).
In a secondary analysis that combines all the data on NDI risk after GBS sepsis, i.e. combines studies from low-and middle-income countries and high-income countries, risk of moderate and severe NDI was 4.8% (1.7 -11.1) and the estimated number of moderate and severe NDI cases was 27,700 (12, 100 -64,600).

Stillbirths attributed to GBS
To our knowledge, data are not available on the association between stillbirth risk and maternal GBS colonization; for this reason, our approach was to model the proportion of all stillbirths with GBS infection.
Six studies performed after 2000 on the proportion of stillbirths with evidence of iGBS, and reviewed in 17 , were included in this analysis; of note, one of the six studies also included data collected before 2000s.
The underlying assumption is that detection of GBS in stillbirth tissues, as opposed to just skin, implies Below we present the prior predictive distribution using the following assumptions: − ~ Normal (-3, 4), which corresponds to having most of the prior probability distribution between 0.06 and ~25%, consistent with results of studies performed before 2000 and reviewed by Seale and colleagues; and − ~ Normal (0,1), which is based on the assumption that between region variation is limited.  Posterior estimates are presented in Figure S2 and Table S7. We also fit a model with region-specific between-study scale parameters; since results were similar, they are not presented.
In analyses described in the preceding subsections, country-specific numbers of stillbirths were subtracted from country-specific numbers of births; in particular, all stillbirths estimated to be linked to GBS were subtracted from numbers of GBS-colonised pregnant women.
iGBS Maternal disease Four studies were identified in a recent review 20 that provided information on the risk of maternal GBS disease; three of these studies had as denominator deliveries, whilst, the other, pregnancies. In addition to these studies, we included a study published by Collin   Preterm birth associated with maternal GBS colonisation GBS might also indirectly cause morbidity and mortality by increasing the risk of prematurity. Here we used data reviewed by Bianchi-Jassir and colleagues 22 to quantify this association. As in the review, we excluded studies that used urine samples.
Firstly, we performed a Bayesian random-effects meta-analysis of case-control studies. The estimated odds ratio (posterior median and 95% interval) was 1.83 (1.04 -3.07).

Meta-analysis of case-control studies
Posterior mean and standard deviation of the random effects location parameter in the case-control meta-analysis were used as priors in the analysis of the other study types. Data from 28 studies were used in the meta-analysis of cohort and cross-sectional studies ( Figure S7 represents study-specific coefficients of the association between maternal GBS colonisation and prematurity; are study-specific intercepts; are study-and group (indexed by t, the colonisation status)-specific probabilities of prematurity; is 0 for babies born to mothers who are not GBS colonised and 1, for babies of mothers who are.
We used the odds ratio, exponential ( ), as the measure of association. National estimates of preterm risk 23 were used, together with the odds ratio and country-specific maternal GBS colonisation prevalence, to calculate the excess number of preterm births associated with GBS colonisation. Two approaches were used: in the first, population attributable fraction formula was used; in the second approach, we solved the system of equations below for each posterior sample: where represents risk of prematurity for pregnant women with GBS colonisation (exposed group) in country j; represents risk for pregnant women who are not colonised by GBS; is the countrylevel risk of prematurity; and represents prevalence of maternal GBS colonisation in country j.
Using the first approach, the excess number of preterm births estimated to be associated with GBS exposure was 596,600 (41,700-1,343,100), whilst using the second approach this quantity was 518,100 (36,900-1,142,300). The latter is presented in the Results section.

Computational methods
Bayesian analyses were performed using PyStan, the interface for Stan libraries in Python 24 ; code is available upon request. Centered parameterisations were initially used for the hierarchical models; if divergences persisted after modifying the 'adapt_delta' parameter of the Hamiltonian Monte Carlo algorithm, non-centered parameterisations were used, as described by Betancourt and Girolami 25

Results Supplement
Figures Figure S1. Incidence of early-infancy iGBS by region.

Figure S2. Proportion of stillbirths linked to GBS infection (x-axis). The y-axis indicates countries
where studies were performed, or regional estimates (higher values of y-axis coordinates). For study-specific estimates, different shades of red correspond to 2.5 -97.5, 25 -75, and 40 -60 percentile intervals; posterior medians are represented by vertical red lines and observed proportion in each study, by a black circle. Note that estimates for Oceania and Latin America regions were predicted from data available from other regions (green horizontal bar).  A B Figure S6. Risk of moderate and severe NDI or any severity NDI after GBS sepsis in low-and middle-income countries (LMIC; first two panels) and high-income countries (HIC; third and fourth panels). Figure S7. Meta-analysis on the association between maternal GBS colonisation and preterm births. Data from case-control studies informed the prior distribution for the meta-analysis of cohort and cross-sectional studies, shown below. Black circles represent odds ratio in each study.

LMIC HIC
Different shades of red correspond to 2.5 -97.5, 25 -75, and 40 -60 percentile intervals; posterior medians are represented by vertical red lines. Figure S8. Estimates of the risk of maternal morbidity due to GBS infection. Tables   Table S1. Estimated numbers, in thousands, of GBS-colonised pregnant women and regionspecific prevalences of GBS colonisation. We used country-specific numbers of births as weights to calculate global and region-specific prevalences. The last two digits of each number in the second column were rounded down.

GBS colonisation prevalence (%)
Sub-Saharan Africa 6,000 (5,100 -7,100) 16 Table S6. Estimated numbers of deaths in children who developed iGBS in 2020 by region. These estimates differ from estimates shown in Table 2 in that here all children, regardless of access to skilled birth attendance, were assumed to have CFR presented in Table S5.     -Majority of studies used culture methods; PCR methods are likely more sensitive; -Within-country geographical variation not modelled;

Early-onset iGBS risk (Approach I)
-High heterogeneity in risks in individual studies, possibly due to differences in design, diagnostics and study population; -Only culture-confirmed EOGBS considered, which might lead to under-estimation depending on culture sensitivity; -Most studies from high income countries; -By not differentiating between microbiology screening-and risk factors-based IAP, which might be less effective in reducing EOGBS 26 , we might have underestimated the number of EOGBS cases where the latter approach is used;

Early-onset iGBS risk (Approach II)
-Although this evidence synthesis model partially accounts for the likely underreporting/under-ascertainment in studies directly estimating GBS incidence, additional, context-specific information on degree of under-reporting and underascertainment would improve estimation; Late-onset iGBS risk -An assumption in modelling this outcome is that the under-estimation of EOGBS and LOGBS case numbers are similar; however it is not implausible that underreporting is higher in EOGBS (e.g. due to home delivery) or in LOGBS (e.g. limited capacity or hesitation for cerebrospinal fluid sampling);

Case fatality rates (EOGBS and LOGBS)
-Several studies with small number of cases (publication bias?); -Possible bias due to case ascertainment in some settings, as suggested in 27 ;

Neurodevelopmental impairment after GBS meningitis
-Limited data outside the USA and Europe; -Variation in methods of NDI assessment and case capture between studies; -Absence of an unexposed (children with no history of iGBS) group in many of these studies;

Neurodevelopmental impairment after GBS sepsis
-Variation in risk observed between studies; -For both sepsis and meningitis calculations, it is difficult to define a counterfactual risk that could be reasonably applied globally when estimating excess number of cases; Stillbirths -Limited data from Asia;

Maternal morbidity linked to GBS
-All studies from high-income countries; -Not all maternal infections caused by GBS are captured by definitions.

Preterm births associated with maternal GBS colonisation
-Considerable variation in study design, including timing of exposure assessment and study population (e.g. exclusion or not of women with recent antibiotic use)