Transmission dynamics and control of multidrug-resistant Klebsiella pneumoniae in neonates in a developing country

Multidrug-resistant Klebsiella pneumoniae is an increasing cause of infant mortality in developing countries. We aimed to develop a quantitative understanding of the drivers of this epidemic by estimating the effects of antibiotics on nosocomial transmission risk, comparing competing hypotheses about mechanisms of spread, and quantifying the impact of potential interventions. Using a sequence of dynamic models, we analysed data from a one-year prospective carriage study in a Cambodian neonatal intensive care unit with hyperendemic third-generation cephalosporin-resistant K. pneumoniae. All widely-used antibiotics except imipenem were associated with an increased daily acquisition risk, with an odds ratio for the most common combination (ampicillin + gentamicin) of 1.96 (95% CrI 1.18, 3.36). Models incorporating genomic data found that colonisation pressure was associated with a higher transmission risk, indicated sequence type heterogeneity in transmissibility, and showed that within-ward transmission was insufficient to maintain endemicity. Simulations indicated that increasing the nurse-patient ratio could be an effective intervention.

Transmission dynamics and control of multidrug-resistant Klebsiella pneumoniae in neonates in a developing country Thomas Crellen 1,2 *, Paul Turner 1,2,3 , Sreymom Pol 1,3 , Stephen Baker 2,4 , To Nguyen Thi Nguyen 4 , Nicole Stoesser 2 , Nicholas PJ Day 1,2 , Claudia Turner 1,2,3 , Ben S Cooper 1,2 * Introduction Infections with multidrug-resistant Enterobacteriaceae constitute a major threat to public health in all regions of the world (Schwaber and Carmeli, 2008;Theuretzbacher, 2017) and extended spectrum b-lactamase (ESBL) producing and carbapenem-resistant Enterobacteriaceae have been prioritised by the World Health Organization as pathogenic bacteria in need of novel therapeutics (World Health Organization, 2014;World Health Organization, 2017). These organisms pose the highest risk to subgroups of patients such as those undergoing surgery, requiring invasive devices, and neonates (Peleg and Hooper, 2010;Goldmann, 1981). Antimicrobial resistance is of particular concern in developing (lower and middle-income) countries where the estimated per capita mortality from drug-resistant bacteraemia is far greater than in high-income countries and where last-line antibiotics may be unavailable or unaffordable (Lim et al., 2016;Zaidi et al., 2005). The high risk of difficult-to-treat nosocomial infections threatens to undermine patient confidence in developing world hospitals and health systems (Dondorp et al., 2018).
In this paper, we focus on Klebsiella pneumoniae, which is amongst the most clinically important multidrug-resistant Gram-negative bacteria in developing country settings (Fox-Lewis et al., 2018;Zellweger et al., 2017;Musicha et al., 2017). Genomic studies characterising the population structure of K. pneumoniae have revealed a complex consisting of three separate species (K. pneumoniae, K. quasipneumoniae, K. variicola) that are indistinguishable by culture or standard biochemical assays (Holt et al., 2015). Where these isolates remain undifferentiated by molecular assay, we refer to them as K. pneumoniae sensu lato (in the broad sense).
While there is currently widespread concern about Gram-negative bacteria as an emerging threat due to high levels of plasmid-borne resistance (Nordmann et al., 2011), pathogens such as K. pneumoniae have been considered a major problem in nosocomial settings for over half a century (Yow, 1955). Research during the 1960s identified a number of drivers of Klebsiella colonisation and infection in hospital settings including invasive devices (Mertz et al., 1967), environmental contamination (Kresky, 1964), introduction from the community (Kessner and Lepper, 1967), person-toperson transmission (Weil et al., 1966), endogenous selection from antibiotic pressure (Selden et al., 1971) and transient carriage on the hands of health-care workers (Adler et al., 1970). In a comprehensive review of this body of work, Montgomerie concluded that 'The likely means of transmission of Klebsiella is via the hands of hospital staff members' (Montgomerie, 1979).
More recently, carriage studies in high-income hospital settings in temperate regions have used whole-genome sequencing to show the critical importance of asymptomatic carriage for understanding the epidemiology of K. pneumoniae, establishing a firm link between gastrointestinal carriage and clinical infection (Martin et al., 2016;Gorrie et al., 2017;Gorrie et al., 2018). In contrast to studies in high-income countries where multidrug-resistant K. pneumoniae is typically rare, prospective carriage studies in hospitalised paediatric populations in developing countries in Africa and Asia have reported hyperendemic levels of ESBL-producing Enterobacteriaceae including K. pneumoniae (Andriatahina et al., 2010;Roberts et al., 2019;Founou et al., 2019;Turner et al., 2016) consistent with the greater burden of disease due to these organisms in lower income settings (Lim et al., 2016;Musicha et al., 2017).
Despite their clinical importance, there are major gaps in our knowledge of the epidemiology of multidrug-resistant K. pneumoniae. First, while a number of recent investigations of localised hospital outbreaks in high-income settings have provided evidence that long-term environmental contamination of sinks and other sites may play a role (Clarivet et al., 2016;Decraene et al., 2018;Mathers et al., 2018), the relative importance of persistently contaminated point sources versus patient-to-patient transmission in endemic settings remains unclear. Second, though it is widely assumed that antibiotic exposures play an important role in selecting for multidrug-resistant K. pneumoniae (Baker et al., 2018), such effects have not previously been quantified at the patient level in a way that disentangles antibiotic effects from general exposure to the hospital environment. Third, attempts to quantify transmissibility and how this varies by sequence type (ST) are lacking. Fourth, the impacts of other factors that might plausibly affect transmission including staffing levels, infant breast feeding and use of probiotics have not been explored.
To address these knowledge gaps, we used data collected from a year-long prospective observational carriage study in a Cambodian neonatal intensive care unit and analysis methods which build upon a previously described data-driven stochastic model (Forrester and Pettitt, 2005). We fit four models with logit link functions to estimate the impact of covariates on the daily risk of acquisition of 3GC-R K. pneumoniae s.l.. As these models are unable to identify the force of infection, and genomic data show the ward 3GC-R K. pneumoniae s.l. to be a highly heterogeneous bacterial community, we then fit five linear transmission models to estimate the force of infection for acquisition of each ST. Finally, we use the estimated parameters to provide model-based assessments of the potential impact of hypothetical control measures. We define "acquisition" as the first detection of the organism within an infant following an initial negative swab on admission, which may indicate transmission from other colonised infants or hospital staff, or endogenous selection as a result of antibiotic pressure.

Descriptive epidemiological data
Over the year-long observation period, there were consistently high rates of patient carriage of third generation cephalosporin-resistant (3GC-R) K. pneumoniae sensu lato. Of 333 infants admitted to the neonatal unit, 121 of 289 (42%) were found to be colonised on the first swab taken within 48 hours of ward admission. A further 21 out of 44 (48%) were positive on the first swab that was taken more than 48 hours after admission. Overall, 109/191 (57%) infants who initially screened negative for 3GC-R K. pneumoniae s.l. became positive during their stay in the neonatal unit. Almost all 3GC-R K. pneumoniae s.l. isolates were ESBL producers (1412/1423; 99%), and only 5/1423 (0.35%) were resistant to imipenem. Co-colonisation with 3GC-R E. coli was observed in 52 infants on their first swab, and a further 102 infants became co-colonised with both resistant organisms during their stay on the neonatal unit. Full details on the study population, including blood-stream infections and mortality have been reported previously (Turner et al., 2016), and a summary is provided in Table 1.
The daily counts of infants known to be colonised with 3GC-R K. pneumoniae s.l. (Figure 1A), shows no clear trend but large stochastic fluctuations, which are expected given the small size of the ward (eight beds) and frequent discharges of patients and introduction of colonised infants (imported carriage). A representation of swabbing interval outcomes, as used in the models, is shown in Figure 1B. While the median length of stay was four days, the distribution is highly skewed with a tail of long-staying patients ( Figure 1C). The most frequently used antibiotic combination was ampicillin with gentamicin which was used empirically to treat suspected sepsis in infants admitted from the community and was taken on one fifth (19%) of patient days on the ward. This was followed by imipenem, which was taken on 12% of patient days. Imipenem was typically used when culture results showed non-susceptibility to first-line antibiotic choices and empirically in infants with suspected hospital-acquired infection. All other antibiotic combinations were used at a much lower frequency ( Figure 1D).

Factors associated with carriage acquisition
Infants were prospectively screened for the organism whilst on the ward by culture of rectal swabs on selective media. For infants with negative cultures for 3GC-R K. pneumoniae s.l. on admission to the ward, the outcome (acquisition of the organism) was therefore interval censored between subsequent rectal swabs that were taken a median of 2 days apart (IQR 1, 3 days). In total there were 400 swab outcomes (either negatives or a first positive swab) over 864 patient days from 191 infants with a negative culture at entry. Four models were fitted to the interval censored swab data to determine factors associated with daily risk of carriage acquisition. The best performing single intercept model (model A; Table 2) with the lowest WAIC (see Methods) considered exposure to antibiotics in the previous 96 hours and did not include a term for colonisation pressure (i.e. daily per-patient acquisition risk did not depend on The total height of the peaks shows the ward occupancy on that day. The results from rectal swabs among the 191 infants uncolonised at entry for 3GC-R K. pneumoniae s.l. are shown in panel B, with the window highlighting the swab outcomes from the first thirty five infants uncolonised at entry. Each row represents a patient and each coloured block represents a swab interval, where the width is the number of days in the interval (i.e. time between swabs). Outcomes are shown up to the first swab positive for 3GC-R K. pneumoniae s.l., after which time the patient is assumed to be colonised until discharge. The length of stay distribution for infants in the neonatal unit is shown as a histogram in panel C, where the bin width is two days. An infant's length of stay is the total time in the neonatal unit during the study period, including re-admissions. The 333 infants were present in the neonatal unit for a total of 3417 study days. The proportion of study days when infants took the six most common antibiotic combinations, or other antibiotics, or none are shown in panel D.
the number of other patients who were colonised on a given day). Models with a 48 hour antibiotic exposure period (model B) and those that included a colonisation pressure term (model C) showed slightly worse fits. The covariates associated with reduced daily risk of acquisition were breast milk feeding (odds ratio [OR] 0.69 [95% CrI 0.35, 1.41]) and increasing the number of nurses, for instance three nurses in the ward was associated with an OR of 0.55 (95% CrI 0.15, 1.77) relative to zero nurses (baseline). Male sex was also associated with a reduced risk of acquisition (OR 0.68 Figure 2A for odds ratio posterior distributions. Using the covariate posterior distributions, we also estimated the probability of colonisation for each of the 864 patient days where patients were at risk for acquiring 3GC-R K. pneumoniae s.l.. The median daily probability of first acquisition of 3GC-R K. pneumoniae s.l. for an infant was estimated from the best fitting model as 0.15. There is considerable variability in the risk of acquisition between patient days and the medians of the posterior probability distribution ranges nearly eightfold from 0.047 to 0.35 ( Figure 2B). Table 2. Comparison of models for the risk of acquiring third generation cephalosporin-resistant Klebsiella pneumoniae sensu lato over 864 patient days in a neonatal intensive care unit in Cambodia. Models vary by explanatory variables (A-C) or by permitting the intercept to vary between study months in a hierarchical model (D). Models were fitted on the log-odds scale with a logit link function, hence prior distributions are shown as log-odds. Posterior parameter distributions have been transformed using the logistic function and are shown as probabilities. Prior distributions are normal distributions, shown in brackets are the mean and standard deviation respectively. The risk of becoming colonised with 3GC-R K. pneumoniae s.l. is cumulative over an infant's length of stay and varies in response to interventions, such as consumption of antibiotics. We show the cumulative risk of first acquisition under two scenarios: 1) where an initially four day old, breastfed, female infant remains in the ward for eight days without taking antibiotics or probiotics; Odds Ratio for Acquisition

Patient days
Daily Probability of Acquisition Note that the 864 patient days have been thinned by a factor of five for visualisation. The cumulative risk for different patient scenarios is explored in panel C; a four day old girl, born full term, without severe conditions, breast milk fed and not taking antibiotics or probiotics over eight days in the neonatal unit is shown in blue. The red line shows the same infant, however ampicillin + gentamicin is taken from day three onwards. The lines and points in both cases show the cumulative probability posterior median, and the shaded area shows the 80% credible interval (CrI). In panel D, we took the probability of colonisation for each of the 400 swab interval and binned them into five quantiles. We then compared the expected number of colonisation events predicted by the model with the observed number of colonisation events (squares) in the swab intervals by quantile. In panels A, B and D points represent posterior medians, thick blue/purple lines represents the 80% CrI and thinner black lines represent the 95% CrI. The online version of this article includes the following figure supplement(s) for figure 2:  Table 2). and 2) where an infant with the same characteristics is prescribed ampicillin with gentamicin from day three onwards. The median cumulative risk of acquiring 3GC-R K. pneumoniae s.l. after eight days for the first scenario is 0.57 (80% CrI 0.42, 0.71) and for the second scenario is 0.75 (80% CrI 0.64, 0.85). Although the median cumulative risk between the two scenarios diverges the longer the infants are in the neonatal unit, the uncertainty also increases with time ( Figure 2C).

Swab sensitivity
We estimated the sensitivity of rectal swabs for detecting 3GC-R K. pneumoniae s.l. by examining the swabs that followed a positive from the same patient. There were 936 such swabs which were taken from a patient after at least one swab positive for 3GC-R K. pneumoniae s.l., and 90 (9.6%) of these were negatives. Under the assumption that all negatives following a positive culture are false negatives, the false negative rate posterior median was 0.096 (95% CrI 0.078, 0.12) and the posterior median swab sensitivity was 0.90 (95% CrI 0.88, 0.92). Under the second assumption that three or more consecutive negative swabs following a positive culture represent a true decolonisation event, there were 72 false negatives, giving a false negative rate of 0.073 (95% CrI 0.058, 0.091) and a swab sensitivity of 0.93 (95% CrI 0.91, 0.94).

Model assessment and comparison
The measures of Markov chain convergence showed high effective sample sizes (>400) andR<1.01, indicating that the chains had run for long enough and had mixed well (see Methods and Figure 2figure supplement 1). Model assessment was performed with a posterior predictive check; we estimated the probability of acquisition for each of the 400 swabbing intervals and binned these probabilities into groups defined by the quintiles. We then calculated the expected number of colonisation events in each of the five groups and compared these with the observed number of acquisitions. Within each of these groups, the posterior median of the predicted number of acquisitions was close to the observed number of events, and the observed values were always within the 80% CrI of the model estimates ( Figure 2D). The results from fitting risk factor model A with alternative prior distributions are shown in Figure 2-figure supplement 2; substantially reducing the variance of the priors for the model intercept and covariates had a negligible effect on the posterior parameter estimates. When the intercept was permitted to vary by study month in a hierarchical model (risk factor model D; Table 2), little variation was observed between months; the median posterior baseline probability ranged from 0.20 to 0.23 with wide credible intervals. As these models did not include a colonisation pressure term, the intercept incorporated time-varying changes in the underlying intensity of transmission. The low variance in the monthly intercepts therefore suggests a relatively constant force of infection over the 12-month study period. In models where we included a colonisation pressure term (risk factor model C; Table 2) this was found to have a slightly negative slope for acquisition of 3GC-R K. pneumoniae s.l. (OR 0.96 [95% CrI 0.86, 1.09]). This is surprising, as if patient-to-patient transmission was occurring, we would usually expect the force of infection to increase with the colonisation pressure (Bonten, 2012). The finding therefore suggested that one of the following three possibilities was true: i) patient-to-patient transmission was not occurring at a high frequency in this ward; ii) patient-to-patient transmission was occurring but, because of the continually high ward-level prevalence, variations in the force of infection could not be identified; iii) patient-to-patient transmission was occurring but exposure to the presence of two or more colonised patients presented a similar risk for acquisition as exposure to one. We therefore used K. pneumoniae s.l. whole-genome sequence data to help determine the most plausible scenario.

Klebsiella whole-genome assemblies
We examined whole-genome assemblies of 317 3GC-R K. pneumoniae s.l. isolates cultured from rectal or environmental swabs in the neonatal unit over a four month period (see Methods). A phylogeny based on k-mer distances between assemblies is shown in Figure 3A. Of note is the highly diverse and structured nature of the pathogen population, in contrast to one dominated by a clonal expansion of a single lineage. Overall 62 distinct sequence types were identified in our collection of isolates. The species identified from culture as K. pneumoniae s.l. consists of three distinct subpopulations that meet the criteria for separate species. We isolated all three species from infants in the   (Table 3). Figure 3 continued on next page cohort (K. pneumoniae n = 219, K. quasipneumoniae subspecies similipneumoniae n = 95, K. variicola n = 3), and found diversity similar to that observed in a global collection of K. pneumoniae isolates (Holt et al., 2015), suggesting that the diversity accumulated within a Cambodian neonatal unit over four months is comparable to the diversity of K. pneumoniae globally. Many STs were characterised by only a single carriage isolate, suggestive of importations that were not subsequently transmitted to other patients ( Figure 3B). The STs with the largest number of carriage isolates were ST334 (K. quasipneumoniae; n = 29), ST101 (K. pneumoniae; n = 21), ST1074 (K. pneumoniae; n = 21) and ST45 (K. pneumoniae; n = 17). The most frequent bla ESBL genes in the whole-genome assemblies were CTX-M-15 (201/317; 63%), followed by CTX-M-14 (43/317; 13%) and CTX-M-63 (38/317; 12%).

Transmission models for sequence types
We fitted mechanistic models representing different transmission processes to the ST swab data.
Within the four month period where sequence data were available, there were 171 events for first acquisition of a 3GC-R K. pneumoniae s.l. ST among 150 infants. Among transmission models 1-3 that were initially tested, the model with the best fit to data by WAIC was transmission model 2 (see Table 3 and Methods), which has an intercept (a), representing a constant risk of acquisition, and a slope (b) which scales the risk of acquisition for each infant colonised with a given ST in the ward (i.e. it accounts for colonisation pressure). The model estimated the values for a as 0.0019 (95% CrI 0.0015, 0.0023) and b as 0.0097 (95% CrI 0.0075, 0.012). We then fitted transmission model 2 with a random effect term, where b was permitted to vary by ST and the underlying distribution of b was assumed to follow a beta distribution, with shape hyper-parameters a and b (transmission model 4).  Table 3).  We also fitted transmission model 2 with a random effect term where a was permitted to vary by ST (transmission model 5).
Transmission model 1 has a constant colonisation pressure by ST which was not linked to the daily number of colonised individuals with that ST; this model showed a substantially worse fit to the data by WAIC (Table 3). Transmission model 3, which included terms for colonisation pressure and contamination in the hospital environment also failed to improve the model fit, and the high estimate for l (4.7 95% CrI [2.0, 7.0]) suggests that contamination left by previously colonised infants decays rapidly to background levels, with an estimated environmental half life of 3.6 hours (95% CrI 2.4, 7.6 hours; Table 4). Varying the a parameter by ST (transmission model 5) resulted in a substantially worse fit to data, suggesting there was not enough information in the model to differentiate ST-specific background rates of colonisation.
The central estimates of the force of infection by ST from transmission model 4 are shown in Figure 3C, with estimates from the four most frequent STs highlighted in colour. The uncertainty around these parameter estimates for the four STs is shown in Figure 3D. The daily interval-censored colonisation incidence for the most frequent STs are clustered in time, generally emerging and reaching extinction in the ward within a matter of weeks, suggestive of importation and subsequent patient-to-patient transmission. The incidence and estimated force of infection for the four most frequent STs over a period in the study where all 3GC-R K. pneumoniae s.l. isolates were sequenced are shown in Figure 3E and F. Parameter estimates for all transmission models are shown in Table 3. Model fitting diagnostics showed that Markov chains had converged satisfactorily (see Methods and

Sequence Type SNP diversity
We mapped reads from isolates in two STs with the highest estimated force of infection (ST45 and ST101; see Figure 3) to ST consensus reference genomes. The 21 ST101 carriage isolates had a mean read depth ranging from 25x to 125x (median 57x), and the 17 ST45 carriage isolates had a mean read depth ranging from 20x to 67x (median 56x). All the carriage isolates in both STs had >90% of the genome covered by >5x coverage. We then called and filtered SNPs (see Methods) to determine if the relatedness of carriage isolates within STs was consistent with recent person-to-person transmission. The pairwise number of variants in ST101 isolates between infants ranged from 15 to 68 SNPs (median 30), which was comparable to the variation seen within infants in ST101 (from 18 to 38 SNPs; median 28). Similarly in ST45 the pairwise SNP differences between infants varied from 13 to 223 (median 125), which was comparable to the within-host ST diversity (from 55 to 212 SNPs; median 124). Therefore the SNP diversity observed within and between-hosts was very similar for both ST45 and ST101.

Simulations with agent-based models
We used the posterior parameter estimates obtained from model fitting for forward simulations using a dynamic agent-based model in order to evaluate the potential impact of interventions (see Methods). We first estimated the ward-level reproduction number (R A ) for 3GC-R K. pneumoniae s.l. by simulating the introduction of a single colonised patient into a ward of eight susceptible patients. For all patients, length of stay was sampled from the empirical length of stay distribution ( Figure 1C) and colonised patients had a transmission potential sampled from the posterior hyperparameter distribution from transmission model 4 ( Table 3). The median of the R A distribution was 0.65% and 95% of values fell between 0.36 and 1.09. The distribution of R A values is shown in Figure 4A.
We then simulated the impact of interventions to reduce the rate of 3GC-R K. pneumoniae s.l. acquisition by combining parameter estimates for colonisation pressure from transmission model 4 with the marginal effect of modifiable covariates from risk factor model A ( Table 2). In the first intervention scenario, we varied the proportion of infants given an oral probiotic, in addition to varying the proportion of infants that were colonised on entry (imported cases; 5% or 40%). We used as an outcome the proportion of infants susceptible to 3GC-R K. pneumoniae s.l. on admission that remained uncolonised on discharge. When the importation rate was high (40% colonised on entry; similar to our study population, see Table 1), setting the proportion of infants taking the probiotic to be 0%, 50% or 100% resulted in the median proportion remaining uncolonised as 0.54 (95% interval 0.34, 0.72), 0.56 (95% interval 0.38, 0.72) and 0.59 (95% interval 0.39, 0.75) respectively. In the lower importation setting (5% of infants colonised on entry), setting the proportion of infants taking the probiotic to be 0%, 50% or 100% resulted in the median proportion remaining uncolonised as 0.80 (95% interval 0.54, 0.93), 0.82 (95% interval 0.60, 0.93) and 0.84 (95% 0.61, 0.94), respectively.
In the third scenario, we simulated either three, four or eight nurses in the ward each day, corresponding to infant:nurse ratios of roughly 3:1, 2:1 and 1:1 respectively. Again, we examined this effect in settings with different importation rates. Varying the number of nurses in the ward between three, four and eight in the high importation setting resulted in median proportions of initially uncolonised infants who remained uncolonised throughout their neonatal unit stay of 0.54 (95% interval 0.36, 0.71), 0.61 (95% interval 0.39, 0.78) and 0.81 (95% interval 0.21, 0.97) respectively. In the lower importation setting, varying the infant:nurse ratio resulted in the median proportions remaining uncolonised of 0.81 (95% interval 0.56, 0.93), 0.86 (95% interval 0.62, 0.95) and 0.96 (95% interval 0.32, 0.99), respectively. Of all the simulated interventions therefore, increasing the number of nurses on the ward had the largest impact on reducing colonisation rates. The distributions of the outcome variables from all simulations are shown in Figure 4B, C and D.

Discussion
In a hyperendemic, developing country hospital setting, we analysed the transmission dynamics of 3GC-R K. pneumoniae s.l. prospectively over one year. We compared the support for competing hypotheses about modes of spread, quantified effects of antibiotic exposures as drivers of the epidemic, evaluated risk factors for transmission, and forward simulated to evaluate the potential benefits of interventions.
We found that carriage strains of 3GC-R K. pneumoniae s.l. among neonates constituted a highly diverse population, with considerable intra-host variation within STs. There were frequent acquisitions or detection of resistant K. pneumoniae s.l. that were closely related to strains carried by other infants on the ward at the time of acquisition ( Figure 3E). Within-host diversity of these lineages was similar to between-host diversity in potentially linked cases. Moreover, once genomic information was considered, models incorporating colonisation pressure as a risk factor for acquisition showed substantially better fits to data than models without colonisation pressure (Table 3). Taken together with the lack of persistent environmental contamination (Smit et al., 2018) and the lack of improvement in model fit when long-term environmental contamination was considered, these findings add support to the view that patient-to-patient transmission (much of which is likely   Figure 4. Simulation results from dynamic agent-based models using parameter estimates on acquisition of third generation cephalosporin-resistant (3GC-R) Klebsiella pneumoniae sensu lato among neonates in a Children's Hospital in Cambodia. The distribution of ward reproduction number (R A ) values shown in panel A was obtained by taking 2000 samples from the force of infection posterior distribution, and for each sample running the agentbased simulation 100 times and taking the mean value. The results from simulating counterfactual scenarios with a dynamic agent-based model are shown in panels B, C and D. In B, the proportion of infants taking a probiotic (Lactobacillus acidophilus) on entry to the ward was varied between 0, -.5 and 1 in setting with a high proportion of imported cases (0.4) and a lower proportion of imported cases (0.05). In panel C, the proportion of infants that were breast milk fed was varied was varied between 0.25, 0.5 and 0.9in settings with a high proportion of imported cases (0.4) and a lower proportion of imported cases (0.05). In panel D, the infant nurse ratio was varied between 3:1, 2:1 and 1:1 in settings with a high proportion of imported cases (0.4) and a lower proportion of imported cases (0.05). The outcome measure in all simulations is the proportion of infants susceptible on entry that remained uncolonised with 3GC-R K. pneumoniae s.l. on discharge. The simulated outcomes are displayed as density plots, with dashed lines showing the median value.
to be mediated by contacts with healthcare workers) is the main driver of resistant K. pneumoniae acquisition, and that colonised patients represent the primary reservoir. Notably, this result was only apparent when we incorporated genomic data into our models to estimate transmission parameters by ST.
We are aware of one other prospective study of ESBL-producing Enterobacteriaceae colonisation dynamics from a developing country (Bonneault et al., 2019). This study considered ESBL-producing Enterobacteriaceae colonisation in a neonatal unit in Madagascar over a period of six months, and found similar rates of acquisition with ESBL-producing Enterobacteriaceae to those seen here. Though sequencing data were not used, fitting transmission models to data provided evidence of patient-to-patient and healthcare worker-patient transmission, particularly for ESBL-producing K. pneumoniae and the estimated daily transmission parameter (0.05; 0.008, 0.14) is similar to our estimates for the most common STs ( Figure 3D).
Two other prospective carriage studies of K. pneumoniae have used whole genome sequencing to identify possible transmission events: while a one year study in an adult intensive care unit in Australia found five epidemiologically plausible intra-hospital transmission chains , a one year carriage study in two geriatric wards (also in Australia) found no evidence of patient-topatient transmission chains (Gorrie et al., 2018). Other studies using genomic epidemiology to study resistant K. pneumoniae transmission have been performed in adult wards in high-income settings (Snitkin et al., 2012;Haller et al., 2015;Snitkin et al., 2017). While such studies have also provided support for patient-to-patient transmission playing an important role, such retrospective investigations made in response to reported outbreaks of multidrug-resistant K. pneumoniae might not reflect typical patterns of transmission. A study using proximity sensors to investigate transmission of ESBL-producing Enterobacteriaceae over four months in a long term care facility in France found stronger support for person-to-person transmission as the main route of acquisition for K. pneumoniae, though the evidence for person-to-person transmission of ESBL-producing E. coli was weaker (Duval et al., 2019). This suggests that separate mechanisms may drive the transmission of different ESBL-producing organisms and that this should be considered when analysing carriage data from multiple species of Enterobacteriaceae.
Under the assumption that patient-to-patient transmission was driving the epidemic, we calculated that the rates of transmission within the ward were insufficient to maintain endemic transmission (i.e. the ward-level reproduction number, R A <1) (Cooper et al., 2004). These findings are comparable to a study which determined the relative force of infection between ESBL-producing E. coli and other ESBL-producing Enterobacteriaceae in 13 European intensive care units, finding that the latter (mainly K. pneumoniae) had a transmission rate almost three times greater than the former, and that the single-admission reproduction number for both classes of organisms were well below one (Gurieva et al., 2018). Our central estimate of R A is, however, substantially higher (0.65 compared with 0.17); this difference in estimated transmission potential may reflect differences in staffto-patient ratios, different standards of hygiene and infection control, different patterns of antibiotic use, and differences in the patient population (Dondorp et al., 2018). The findings from both studies indicate that repeated importation into the unit is needed to maintain endemicity of resistant Klebsiella. Imported cases may be acquired from other wards within the same hospital, other hospitals within the referral network, or community transmission. When rates of importation are high, as observed in this setting where up to 43% of infants may have been colonised on initial ward admission (Table 1), even effective interventions are limited in how many acquisition events they can prevent due to a high underlying colonisation pressure (Figure 4). The public health implications are therefore that control measures should be coordinated regionally and targeted to the wider hospital patient referral network (Donker et al., 2012;Ciccolini et al., 2013).
Amongst the most important findings was the consistent association between a patient's antibiotic exposure and an increased risk of acquiring 3GC-R K. pneumoniae s.l. or detection of the organism due to within-host selection. To our knowledge, the role of antibiotics as drivers of carriage dynamics has not previously been explored with appropriate methods to account for time-dependent antibiotic exposures in Gram-negative bacteria, but there are reasons for believing it is likely to be a key mechanism through which antibiotics select for ESBL K. pneumoniae (Tedijanto et al., 2018). With the exception of imipenem, for which there was no association with the risk of acquisition for these predominantly carbapenem-susceptible bacteria (1418/1423 K. pneumoniae s.l. isolates sensitive to imipemen; 99.6%), effect sizes were similar for different antibiotic combinations (median posterior OR~2; Figure 2A). The narrower credible interval for the ampicillin + gentamicin combination (OR 1.96, 95% CrI 1.18, 3.36) reflects the much higher usage of this antibiotic combination compared to others ( Figure 1D). These effects are consistent with hypotheses about antibiotic therapy leading to reduced microbiome diversity and subsequently increasing the risk of colonisation with drug-resistant bacteria, which face less competition from fitter, sensitive strains (Lipsitch et al., 2000). Microbiome analysis has shown that the greatest dysbiosis following antibiotic therapy in healthy adults is four days after treatment starts (Palleja et al., 2018), lending confidence to our model comparison selecting a 96 hour exposure period over a 48 hour period ( Table 2). While we cannot differentiate the effects of antibiotics on i) increasing the susceptibility of an infant to acquisition from another infant, versus ii) endogenous selection for resistant bacteria within that infant (Lipsitch and Samore, 2002) in the risk factor models, the small estimate for a relative to b in transmission model two suggests that background selection plays a relatively small role in acqusition compared with person-to-person transmission ( Table 3).
Breast milk feeding was associated with a reduced risk of colonisation with 3GC-R K. pneumoniae s.l. though uncertainty was large (Figure 2A). This accords with our understanding of the development of a healthy gut microbiome in the early stages of life, which can be adversely affected by the replacement of breast milk by formula (Bäckhed et al., 2015), and the protective effect of a diverse microbiome that competes against potentially pathogenic bacteria (Langdon et al., 2016). In this population, breastfeeding rates were high (90%) though our simulations showed that dropping the proportion to 50% or 25%, as seen in other developing world populations (Lauer et al., 2004), could increase the proportion becoming colonised during admission by around 5% and 8% respectively.
The finding that the oral probiotic Lactobacillus acidophilus was not strongly protective against acquisition was disappointing in light of earlier results that suggested a possible effect in slowing rates of acquisition of ESBL-producing Enterobacteriaceae (Turner et al., 2016). This negative result was shown by the odds ratio, which was located close to zero in the risk factor models for acquisition ( Figure 2A) and supported by the forward simulations which showed only a 4% median decrease in infant colonisation rates when 100% of infants were prescribed probiotics ( Figure 4B). Evidence is still limited about the value of probiotics for neonates. One large randomised trial in rural India reported a beneficial effect from a symbiotic preparation (combining a probiotic, Lactobacillus plantarum, with fructooligosaccharide) in preventing sepsis in infants (Panigrahi et al., 2017). Another randomised trial in pre-term infants in England found no benefit from the probiotic Bifidobacterium breve BBG-001 in preventing necrotising enterocolitis, blood culture positive sepsis or death before hospital discharge (Costeloe et al., 2016). There is also evidence that some probiotics given after antibiotic consumption can impair and delay the recovery of normal gut flora in humans (Suez et al., 2018).
An increased number of nurses on the ward was negatively associated with the risk of acquiring 3GC-R K. pneumoniae s.l.. Our simulation studies using a dynamic agent-based model showed that increasing the nurse:infant ratio from 1:3 to 1:1 could reduce the number of infants becoming colonised in both high and low importation settings by about a quarter. These findings are consistent with results from large observational studies. For example, (Rogowski et al., 2013), in a retrospective cohort study in 67 neonatal units in the USA found a strong association between neonatal unit understaffing and an increased nosocomial infection rate (where understaffing was defined as a nurse-patient ratio below US guidelines for the patient acuity level). Two distinct mechanisms might account for such an association. First, an imbalance between workload and staffing levels may lead to reduced attention to basic infection control measures such as hand hygiene, as has been reported in several studies (Pittet et al., 2006). Second, as nurse:patient ratios decrease a lower proportion of patient contacts will be cohorted as each nurse will need to contact more patients in a shift, substantially increasing the potential for cross-transmission (Archibald et al., 1997;Austin et al., 1999;Hugonnet et al., 2004).
From our model estimates, we observed considerable variation in the risk of acquisition per patient day, with the median posterior probability varying nearly eight-fold from 0.047 to 0.35, showing that even neonates within a single hospital ward constitute a highly heterogeneous population. This challenges the assumptions of compartmental models for nosocomial transmission that treat patients as broadly homogeneous in their risk of acquiring drug-resistant bacteria (Grundmann and Hellriegel, 2006;van Kleef et al., 2013;Domenech de Cellès et al., 2013).
The effect of colonisation pressure was not identifiable when we considered the total number of 3GC-R K. pneumoniae s.l. isolates on a given day, but became identifiable when we considered transmission by ST. From our analysis of the Klebsiella carriage population, we observed three species (K. pneumoniae, K. quasipneumoniae and K. variicola) along with a considerable number (62) of STs. This indicates that Klebsiella in carriage did not reflect a single population, but rather repeated introductions of diverse isolates which were then either spread around the ward over a number of weeks or were not transmitted and became locally extinct.
Comparison of different transmission models strongly supported the inclusion of a colonisation pressure term to account for patient-to-patient transmission. There was also support for a hierarchical model where transmissibility varied by ST, though the STs we estimated to have the highest transmissibility have not been highlighted as dominant in other settings  suggesting that the ST composition within a region may reflect adaptation to local pressures (Stoesser et al., 2015b). The pairwise SNP diversity in two of the major STs (ST45 and ST101) was greater than expected, but the within-host and between-host diversity was similar in both cases, suggesting that transmission within these STs is biologically plausible. Within-host diversity has the potential to hinder the reconstruction of transmission networks (Worby et al., 2014;Didelot et al., 2016), and our results here highlight the importance of capturing within-host diversity in sequencing studies. While within-host diversity was particularly high in ST45, all acquisitions of this ST occurred within a 19 day window (over a possible four month period when carriage isolates were sequenced) with overlapping colonised patient stays, strongly suggesting that cases were epidemiologically related ( Figure 3E). A previous analysis of a subset of the genomic data identified closely related clusters suggestive of transmission, though with a smaller number of pairwise SNPs than we observed (Smit et al., 2018). This discrepancy is likely due to differences in methodology as we mapped isolates to ST-specific reference genomes, which results in a greater proportion of the genome being callable compared with mapping very diverse isolates to a single reference. We cannot entirely rule out less parsimonious explanations for the temporal clustering of STs, such as a transient increase of certain STs in the water supply, though the person-to-person transmission route, mediated by healthcare workers, is most strongly supported by our models.
An important strength of the study is that we considered asymptomatic carriage, rather than clinical isolates, and collected detailed patient-level data from infants who became colonised as well as from infants who remained uncolonised. This allowed us to quantify the factors driving the epidemic, which would not have been possible if we had considered only clinical isolates. Also, by using a prospective design rather than a reactive exploration of an outbreak or unusual cluster of cases our findings should be more representative of typical patterns of transmission. The use of an inferrential approach that accounted for the interval-censored nature of the data and the use of whole genome sequencing were also key factors in developing a quantitative understanding of the transmission dynamics.
Our study has limitations. We sequenced isolates of 3GC-R K. pneumoniae s.l. from a four month period (rather than the full 12 months of the carriage study) due to resource constraints. Such constraints also prevented us from sampling mothers and other family members of infants, and healthcare workers in close contact with infants. Greater density of within-host sampling (Stoesser et al., 2015a;Wymant et al., 2018;Lees et al., 2019), and inclusion of long-read sequencing to investigate plasmid transmission (Conlan et al., 2014) would have also provided a more complete epidemiological picture. Although our analysis accounts for patient heterogeneity and interval censoring, it assumes that a positive culture taken within 48 hours of admission indicates that the patient was colonised on admission, and that cultures had 100% sensitivity for detecting carriage. Such restrictions could all be relaxed using a data augmentation framework (Cooper et al., 2008), but validated implementations of such an approach allowing inclusion of an arbitrary number of covariates are not currently available. Antibiotics were not prescribed randomly, but according to the clinical judgement of the clinicians and in response to the pathology of the patient. It is therefore possible that the effects of antibiotics in the model estimates may be confounded by the clinical severity of the infant. We did attempt to mitigate this by including a covariate for clinical severity, which links to the use of invasive devices and we note that, with the exception of imipenem, all antibiotics show similar effects to each other on the risk of first acquisition.
In summary, this study provides strong evidence for person-to-person within ward transmission of 3GC-R K. pneumoniae s.l. mediated by healthcare workers, estimates key epidemiological parameters, quantifies the role of different antibiotics in driving the epidemic, and highlights interventions with the potential to contribute to control efforts.

Epidemiological and microbiological data
We used data collected prospectively from a neonatal intensive care unit in a children's hospital in Siem Reap, Cambodia between September 2013 (when the neonatal unit newly opened) and September 2014. The study protocol required rectal swabs to be taken within 48 hours of admission and subsequently every two to three days. We assumed that any patient testing positive on their first swab taken within 48 hours of admission was positive on arrival. Patients who had their first swab taken >48 hours from admission but tested negative were included in the analysis, while those that tested positive were omitted. Infants that were re-admitted to the ward and had been colonised prior to first discharge were assumed to still be colonised and were thus excluded. Details of the microbiological treatment of rectal swabs, including resistance assays, have been published previously (Turner et al., 2016).

Probability of acquisition
We estimated the daily probability of colonisation with 3GC-R K. pneumoniae s.l. using a discrete time model with time steps of one day. Each day in the ward a previously uncolonised patient can become colonised. As rectal swabs are not taken on every day of a patient's stay the outcome is interval censored: we know that a negative swab followed by a positive swab indicates that a patient became colonised on some day between the two swabs, but not on which day. If the probability of becoming colonised on day i for patient j is p ij , given the patient is uncolonised at the start of the day, then the probability of remaining uncolonised is (1-p ij ). In interval k for patient j consisting of N kj days, then the probability of remaining uncolonised is: Therefore the probability of becoming colonised (v kj ) is the complement: The outcome for patient j in interval k, Y kj 2 {0,1}, as the patient either becomes colonised (1) or remains uncolonised (0) with 3GC-R Klebsiella. Therefore the likelihood is given by:

Risk factor models for carriage acquisition
The daily probability of becoming colonised (p ij ) is related by the logit link function to a linear function of covariates: Where x 1 , x 2 , x 3 . . . is a vector of predictors (data) and b 1 , b 2 , b 3 . . . is a vector of slopes (parameters) that are to be estimated. The intercept a can be a single parameter, or permitted to vary over m periods of time in a random effects model. When using such a random effects model a was assumed to be normally distributed, with a mean m and standard deviation s, which are themselves parameters with their own prior distributions. The prior distributions used in the analysis are shown in Table 2. Results from models with alternative prior distributions are shown in Figure 2-figure supplement 2. a m~n ormalð; sÞ Fourteen standard covariates included in all risk factor models were: exposure to the six most common combinations of antibiotics, taken intravenously unless otherwise stated (ampicillin, ampicillin + gentamicin, cloxacillin (oral), ceftriaxone, cloxacillin + gentamicin, and imipenem) within the past 48 or 96 hours; whether the infant was breast milk fed; if the infant recieved an oral probiotic on entry (Lactobacillus acidophilus) for prevention of necrotising enterocolitis; sex; born prematurely (before the 37th week of pregnancy); severity (defined as either i. requiring ventilation, ii. requiring continuous positive airway pressure or iii. requiring inotopes); and if already colonised with 3GC-R E. coli. These explanatory variables were treated as binary (0/1). We also included the age in days on first admission to the NU, and the daily number of nurses on the ward. An additional covariate included in one of the risk factor models (see below) was a term for colonisation pressure, which is an integer value representing the number of individuals known to be colonised with 3GC-R K. pneumoniae s.l. on that day. Covariates were recorded for every day the infant was present in the neonatal unit and data were treated as complete. We considered the following models: A. Single intercept with standard covariates (14). Exposure to antibiotics considered if taken within the past 96 hours. B. Single intercept with standard covariates (14). Exposure to antibiotics considered if taken within the past 48 hours. C. Single intercept with standard covariates (14) plus an additional covariate for colonisation pressure. Exposure to antibiotics considered if taken within the past 96 hours. D. Variable intercept by study month that uses partial pooling (hierarchical model). Standard covariates (14) and exposure to antibiotics considered if taken within the past 96 hours.
Rectal swab/culture sensitivity Swab sensitivity was estimated from the number of negative swabs following a positive swab, i) under the assumption that all negative results following a positive swab were false negatives and ii) under the assumption that three or more consecutive negative swabs following a positive represented a true decolonisation event. Posterior distributions of the false negative rate and swab sensitivity were estimated using a conjugate beta prior; beta(a=1, b=7) (Bolker, 2008).

Pathogen sequencing and bioinformatics
We whole-genome sequenced 317 cultured isolates identified morphologically as 3GC-R K. pneumoniae s.l. from i) rectal swabs from all colonised patients within a four month period of the study and ii) twice weekly swabs from seven environmental surfaces around the ward (6 sinks and one computer keyboard) within the same time frame. Sequencing was performed with the Illumina HiSeq 2500 platform, producing 150 base-pair paired-end reads. The reads were trimmed for adapter sequence using TrimGalore (v0.4.4) before assembly with Unicycler (v0.4.5) , contigs <1 kilobase were discarded. Distances between assemblies were calculated using mash (v1.1) (Ondov et al., 2016) and a phylogeny constructed with mashtree (v0.33). Sequence types (STs) were identified using Kleborate 0.2.0 . Variant calling was performed within STs by mapping reads to ST consensus reference genomes (5.32 Mbp) with SMALT (v.0.7.6) https://www. sanger.ac.uk/science/tools/smalt-0 (parameters -x -y 0.85 r 1 j 100 -i 800), before sorting and removing unpaired mate reads and technical duplicates from binary alignment files with samtools (v.1.8).

Transmission models incorporating sequence type data
We assessed within-ward transmission of 3GC-R K. pneumoniae s.l. STs under the assumption that individuals remain colonised with a given ST for the duration of their stay until discharge (Birgand et al., 2013). We fitted data to five linear transmission models.  (2) with an additional term for the contribution of environmental contamination to transmission (g), whereby the number of cases with ST c on an earlier day i' (n i 0 c ) are assumed to leave some trace in the environment that decays exponentially over time at a rate given by l. 4. A hierarchical version of (2) which permits the transmission parameter b to vary by ST. 5. A hierarchical version of (2) which permits the intercept a to vary by ST.
The probability p of colonisation for individual j on day i with ST c for the respective models are: We opted to fit transmission models on the linear, rather than logistic, scale as we consider a linear increase in the force of infection with the number of colonised infants to be a more realistic assumption than the, initially, exponential increase that results from a logistic transformation.

Statistical model fitting
We fitted the statistical models using Hamiltonian Markov chain Monte Carlo in Stan (version 2.17.3) within the R environment (v. 3.4.3). Prior distributions were selected to be weakly informative normal distributions for the risk-factor models (McElreath, 2018), see Table 2. For the transmission models, beta distributions were used as priors. in the case of the hierarchical transmission models (4 and 5) the scale and location parameters are themselves hyper-parameters with their own priors. Studies have shown the half life of carbapenem-resistant K. pneumoniae inoculated onto hospital surfaces to be around 12 hours (Weber et al., 2015); this corresponds to l=1.39 in our model. We therefore gave l in transmission model 3 a more informative prior, normal(m=1, s=2), to give weight to biologically plausible estimates. Prior distributions for all transmission models are shown in Table 3. Results from a less informative prior distribution for l are shown in Figure 3-figure supplement 2. Chains were run for a varying number of iterations depending on the number of parameters to estimate, though with a minimum of 12,000 iterations over four chains, including burn-in. The Gelman-Rubin statistic (R) was used as a diagnostic, where values <1.01 indicate chains have converged and additionally posterior chains were visually inspected for convergence. Effective sample sizes (ESS) for both the centre of the posterior distribution (bulk) and the ends of the distribution (tail) was ensured to be >400 (Vehtari et al., 2019). Model comparison was performed with widely applicable information criterion (WAIC) (Vehtari et al., 2017). We used 95% credible intervals (CrIs) as a measure of uncertainty around posterior parameter distributions and posterior medians as the central estimate. Uncertainty in parameter estimates is represented by the posterior distributions, and the estimated probability that a particular parameter is within a certain range is given by the area under the curve of that parameter's marginal posterior distribution within that range.

Agent-based forward simulations
We forward simulated the impact of interventions using an agent-based model implemented in Python (v. 2.7.15) and hosted at https://github.com/tc13/ward-infection-ABM (copy archived at https://github.com/elifesciences-publications/ward-infection-ABM). In brief, we model in discrete time steps a ward containing a fixed number of beds and where patients sample a length of stay and colonisation status on entry. Simulations to estimate the ward reproduction number R A introduced a single colonised individual into a full ward of susceptible patients. The probability of any uncolonised patient acquiring resistant Klebsiella from the index patient on day one is p ijc , where p ijc is sampled from a beta distribution with shape hyper-parameters a and b from transmission model 4 ( Table 3). This probability changes on subsequent days based on the number of other infants that become colonised and start transmission, and the lengths of stay. The simulation ends when the index patient is discharged.
For simulations that explored the impact of interventions, the effect of each of the covariates of interest (probiotic consumption, breast milk feeding and number of nurses on the ward) were obtained from risk factor model A ( Table 2 and Figure 2A). The marginal effects of each risk factor were transformed into a log-odds ratio and used to modify the daily risk of acquisition sampled from a beta distribution with shape hyper-parameters a s and b s . For n c colonised patients on the ward on day j, the probability of an uncolonised patient with covariate k acquiring 3GC-R K. pneumoniae s.l. is 1-(1-p k ) n cj . The simulation ends after 365 days. To reduce variability between model runs for estimation of R A and the interventions, simulations were run 100 times with each posterior parameter sample and the mean outcome value obtained. As we used 2000 posterior samples from each parameter estimated by model fitting, this resulted in a total of 200,000 model runs for each simulated scenario.
The following dataset was generated: