Tracking Pseudomonas aeruginosa transmissions due to environmental contamination after discharge in ICUs using mathematical models

Pseudomonas aeruginosa (P. aeruginosa) is an important cause of healthcare-associated infections, particularly in immunocompromised patients. Understanding how this multi-drug resistant pathogen is transmitted within intensive care units (ICUs) is crucial for devising and evaluating successful control strategies. While it is known that moist environments serve as natural reservoirs for P. aeruginosa, there is little quantitative evidence regarding the contribution of environmental contamination to its transmission within ICUs. Previous studies on other nosocomial pathogens rely on deploying specific values for environmental parameters derived from costly and laborious genotyping. Using solely longitudinal surveillance data, we estimated the relative importance of P. aeruginosa transmission routes by exploiting the fact that different routes cause different pattern of fluctuations in the prevalence. We developed a mathematical model including background transmission, cross-transmission and environmental contamination. Patients contribute to a pool of pathogens by shedding bacteria to the environment. Natural decay and cleaning of the environment lead to a reduction of that pool. By assigning the bacterial load shed during an ICU stay to cross-transmission, we were able to disentangle environmental contamination during and after a patient’s stay. Based on a data-augmented Markov Chain Monte Carlo method the relative importance of the considered acquisition routes is determined for two ICUs of the University hospital in Besançon (France). We used information about the admission and discharge days, screening days and screening results of the ICU patients. Both background and cross-transmission play a significant role in the transmission process in both ICUs. In contrast, only about 1% of the total transmissions were due to environmental contamination after discharge. Based on longitudinal surveillance data, we conclude that cleaning improvement of the environment after discharge might have only a limited impact regarding the prevention of P.A. infections in the two considered ICUs of the University hospital in Besançon. Our model was developed for P. aeruginosa but can be easily applied to other pathogens as well.

Hospital-acquired infections are a major cause of morbidity and mortality worldwide [1]. 2 In industrialized countries, about 5 − 10% of admitted acute-care patients are affected 3 whereas the risk is even higher in developing countries [2]. 4 Due to its intrinsic resistance to multiple antibiotics, Pseudomonas aeruginosa (short 5 P. aeruginosa or P. A.) is an important contributor to nosocomial infections [3][4][5]. The 6 most serious P. aeruginosa infections lead to bacteremia, pneumonia, urosepsis, wound 7 infection as well as secondary infection of burns [6]. In 2018, the World Health 8 Organization has recognized P. aeruginosa as a serious health-care threat by including it 9 in the list of antibiotic-resistant highest priority pathogens [7]. 10 Given the severe consequences of P. aeruginosa infections, in particular for critically-ill 11 patients, it is clear that strategies preventing infections are seen as a key priority. 12 However, infections are recognized as only the tip of the iceberg, while colonizations 13 represent the true load of pathogens carried by patients in the intensive-care unit (ICU). 14 Understanding the dynamics of P. aeruginosa colonizations is therefore crucial for 15 developing and evaluating infection control policies. 16 There are several modes of transmission for colonizations. The endogenous route, due to 17 e. g. antibiotic selection pressure, was regarded as the most important route of discharge on P. aeruginosa transmissions in ICUs using solely routine surveillance data. 48

49
In this section, we present our framework for modeling the transmission routes of P. 50 aeruginosa including environmental contamination, as well as the method for computing 51 the relative contributions of the routes. We further elaborate on the procedure that we 52 used to estimate the relevant transmission parameters. A brief introduction to the data 53 used for the analysis is given. We describe the model selection as well as model 54 assessment procedures that are used to compare the developed models and to assess the 55 model fit to the data. 56 Transmission models 57 The underlying model for our algorithm is a compartmental SI-model (e.g. [20]). All 58 patients are admitted to an ICU and either belong to the susceptible (P. aeruginosa 59 negative) or colonized (P. aeruginosa positive) compartment at any given time. The 60 latter includes patients with asymptotic carriage and those with P. aeruginosa infection. 61 A susceptible patient may become colonized at a certain transmission rate, which 62 depends on the colonization pressure in the ward at the time. hands of health-care workers, is proportional to the fraction of colonized patients in the 73 wards. The probability of colonization due to cross-transmission is high if the number of 74 colonized patients is high and vice versa. Environmental contamination is modeled on a 75 ward-level represented as a general pool of bacteria linked to objects contaminated by 76 colonized patients. Bacterial load may persist in the environment even after the 77 discharge of patients. This leads to higher probabilities of acquiring colonization after 78 outbreaks, even when the number of colonized patients is low. 79

3/29
The force of infection λ(t), i.e. the probability per unit of time t for a susceptible patient to become colonized, is modeled as where I(t) is the number of colonized patients, N (t) the total number of patients and   The described model is subject to the following further assumptions: 85 • Once colonized, patients remain colonized during the rest of the stay. This 86 assumption is appropriate when the average length of stay of patients does not 87 exceed the duration of colonization, as is the case for P. aeruginosa.

88
• Colonization is assumed to be undetectable until a certain detectable bacterial 89 level is reached. We do not distinguish between several levels of colonization. 90 Furthermore, the detection of carriage in specimen is assumed to be the same for 91 each screening separately.

92
• Assuming that HCWs are colonized for a short period of time (typically until the 93 next disinfection) in comparison with the length of carriage for patients, we use a 94 quasi-steady state approximation [20]. This means that contact patterns between 95 patients and HCWs are not explicitly modeled and we assume direct 96 patient-to-patient transmission.

97
• All strains of P. aeruginosa are assumed to have the same transmission 98 characteristics. We therefore assume that all colonized patients may be a source of 99 transmission and contribute equally to the colonization pressure.

100
• All susceptible patients are assumed to be equally susceptible. 101 In order to analyze the impact of environmental contamination after the discharge of colonized patients, we model the underlying mechanism leading to the presence of pathogens in the environment after discharge. Patients contribute to the bacterial load by shedding P. aeruginosa at a rate ν during their stay. Furthermore, natural clearance and cleaning lead to a reduction of P. aeruginosa bacteria in the environment at a rate µ. The change of environmental contamination can be described by The differential equation (2) is solved by assuming I(t) = I t and N (t) = N t are known piece-wise constant functions with steps at times t 0 , t 1 , . . . , t N . For t i ∈ {t 0 , . . . , t N }, it holds for t := max{x ∈ {t 0 , . . . , t N } | x ≤ t} and t ∈ R \ {t 0 , t 1 , . . . , t N } (full details are 102 given in S1 Text). The initial amount of bacterial load is denoted by E 0 := E(t 0 ).

103
Given the number of colonized patients at a certain time t, the bacterial load E(t) is 104 deterministic. The acquisitions are stochastic based on the force of infection in (1).

105
Under the assumption of a force of infection λ(x) at time x, the cumulative probability 106 of any given susceptible person of becoming colonized in [0, t] is 1 − e − t 0 λ(x)dx (see 107 e.g. [21]).

108
All parameters, namely α, β, , µ, ν and E 0 are assumed to be non-negative. By setting 109 certain transmission parameters (α, β or ) to zero, model variants may be defined. In 110 this paper, we additionally consider a submodel with = 0, where environmental 111 contamination is not explicitly modeled and therefore only two transmission routes are 112 considered. The force of infection for this transmission model with two acquisition 113 routes is then given by The previous explanation leads to the following attribution of the terms to the different acquisition routes where i p indicates a colonized patient that is present at time t and i d a colonized patient that has been colonized prior to t but was already discharged. The bacterial load produced by patient i at time t is given by where t c i is the time of colonization and t d i the time of discharge of patient i.

137
In continuous time, the relative contribution of a specific route to the overall number of acquired colonizations is determined by the ratio of the probability of colonization due to that route and the probability of colonization: (4) where l is the number of colonized patients, t c 1 , . . . , t c l represent the times of colonization 138 and j can be either of the three considered routes. The relative contributions are then 139 given by:

Number of acquisitions 143
For the submodel including only endogenous and cross-transmission, the computation of 144 the relative contribution is derived from above by setting = 0.  1 − e − t 0 λ(x)dx ≈ λ(t) for small values of λ(t). Hence, the discrete-time formulas for the 154 relative contributions can be approximated by the continuous-time formulas evaluated 155 at discrete time steps.

156
Estimation procedure 157 We assume that a patient is admitted to the ICU at time t a i and discharged at time t d i . 158 The probability that a patient is admitted already colonized is described by the set of all screening results is denoted by X = {X 1 , . . . , X n } where n is the total number 164 of patients. Since screening tests are typically intermittent and imperfect, we define the 165 test sensitivity φ, i.e. probability that a colonized patient has a positive result.

166
The aim is to estimate the model parameters α, β, , µ, ν and E 0 as well as the 167 sensitivity of the screening test φ and the importation rate f based on longitudinal data. 168 The relative contributions of the transmission routes can then be estimated following 169 the description in (4). The key idea of the estimation procedure is to fit a stochastic 170 transmission model to the observed data. It is based on certain patterns of fluctuations 171 in the prevalence linked to the different transmission routes (as previously described in 172 section Transmission models).

173
In the analysis, we use the following input data for each patient:  screening, missing admission and discharge swabs and leads to an estimation of the true 190 (rather than the observed) prevalence on admission. Precise details of the analysis can 191 be found in S5 Text. The algorithm was implemented in C++ and was tested using 192 simulated data. Convergence of the MCMC chains were verified using visual inspection. 193 We used uninformative exponential priors Exp(0.001) for the transmission parameters 194 α, β, and µ. Parameters for the proposal distribution were tuned in order to ensure 195 rapid convergence. Similar to [26], we estimated the sensitivity φ and importation 196 parameter f using uninformative beta prior distributions Beta(1, 1). The initial 197 bacterial load E 0 was approximated by ν µĪ withĪ being the mean prevalence in the 198 ward.

199
The MCMC algorithm was run for 500, 000 iterations following a burn-in of 30, 000  During the estimation process, several assumptions are made.

204
• Incorporating both sensitivity and specificity parameters in a model may cause 205 identifiability issues. Thus, test specificity was assumed to be 100%, meaning that 206 positive results were assumed to be true positive. Experimental results indicate 207 the specificity of screening tests to be close to 100% [22]. 208

7/29
• The initial bacterial load E 0 is assumed to be the environmental contamination at the beginning of the study period. The effect of E 0 diminishes proportionally to exp(−µ) per day. It is therefore sufficient to use an approximation rather than including it as a parameter in the estimation process. We use the equilibrium state of (2) as an approximation, i.e.
whereĪ represents the mean prevalence in the ward. The environmental 209 contribution to the force of infection at time t is · E(t). As the total amount of 210 environmental contamination E(t) is unobserved, it is only possible to estimate 211 the product · E(t) and therefore, we assume the shedding parameter ν to be 212 fixed at 0.1. All remaining parameters are to be estimated from the data.

213
• Colonization was defined as the presence of bacteria at the screening sites as 214 reported in the available data. Admission and screening are assumed to occur at 215 12:00 pm and discharge at 11:59 am.

216
• Re-admissions are not accounted for. Instead every new admission is treated as a 217 new patient. The probability to be positive on admission is therefore identical for 218 all patients, irrespective whether it is a readmission or not. Since we are 219 interested in the overall prevalence and overall relative contribution of the 220 acquisition routes rather than individual predictions, we do not expect this to 221 have a major influence on our results.

222
• Since the smallest time unit is one day, colonization events occurring on a 223 particular day are assumed to be independent.

224
• A negative result on the day of colonization is considered to be a false negative 225 result.

226
• It is assumed that colonized patients contributed to the total colonized population 227 from the day after colonization onwards, or for importations, from the day of resulted from a negative culture on both swabs taken at the specific day. More than 248 84% of admitted patients were screened. As HCWs, including physicians, were (with 249 minor exceptions) working only in one of the ICUs during the whole study period, the 250 two ICUs can be treated independently in the analysis.

251
Since 2000, the hand hygiene procedures recommended in both ICUs is rubbing with 252 alcohol-based gels, or solutions (ABS). Cleaning of the rooms is done daily by using the 253 detergent-disinfectant Aniosurf ® . The sinks were cleaned daily before pouring the 254 detergent-disinfectant Aniosurf ® into the U-bends. Plumbing fittings were descaled 255 weekly.

256
In our main analysis, data for each ICU and each time period (before and after • the submodel with only endogenous and exogenous transmission.

264
Patient data were anonymized and de-identified prior to analysis.

265
Model selection

266
To assess the relative performance of a given model, we used a version of the deviance 267 information criterion (DIC) based on [23]. For an estimated parameter set θ and

274
The DIC is a simple measure that can be used to compare hierarchical models.

275
Furthermore, it allows determining whether two data sets may be concatenated or 276 should be treated separate. The idea is to distinguish two models: one that includes one 277 parameter set for both ICUs (and therefore treats them as concatenated) and one that 278 includes different parameter sets for each ICU (and thus treats them as separate). The 279 first scenario leads to one analysis and one DIC value whereas the second model results 280 in two independent analyses and hence two DIC values. The sum of the DICs of the 281 latter may be compared to the DIC value of the first scenario. A smaller DIC value is 282 preferred. More details can be found in S6 Text.

283
Model assessment 284 We chose to check the adequacy of the models using the following approach. The ability 285 of the model to predict the probability of acquisition based on the predicted force of computed to determine the actual proportion of updates for which the interval contains 295 the theoretical probability of acquisition. We set the nominal confidence level to 0.95. A 296 good fit is given when the actual coverage probability is (more or less) equal to the 297 nominal confidence level. In order to avoid the coverage probability tending to zero 298 when p acq tends to 0 or 1, Jeffreys confidence intervals are used (as recommended 299 in [24]). When N acq = 0 the lower limit is set to 0, and when N acq = N susc the upper 300 limit is set to 1.

302
Descriptive analysis of data 303 The descriptive statistics of the data sets corresponding to ICU A and B with respect to 304 the number of admissions, lengths of stay and colonization characteristics are shown in 305 The corresponding median length of stay was 8.0 days for both ICUs before and after 318 renovation, respectively. Hence, there is hardly any difference between the ICUs, nor 319 between the two time periods regarding the median length of stay.

320
In both ICUs, the fraction of patients who were positive on admission was slightly 321 higher after renovation. In contrast, the observed fraction of patients who acquired     Table 2. Acceptance probabilities for proposed updates to the augmented 335 data ranged from 3.2% (ICU B after renovation) to 11.1% (ICU A before renovation).

336
Pairwise scatter plots indicated little correlation between parameter values, with the 337 exception of a negative correlation between α and β (see S11). Histogram and trace 338 plots of the posterior estimates are given in S12-S15 Figs and show that the MCMC 339 chains rapidly mix and quickly converge to their stationary distribution. We found our 340 estimates to be robust to the choice of priors for transmission parameters.

341
The probability of being colonized with P. aeruginosa on admission and the screening 342 test sensitivity varied between the two ICUs and the time periods. For both ICUs, the 343 median estimates of the importation probability f is higher in the data set after 344 renovation than before, i.e. 4.5% and 6.2% for ICU A and 6.0% and 9.9% for ICU B.

345
The difference between the time periods is only significant for ICU B. We estimated the 346 median of the prevalence of P.A. to be 24.4% and 19.9% for ICU A and 22.3% and 347 24.4% for ICU B before and after renovation, respectively. Median estimates for the 348  respect to the two ICUs, we can conclude that there is a 95% probability that the test 351 sensitivity is higher in ICU B than in ICU A. Our possible explanation is based on the 352 fact that the ICUs differ in their patient population. As a medical ward, ICU B 353 contains patients with longer lengths of stay and more readmissions. Patients who are 354 exposed to an ICU environment for a longer period of time may have a higher 355 probability to get colonized at a detectable level. However, our explanation is only 356 hypothetical and the true reason for the difference is not known.

357
The relative importance of the two considered transmission routes per ICU and time 358 period is depicted in Fig 5 (a) and (b). For ICU A, the median relative contribution of 359 the endogenous route is 53.6% (95% CrI : 32.8 − 75.9%) and 89.3% The results suggest that both routes have an important impact on the acquisitions in 367 both ICUs. The median estimates of the relative contribution of the endogenous route 368 are higher after than before renovation in both ICUs. Thus, there is a tendency for lower 369 contribution of cross-transmission route after renovation in both ICUs. Possibly, hygiene 370 was improved after renovating the ICUs. However, since the credibility interval for the 371 endogenous route overlap before and after renovation, there is no evidence that the 372 12/29 relative contributions differ between the time periods. Before renovation, the credibility 373 intervals of the relative contributions for the endogenous route and cross-transmission 374 overlap. Thus, we conclude that no route considerably predominates the transmissions 375 before renovation. On the other hand, the respective credibility intervals do not overlap 376 after renovation. Hence, the endogenous route predominates the transmissions after 377 renovation. Comparing the results across ICUs, we can see that the credibility intervals 378 of the relative contributions overlap leading to the conclusion that the two ICUs do not 379 seem to be different regarding the relative importance of the transmission routes. Posterior estimates of the model parameters for each ICU are reported in Table 3. The 382 estimates and interpretations for the importation rate f , the screening test sensitivity φ 383 and the mean prevalence stay roughly the same when adding environmental 384 contamination as an additional route. The same holds for the median relative   from 7.2% (ICU B after renovation) to 90% (ICU A before renovation). Pairwise scatter 391 plots indicated strong correlations between α and β, β and and between and µ (see 392 Fig 24). The correlation coefficient of the latter pair ranged from 0.531 to 0.561.

393
Furthermore, it can be seen in Table 3  the transmissions (more details can be found in S7 Text). Hence, we can conclude that 409 the role of environmental contamination after discharge within the transmission process 410 of P. aeruginosa in the two ICUs A and B is small before as well as after renovation. In total, 14 analyses were performed. For each ICU, three data sets were created -one 413 for each time period and one combining the data sets before and after renovation.

435
To our knowledge, our study is the first to use mathematical transmission models to 436 estimate the relative contribution of environmental contamination after discharge for P. 437 aeruginosa using only admission, discharge and screening data. The three different 438 routes, endogenous route, cross-transmission and environmental contamination after 439 discharge, are distinguished by the resulting patterns of the prevalence that they induce. 440 We estimated that environmental contamination after discharge accounts for at most 1% 441 of the total P. aeruginosa transmissions in the two ICUs of the University hospital in 442 Besançon before and after renovation. In contrast, endogenous as well as  performed environmental studies to determine the extent of environmental 450 contamination with an epidemic strain of P. aeruginosa [25]. They concluded that the 451 transmissibility of the epidemic strain cannot be explained solely on the basis of 452 improved environmental survival. Our results likewise demonstrate that the decay of P. 453 aeruginosa is already rapid enough to limit its survival in the environment.

454
While our approach is efficient in determining the relative contribution of environmental 455 contamination after discharge requiring merely longitudinal surveillance data, it has 456 15/29 several limitations that may restrict its practical applicability.

457
Our conclusions on the impact of cleaning only applies to the environment after the 458 discharge of patients. Permanently contaminated reservoirs in ICUs, such as sinks, may 459 still serve as sources for colonization. In our model they are assigned to the endogenous 460 route. Thus, while the effect of cleaning improvement after discharge might be limited 461 for the two considered ICUs, general cleaning improvement of the environment might be 462 important to reduce permanent reservoirs for environmental contamination.

463
The results of our analysis build on a data-augmented MCMC algorithm [19,26]. combination with a small initial standard deviation for its proposal distribution resulted 475 in large acceptance ratios close to 1. The MCMC chain mixed too slowly and therefore 476 hindered the identifiability of the likelihood. We were able to tune the parameters of the 477 proposal distribution for µ such that rapid convergence to the posterior distribution  Moreover, colonization is assumed to remain until discharge. While this assumption is 485 true for P. aeurginosa it does not hold true for all antibiotic-resistant nosocomial 486 pathogens. However, intermittent carriage may be readily included allowing the method 487 to be generalized to other pathogens. 488 We assumed no difference in transmissibility between different strains of P. aeruginosa 489 and that all colonized patients are equally likely to transmit the pathogen. While 490 information on antibiotic resistance or microbial genotyping in combination with 491 epidemiological data may aid in distinguishing different strains and identifying specific 492 transmission events, only the uncertainty of the estimates would be affected. In 493 particular, the widths of the credibility intervals are likely to be reduced, but we do not 494 expect a large effect on the parameter estimates.

495
Assessing the fit of the model to the data is crucial to model building. The true relative 496 importance of the different routes of colonization in ICUs is generally unknown.

497
Genotyping data that might be used to demonstrate the source of the acquired 498 colonization is generally scarce and was not available for the data used in our analysis. 499 While the posterior predictive p-value is a popular method for assessing model fit, it has 500 been increasingly criticized for its self-fulfilling nature [27]. Furthermore, the choice of 501 the test statistic is crucial in order to adequately summarize discrepancies between 502 datasets. Rather than relying on a suitable summary statistic, we presented a model MCMC update simultaneously. Thus, the true sample size is estimated to be smaller.

509
Further improvement of the method presented here or development of other methods 510 would be a vital topic for assessing epidemic models.

511
Model selection was performed using the DIC which is known to display poor 512 performance (i.e. identifying the correct model) for complex likelihood functions such as 513 those corresponding to epidemic models. Comparing the plausibility of different models 514 is crucial for selecting the model that describes the dynamics of the observed system 515 best. Nevertheless, model choice for stochastic epidemic models is far from trivial. All 516 known approaches for model selection exhibit advantages as well as disadvantages [27] 517 which makes selecting the most suitable model comparison technique not 518 straightforward. We selected the well-known DIC-method that was easy to use and   Supporting information 538 S1 Text. Environmental contamination. The full model includes environmental contamination on a ward-level. The bacterial load at any given time t is based on the differential equation Solving the above differential equation requires discretizing over t, resulting in a finite number of time steps t 0 , t 1 , . . . , t N . We then assume I(t) = I t and N (t) = N t to be constant within a time step and use it as initial conditions. Separating variables leads to Now, two cases have to be distinguished.
Determine B t0 for initial condition E(t 0 ) = E t0 : and therefore For t 0 ≤ t ≤ t 1 the environmental load can be then computed by .
and therefore, it holds for t := max{t 0 ≤ x ≤ t N | x ≤ t} and t ∈ R \ {t 0 , t 1 , . . . , t N } and Text. Discrete-time transmission model. For the discrete-time transmission model, we assume that the number of colonized patients I(t), the total number of patients N (t) and the bacterial load E(t) is constant during the day. It is assumed that admission and screening occur at 12:00 pm on each day T determining I T and N T . Given all the information (at 12:00 pm), the environmental contamination on day T is determined. The force of infection on day T is then given by .

541
S3 Text. Relative contribution. The computations in section Relative contributions of transmission routes were developed for continuous-time models. In our discrete-time model, we assume that events such as, admission, colonization and discharge of patients and screening occur on a daily basis. However, we do assume that the level of environmental contamination changes continuously. Computing the relative contributions of the different transmission routes becomes more laborious in this scenario. Let t c i be the acquisition time of patient i ∈ {1, . . . , n}. The contribution of a route j is the ratio of the probability that the acquisition was due to route j and the total probability of acquisition: Contribution of endeogenous route = R j =  where and γ(·, ·) is the lower incomplete gamma function. Note that the derivations are 546 omitted here but can be requested from the first author.

547
S4 Text. Approximation of relative contribution in discrete-time. Large values of the force of infection λ(t) are very unlikely. Under the assumption of small λ(t), the following simplifications and approximations can be made: Therefore, the force of infection itself may be a good approximation of the probability of infection and the probability of acquiring colonization due to route j may be approximated by the respective sub-term of the force of infection assigned to route j: P (infection during day T due to route j) ≈ λ j with j ∈ {end, crossT, env}. As an approximation of the relative contribution we 548 compute the ratio of the transmission rate and the force of infection for each acquired 549 colonization: where t c i is the day of colonization of patient i ∈ {1, . . . , n} and N acq the total number 554 of occured colonizations. Furthermore, i p indicates a colonized patient that is present at 555 time t c i and i d a colonized patient that has been colonized prior to t c i but was already 556 discharged. It holds R end + R crossT + R env = 1. typically intermittent and imperfect, we define the test sensitivity φ (i.e. probability 568 that a colonized patient has a positive result). We assume that the specificity (i.e.

569
probability that an uncolonized patient has a negative result) is 100%. 570 We implemented an adapted version of the data-augmented MCMC algorithm to analyze the data. The transmission and importation model, as well as the data-augmentation method is closely based on the approach of [19,26] but adapted for the transmission routes presented in this paper. The algorithm was implemented in C++ and the analysis of the output was performed in R (Version 3.5.1) [28]. The aim of our analysis was to estimate the set of parameters θ = {α, β, , µ, f, φ}. The prior distribution were chosen as follows: where Exp(λ) represents the exponential distribution with rate λ, and Beta(a, b) the 571 beta distribution with shape parameters a and b. Having fixed a = b = 1 and λ = 0.001, 572 we use uninformative priors in our analysis.

573
The data-augmentation procedure accounts for unobserved colonization times by augmenting the parameter space with A = {t c , s a }, a set comprising of the unobserved colonization times t c and admission states s a of all n patients. An admission state of a patient is 1 if the patient is colonized upon admission and 0 otherwise. If the patient j becomes colonized during his/her stay, the colonization time may take an integer value between the time of admission t a j and time of discharge t d j (inclusive). If a patient does not acquire colonization, the respective value t c j takes a dummy value of −1. The augmented posterior density relation can be determined using Bayes' Theorem: = P (D | t c , s a , θ)P (s a | θ)P (t c | s a , θ)P (θ)

21/29
where P (D | A, θ) is the likelihood of the observed data D, P (A | θ) is the likelihood of the augmented data and P (θ) is the joint prior distribution of the parameter set θ. All terms in (8) can be explicitly calculated. It holds where T P (X) and F N (X, A) are the total number of true positive and false negative swab results, given the colonization times t c , respectively. It represents the imperfect observation of the transmission dynamics. Assuming that lost colonization can be excluded, we consider any negative result after the time of colonization as a false negative. Since false positive results are impossible, the T P (X) is not dependent on the augmented data and can be determined directly from the observed data. The probability of the set of importations, given the importation probability f is given by The transmission model itself is reflected in the probability of the colonization times given the admission states and the parameters To update the importation rate f and the sensitivity φ, we use Gibbs sampling as we The Metropolis-Hastings algorithm generates a Markov chain θ (1) , . . . , θ (N ) which converges to a target distribution π(·) if N is large enough. In each update of the Markov chain, a candidate point, θ * is sampled from a proposal density q(θ * | θ (i) ), which gives the probability density of proposing θ * , given the current, i th value. With a certain probability or so-called acceptance ratio a(θ * , θ (i) ) = min 1, the proposed value is accepted.

584
The Metropolis algorithm is a special case of the Metropolis-Hastings algorithm where 585 the proposal function is symmetrical. Since a symmetrical proposal distribution 586 simplifies the calculation of the acceptance ratio to a(θ * , θ (i) ) = min 1, π(θ * )/π(θ (i) ) , 587 it is often used for updating parameters. The proposal function has a great influence on 588 the speed of convergence and hence efficiency of the algorithm. We suggest a proposal 589 distribution that speeds up the convergence towards the target distribution while mean force of infection that should be approximated by the MCMC algorithm.

595
Proposing new parameter candidates depending on the mean force of infection reduces 596 the volume that has to be traversed in order to converge to the target distribution. The 597 resulting proposal density is not symmetric anymore and thus the procedure requires an 598 adjustment of the acceptance ratio. The adapted Metropolis-Hastings algorithm to 599 update the transmission parameters runs as follows: , accept the proposed value and set 607 θ (i+1) = θ * , else set θ (i+1) = θ (i) .
Text. Model selection. We would like to assess whether we can concatenate the 624 Besançon data e.g. before and after the renovation of the ICUs in one large data set to 625 increase the power of our method. The idea is to compare the DICs for two different 626 scenarios:

627
• Consider only one model including one parameter set θ = {α, β, f, φ} where α is the endogenous, β the cross-transmission parameter, f the importation rate and φ the test sensitvity. The analysis is then performed on all the data X of the two ICUs and the two time periods (before and after renovation). The DIC is then given by • Consider a model including a parameter set consisting of separate parameters for 628 each time period: The parameter set of the model is then: The parameters in θ 1 are updated for the data set before renovation whereas the parameters in θ 2 are updated for the data set after renovation. The deviance for this model is determined by where X 1 is the data set for the time period before and X 2 for the time period after renovation. U (0, 2)) led to more rapid convergence of the MCMC chain.

648
For a medium length of bacterial persistence, the model is able to estimate the simulated 649 parameter values i.e. the true parameter values and respective relative contributions lie 650 in the 95% credibility intervals, given the mean prevalence was large enough (> 15%). 651 We have performed further simulation studies where the relative contribution of    The results are displayed for transmission parameters α, β, and µ.  The results are displayed for importation probability f , sensitivity parameter φ,  The results are displayed for transmission parameters α, β, and µ.  The results are displayed for transmission parameters α, β, and µ.   The results are displayed for transmission parameters α, β, and µ.