Accounting for cellular-level variation in lysis: implications for virus–host dynamics

ABSTRACT Viral impacts on microbial populations depend on interaction phenotypes—including viral traits spanning the adsorption rate, latent period, and burst size. The latent period is a key viral trait in lytic infections. Deﬁned as the time from viral adsorption to viral progeny release, the latent period of bacteriophage is conventionally inferred via one-step growth curves in which the accumulation of free virus is measured over time in a population of infected cells. Developed more than 80 years ago, one-step growth curves do not account for cellular-level variability in the timing of lysis, potentially biasing inference of viral traits. Here, we use nonlinear dynamical models to understand how individual-level variation of the latent period impacts virus–host dynamics. Our modeling approach shows that inference of the latent period via one-step growth curves is systematically biased—generating estimates of shorter latent periods than the underlying population-level mean. The bias arises because variability in lysis timing at the cellular level leads to a fraction of early burst events, which are interpreted, artefactually, as an earlier mean time of viral release. We develop a computational framework to estimate latent period variability from joint measurements of host and free virus populations. Our computational framework recovers both the mean and variance of the latent period within simulated infections including realistic measurement noise. This work suggests that reframing the latent period as a distribution to account for variability in the population will improve the study of viral traits and their role in shaping microbial populations. IMPORTANCE Quantifying viral traits—including the adsorption rate, burst size, and latent period—is critical to characterize viral infection dynamics and develop predictive models of viral impacts across scales from cells to ecosystems. Here, we revisit the gold standard of viral trait estimation—the one-step growth curve—to assess the extent to which assumptions at the core of viral infection dynamics lead to ongoing and systematic biases in inferences of viral traits. We show that latent period estimates obtained via one-step growth curves systematically underestimate the mean latent period and, in turn, overestimate the rate of viral killing at population scales. By explicitly incorporating trait variability into a dynamical inference framework that leverages both virus and host time series, we provide a practical route to improve estimates of the mean and variance of viral traits across diverse virus–microbe systems.


S1.1 Nonlinear population model of lytic phage infections
We use a coupled system of nonlinear differential equations to model virus-host dynamics, including susceptible cells, S, free viruses, V , exposed cells, E, and actively-infected cells, I.In this model, susceptible host cells, S, are infected by free virus particles, V .We assume that viruses and hosts are well-mixed.Under these assumptions, the incidence, the rate at which susceptible cells are infected, is given by the mass action term: i(t) = ϕ S V , where ϕ(ml/hr) denotes the adsorption rate.
To incorporate variability in latent period, we assume that before entering the actively-infected stage, I, infected cells advance through several exposed E stages: E 1 , . . ., E n , where n is a non-negative integer; infected cells move between stages at a rate of (n + 1) η with exponentially distributed times, where η is the lysis rate.As the cells remain in each stage for a period of T /(n + 1) on average and there are n + 1 stage transitions, the average time from adsorption (i.e., entering the first exposed class, E 1 ) to cell burst (i.e., exiting the actively-infected class, I) is the latent period mean, equal to the inverse of the mean lysis rate, T = 1/η.At the end of the actively-infected stage, I, the cell bursts and free virus, V , increases as a result of viral release of β virions.The system of nonlinear, ordinary differential equations can be written in the form: where µ (1/hr) denotes the maximal cellular growth rate, K (1/ml) denotes the cellular carrying capacity in the absence of viruses, and N = S + I + n k=1 E k is the total cell population.Note that when the number of E stages n equals 0, the model is reduced to the SIV model where İ = i(t) − η I.We assume that infected cells at any stage of infection do not grow and that cell death rates and virus washout rates are negligible compared to other key rate constants of the system.See Table 1 for model parameter descriptions.Hence, this model describes the latent period distribution as an Erlang distribution with shape n + 1, the number of exposed (E) compartments plus the infected (I) compartment, and rate η, the lysis rate.In this form, the mean (T ), variance (σ 2 ), and coefficient of variation (σ/T ) of the latent period are given by: Mean(LP) : Var(LP) : CV(LP) : The number of E compartments modulates the dispersion of the distribution through the coefficient of variation (CV), with larger n resulting in tighter distributions with smaller CV (Figure S1).Note that requiring n to be an integer limits the CV values that can be simulated.For example, n = 0 corresponds to CV = 1, while n = 1 corresponds to CV ≈ 0.7, meaning that CV values between 0.7 and 1 cannot be represented using our model.Latent period distributions with CV lower than 0.5 can be simulated with our model at a tolerance of 0.05.Based on lysis timing variability of induced lysogens [1], we expect latent period CV values in natural systems to be <0.5, consistent with our model's effective coverage of latent period variability.

S1.2 Parameter inference
This section provides extended details for the parameter estimation section in the Methods of the main text.All code is written in Julia v1.8 [2].Our computational framework fits host and viral temporal data to our nonlinear population model of lytic infections to recover the parameters that allow the model to better represent the data.
First, host-only parameters, i.e., host growth rate (µ) and carrying capacity (K) are estimated using temporal data of host without added phage that captures the exponential and stationary phases.The framework is composed of two steps: (1) The log-transformed data is segmented into two lines, one representing the slope of the exponential phase and the other one representing the growth plateau.The slope of the exponential phase is assumed to approximate the host growth rate (µ) and the y-intercept of the stationary phase is assumed to approximate the carrying capacity (K); (2) The point estimates obtained in step (1) are used to inform priors for a Markov Chain Monte Carlo (MCMC) search.Details on the MCMC search implementation are provided below.These host-only predicted parameter values are used in the following steps of the parameter inference.
Second, the computational framework estimates the model microbe-virus parameters, i.e., adsorption rate (ϕ), burst size (β), lysis rate (η), and the number of E compartments (n) using host and free virus temporal data from 'multi-cycle response curves' where free virus is added to a microbial population and both free virus (V ) and total host (N ) densities are measured at multiple time points.The framework's main aim is to recapitulate the latent period distribution described by the latent period mean (T = 1/η) and coefficient of variation (CV = (n + 1) 1/2 ).The framework for microbe-virus trait estimation is comprised of two main steps: (1) a likelihood function to narrow the search space for all microbe-virus parameters; (2) a Markov Chain Monte Carlo (MCMC) search with prior distributions informed by the likelihood function in step (1).In step (1), rough parameter ranges are found using a grid search for the maximum likelihood estimate (MLE) of the parameter combination.In step (2), we implement MCMC using confidence intervals obtained in step (1) to inform prior distributions.As chains can converge to different parameter ranges, we choose the set of chains that converge to the same values for all parameters and that minimize the error chains.The resulting posteriors are then used as priors for a second round of MCMC.We obtain 95% confidence intervals by sampling the resulting posterior distributions.

S1.2.1 MCMC implementation
For step (2) in parameter estimation of both host-only and microbe-viral parameters we use Markov Chain Monte Carlo (MCMC) implemented in the probabilistic inference package Turing [3] in the Julia language v1.8 [2].MCMC is a class of algorithms that aim to obtain the equilibrium probability distribution of the model parameters [4].We use the No-U-Turn Sampler (NUTS) from the Turing package [3] to sample the posterior distribution.For each implementation, we ran MCMC chains for 2000 iterations with a 1000iteration warm-up period.The target acceptance ratio was set to 0.45.We obtained high autocorrelation for the cases where the coefficient of variation is small.In these cases, we achieved chain convergence by thinning chains to obtain 3000 steps.

S1.2.2 Chain convergence analysis
To ensure the convergence of the MCMC chains in step ( 2), we calculated the potential scale reduction, R [5] for the pooled 10000-iteration long chains.For each parameter, R measures the ratio of the average variance in samples of the chain to the variance of the whole chain.When the chains have converged around a posterior distribution, R will be centered around 1. If the chains have not converged to a single distribution, R values will be decimals larger than 1.The R calculated for all our chains is centered below 1.1, suggesting convergence.
We computed the Effective Sample Size (ESS, N ef f ) ratio as N ef f divided by the number of iterations in the pooled chain (N = 3000).A low ratio, an empirical common cutoff of N ef f N << 0.1, suggests sampling issues and unreliability to estimate confidence intervals.The ESS ratio for all our chains is close to 0.1.Both R and N ef f were calculated with the ess_rhat function from the MCMCDiagnosticTools Julia package [6] (Figure S5).

S1.2.3 Parameter ranges and MCMC priors
For step (1) of the microbe-virus parameter inference, we do a Maximum Likelihood grid search with biologically relevant parameter value ranges.The prior distributions for the MCMC search in step (2) are informed by estimations obtained from step (1).The prior distributions are given by a LogN ormal(θ, ν), where θ = ln mode + 2 3 ln mean mode and ν = 2 3 ln mean mode .For the first round of MCMC the mode of the distribution is equal to the Maximum Likelihood point estimates obtained in step (1), and the distribution mean is equal to two times the point estimate.For the second round of MCMC (only used for microbevirus parameters), the mode is equal to the mean of the parameter chain from the previous round, and the distribution mean equals the mode plus two times the chain's standard deviation.CV [1] in [19] in [1] Lysis time ± sd (r) [19] Mean lysis time ± sd (n) [ S2: Lysis time in λ phage lysogens is consistently underestimated when using population-level protocols.The lysis time upon thermal induction of isogenic λ strains that differ only in mutations in the S gene, which encodes a holin-antiholin system that modulates lysis timing, was determined using turbidity assays in [19].In a subsequent study, the lysis time of individual cells was measured for the same strains using microscope tracking [1].The same induction protocol was followed in both studies.Notably, the mean lysis time calculated from cellular-level measurements is consistently longer than the one obtained from population-level protocols.S105 strains have a mutation that abolishes antiholin expression.(r) represents the number of replicates in turbidity assays.(n) represents the number of individual cells observed and may represent the pool of multiple experiments.

Parameter Parameter description
Step (1) Search space Step ( Table S3: Parameter ranges and MCMC priors used for parameter inference.The specific values for θ and ν are described in the text above.Time, hr  : Lysis time estimation bias in λ phage lysogens using population-level protocols is dependent on the coefficient of variation.Top, the population to cellular-level lysis time measurement ratio decreases with larger CV, as a result of early lysis events.Confidence intervals were calculated by multiple resampling using experimental mean and standard deviation assuming normality, as explained in [20].Our model predicts a similar trend where the mean lysis time underestimation increases with CV when using either PFU assays (represented by the time of first visible burst, bottom left panel) or turbidity assays (represented by the start of host population collapse, bottom right panel) as indicators of lysis time.This trend is parameter-free, as we simulate a perfectly synchronized population of infected cells by initializing the simulations with the total host density in the first infection compartment (E 1 ).
Given that infected cells are assumed to not grow, host-only parameters (growth rate, µ and carrying capacity, K) are irrelevant.The time of first burst and start of host population collapse can be characterized independent of burst size (β).Our model also suggest that assays using host dynamics may estimate the mean lysis time more accurately than PFU-based methods (compare purple to orange lines in top panel).

Figure S1 :
Figure S1: Latent period variability incorporated in a lytic infection model.A) We construct a compartmental model of lytic viral infections.Susceptible cells, S, grow in a logistic manner determined by the growth rate (µ) and carrying capacity (K).Susceptible cells are infected by free viral particles, V , dependent on the susceptible cell, viral density, and the adsorption rate (ϕ).Before entering the actively infected stage, I, infected cells advance through several exposed E stages.The number of E stages (n) is a parameter of the model that determines the coefficient of variation (CV) of the latent period distribution.After reaching the actively infected stage I, cells burst yielding viral offspring with burst size β.The average time before cell burst is the inverse of the lysis rate (η).B) The number of E compartments (n) is treated as a model parameter.The CV of the distribution scales like (n + 1) −1/2 .

Figure S2
FigureS2: Lysis time estimation bias in λ phage lysogens using population-level protocols is dependent on the coefficient of variation.Top, the population to cellular-level lysis time measurement ratio decreases with larger CV, as a result of early lysis events.Confidence intervals were calculated by multiple resampling using experimental mean and standard deviation assuming normality, as explained in[20].Our model predicts a similar trend where the mean lysis time underestimation increases with CV when using either PFU assays (represented by the time of first visible burst, bottom left panel) or turbidity assays (represented by the start of host population collapse, bottom right panel) as indicators of lysis time.This trend is parameter-free, as we simulate a perfectly synchronized population of infected cells by initializing the simulations with the total host density in the first infection compartment (E 1 ).Given that infected cells are assumed to not grow, host-only parameters (growth rate, µ and carrying capacity, K) are irrelevant.The time of first burst and start of host population collapse can be characterized independent of burst size (β).Our model also suggest that assays using host dynamics may estimate the mean lysis time more accurately than PFU-based methods (compare purple to orange lines in top panel).

E
Figure S3: In silico multi-cycle response curves of biological systems.Using our virus-microbe dynamical model (see Methods) we simulate the dynamics of three different virus-microbe pairs.Parameter values were obtained from literature estimations of the host and virus life-history traits (Table3).Multi-cycle response curves were simulated for the different systems varying the latent period coefficient of variation to range between 0.05 and 0.5 (distributions are shown on the left).The free virus and host densities are shown in the middle and right panels, respectively.

Table S1 :
Data sources for Figure1Sources for one-step growth curves shown in Figure1including host species with strains and virus used to perform the experiments.The latent period is explicitly reported in the text and/or figure captions of the sources.