Scaling law characterizing the dynamics of the transition of HIV-1 to error catastrophe

Increasing the mutation rate, &mgr; , ?> of viruses above a threshold, &mgr; c , ?> has been predicted to trigger a catastrophic loss of viral genetic information and is being explored as a novel intervention strategy. Here, we examine the dynamics of this transition using stochastic simulations mimicking within-host HIV-1 evolution. We find a scaling law governing the characteristic time of the transition: τ ≈ 0.6 / &mgr; − &mgr; c . ?> The law is robust to variations in underlying evolutionary forces and presents guidelines for treatment of HIV-1 infection with mutagens. We estimate that many years of treatment would be required before HIV-1 can suffer an error catastrophe.


Introduction
The molecular quasispecies theory, developed by Eigen and coworkers in an attempt to understand the origin of life (Eigen 1971, Eigen et al 1989, has yielded key insights into viral evolution and suggested a novel intervention strategy for treating viral infections (Domingo et al 2012). The theory predicts that when the viral mutation rate, , μ is below a threshold, , c μ the viral population in an infected individual is dominated by the fittest, or master, sequence. When , c μ μ ⩾ however, the distribution of sequences delocalizes in sequence space, obliterating the domination of the master sequence and resulting in a catastrophic loss of genetic information. Intervention to increase μ beyond c μ could thus prove lethal to the virus (Eigen 2002, Domingo et al 2012. Supporting evidence comes from in vitro experiments with the poliovirus, where 4-fold increase in μ lowered viral infectivity by 70% (Crotty et al 2001), prompting efforts to develop drugs called mutagens that can increase .
μ For treating HIV-1 infection, for instance, where current antiretroviral drugs are susceptible to failure by viral mutation-driven development of resistance (Wensing et al 2014), this novel strategy of increasing μ may be particularly promising. Indeed, several HIV-1 mutagens are in the pipeline (Loeb et al 1999, Harris et  The success of treatment with mutagens relies on accurate estimates of c μ as well as the timescale of the transition to error catastrophe, . τ c μ determines the minimum desired level of exposure to a mutagen: Typically, μ increases with mutagen dosage (e.g., see Dapp et al 2009) and the dosage employed must ensure that . c μ μ > , τ referred to also as the relaxation time or the characteristic time of the transition, determines the requisite duration of treatment given . c μ μ > In a recent clinical trial involving an HIV-1 mutagen, for instance, over the 124 days of treatment, no significant viral load change was observed in patients, although the viral mutational pattern was altered (Mullins et al 2011). This absence of antiviral activity was attributed to the poor knowledge of the dosage of the mutagen and the duration of treatment necessary to induce an error catastrophe in HIV-1 (Mullins et al 2011).
Application of the quasispecies theory to estimate c μ and τ is precluded by the assumptions the theory makes that often do not hold for specific viruses. For instance, the quasispecies theory considers the asexual evolution of an infinitely large population of haploid organisms on an isolated peak fitness landscape. HIV-1, in contrast, is diploid, undergoes recombination (Onafuwa-Nuga and Telesnitsky 2009), has a finite population in infected individuals (Kouyos et al 2006), and experiences a much broader fitness landscape (Bonhoeffer et al 2004, Hinkley et al 2011, Ferguson et al 2013. Although attempts to relax several of the assumptions of the quasispecies theory have been successful (e.g., see Nilsson and Snoad 2000, Saakian and Hu 2006, Park and Deem 2007, Nagar and Jain 2009, Alonso and Fort 2010, incorporating all of the above features of HIV-1 in the theory has not been possible. Recently, as an alternative, we developed population genetics-based stochastic simulations that incorporate all the above features of HIV-1 and closely mimic patient data of viral genomic evolution over extended durations following infection (Vijay et al 2008, Balagam et al 2011. With these simulations, we found that HIV-1 experiences a sharp transition to error catastrophe and estimated c μ to be 2-6 fold higher than its natural mutation rate (Tripathi et al 2012). A guideline for the desired level of exposure to HIV-1 mutagens thus emerges, and has been applied to the drug KP1212 (Li et al 2014). The required duration of treatment, however, remains unknown.
In the present study, we employ the stochastic simulations above to estimate τ as a function of ( ) . c μ μ > We find that τ obeys a scaling law: We examine the dependence of this scaling on simulation parameters, viz., population size, genome length, cellular superinfection frequency, recombination rate, and the fitness landscape, and find that the scaling is robust to variations in these parameters although c μ is not. With this scaling law, we estimate that treatment with mutagens would have to last several years before the transition of HIV-1 to error catastrophe can be realized.

Methods
Simulation procedure Our simulations proceed as follows. We represent viral genomes as bit-strings of length L. Each virus contains two such genomic strings. We initiate infection with a founder sequence, F, which we assume without loss of generality to be the fittest sequence (see figure S1). We construct F by randomly choosing a sequence of nucleotides, A, G, C, and U. We let all the V virions in the initial viral pool contain F. We infect C uninfected cells synchronously with virions drawn from the viral pool. Infection involves transferring the chosen viral genomes to the cells. Each cell is infected by M virions, where M is drawn randomly from a distribution that mimics experiments (stacks.iop.org/PB/12/054001/ mmedia). Following infection, the two genomes from each infecting virion undergo reverse transcription, which involves mutation and recombination. We commence reverse transcription from one end of one of the genomes chosen with a probability ½. The nucleotide sequence of that string is copied bit by bit until a switch to the other genome is implemented, which then acts as the template. At each position, the switch occurs with a probability , ρ the recombination rate. The resulting string is thus a mosaic of the two parent genomes. This recombined string is mutated with a probability μ per position. The mutated string yields the proviral DNA produced from reverse transcription of the viral genomes. We repeat this procedure for all the viral genomes infecting each cell. Infected cells then produce progeny virions. The proviral DNA within a cell are randomly chosen in pairs with repetition and copied as the genomes of progeny virions. Each cell is assumed to produce P progeny virions. The progeny virions are subjected to fitness selection, where virions are chosen with a probability equal to their relative fitness to yield a new pool of virions for the infection of a fresh set of C uninfected cells. The fitness of a virion is assumed to be the average of the fitness of its genomes. Genomic fitness is determined by an experimentally identified fitness landscape for HIV-1 (see text S1).
In each generation, we compute the distribution of genomes in different Hamming classes. Genomes carrying i mutations relative to F belong to Hamming class i. We also compute the Shannon entropy associated with the distribution, where v j is the frequency (or relative prevalence) of genome j. The time when the mean Shannon entropy reaches 0.9 is taken as , τ the time when the quasispecies has practically completely delocalized in sequence space, indicating error catastrophe. We estimate τ as a function of ( ) c μ μ > and other simulation parameters. We set parameters to values representative of HIV-1 infection in vivo (Tripathi et al 2012, Thangavelu et al 2014) and also examine implications of their variations (text S1). For each parameter setting, we run simulations for up to 10000 generations and average over up to 10 realizations (figure S2). We perform the simulations using a code written in C++.

Results
Dynamics of the transition to error catastrophe We start with a viral population comprising the founder sequence and let the mutation rate, , μ exceed the error threshold, . c μ Initially, the distribution of genomic sequences is localized at Hamming class 0. With time, mutations accumulate and the distribution shifts to higher Hamming classes. When 60 , c μ μ ≈ for instance, the peak of the distribution is at a Hamming class ∼2000 after 50 generations and reaches Hamming class ∼5000 in ∼500 generations, where it subsequently stays ( figure 1(a)). Note that for the full length genomes considered here, i.e., L 10000 = nucleotides, the peak at L/2 5000 = nucleotides marks complete delocalization of the quasispecies in sequence space, indicating completion of the transition to error catastrophe. (When , c μ μ < the peak stabilizes at a Hamming class .) The corresponding evolution of the Shannon entropy, H, shows a rise from zero at the start of the simulations to an asymptotic plateau to unity at the end ( figure 1(b)). H here serves as an order parameter. When all genomes are identical, indicating complete localization, H 0, = whereas when all possible genomes are equally represented, indicating complete delocalization in sequence space, H 1.
= In our previous simulations to determine c μ (Tripathi et al 2012), we assumed that when H crossed 0.9, the quasispecies was sufficiently delocalized to be treated as having suffered an error catastrophe. In accordance, here we define , τ the time required for the transition to error catastrophe to be practically complete, as the time when H 0.9. = (Our key conclusions are not sensitive to the latter value of H (figure S3).) Thus, when 60 ,
τ Indeed, we find upon performing simulations with several values of μ and analyzing the data that τ scales as~1/ ( ) c μ μ − ( figure 1(c)). This scaling is consistent with the expectation that as μ approaches c μ from above, τ diverges, and with the observation that when , c μ μ ≫ an n-fold increase in μ results in an n-fold decrease in . τ We examine next whether the scaling is robust to parameter variations.

Robustness of the scaling law
We vary the population size of cells, C, the genome length, L, the frequency of multiple infections of cells, M, the recombination rate, , ρ and the fitness landscape. For many parameter combinations, estimates of c μ are available from our previous study (Tripathi et al 2012); where unavailable, we estimate c μ here (table S1). We find that c μ varies with the parameters (figure S5). For instance, consistent with our previous findings (Tripathi et al 2012), c μ scales as C 1/ . Yet, intriguingly, τ depends only on , c μ μ − regardless of the parameters. The dependence of τ on parameters is thus subsumed in . c μ Further, not only is the scaling applicable to all parameter variations, data from the various parameter sweeps overlap (figure 2). Thus, the scaling appears remarkably robust to parameter variations. Using a least squares fit to all our simulation data, we obtain the scaling law:

Estimate of the minimum treatment duration with mutagens
We apply the scaling law to estimate the duration of treatment with an HIV-1 mutagen for ensuring that the transition to error catastrophe is practically complete. We estimated c μ for HIV-1 in vivo previously to be 7 10 1 10 ≈ − years. If the mutagen were to drive μ 10-fold above , c μ the corresponding duration would be reduced to 3.5 5 ≈ − years. Thus, the transition appears slow and treatment over several years appears necessary to drive HIV-1 to an error catastrophe.

Discussion
The notion inspired by the quasispecies theory that increasing the viral mutation rate can drive viruses past an error threshold is being explored as a novel treatment strategy for viral infections and holds particular promise against infections such as HIV-1 because it may be less susceptible to failure via viral mutation-driven development of resistance than current antiretroviral treatments (Domingo et al 2012). Indeed, several mutagens are in development for HIV-1 infection. In a previous study, we estimated the error threshold, , c μ of HIV-1 to be 2-6-fold higher than its natural mutation rate, suggesting that modest increases in the mutation rate could drive HIV-1 past its error threshold (Tripathi et al 2012). A recent study, using an empirical fitness landscape of the HIV-1 protein p6, has also argued that HIV-1 survives close to its error threshold (Hart and Ferguson 2015). Here, we identify a scaling law, 0.6/ ( ), c τ μ μ ≈ − that relates the timescale, , τ of the transition to error catastrophe to the mutation rate, , μ presenting guidelines for the duration of treatment with HIV-1 mutagens. We estimate that such treatments would have to last several years before HIV-1 suffers an error catastrophe.
Our findings may explain why the short span of 124 days might not have resulted in a significant viral load decline in patients in a recent clinical trial with an HIV-1 mutagen (Mullins et al 2011). According to the scaling law we identify, >50-fold increase in the mutation rate would be required to realize an error catastrophe in such a short span, which is unlikely to be achieved with the dosage employed. Exposure to the mutagen KP1212 has been observed to increase mutational frequencies by 40-90% (i.e., <2-fold) in the HIV-1 envelope gene in vitro (Harris et al 2005). Its pro-drug, KP1461, employed in the trial, appeared to yield a much smaller, ∼20%, increase in the so-called private mutational frequency in vivo (Mullins et al 2011). Clearly, much longer treatment durations appear to be necessary with the dosages employed.
We find the scaling law to be robust to parameter variations. Parameter variations alter the strengths of underlying evolutionary forces. For instance, increasing the frequency of multiple infections of cells increases the contribution of recombination to viral genomic diversification. Similarly, increasing the population size lowers stochastic effects and hence random genetic drift. The scaling law is thus robust to variations in the underlying evolutionary forces. Parameter variations, however, do affect . c μ Thus, the effects of parameter variations are all subsumed in c μ leaving the scaling law unaffected. The origin of this intriguing scaling behaviour remains difficult to unravel. Campos and Fontanari (1999) found using a Markov processes approach that the time for the master sequence to vanish from the population depended on an unknown function of where Q is the probability of correct copying of an infinitely long sequence, and a is the selective advantage of the master sequence in an isolated peak fitness landscape. Here, in closer agreement with the definition of error catastrophe, we estimate τ as the time for the quasispecies to delocalize in sequence space and not for the master sequence to vanish. While a Markov chain that  figure S5. The black line represents the best-fit to all of the data and the grey lines 95% confidence limits. mimics our simulations in the absence of recombination can be set up, analyzing its properties and in particular estimating its so-called mixing time, which would yield , τ poses an outstanding challenge (Dixit et al 2012, Vishnoi 2015. A further complication is introduced by the lack of reliable estimates of the fitness of strains carrying multiple mutations leaving the global fitness landscape inadequately characterized (Hinkley et al 2011, Ferguson et al 2013. We recognize several limitations of our study. First, our simulations ignore lethal mutations that could be introduced by a mutagen. Lethal mutations lower viral infectivity and/or productivity, which may decrease the viral population size continuously leading to an extinction rather than an error catastrophe (Tejero et al 2010), as also predicted with APOBEC3Ginduced hypermutations (Thangavelu et al 2014). The required treatment duration may then be shorter than estimated by the present scaling law. Using mutagens in combination with other drugs may similarly shorten the required treatment duration. Further, we employ estimates of the viral generation time (∼2 d), obtained by continuum viral dynamics models (Dixit et al 2004), that are larger than those obtained by fits of our simulations, which assume discrete generations with synchronous infections, to patient data (∼1 d) (Balagam et al 2011). Lower generation times would again imply shorter required treatment durations. Our estimate of the required treatment duration is thus a conservative one. Second, our simulations do not incorporate latent reservoirs of HIV-1 (Siliciano and Greene 2011) because the reservoirs do not contribute significantly to active viral diversification and would remain unaffected by mutagens. Eliminating the reservoirs remains a key challenge to HIV-1 eradication (Deng et al 2015). Finally, our simulations also do not explicitly consider the host immune response (Perreau et al 2013). The immune response, to a first approximation, may be viewed as resulting in an altered fitness landscape (Shekhar et al 2013), which we expect not to affect the scaling law we identify (see above). Besides, with the effective population size we employ, our simulations are in agreement with patient data of viral diversification over extended durations (Vijay et al 2008, Balagam et al 2011 so that the effects of the immune response on viral diversification are implicitly accounted for. Nonetheless, the full scope of the interactions between HIV-1 and the host immune system is yet to be unravelled (Perreau et al 2013), leaving its implications for treatment with mutagens the subject of future study.
In summary, we unravel a new scaling law characterizing the dynamics of the transition of HIV-1 to error catastrophe and estimate that treatment of HIV-1 infection with mutagens may have to last several years for the transition to be realized.