ROTATING ANTIBIOTICS SELECTS OPTIMALLY AGAINST ANTIBIOTIC RESISTANCE, IN THEORY

The purpose of this paper is to use mathematical models to investigate the claim made in the medical literature over a decade ago that the routine rotation of antibiotics in an intensive care unit (ICU) will select against the evolution and spread of antibiotic-resistant pathogens. In contrast, previous theoretical studies addressing this question have demonstrated that routinely changing the drug of choice for a given pathogenic infection may in fact lead to a greater incidence of drug resistance in comparison to the random deployment of different drugs. Using mathematical models that do not explicitly incorporate the spatial dynamics of pathogen transmission within the ICU or hospital and assuming the antibiotics are from distinct functional groups, we use a control theoretic-approach to prove that one can relax the medical notion of what constitutes an antibiotic rotation and so obtain protocols that are arbitrarily close to the optimum. Finally, we show that theoretical feedback control measures that rotate between different antibiotics motivated directly by the outcome of clinical studies can be deployed to good effect to reduce the prevalence of antibiotic resistance below what can be achieved with random antibiotic use.


Introduction
Antibiotic rotation was been proposed over a decade ago as a way of reducing the incidence of antibiotic-resistant infections. This view, articulated by Niederman in the editorial Is Crop Rotation of Antibiotics the Solution to a Resistant Problem in the ICU? (see [9]) states "The 'crop rotation' theory of antibiotic use has suggested that if we routinely vary our 'go to' antibiotic in the ICU (intensive care unit), we can minimize the emergence of resistance..." In the intervening decade, a number of theoretical studies have espoused a different viewpoint in proposing that the heterogenous, random deployment of antibiotics in an ICU unit or hospital can slow the evolution and spread of drug-resistant pathogens [7,2,3]. The purpose of this paper is to interpret the antibiotic deployment problem in the framework of optimal control theory using mathematical models of antibiotic use already developed in [2,3] and our main finding can be summarised thus: for such mathematical models, the optimal antibiotic usage protocols do indeed rotate between their 'go-to' antibiotics, just not routinely.
There is no discrepancy between the findings of [2,3] and this paper; the apparent difference between the two sets of results rests in the interpretation of what antibiotic rotation means. The citation of Niederman hints at a scheduled and cyclical rotation that exchanges one drug for another periodically, where that period is fixed at the start of a clinical trial, say, just as a crop rotation might only change the crop with each new season. The work in [2,3] shows that this idea need not work for antibiotics. Indeed we believe that there is no theoretical basis to support the optimality of scheduled antibiotic rotation. However, as we show below, it is equally true that the random allocation of drugs to each patient is not optimal in the models of [2,3].
In general, the optimal protocol will exchange one antibiotic for another across the theoretical ICU unit or hospital, not routinely or randomly, but in a manner commensurate with the epidemiological and evolutionary dynamics observed in each context. It is the resultant adaptive rotation of antibiotics based on the observation, or even partial observation, of those dynamics that may lead to the optimal protocol and minimise selection for drug-resistant pathogens. We arrive at this theoretical result by first noticing that rotational protocols as they are modelled in [2,3] switch between the prioritisation of two drugs in such a way that one of them is designated the 'go-to' drug at every moment in time. As we explain later, this form of antibiotic protocol can be written as a bang-bang function which allows us to apply standard control-theoretic results (see [15], for example) and deduce the theoretical optimality, or at least near-optimality, of rotational protocols.
In terms of empirical evidence for and against the cycling of antibiotics, some studies support rotation [11,8] but others either advocate against it or at least indicate indifference [13,14,10]. The authors of [6] goes as far as making the claim that antibiotic rotation may be implicated in the cause of an outbreak of resistant Pseudomonas aeruginosa. Empirical studies evaluating the efficacy of antibiotic rotation prior to 2005 have also been criticised for 'multiple methodological flaws and a lack of standardization', a particular criticism being the lack of repetition of cycles within rotational protocols [4].
In order to place our analysis into an empirical context we end the paper by taking the idea that '...prescription patterns balancing the use of different antimicrobials should be promoted to reduce selection pressure' from [12] to create a feedback control strategy that balances the use of different antibiotics. To design the rules for this controller we distill the following observation taken from [1] into a mathematical form: "A non-premeditated change of antibiotics in empirical therapy, on the basis of detected resistance patterns, provided promising results in reducing some antimicrobial resistance rates." We interpret this quotation as a maxim that can be employed to control the spread of resistance in theoretical models of antibiotic use, this maxim states: if the observed level of resistance to an antibiotic is too high, exchange it for a different antibiotic. Later, we show by example that the implementation of this simple rule in pre-existing mathematical models of antibiotic use can outperform the random allocation of drugs.
We end this section with a remark. A crucial biological assumption is used in [2,3] to simplify the modelling problem, namely antibiotic symmetry. This assumption is not benign. It is a mathematical degeneracy and we prove that antibiotic rotation, in the weaker sense defined in this paper, is optimal whenever such a symmetry property is not present in a mathematical, epidemiological model of antibiotic use.
1.1. Notation. The 1-norm of a vector s = (s 1 , ..., s k ) is given by s 1 = k i=1 |s i |. L(R k ) denotes the space of linear maps on R k , s 2 denotes the 2-norm of s: if s = (s 1 , ..., s k ) then For each linear mapping A ∈ L(R k ) we define the operator 2-norm A 2 = sup s 2 =1 As 2 . A function or vector that is zero everywhere will be denoted, on occasion, by 0, so that f = 0 means that f (t) = 0 or f (t) = (0, 0, ..., 0) for all t.
We shall use a barcode graphic to denote the deployment of two different antibiotics as part of a rotational protocol, as illustrated in Figure 1. This graphic shows that all patients are treated initially with drug A, before a switch is invoked at time T 1 to drug B.  Throughout the paper shall use boldscript s to denote the state variable of a mathematical model, p will denote a vector of fixed parameters used to define the model and t will denote time. The following mathematical model is investigated in [3, Case III]: where the state variable is given by s := (x, y w , y a , y b ) and the set of fixed, epidemiological parameters in this model is given by whose interpretation is contained in Table 1. Here, x denotes the density of uninfected hosts in a hospital or intensive care unit, say, y w is the density of hosts infected by wild-type bacterial strain, y a are hosts infected with A-resistant bacteria and y b are hosts infected with B-resistant strains. There are no multidrug-resistant bacterial strains in this model, although that case is also considered in a different model in [3].
In (1), f a is a variable that may depend on time and denotes the proportion of infected hosts treated with antibiotic A, f b is the proportion of hosts treated with a second antibiotic B, moreover we shall invoke a must-treat everyone constraint that for all times t ≥ 0. The optimal control problem for (1) is to determine the protocol f a (t) that minimises the observed prevalence of resistance over a given time period of length T : (1). Definition 1. The 50-50 mixing protocol for Problem 1 is defined by taking a constant value for the treatment protocol f a , namely f a (t) = 1/2 for all t ≥ 0. The interpretation of this condition is that exactly half of all infected hosts are treated with drug A, half with drug B so that f b (t) = 1/2 too. As the mode in Problem 1 does not track individual treatments, this corresponds to the random allocation of the two drugs per infected patient.
Other mathematical models of drug use are given in the literature such as the following developed in [2]. Let S be the fraction of patients in a hospital colonised by antibiotic susceptible bacteria, let R 1 be the fraction of patients colonised by bacteria susceptible to antibiotic 1, let R 2 be the fraction colonised by bacteria susceptible to antibiotic 2 and then X denotes the fraction of uncolonised patients. If we use these variables to create a state vector s = (S, R 1 , R 2 , X), the following epidemiological dynamics describing the antibiotic treatment of a patient population in a hospital are given in [2]: The interpretation of the parameter set used in this model p = {µ, σ, m, m 1 , m 2 , γ, β, α, τ max , c 1 , c 2 } is given in Table 2. As done in [2] and in Problem 1 above, we simplify the optimisation problem associated with (2) by imposing the must-treat constraint that τ 1 (t) + τ 2 (t) = τ max for all 0 ≤ t ≤ T , where τ max is a fixed parameter that determines the maximum rate of drug use. The optimal treatment problem for (2) is to minimise the observed prevalence of antibiotic-resistant infections subject to treating at the maximum rate possible. We state this mathematically as follows:

and equation (2).
Problem 2 also has a 50-50 mixing protocol that is defined by taking a constant value for the treatments: τ 1 (t) = τ max /2 for all t.    parameter meaning τ 1 , τ 2 rate of use of drugs 1 and 2 per unit time (days) m, m 1 , m 2 patients enter hospital in states S, R 1 and R 2 at rates µm, µm 1 and µm 2 resp. c 1 , c 2 fitness cost of resistance to bacteria σ relative rate of secondary colonization to primary colonization β rate constant for colonization of uncolonized individuals γ untreated patients colonized by susceptible bacteria remain colonized 1/γ days on average µ rate of patient turnover in the hospital α represents physician compliance with cycling program must be multiplied by the total population size in the hospital (some fixed and unknown constant) in order to represent the total number of patients infected with antibiotic-resistant pathogens over the period observed. So, T 0 R 1 (t) + R 2 (t)dt/T is the per unit time, mean fraction of patients infected with drug-resistant pathogens; it is unimportant whether or not we divide by T when minimising the treatment payoff as T is a fixed parameter. This set differs from the parameters given in [3] where s = 1/1000 and r a = 1/10; the parameter b does not appear to have a defined numerical value in [3]. When working with Problem 2 we shall use the numerical parameter set Throughout the paper the term parameter-initial condition set (PICS) will be used for the set of epidemiological parameters and initial conditions defined within Problems 1 and 2, note that each element of a PICS forms a pair that we shall write throughout as (p, s 0 ). The following important definition makes explicit the term symmetric as it is used in [2].
0 ) so-defined are asymmetric in the sense of Definition 2, contrasting with the values chosen for numerical simulations in [3,2] where symmetric values are used.
Mathematical models that have symmetric parameter values and initial conditions can be thought of as descriptions of antibiotic deployment problems in which the fundamental epidemiological properties of the drugs are identical. This may means that there are equal fitness costs of antibiotic resistance to the pathogenic bacteria, equal transmission rates of those pathogens or the equal prevalence of resistant phenotypes at the beginning of an observation period. However, while it is natural to support antibiotic symmetry on the grounds of numerical parsimony, we claim it is unlikely that two antibiotics will exert precisely the same selection pressures on bacterial pathogens. As a result we have chosen to use slightly different parameter sets for our illustrative simulations given later in the paper from those found in [3,2] in order to mimic the deployment of two antibiotics from distinct functional groups as defined, for example, in the sense of [17].
We make the claim that both Problem 1 and Problem 2 must reflect this fundamental property on biological grounds too. Consider two antibiotics, rifampin (rif) and sorangicin A (sor), that have the same mode of action and bind to the same residue on their common target protein, inhibiting the synthesis of mRNA by binding to the β subunit of RNA polymerase. Rif causes the bacterial cell to abort transcription at the elongation phase, as does sor, albeit with slightly different abortive transcripts and the gene rpoB controls resistance mutations to both antibiotics. However, it is known [5] that mutations in rpoB conferring resistance to rif need not confer resistance to sol because of the greater flexibility of the sorangicin A molecule (also see [16]); thus functionally identical antibiotics may be different from an evolutionary perspective. As a result we argue that we should seek to understand the structure of solutions to Problems 1 and 2 for all parameter sets, whether symmetric or asymmetric, but we now explain why the mathematical reasons why the symmetric case is so special.
First, note that the differential equations in Problems 1 and 2 can both be written in the abstract form Equation (1) can be written in the form (3) as follows: first set s = (x, y w , y a , y b ) and then For equation (2) we have s = (S, R 1 , R 2 , X) and The fact that (1) and (2) can both be written in the form of (3) allows us to deduce properties of these two specific models by deducing properties from the more general and structural form of (3). Now, equation (3) is a differential equation on a four-dimensional state-space Σ of nonnegative vectors, so s(t) ∈ Σ for all t, where the parameter vector p lies in a space P of positive parameter values and so a PICS, (p, s 0 ) say, is an element of P × Σ. In Problem 1 we have s = (x, y w , y a , y b ) whereas in Problem 2 we write s = (S, R 1 , R 2 , X). The parameter-dependent linear maps g(p) and G(p) describe how the different rates of input of each antibiotic into the system drive the epidemiological dynamics of that system.
The optimality criteria in Problems 1 and 2 can now be written in an abstract form by defining a weight vector, call it w, setting A(t) + B(t) ≡ C, the latter a fixed constant, and then seeking a protocol A(t) ∈ L ∞ (0, T ) that achieves (3); the optimal protocol that solves Problem A will be denoted throughout by A * (t). Note that both Problems 1 and 2 have the same form as Problem A and so any statement made of Problem A regarding the structure of A * (t) has immediate consequences for both Problem 1 and Problem 2.
For each measurable control or deployment function A satisfying 0 ≤ A(t) ≤ C, the corresponding solution s A obtained by solving the differential equation (4) yields a value of the functional that will be denoted R(A) throughout the remainder and called the treatment objective. The function of t, (w, s A (t)), will be called the running objective associated with A. Moreover, for Problems 1 and 2 the weight vectors are w = (0, 1, 1, 1) and w = (0, 1, 1, 0), respectively.
Let us now be precise about the differences between antibiotic cycling, antibiotic rotation and antibiotic mixing protocols and note that the terms alternating protocol and sequential protocol are used synonymously for the term antibiotic rotation in the remainder of the paper.
Definition 3. Any measurable, almost-everywhere (a.e.) periodic function A(t) defines an antibiotic cycling protocol for Problem A if 0 ≤ A(t) ≤ C a.e., whereas if A(t) is constant (a.e.) it defines a mixing protocol. If two functions A(t) and B(t) satisfy for almost all t, we say that the antibiotics A and B are deployed in rotation in Problem A.

Any protocol whereby
will be described using the prefix '50-50'.
The subset M ⊂ P × Σ for which a solution A(t) of Problem A is a mixing protocol is called the mixing PICS; note that M may be empty.
Implementing the must-treat constraint A(t) + B(t) = C in equation (3) yieldṡ and so we define, here and throughout, Thus, if there is any parameter value p for which g(p ) = G(p ) then the set (p , s 0 ) : s 0 ∈ Σ must lie in the mixing PICS because the independence of equation (4) of A in this case renders the treatment objective identical for all deployment protocols. This is a trivial form of degeneracy that causes the mixing PICS M to be non-empty; we discuss less trivial examples below.
The Lagrangian of Problem A is and the Hamiltonian H is finally, the adjoint variable µ satisfies the final-value problem As is well-known, the Hamiltonian associated with (4)(5) is maximised at all times along an optimal solution (s * , µ * , A * ) of Problem A with respect to the control variable A: The solution of the optimal control problem Problem A is therefore a bang-bang function A * (t) taking only the values 0 and C unless t takes values in an interval where the switching function (µ * (t), G(p)s * (t)) is zero: Bang-bang controls correspond precisely to the antibiotic rotational protocols of Problem A and from the form of the optimal control A * given in (6) we deduce that Problem A may only have a solution that is a mixing protocol when for almost all t between 0 and T . Based on this observation, and one that is quite standard within the theory of optimal control, condition (7) will be used below to rule out mathematical models within Problem A for which mixing outperforms antibiotic rotation. Moreover, switching functions such as σ(t) in (7), so-named because it tells us when an exchange of antibiotics should be invoked, will be denoted using the Greek letter σ throughout the paper.
Using condition (7) as the starting point, we deduce the following theorem that provides technical conditions on F and G under which there can be no solution of Problem A that represents an antibiotic mixing protocol.
Proof. Begin by defining a new time-scale τ := t/T and re-writing the Euler-Lagrange equations of Problem A, namely (4)(5), in the forṁ Now set m := µ/T so that To complete the proof we shall need the following auxiliary lemma that is required nowhere else in the paper.
Proof. Let Φ(t) be the smooth, one-parameter family of matrices that satisfieṡ and the result follows.
The proof of Theorem 1 follows immediately below and to reduce notational clutter we assume without loss of generality that the constant C defined in Problem A equals one.
Suppose that a parameter value T , that we label T * , exists for which Problem A has optimal control A * ≡ ω * ∈ (0, 1) with treatment objective R(ω * ) and so we may suppose (s * , m * ) is a solution of the re-scaled Euler-Lagrange equations (10)(11). If we now define we may re-write the Euler-Lagrange equations associated with Problem A as a nonlinear operator equation that we denote E(S, m, T, ω) = 0, where are Banach spaces when endowed with standard C 1 norms and is an everywhere continuously Fréchet differentiable nonlinear mapping. Define the following isomorphism of Banach spaces D : U × V → C 0 ([0, 1], R k ) × C 0 ([0, 1], R k ) given by the differential operator Being an isomorphism, D is a linear operator of Fredholm index 0 but then is also a linear, Fredholm mapping of index-0 because it is a compact perturbation of D.
We shall call the two-parameter function (S(T, ω), m(T, ω)) the mixing surface of Problem A for it contains every possible small-T mixing solution of this optimal control problem. We can extend the domain of this surface, currently Ω, to the entire rectangular domain [0, T * ] × [0, 1] using Lemma 1 and the implicit function theorem, but we shall only sketch the argument as follows.
First fix ω = ω * . If then we can find a sequence (S * n , m * n , T * n , ω * ) ∈ U × V × [0, T * ] to form this infimum. By Lemma 1 this sequence is C 0 -bounded, but from the form of E(S * n , m * n , T * n , ω * ) = 0 we can bootstrap to readily obtain C 2 bounds on this same sequence and so extract C 1 -convergent subsequences that we do not relabel that converge to a solution of E(S, m, T, ω * ) = 0. We can then apply the implicit function theorem using the fact that ∂ S,m E(S, m, T, ω * ) is an isomorphism at this point to further extend the definition of (S(T ), m(T ), T, ω * ) to a lower value of T . This is a contradiction which ensures that Although the mixing surface is now defined on [0, T * ] × [0, 1], because mixing solutions must be totally singular in the construction of the optimal control (6) this surface only contains mixing solutions of Problem A when the switching function σ defined in (7) equals zero as a function in C 0 [0, 1] when evaluated on that surface. In other words, σ(T, ω)(t) := (G(p)(s 0 + S(T, ω)(t)), m(T, ω)(t)) = 0 must be satisfied for all t between 0 and 1 in order for the mixing protocol ω to be a solution of Problem A.
Our goal now is to use the infinite-dimensional version of Taylor's theorem to determine conditions that must be satisfied by F and G under the assumption that mixing is optimal. From this working assumption, the optimal control of Problem A is A * (t) ≡ ω * identically in t which is a constant and so smooth function. Accordingly we can apply the infinitedimensional version of Taylor's theorem and write, for 0 ≤ t ≤ 1 and fixed ω > 0, where the O(T 2 ) term here is measured in the C 0 -norm. Solving E(S, m, T, ω) = (0, 0) when T = 0 and ω is arbitrary yields the unique solution Continuing with the application of the Taylor's theorem and expanding the solution locus of E(S, m, T, ω) = (0, 0) locally as a Taylor series, we therefore obtain Let us now compute the T -derivative ∂ T S(T, ω)(t) that we denote by S T ∈ U , for the derivative ∂ T m(T, ω)(t) we shall write m T ∈ V . On differentiating the equation E(S, m, T, ω) = 0 with respect to T we find where m(t) = (1 − t)w. Solving (13a-b) and incorporating boundary conditions we obtain, for 0 ≤ t ≤ 1 and at T = 0, But the following expression for the switching function σ is identically zero in ω, T and t: In order for the mixing constant ω * to be the optimal solution of Problem A from the O(1) terms in (14) we require (G(p)s 0 , w) = 0, but the O(T ) terms must also be identically zero in t. Hence, the quadratic expression in t must be zero for all t ∈ [0, 1] and all (T, ω) in the domain of σ, concluding the proof.
Theorem 1 is a negative result in the sense that it does not help us find solutions of Problem A, but it can be used to tell us when antibiotic mixing is not a solution of Problem 1 and Problem 2 in concrete cases. In particular, we have the following two corollaries which state that Problem 1 and Problem 2 have optimal controls that are mixing protocols only when their respective sets of parameters and initial conditions (PICSs) are symmetric. Corollary 1. Suppose that system parameters (given by the vector p) and initial conditions (given by the vector s 0 = (x(0), y w (0), y a (0), y b (0))) are non-negative in (1) with h > 0 and suppose also that Problem 1 has an optimal mixing treatment f * a (t) that we denote by the constant ω * ∈ (0, 1). If y w (0) > 0 and h > 0, then the PICS (p, s 0 ) is necessarily symmetric: (15) ω * = 1 2 , r a = r b and y a (0) = y b (0), and so y a (t) = y b (t) for all t ≥ 0.
The following shows that a similar statement can be made for Problem 2.
Corollaries 1 and 2 represent analogous statements in terms of Problems 1 and 2 that may be summarised as follows: we do not yet know whether antibiotic mixing protocols are optimal for Problems 1 and 2, but if mixing is optimal for one of these models at some parameter value, the parameters and initial conditions within that model must be symmetric in the sense of Definition 2. These two results form the essence of our argument, ensuring as they do that many biologically interesting parameter values exist for which antibiotic mixing is not the optimal protocol. Indeed, these corollaries show that mixing may only be optimal in mathematically rare cases.

Optimal Protocols: Bang-Bang Controls
The results of the previous section are entirely negative and give no clue as to what the optimal deployment protocols might actually be for a given mathematical model. So, we now apply standard control-theoretic results to establish the epidemiological result that alternating protocols are optimal for Problem A, or at least ' -suboptimal' in a sense described below.
The set of admissible controls U for Problem A is the set of measurable functions taking values almost everywhere between 0 and C: we are interested in conditions under which a solution of Problem A exists and lies in U. The set of bang-bang functions B is contained within U and is defined by It is important to note that bang-bang functions B exactly describe the rotational protocols of equation (4) because the range of a function φ ∈ B can only contain the two values 0 and C. In terms of Problem A, if A(t) = φ(t) and B(t) = C − φ(t) then A and B represent a rotational protocol that is completely described by φ.
The following basic existence theorem tells us that an optimal control exists for Problem A provided (4) has a natural control-independent, point-dissipative bound. More importantly, it shows that the optimal deployment protocol can be approximated arbitrarily closely in a suitable sense by functions that rotate between the two antibiotics.
Theorem 2. Suppose that there is a finite constant C depending on C, p, T and s 0 such that for any function A ∈ U, the solution s of (4) with s(0) = s 0 satisfies, for any norm · , (23) sup 0≤t≤T s(t) ≤ C(C, p, T, s 0 ).
Then Problem A has at least one solution A * ∈ U with corresponding state response s * which satisfies (4) Proof. Suppose that the sequence (s n , A n ) provides the infimum R * := inf {R(A) : A ∈ U}, then we may assume that there is an because U is compact with respect to the weak * topology on L ∞ . Without the loss of any generality, let us shift the initial datum to zero in equation (4) by assuming that s n satisfieṡ s n = F(s 0 + s n , p) + A n · G(p)(s 0 + s n ), s n (0) = 0, instead of (4). We obtain the bound d dt s n ∞ ≤ F(s 0 + s n ) ∞ + C G(p) 1 s 0 + s n ∞ , but s n ≤ C(C, p, T, s 0 ) and as all finite-dimensional norms are equivalent it follows that the sequence (s n ) ⊂ W 1,∞ 0 ((0, T ), R k ) is bounded (the space W 1,∞ 0 ((0, T ), R k ) appropriately incorporates the zero boundary condition at t = 0). As a result (s n ) has a weak * convergent subsequence that we do not relabel, converging to s inf ∈ W 1,∞ 0 ((0, T ), R k ). As the nonlinear mapping N : is continuous with respect to weak * convergence in W 1,∞ 0 (0, T ) × L ∞ (0, T ), we see that the limiting pair (s 0 + s inf , A inf ) satisfies (4), that is N (s inf , A inf ) = 0 and the result follows on setting A * = A inf .
3.1. The optimal mixing protocol. As pointed out in Appendix B3 of [3], the idea of an optimal mixing protocol is meaningful in the context of asymmetric antibiotic deployment problems whereby asymmetric PICS values are used. In such a case, the optimal mixing protocol has to be adjusted from the 50-50 value of ω = 1/2 to account for their different evolutionary and epidemiolgical properties.
So, let s ω (t) be the solution of the differential equatioṅ which sees the constant deployment of two antibiotics at some rate ω ∈ [0, C]. The optimal mixing protocol for equation (4) is found by solving a one-dimensional optimisation problem which asks for the single value ω between 0 and C, denoted ω * , for which the treatment objective is minimal. It is clear that the optimal mixing protocol is suboptimal in the context of (4) because by definition. Note that we have already proven in Corollaries 1 and 2 of the previous section that equality is possible in (24) for Problem 1 and Problem 2 only when the parameters and initial conditions used within those problems are symmetric.
Theorem 2 can be applied to Problems 1 and 2 to provide the main mathematical result of this paper as a corollary.
Corollary 3. Problems 1 and 2 have optimal controls f * a (t) ∈ L ∞ (0, T ) and A * (t) ∈ L ∞ (0, T ) respectively. If their respective PICSs are asymmetric then there are infinitely many antibiotic rotation protocols that outperform antibiotic mixing in terms of the performance measure R(A).
Proof. From Theorem 2 we only have to establish the existence of a dissipative bound of the form (23) for equations (1) and (2), the result then follows from the second part of Theorem 2.
The following theorem illustrates that when condition (8) of Theorem 1 applies, the optimal antibiotic deployment protocol cannot be antibiotic mixing. Indeed, within the optimal protocol there is a time interval over which one of the drugs should not be deployed and the analysis immediately below tells us that this is because condition (8) can be thought of as telling us when the prevalence of resistance to one of the antibiotics is too high. We formalise this idea in the following theorem.
Theorem 3. Suppose that there is a finite constant C depending on C, s 0 , T and p (but not A) such that for any function A ∈ U, the solution s of (4) with s(0) = s 0 satisfies s ≤ C(C, p, T, s 0 ). Also assume that condition (8) holds: (w, G(p)s 0 ) = 0 and write s * (t) for the solution of (4) corresponding to an optimal control A * (t) of Problem A. As a result, to each T we can associate at least one optimal control A * T by Theorem 2. Under these restrictions there exists uncountably many T > 0 for which A * T (·) takes either the value 0 or C on a non-trivial sub-interval of [0, T ] of the form [0, τ ) and so cannot be a mixing protocol.
Proof. Let (T n ) be any positive sequence of times converging to zero and let A * Tn be an optimal solution of Problem A associated with these times; such a sequence is welldefined from the conditions of the theorem. Now define the switching function σ n (t) := (m n (t), G(p)s n (t)) where s n and m n provides a solution of the re-scaled Euler-Lagrange equations given by the pair (10) and (12) when the function A(t) in those equations is given by the optimal control A * Tn . (The rescaling alluded to changes the time interval of the problem from [0, T ] to [0, 1] and so this will be assumed in the remainder of the proof.) As T n → 0 in the Euler-Lagrange equations (10) and (12), the associated solutions (s n , m n ) with control A n := A * Tn satisfies s n → s 0 and m n → (1 − t)w, as n → ∞, where the convergence is strong in W 1,∞ (0, 1), as can be seen by bootstrapping on the assumption of the existence of the a-priori bound s n ≤ C(p, T, s 0 ). Thus, the corresponding sequence of switching functions (as given in (7) but now with m(t)/T replacing µ(t)) strongly in W 1,∞ (0, 1) as n → ∞. However, the affine function of t, (1 − t)(w, G(p)s 0 ) defined for 0 ≤ t ≤ 1 is non-zero on [0, 1) by assumption and has a transverse zero at t = 1. As a result, by the properties of uniform convergence, there is a sequence τ n converging to 1 from below such that for all large enough n the function σ n (t) is non-zero in [0, τ n ). Now let A 0 n (t) denote any measurable function bounded below by 0 and above by C. From (6) the optimal control A n has the form : σ n (t) > 0, 0 : σ n (t) < 0, A 0 n (t) : σ n (t) = 0, for each n, it follows for sufficiently large n that A n has the form A n (t) = C : 0 ≤ t ≤ τ n , A 0 n (t) : τ n < t ≤ 1, if we assume that (w, G(p)s 0 ) > 0. If, on the other hand (w, G(p)s 0 ) < 0, then A n (t) = 0 : 0 ≤ t ≤ τ n , A 0 n (t) : τ n < t ≤ 1, completing the proof.
Applying Theorem 3 to Problems 1 and Problems 2 gives the following natural condition on the form of the optimal controls. From equation (16) in the case of Problem 1, condition (8) can be written h(y a (0) − y b (0)) = 0 whereas from equation (20) in the case of Problem 1 this abstract condition becomes We can see from an epidemiological perspective that the abstract condition (8) has a very simple and practical interpretation: if resistance to one of the antibiotic is greater than to the other, do not use that antibiotic.
We now ask what happens when we take the idea hinted at in the previous paragraph of deploying only one antibiotic when the situation demands, for example use only drug 2 if if R 1 (t) > R 2 (t), and extrapolate it as a deployment rule into the future. While this protocol will not usually produce an optimal policy, in the next section we show that it can produce effective rotational protocols that are superior to antibiotic mixing. As a result, the control strategies that we deploy to combat the evolution of resistance in (1) and (2), as motivated by the above analysis, are defined as the following feedback control laws: Rule 1: in Problem 1 continue with the present antibiotic but if y a (t) > y b (t) then switch to antibiotic B, if y b (t) > y a (t) switch to A.
One further concept needed to complete the definition of the feedback controls is the idea of a sample time. The variable t in Rule 1 and Rule 2 may refer to all instances of time or t could be a sample time whereby the control decision is taken periodically or at some other prescribed instants in time. In the numerical examples of the next section we take the latter approach due to its practical relevance to managing antibiotic use in hospitals and ask how often must the system be sampled so that the feedback rules outperform antibiotic mixing? This can be interpreted in the sense of how much information do we need so that a protocol based on exploiting that information outperforms protocols founded on no information at all, like cycling and mixing.

Rotation Outperforms Mixing: Numerical Examples
The first numerical example, illustrated in Figure 2, provides a comparison of equation (1) for symmetric and asymmetric parameter sets, where optimal mixing is compared with a sequence of cycling protocols. In the symmetric case of Figure 2(a) where 50-50 mixing provides the optimal mixing protocol, the protocols that cycle between the two antibiotics are inferior to optimal mixing; note that the optimal protocol itself is not known for these parameters so this figure is a comparison of several sub-optimal protocols. In Figure 2(b) where asymmetric parameters are used (the values in (p (2) , s 0 )) and 50-50 mixing performs poorly as a result, a range of cycling protocols biased to one of the drugs outperform optimal mixing provided each cycle occurs sufficiently quickly. Figure 2. Two different parameter sets, one symmetric and one asymmetric, are used in Problem 1 to compute the response to the cycling protocols shown in the right-hand column and mixing protocols: in (a) the symmetric parameter values are taken from [2] but in (b) we used the asymmetric set (p (2) , s (2) 0 ) defined in this paper, taking T = 50 in both cases. The (red) mixing and (black/solid) cycling lines in the two figures illustrate that cycling protocols may be outperformed by the optimal mixing protocol and vice versa (the symmetric case (a) and the asymmetric case (b), respectively). (The dashed lines in (b) are a reproduction of the data from (a); the cycling protocols used in (b) are biased towards more frequent use of one of the drugs whereas the cycling protocols in (a) may be described as 50-50. The purpose of this computation is to show that cycling and mixing protocols cannot be compared in any definitive sense: cycling can beat mixing and vice versa, the precise nature of the comparison depends on the structure of the cycling itself and on the numerical parameters used in the mathematical model. Figure 3 shows the result of a numerical computation that deploys an optimisation algorithm to determine the best rotational protocols where the asymmetric parameter set (p (1) , s 0 ) has been used to parameterise the model (1). While both antibiotic rotation protocols outperform optimal mixing, if only by relatively small amount with less than 1% difference, the dynamics of antibiotic rotation shown as black lines exhibit spikes whereby drug resistance can increase sharply after the introduction of a new antibiotic regime. Nevertheless, it is with rotational protocols, and not through mixing protocols, that we can minimise the performance measure defined in [3]. Figure 4 shows one outcome of applying Rule 1 to Problem 1 using the same asymmetric parameter values as Figure 3 where it is evident that the rule-based control measure is superior to optimal mixing even though the rule only implements seven switches of antibiotic. Figure 5 is an analogous computation that implements Rule 2 on Problem 2 using parameters (p (2) , s 0 ). Similarly, the rule-based controller produces rotational protocols that outperform optimal mixing.

Discussion
This paper demonstrates that antibiotic mixing can optimally reduce the prevalence of drug-resistant pathogens in existing mathematical models of antibiotic use only when symmetries are present in the model, if those symmetries are broken, antibiotic rotation is optimal. While numerical optimisation techniques can be used to determine effective rotational protocols for specific model instances defined in (1) and (2), of greater practical The running treatment objective (the function R 1 (t) + R 2 (t)) obtained using 50-50 mixing (shown in red, with treatment objective equal to 15), optimal mixing (blue, treatment objective close to 11.7) and the rule-based feedback (black, treatment objective close to 11.61). optimal mixing 50−50 mixing feedback Figure 6. Comparing the performance of optimal mixing (blue), 50-50 mixing (red) with Rule 1 and Rule 2 (boxes). The filled boxes illustrate the number of sampling points for which the rules-based contollers outperform optimal mixing. The asymmetric parameter sets used for these simulations are defined in the text and T = 50 for both models (1) and (2) importance are the rule-based feedback controllers that invoke an exchange of antibiotics when resistance to the present one is observed to be high. While such simple rules cannot produce optimal deployment policies, they can reduce the incidence of infection below what is possible with mixing protocols. Moreover, an important robustness property follows from linearity of models (1) and (2) with respect to their control variables, f a and A respectively. This property ensures that all rotational protocols sufficiently close to the true, and usually unknown, optimal control will perform nearly as well the optimum, providing a degree of protection against errors in the implementation of the optimal policy. Finally, in Figure 6 both Rules 1 and 2 have been applied to equations (1) and (2) in the search for suboptimal rotational protocols that outperform antibiotic mixing. With a time parameter T of fifty units, no more than N switches of antibiotic were allowed on any given simulation and the dynamical systems (1) and (2) were sampled T /N time units apart to make the deployment decision as to which antibiotic would be used until the next sample. The sampling parameter N is shown along the horizontal axis in Figure 6 where it is labelled as sampling points and both diagrams in the figure show that the performance of these rule-based controls (as plotted on the vertical axis) improves dramatically with increasing N , although not monotonically. In both cases a value of N is reached above which the feedback rules Rule 1 and Rule 2 outperform optimal mixing. We deduce from this computation that there are infinitely many alternating protocols superior to optimal mixing.