Ribo-attenuators: novel elements for reliable and modular riboswitch engineering

Riboswitches are structural genetic regulatory elements that directly couple the sensing of small molecules to gene expression. They have considerable potential for applications throughout synthetic biology and bio-manufacturing as they are able to sense a wide range of small molecules and regulate gene expression in response. Despite over a decade of research they have yet to reach this considerable potential as they cannot yet be treated as modular components. This is due to several limitations including sensitivity to changes in genetic context, low tunability, and variability in performance. To overcome the associated difficulties with riboswitches, we have designed and introduced a novel genetic element called a ribo-attenuator in Bacteria. This genetic element allows for predictable tuning, insulation from contextual changes, and a reduction in expression variation. Ribo-attenuators allow riboswitches to be treated as truly modular and tunable components, thus increasing their reliability for a wide range of applications.


Models
We consider two models: the one-component system (a riboswitch), and the two-component system (a riboswitch with attached ribo-attenuator).

One-Component System
Here the RNA has a single riboswitch covering the RBS for GFP. There are two possible rates at which ribosomes bind to the RNA, depending on the state of the riboswitch covering the RBS. If the riboswitch is in the OFF position, GFP is translated at a rate λ OF F . If the riboswitch is in the ON position, the translation rate of GFP is λ ON λ OF F . GFP also degrades at a constant rate δ. The riboswitch randomly flips between OFF and ON, depending on the inducer concentration [I], such that where k + ([I]) is assumed to be an increasing function of [I]. This is a random walk, where increasing inducer concentrations correspond to a bias towards the ON state.

Two-Component System
Here there are two switches on the RNA: the original riboswitch, and the attenuator. There are four possible states of the two switches: (OF F, OF F ), (OF F, ON ), (ON, OF F ), and (ON, ON ). In each of these states, the first ON or OFF corresponds to the state of the first switch, while the second ON or OFF corresponds to the state of the second switch. The intuition behind this system is that a ribosome binds to the RBS downstream of the first switch at the rate λ OF F or λ ON defined above, depending on its state. The ribosome then opens up the downstream switch. Ribosomes also bind to another RBS downstream of the first, and translate GFP at two rates µ OF F and µ ON µ OF F , now depending on the state of the second switch. Again, GFP degrades at a rate δ.
The possible transitions between the four states of the two switches are: Here k ± are as in the previous model, describing the dynamics of the upstream switch. There are two rates that the downstream region switches on: λ OF F and λ ON , depending on the upstream switch's state. The downstream switch spontaneously switches off at a rate m − (L) that we assume is an increasing function of L, the length of the attenuator region.
To emphasise: the two-component system has production rates of GFP of µ OF F in the states (OF F, OF F ) and (ON, OF F ), and µ ON µ OF F in the states (OF F, ON ) and (ON, ON ).

Analysis
Our analysis of these random processes focused on a single random variable. We defined T to be the length of time elapsing from an arbitrary time τ until the next GFP molecule was produced. We assumed that both systems had reached stationarity, so that this was well-defined independently of τ . Importantly, this random variable is not equal to the time elapsed between two consecutive GFP production events: instead, the start time τ is arbitrary. In the main text, the inverse 1/T of this random variable is what we call the expression rate.

One-Component System
To find the CDF of T , we condition on the state of the RNA switch, so that It is simple to see that, at stationarity, we have P(ON ) = k + /(k + +k − ) and P(OF F ) = k − /(k + +k − ), where the notation of [I] is dropped for now. We define two conditional CDFs Assume that the switch at time t is ON, and consider the auxiliary random variable X, which is the waiting time until the switch switches OFF. Conditioning on X ≥ 0, the conditional CDF F ON (t) satisfies the equation We multiply this equation by e (λ ON +k−)t , and take the derivative of both sides with respect to t. Re-arranging the result, we find that the PDF f By the symmetry of our notation (or a similar argument), the PDF of T conditional on the OFF switch state satisfies Hence, matrix notation gives a first-order, two-dimensional ODE to solve for F ON and F OF F : Denote the two negative eigenvalues of the system matrix as −ω 1 and −ω 2 , with corresponding eigenvectors v 1 and v 2 . The solution to this ODE that satisfies the boundary conditions is where α 1 and α 2 are determined by the eigenvectors such that

for the scalar coefficients
From this CDF we can calculate quantiles, and also determine the mean of T as

Two-Component System
We can now perform a similar analysis on the two-component system, but now need to consider the CDF of T conditional on the four possible states of the RNA. Following the example above, we write We write the conditional CDFs as F (OF F,OF F ) (t) := P(T ≤ t | (OF F, OF F )) and so on. Given that the RNA is in state (OF F, OF F ) at time t, we now let the auxiliary random variable X denote the time elapsing between the arbitrary time t and a switch away from (OF F, OF F ). The first conditional CDF F (OF F,OF F ) then satisfies the equation We then multiply both sides by e (µ OF F +k++λ OF F )t and take derivatives. Re-arranging for the PDF f (OF F,OF F ) , we find that Performing a similar analysis for the other CDFs, we find that This is now a four-dimensional first-order ODE, but the analysis mirrors that of the simpler case above. If −φ i and w i , for i = 1, . . . , 4, are the negative eigenvalues and corresponding eigenvectors of this system matrix, then the solution of this ODE satisfying the boundary conditions is where the coefficients β i are chosen such that 4 i=1 β i w i = 1 1 1 1 T . These conditional CDFs map to the unconditional CDF for T to give for the probabilities of the stationary distribution of the random walk with transitions given in the model (2) above. Again, T has a mean 3 Supplementary data Figure S1: Ribo-attenuator secondary structures: Lowest free energy conformations were calculated for each ribo-attenuator using the RNAstructure Fold Web Server 2 . These structures were calculated for the attenuators in isolation, and thus may vary depending on corresponding genetic contexts (Fig. 2), as well as due to limitations inherent in the theoretical prediction of RNA structures. ∆G values are in kcal/mole, and the RBS core is highlighted in blue. For isolated sequences see Table S2. Figure S2: Additional ribo-attenuator screening data: Response of ribo-attenuators not used in the main paper to varying levels of induction by 2-aminopurine. This experiment was performed using the above riboattenuators (which were designed to have hairpins consisting of 9, 12, 21, and 24 repressing bp respectively), as well as Att1 (3 repressing bp), Att2 (4 bp), Att3 (6 bp), Att4 (15 bp), and Att5 (18 bp), for which data is presented in Fig. 4. Att1-5 were selected from the nine candidates to provide a wide range of induction response behaviours. Error bars indicate standard deviation of measurements for biological triplicates. Figure S3: Repressive effects of ribo-attenuators without riboswitch inclusion: Each attenuator was introduced between the tetracycline promoter and sfGFP to assess its basal expression rate in the absence of an upstream riboswitch. A positive control (RBS only) was created by including only the Common RBS region (Fig. 2, Table S2) without the upstream repressing region. Introduction of the repressing regions resulted in a reduction in expression on a population (A) and single cell (B) level, though only a minor reduction was demonstrated by Att1 due to its weak hairpin. Error bars indicate standard deviation of measurements for biological triplicates. Figure S4: sfGFP size analysis: Immunoblot of sfGFP translated from the Adda riboswitch with the Att1 ribo-attenuator, and as a direct fusion to the first 150 bp of the riboswitch's working ORF (eGFP). The effectiveness of the transcriptionally coupled junction TAATG is observed in the approximate 5 kD difference between constructs, demonstrating that the attenuator separates the introduced gene of interest from the 150 bp fusion. Immunobloting was performed as described in the materials and methods using a monoclonal GFP antibody (Clonetech). Figure S5: OmpT targeting analysis: Immunoblot of strep-tagged OmpT expressed from the Adda riboswitch directly, as a fusion with the first 150 bp of the Adda system's working ORF, and with Att2 isolating OmpT from the fusion domain. To assess targetting of OmpT, membranes (Mem) and inclusion bodies (IB) were separately dissolved on the blot demonstrating that the fusion approach to expression leads to a prevalence of insoluble inclusion bodies, whereas the ribo-attenuator system promotes proper membrane targetting. Whole cell (WC) fractions were included to demonstrate the basal expression of the system.    Table S2: List of ribo-attenuator sequences and predicted structural free-energy. For lowest free energy structures see Supplementary Fig. S1