Polysomes Bypass a 50-Nucleotide Coding Gap Less Efficiently Than Monosomes Due to Attenuation of a 5′ mRNA Stem–Loop and Enhanced Drop-off

Efficient translational bypassing of a 50-nt non-coding gap in a phage T4 topoisomerase subunit gene (gp60) requires several recoding signals. Here we investigate the function of the mRNA stem–loop 5′ of the take-off codon, as well as the importance of ribosome loading density on the mRNA for efficient bypassing. We show that polysomes are less efficient at mediating bypassing than monosomes, both in vitro and in vivo, due to their preventing formation of a stem–loop 5′ of the take-off codon and allowing greater peptidyl-tRNA drop off. A ribosome profiling analysis of phage T4-infected Escherichia coli yielded protected mRNA fragments within the normal size range derived from ribosomes stalled at the take-off codon. However, ribosomes at this position also yielded some 53-nucleotide fragments, 16 longer. These were due to protection of the nucleotides that form the 5′ stem–loop. NMR shows that the 5′ stem–loop is highly dynamic. The importance of different nucleotides in the 5′ stem–loop is revealed by mutagenesis studies. These data highlight the significance of the 5′ stem–loop for the 50-nt bypassing and further enhance appreciation of relevance of the extent of ribosome loading for recoding.


Introduction
The most striking exception known to nonoverlapping sequential triplet decoding is in a phage T4 gene next to that used by Crick et al [1]. to establish the general nature of genetic readout. This nearby gene, gene 60, derives from insertion of a mobile DNA cassette consisting of a homing endonuclease gene and an associated separate 50-nt sequence that provides protection against self-cleavage [2]. The insert occurred into an ancestral phage T4 topoisom-erase encoding gene. The inserted endonuclease gene split the original gene into two genes. Both genes are functional despite the 3′ gene having a 50nt insert between codons 46 and 47 of its coding sequence [3]. The insert has stop codons in all frames, suggesting that translation of this sequence would result in a prematurely terminated protein. Studies with plasmid-borne cassettes showed that in Escherichia coli grown on rich media, a substantial proportion of translating ribosomes successfully bypass the 50-nt coding gap to synthesize a single protein from two discontinuous open reading frames (ORFs) [4,5]. Moreover, bypassing can also occur with rare codons in highly expressed genes upon heterologous expression [6], with unassigned codons [7], and even in unstarved cells [8]. Limitation of aminoacyl-tRNA for a "hungry" A-site sense codon induces low-level bypassing [9][10][11]. Interestingly, abundant translational bypassing is productively utilized in mitochondrial decoding in certain yeasts [12,13]. Protein sequence data have shown the absence of amino acids specified by 29 nt within the coding sequence of an adhesion gene of the oral bacterium Prevotella loescheii, and much indirect evidence points to the involvement of translational bypassing [14]. The types of whole (both subunit) ribosome bypassing considered above are very different from that involved in ribosome shunting [15,16].
Gene 60 of bacteriophage T4 is the best-studied example of translational bypassing. The high reading efficiency of these two discontiguous ORFs encoding a single-polypeptide chain is achieved by the presence of a number of recoding signals [17,18]. In WT bypassing, the anticodon of peptidyl-tRNA Gly 2 [19] dissociates from codon 46, GGA, the take-off codon, and re-pairs to mRNA at the matched "landing" codon GGA, 5′ adjacent to the resume codon 47 [4,20]. Nucleotide position numbers counted 3′ from the take-off codon have the prefix " §" (Figure 1). As shown by both singlemolecule fluorescence resonance energy transfer (smFRET) studies with zero-mode waveguides, and cryogenic electron microscopy (cryo-EM) studies, when the GGA codon is in the P site, the mRNA in the A site folds into a short stem-loop capped by a UUCG tetraloop ( §6-9), termed "A-site SL" ( §3-12) [21][22][23]. UUCG has the propensity to form a tetraloop of unusual stability and compactness [24][25][26][27]. Included in the A-site SL is the third nucleotide, G, of the codon, UAG, 3′ adjacent to the take-off GGA, codon 46. Formation of this stem-loop within the A-site prevents access of release factor 1 and tRNAs near-cognate to the UAG [22] and is consistent with earlier results showing WT levels of release factor 1 do not mediate termination at the UAG [4,20].
Formation of the A-site SL and the interactions of the nascent peptide with the walls of the polypeptide exit tunnel (see below) facilitate the formation of the non-canonical rotation state of the ribosome, a hyperrotated state, that initiates bypassing [21,23]. The Asite SL also serves as a tRNA mimic for the recruitment of EF-G and a pseudotranslocation event that initiates bypassing [23]. Following release of the tRNA anticodon from the take-off codon, the ribosome moves over the 5′ part of the coding gap without anticodon scanning of the mRNA for potential complementarity [21,28]. This initial absence of anticodon scanning explains why peptidyl-tRNA Gly 2 does not recognize the cognate GGG §9-11 triplet in the A-site SL [28]. Subsequently, continued forward progression toward the 3′ end of the coding gap does involve peptidyl-tRNA anticodon scanning of the mRNA for potential complementarity [21]. [Such scanning makes the mechanism distinct from that described for intact 80s ribosomes resuming translation 3′ of stop codons under stress conditions [29].] In addition to the A-site SL, another crucial feature for the bypassing is a nascent peptide sequence encoded 5′ of the take-off site [4,[20][21][22]30,31]. Changes in the nascent peptide starting from the crucial 14 KKYK 17 motif resulted in substantial reduction in bypassing efficiency. In vitro studies that systematically scanned the effect of nascent peptide mutations have shown that mutations of residues 14-30 in the nascent peptide sequence reduce bypassing efficiency by 2-to 20-fold [23,31]. The nascent peptide adopts an α-helical conformation and forms multiple interactions with both rRNA and protein components of the interior of the peptide exit tunnel of the ribosome [22]. Nascent peptide-exit tunnel interactions cause progressive ribosome slowing, as the ribosome approaches the take-off codon, enabling ribosomes to adopt an unusual hyper-rotated conformation prior to bypassing [21,23]. The nascent peptide interaction also helps the ribosome to retain peptidyl-tRNA during bypassing [31,32] and serves to increase the accuracy of peptidyl-tRNA re-pairing to mRNA [20,33].
Toward the end of gene 60 bypassing, peptidyl-tRNA re-pairing at the matched landing codon is influenced by the mini Shine-Dalgarno (SD)-like sequence GAG 6-nt 5′ of the landing codon that can pair to the anti-SD sequence in 16S rRNA [28]. Perhaps significantly, this is flanked by A's [34,35]. Landing is also facilitated by an mRNA structure; here termed a forward slippage barrier, 3′ of the landing codon [31]. Translation resumes at the 3′ adjacent "resume codon" (Figure 1) with binding of aa-tRNA to the ribosome in the rotated state [21,23] and then continuing normal translation.
The signals just described conspire to make the initiation of bypassing highly efficient and to overcome the strength of codon:anticodon pairing at the take-off site, which does not affect take-off efficiency [36].
In vivo studies showed that despite the high efficiency of take-off, the overall bypassing efficiency is lower, due to drop-off of peptidyl-tRNA from the ribosome [32]. Quantification in vitro of bypassing efficiency as a function of the gap length indicated that peptidyl-tRNA is progressively lost as the ribosome moves along the non-coding mRNA, but all ribosomes that maintain their P-site tRNA can land [31]. Addition to the 3′ part of the gap of sequence with strong potential to form an SL reduced the amount of product derived from successful bypassing compared to insertion of sequence of the same length without structure potential [31].
A further stem-loop is important for bypassing. It was identified in the course of in vitro studies and because of its location named the 5′ SL ( Figure 1) [21,31]. The 5′ SL was suggested to form upon exit from the ribosome mRNA tunnel [31], and there is some evidence for its refolding from the cryo-EM structure [22]. However, mutations of the 5′ SL were not noted as being significant in the assays performed to date in E. coli cells. Interestingly, SL elements are found 5′ of the putative recoding sites in several other cases of recoding, for example, in two Streptomyces phages [37], (O'Loughlin S., et al., unpublished). Similar nucleotide features are present in diverse Streptomyces species [17]. Work in E. coli with a model system has shown that a synthetic SL 5′ of a frameshift site can inhibit −1 frameshifting and so "backwards" ribosome movement, but promote +1 frameshifting [38]. The number of nucleotides from the top of the 3′ side of the stem-loop tested to the recode site was much more similar to that of its gene 60 5′SL counterpart, being only 5 nt less, than the equivalent of the hypothesized Streptomyces phage bypassing 5′ SL mentioned above.
Previous in vitro studies clearly identify the 5′ SL as an important facilitator of bypassing, but the question still remains why its effect has not been evident in in vivo studies performed to date. At one time, it was plausible to consider that progression of a bypassing ribosome through the coding gap involved "pushing" by continued translation of a following ribosome. However, in vitro studies showed efficient bypassing under monosome conditions [21,31], where formation of the 5′ SL may have served to favor movement of the initial part of the coding gap through the ribosome [31]. Nevertheless, the bypassing efficiency may be influenced by level of ribosome loading on mRNA; in in vivo studies to date, forward movement could be generated by continuing translation by a following ribosome possibly substituting for the effect of the 5′ SL. The experiments reported here address the nature of the 5′ SL and its significance as well as the relative efficiency of bypassing during high (polysome) and low ribosome loading conditions.

Results
The major take-off site pause facilitates ribosome profiling detection of 5′ SL sequence To determine the extent of gene 60 mRNA protected by the ribosome, ribosome profiling of T4 infected E. coli was carried out. Micrococcal nuclease (MNase) was used to cleave mRNA unprotected by the ribosome. Because the A-site SL is formed within bypassing ribosomes [22], an increased length of the relevant ribosome protected mRNA fragment might be expected. Monosomes generated by MNase digestion [39,40] were collected from a sucrose gradient. RNAs recovered from the monosomes fraction were size separated by electrophoresis on a polyacrylamide gel. RNAs of 30-40 nt, as well as larger fragments of 45-140 nt, were excised. For only partly understood reasons, in addition to the significant but occasional contribution of interactions involving internal SD sequences [41], protected fragments in bacterial ribosome profiling experiments are often longer and more variable in length than counterparts obtained with larger eukaryotic ribosomes [42].
We performed single read sequencing for the smaller fragments and paired-end sequencing of the longer fragments. As previously reported [43], the T4 bacteriophage almost completely hijacks the host translational machinery. Of the 30-to 40-nt fragments over 1.8 million aligned to E. coli genes and 11 million mapped to T4 genes with 108,000 aligning to gene 60. Around 3.2 million 45-to 145-nt fragments aligned to E. coli genes, 2.4 million of which aligned to tmRNA. One hundred forty-four thousand reads mapped to the genes of T4, with 4311 of these reads aligning to gene 60.
For gene 60, the 30-to 40-nt fragments revealed an extraordinary high density (200 times greater than the gene average) whose 5′ end was 24-nt 5′ of the start of the coding gap. Based on their 5′ ends, it can be inferred that these protected fragments derive from ribosomes whose P-site was within a few nucleotides 5′ of the take-off site (because of the vagaries of cleavage with the micrococcal nuclease used to generate the protected fragments, P-site location could not be precisely predicted). The great majority of these fragments had 13-nt 3′ of the start of the coding gap (U of the UAG stop codon is designated as 0 and counted as 1) consistent with the P-site being located at, or very near, the take-off site (Figure 2(a), top, and Figure 2(b)). Ribosome queuing was not addressed since only monosome-protected mRNA fragments were analyzed (even with high ribosome numbers, ribosome spacing was apparently not tight enough to prevent cleavage). In contrast, the protected fragments detected at the 3′ end of the coding gap were much less numerous. One interpretation of this is that the subsequent re-pairing and the next decoding event are comparatively quick. However, smFRET studies provided direct contrasting evidence [21] raising the possibility that an alternative event, such as increased ribonuclease (RNase) accessibility to the A-site, may be related to the relative paucity of 3′ protected fragments (see Discussion) ( Figure 2(b)).
Approximately 65% of all of the 45-to 145-nt-long fragments aligning to gene 60 mapped to a single position (and no other single position has many reads mapping to it) (Figure 2(a) bottom). While both the 5′ and 3′ ends of the long fragments aligning at this position display slight variability, most are of one length, 53 nt. This fragment consists of 37-nt 5′ and 13-nt 3′of the (3-nt) take-off site. The 5′ extension incorporates the nucleotide sequence of the 5′ SL (Figure 2(b)). Ribosome profiling of a plasmid-borne gene 60 cassette in E. coli cells uninfected by phage T4 was also performed. The WT gene 60 cassette again yielded just one long protected fragment and similar to the size and  Figure 2(b)) is designated as zero (0), and the 5′ and 3′ boundary of the 50-nt coding gap is indicated (blue dashed lines). (b) The position of the ribosome-protected mRNA fragments recovered in the coding gap, and associated regions of gene 60 are indicated. The smaller RPFs (blue overline) extend 13 nt into the coding gap and include 24-nt 5 ′ of the coding gap. The large RPFs (white text in gray) include 40-nt 5 ′ of the coding gap. It encompasses nucleotides that form the 5′ stem-loop and also extends 13 nt into the coding gap. If MNase digestion occurs in the A-site, then prior to this cleavage, additional 3′ sequence would have been present in the ribosome (3′ sequence shown without highlighting). sequence as that recovered from phage T4 infected cells (− 41, + 12) ( Figure S1).

5′ SL RNase digestion assays
The secondary structure of the 5′ SL, as deduced from Selective 2′-Hydroxyl Acylation analyzed by Primer Extension (SHAPE), is composed of two helices, a 4-bp bottom helix and 5-bp upper helix, connected by a 4-nt internal loop and capped by a 6-nt terminal loop (Figure 2(b)) [26]. In this study, the structure of the 5′ SL was explored further through direct measurements in order to understand better how this stem-loop could interact with the translational machinery.
RNase digestion can be a powerful tool for probing secondary structure, because RNases are specific for single-or double-stranded RNA. Thus, paired bases and nucleotides in loops are recognized distinctly. The 5′ SLx construct was designed for RNase digestion assays. 5′ SLx extends three bases 3′ and 5′ of the 5′ SL to include the 34-nt sequence from C96 to U129 with a Cy3 fluorescent label on C96 ( Figure S2A). After digestion, only those fragments containing C96 would be detectable on a denaturing gel. Two RNases were chosen for these assays: RNase A, which hydrolyzes the phosphodiester bond 3′ of single-stranded pyrimidine nucleotides, and RNase T1, which targets phosphodiester bonds 3′ of single-stranded guanosines. When 1 ng/μl 5′ SLx was incubated with 0.1 ng/μl RNase A, three fragments were resolved by denaturing gel electrophoresis ( Figure S2B). These fragments had approximate lengths of 34, 18, and 15 nt, which represented the full-length 5′ SLx, the product of hydrolysis at U113 or U114, and the product of hydrolysis at C110 or U111, respectively. The absence of other fragments indicated that the remaining pyrimidines in the 5′ SL are protected from RNase A and therefore base-paired.
RNase T1 digestion did not result in fragmentation of the 5′ SLx. One major band was present for the 34nt full-length RNA oligonucleotide and one minor band at 32 nt represented cleaving after G127 in the 3′ overhang outside of the 5′ SL structure ( Figure S2B). Since the 5′ SL SHAPE structure predicted that G121 resides within a loop, digestion 3′ of G121 was expected. Protection from RNase T1 indicated that the 5′ SL adopted a tertiary structure that stacked the internal loop bases and ordered the backbone.

5′ Stem-loop NMR analysis
To build from the RNase secondary structure probes, we endeavored to solve the solution NMR structure of the 5′ SL. NMR has proven to be wellsuited for structure determination of oligoribonucleotides [44]. NMR experiments were performed on three RNA constructs: the complete 28-nt 5′ SL sequence, a truncated 5′ SL sequence with the terminal hexaloop and topmost two base pairs replaced by a UUCG tetraloop (5′ SLtrunc), and a 5′ SL sequence with one additional GC base pair at the bottom stem (5′ SLGC) (Figure 3(a)). The 5′ SLtrunc and 5′ SLGC constructs were designed to simplify assignment through signal deconvolution and stabilization of the helical stem, respectively. 5′ SLGC was chosen for isotopic labeling due to its minimal changes from the wild-type sequence and benefits from stabilization of the bottom helix.

Exchangeable 1 H assignment
The secondary structure of the 5′ SL predicted from SHAPE data included three AU pairs, five GC pairs, and one GU pair. Similarly, 1D exchangeable 1 H spectra, which only detected UH3 and GH1 nuclei participating in hydrogen bonds due to base pairing, indicated two AU pairs, five GC pairs, and one GU pair. The missing AU pair corresponding to the terminal AU of the bottom stem was detected from the 5′ SLGC spectra due to addition of the stabilizing terminal GC pair. Assignments of UH3 and GH1 nuclei in the 1D spectra were corroborated by 1 H-1 H nuclear Overhauser enhancement spectroscopy (NOESY) experiments, which served to connect adjacent UH3 and GH1 in the top and bottom stems. Nucleotides not involved in base pairing, such as those in the purine-rich inner loop and the terminal loop, were not detected in these experiments due to rapid imino 1 H exchange in these regions. Thus, the 1D 1 H and NOESY spectra completely supported the SHAPE-predicted secondary structure of the 5′ SL.

Nonexchangeable 1 H assignment
We next assigned nonexchangeable 1 H resonances to model the ribose backbone of the structure. Characteristic A-form RNA helix geometries were leveraged with through-space NOESY experiments to connect 1 Hs from adjacent riboses within the top and bottom stems. These connections were rarely found in the flexible internal and terminal loops. NOESY mixing times controlled the maximum internuclear distance that could be detected in the experiment. 50-, 150-, and 300-ms mixing times corresponded to distances of approximately 3, 5, and 6 Å, respectively, so comparisons between NOESY spectra of different mixing times assisted chemical shift assignment. Since the intensity of a NOESY crosspeak inversely correlated with the sixth power of the internuclear distance at shorter mixing times in the absence of spin diffusion, the internuclear distance was indirectly measured by the NOESY experiments.
Two-dimensional through-bond double quantum filtered (DQF)-correlated spectroscopy (COSY) experiments also delivered conformational information in the form of dihedral angles according to Karplus relationships with 3 J HH coupling. Ribose sugar pucker conformation was probed through 3 J H1′H2′ couplings. Intense COSY signals corresponded to a C3′-endo sugar pucker common in A-form RNA helices, while weak signals indicated a C2′-endo conformation.
Within the 5′ SL model, the 18 nt of the top and bottom stems had strong 3 J H1′H2′ couplings, which were interpreted as C2′-endo ribose conformations.
To aid 1 H resonance assignments, heteronuclear NMR experiments were required, taking advantage of the broader chemical shift dispersion of 13 C and 15 N resonances. Three-dimensional NOESYHSQC experiments provided an additional heteronuclear dimension with which to distinguish similar 1 Hs. 13 C chemical shifts distinguished C1′ nuclei from the other ribose carbons, and purine C8 and adenine C2 nuclei were marked by characteristic chemical shifts. 15 N chemical shifts also allowed connections to be made between the ribose and nucleobase by observing through-bond energy transfer across purine N9 and pyrimidine N1 nuclei.

Residual dipolar coupling
Residual dipolar coupling was measured by acquiring 1 H-13 C HSQC spectra without 13 C decoupling in the presence and absence of 21 mg/ml Pf1 phage for partial alignment. From a comparison of these spectra, 19

Structure modeling
Data from the NMR experiments were used in structural modeling in the following ways. One hundred forty-three internuclear distances were calculated from the NOESY crosspeak intensity and calibrated to the pyrimidine H5-H6 crosspeak intensity and known internuclear distance. Eighteen dihedral angle constraints were collected from COSY 3 J HH coupling constants. Eight base-planarity constraints were applied from the strong base pairing data of the exchangeable 1 H experiments. The statistics of the modeling constraints are shown in Figure 3(b). The lowest-energy model produced by Xplor-NIH using simulated annealing in the presence of these constraints was refined against a second simulated annealing process including 19 Δ 1 J CH residual dipolar coupling restraints measured from the addition of Pf1 phage magnetic resonance cosolvent. The refinement step generated 100 structural models of the 5′ SL construct, and the lowest-energy structure is shown in Figure 3(c). The lower and upper stems were modeled as right-handed A-form helices with an approximate rotation per base pair of 29°, 12.3 bp per turn, rise per base pair of 3 Å, and C3′-endo ribose sugar pucker. The 4-nt internal loop was modeled as a hinge between the two helices such that the loop bases stack nonuniformly to make the helices blend into one semicontinuous helix. The bend of the hinge varied between structures, as indicated by the poor degree of overlapping in the superposition (Figure 3(d)). A non-canonical AG base pair between A104 and G121 was suggested in some structures. The bend followed the right-handedness of the helices, therefore causing the terminal loop to approach the 5′ side of the lower helix. The terminal loop geometry was also extremely heterogeneous, resulting in significant differences between model solutions (Figure 3(d)). Some base stacking was evident for C110, U111, U114, and A115, but nucleobases were often outwardly projected into the solvent.

Comparison of bypassing efficiency with polysomes or monosomes in vivo and in vitro
To test whether the level of ribosomal loading has an effect on bypassing efficiency in vivo, an experiment was performed in E. coli cells with a plasmid-borne cassette that permitted either high or low ribosome loading. The former possessed a strong initiating SD sequence, AGGAGG, while the latter had a weaker SD, AGAUGG, designed to lower ribosome loading. There is a 12-fold difference in product level between the two vectors each with a gene 60 cassette lacking the 50-nt coding gap (gene 60 Δgap) ( Figure S3A). It was previously confirmed this strong initiating SD vector has a higher ribosome load per message from polysome fractions [45]. With a WT gene 60 cassette in the high ribosome load vector, bypassing efficiency was 26%, whereas in the low ribosome load vector, bypassing efficiency increased to 45% (Figure 4(a) and Figure S3B). Next, a counterpart experiment was performed to see whether one could recapitulate in vitro the ribosome loading effect seen here in vivo. The in vitro translation system [22,23,31] was primed with gene 60 mRNA whose non-bypass product (ORF1 translation product) was 46 amino acids (~5 kDa) and the full-length bypassing product was 160 amino acids (~17.6 kDa). The set-up for in vitro translation by monosomes involved mixing initiation ribosome complexes (to final concentration 0.016 μM) with the ternary complexes (50 μM). Translation by polysomes was carried out in the same way but in the presence of additional ribosomal subunits, initiation factors and Bpy-Met-tRNA fMet to allow for re-initiation on the same mRNA. Upon translation by a single ribosome, the non-bypass, i.e. ORF1, product accumulates after about 10 s and then decreases, because about 60% of ribosomes bypass and synthesize the bypass product (Figure 4(b) and Figure  S3C). A similar accumulation of the ORF1 product is observed for the leading ribosome in the polysome (Figure 4(b) and Figure S3C). In the conditions of high ribosomal loading (polysome), the bypass product is formed slower in comparison with mono ribosome translation (Figure 4(c)), which suggests that even the leading ribosome has difficulties to land. When the leading ribosome leaves the take-off codon, trailing ribosomes in the polysome complete synthesis of the ORF1 product, but cannot complete bypassing, which leads to accumulation of the non-bypass product (Figure 4(b)). Although the level of bypassing product increases in polysomes, (Figure 4(c)), the fraction of ribosomes that complete bypassing is smaller than by a monosome (Figure 4(d)). This suggests that fewer ribosomes reach the landing site when they are arranged in polysomes than in monosomes. The directionality of the change seen corresponds to that of the in vivo analysis. Thus, ribosome loading has a clear effect on gene 60 bypassing both in vivo and in vitro.
Next to determine whether the action of the 5′SL is attenuated by successive ribosomes in a polysome, we examined mutants of the 5′SL under the same high/low ribosome conditions in vivo. The four synonymous substitutions in the 5′ side of the stem, used in the initial in vitro work [31] to preclude stem formation, were tested first (Figure 4(e), Fa1). The in vivo bypassing efficiency of the resulting mutant under high-and low-ribosome loading is 25% and 29%, respectively, which is similar to the bypassing efficiency on WT mRNA at high ribosome loading, 28%, but significantly lower than at low ribosome loading, 42% (Figure 4(f)). This experiment led us to conclude that 5′SL enhances bypassing efficiency in the conditions of low ribosomal density and is consistent with 5′SL formation being precluded or disrupted by the following ribosome in high ribosome conditions (Figure 4(f)). As the only substantial effect for the 5′SL was observed with low-ribosome loading (Fa1, 69% of WT), the remaining mutant constructs were assayed under these conditions. Next, secondary-structure disrupting synonymous changes were made separately to the 3′ side of the stem (Figure 4(e), Fa2), bypassing decreased to 49% of WT. However, when compensatory mutations on the 5′ side were made to restore the secondary structure of the 5′SL (Fa3), bypassing efficiency was only partially restored (64% of WT) (Figure 4(g)).
To further understand the effect of the 5′SL in vivo, we introduced mutations in the lower and upper parts of the 5′SL and measured bypassing in conditions of low ribosome loading (see Materials and Methods). First non-synonymous mutants designed to preclude alternative Watson-Crick pairing were introduced into the lower and upper sections of both sides of the 5′ SL ( Figure 5(a)). The mutant of the 5′ side of the lower part of the stem (Fa4) had an efficiency of 73% of WT, and its 3′ side counterpart (Fa5) is 41% of WT. Combining the 5′ and 3′ side mutants in the lower part of the stem to restore base pairing (Fa6) resulted in a partial rescue (68%). The mutant of the 5′ side of the upper part of stem (Fa7) caused bypassing to decrease to 57% of WT. A counterpart on the 3′ side of the upper part of the stem (Fa8), reduced bypassing to 42% of WT. A combination of the mutants that restored complementarity (Fa9) did partially restore bypassing levels (62%) (Figure 5(a)). Together with the earlier in vitro study, these in vivo results support the existence and significance of the 5′SL at conditions of low ribosome loading.
In addition to potential effects of structure formation on bypassing, the encoded amino acid sequence of the 5′SL may also be relevant. To explore potential amino acid level effects, variants with potential alternative Watson-Crick pairing that might minimize nucleotide pairing level effects were tested. Mutating the 5′ side of the stem, GCG (Ala 34), to CGC (Arg) (100-102 GCG-CGC) (Fa10) reduced bypassing efficiency to 63%. Mutating GAC (Asp 41) and GCA (Ala 42) (123-125 CGC-GCG) in combination on the 3′ side to GAG (Glu) and CGA (Arg) (123-125 CGC-GCG and 100-102 GCG-CGC) (Fa11) reduced bypassing efficiency to 53% Figure 5. Assessment of the nucleotide and amino acid effect of the sequence that forms the 5′ SL. (a) Nonsynomonous changes that prevent alternative Watson-Crick pairing were made to the upper and lower region of the 5′ and 3′ side of the 5′ SL, alongside with compensatory mutations expected to restore formation of the secondary structure (white text in black). (b) The effect of the encoded amino acids of the 5′SL was also further explored by testing amino acid variants, which permit alternative Watson-Crick pairing to minimize the nucleotide level effect (black, mutations of the left strand; gray, mutations of the right strand; white text, compensatory mutations). of WT ( Figure 5(b)). A combination of mutated codons 40 and 41 with mutant codon 34 that restored complementarity at the original location (Fa12) yielded WT levels ( Figure 5(b)). A complete restoration in this case clearly indicates that base pairing is important in this region of the 5′SL, and not the amino acid identity at these positions, consistent with the lack of specific contacts of Ala34, Asp41, and Ala42 with the exit tunnel [22]. Separately mutating GUC (Val 36) to CUG (Leu) (106 U-C, 108 U-G) (Fa13), reduces bypassing efficiency to 61% of WT. Mutating AUG (Met 39) and ACA (Thr 40) to AUC (Ile) and AGA (Arg), respectively (117 G-C, 119 C-G), on the 3′ side of the stem (Fa14) caused a reduction to 56% of WT ( Figure 5(b)). A combination of mutated codons 39 and 40 with mutant codon 36 that restored complementarity at the original location (Fa15), yielded 76% of WT partially rescuing bypassing. A lack of complete restoration could be attributed to the loss of specific interactions of Met39 and Thr40 with the exit tunnel, and in particular the replacement of Thr [40] with Arg may be significant [22]. These results are suggestive of a nucleotide effect on bypassing for both the lower and higher segments of the 5′SL. Disruption of the nucleotide base pairing involved in the formation of the 5′SL without changing the amino acid sequence (Fa1) leads to a 30% reduction in bypassing efficiency. The other compensatory mutations tested cause an amino acid substitution, but nevertheless restore bypassing efficiency, unless the mutations results in a dramatic amino acid substitution, such as Thr to Arg. We note that the observed 30% reduction in 5′SL mutants is not equivalent to the 40% reduction of WT bypassing efficiency in polysome conditions, which indicates that 5′SL unwinding is not the only effect of the trailing ribosome.
Separating the upper and lower parts of the 5′SL is a central region consisting of four purines, AA on the 5′ side and GA on the 3′ side ( Figure 6(a)). Extrapolating from the NMR data, one model is that the unpaired nucleotides in this central region act as a molecular hinge providing flexibility to the upper stem. Watson-Crick pairing in the central region connecting the lower and upper parts of the 5′SL into one continuous stem would impede free movement in the central region. This was achieved by changing the 5′ side AA to UC, now complementary to the 3′ side (WT GA) (Fa16), the bypassing efficiency remained at WT levels (100%) (Figure 6(a)). On changing the 5′ side AA in the central region to CC and the 3′ side to UU (i.e. all pyrimidines) (Fa17), bypassing was reduced to 60% of WT. The sequence may be important regardless of the relative flexibility of the upper and lower stems, potentially being involved in some kind of interaction with the ribosome. We also tested the sequence significance of the 6 nt terminal loop (CUAUUA) capping the 5′SL. A minimal effect was observed for loop mutants, e.g., changing the first C nt of the loop to an A reduced bypassing to 80% of WT (Fa18, Fa20). For all other loop mutations, no significant reduction in bypassing was observed ( Figure 6(b)).

Discussion
In this paper, the role of the 5′ SL was addressed by ribosome profiling, NMR structural analysis, and examining its effect on bypassing under high (polysome) and low (monosome) ribosome loading conditions. An extraordinarily high proportion of the ribosome-protected fragments in ribosome profiling derived from ribosomes stalled at the take-off codon; in agreement with earlier studies that showed ribosome pausing at the take-off site [21,22]. There were essentially two sizes of protected RNA, 37 and 53 nt. The 37-nt fragment is within the upper part of the size range of standard protected fragments found with ribosomes carrying out canonical trans-lation. However, as extrapolated from the protection pattern of ribosomes in canonical rotation states, the 53-nt fragment is unusually long. If 15-nt 5′ of the P site/A site junction is located in the mRNA channel, the 5′ SL top could form outside the ribosome, whereas the 3′ side of the stem would reside within the mRNA exit channel. On this basis, it is surprising, that the 5′ nt of the protected 53-nt fragment coincides exactly with the 5′ nt of the 5′ SL. However, formation of the SL involving the 3′ side of the stem is highly unlikely, as it is occluded by the mRNA channel, and given the structure of the channel, a double-stranded SL is too bulky to fit into it. Moreover, even if it were somehow to be accommodated in the mRNA exit channel, then either a triple helix or nuclease penetration deep into the exit channel of the hyper-rotated ribosome would apparently be needed to explain the results. Rather, the observed protection could be explained in two possible ways. One possibility is that the 5′SL top and the mRNA region up to the take-off codon is protected in a canonical way due to formation of the SL and protection by the ribosome, respectively, whereas the 5′ part of the 5′ SL is inaccessible for RNase cleavage due to the hyper-rotated conformation of the ribosome. Another possibility is that the parts of the 5′SL that are out of the 30S subunit interact with the ribosome or with the upstream elements of the mRNA, thereby preventing the RNase cleavage. This possibility is supported by the cryo-EM analysis of ribosomes stalled at the take-off codon revealing density for the structured mRNA 5′ of the 30S [22]. Earlier SHAPE analysis showed that the 5′ SL can form in the absence of ribosomes [26]. That work showed that the two bases in the 5′SL closing loop and two bases between the 5′ SL and the extended SL, which were highly accessible to the SHAPE reagent and to Tb 3+ . These potential cleavage sites in the unstructured regions were not accessible for the MNase in the ribosome profiling experiment, possibly owing to the larger size or the slower cleavage rate of the MNase compared to the SHAPE reagent or Tb 3+ . Since the 5′SL is dynamic, as indicated by the NMR, the lifetime of the open conformation accessible for cleavage may be too short for the RNase, but sufficient for the chemical probing reagents.
Paradoxically, although we collected and analyzed fragments of up to 150 nt, we did not detect protection of 23-nt 3′ from the take-off codon. Inclusion of 23 nt would be expected if the A-site SL were present in the great majority of the ribosomes paused at the A-site, and if the ribosome hyper-rotated state at take-off, or dynamic properties of the A-site SL does not permit MNase to enter the A-site and cleave the A-site SL structure. While the A-site SL was detected both by cryo-EM analysis and smFRET, the former used temperature trapping and the smFRET analysis showed that the A-site SL only briefly formed [21], with recent work providing consistent results [23]. So it remains possible that the protected fragment being considered derives from ribosomes in which the A-site SL had not yet formed. Though the smFRET study showed that landing/coding resumption is a slow process [21], for unknown reasons possibly related to susceptibility of the mRNA in the vacant A-site in the noncanonical rotated ribosome after the peptidyl-tRNA has paired to the landing site codon, we detect no excess of protected fragments containing the landing site. [Of potentially wider interest, we did detect extra-long protected mRNA fragments in a wide variety of other genes but did not investigate these.] The NMR analysis shows that the 5′ SL can adopt a compact structure, but remains highly flexible. The top and bottom helices, although rigid and highly stable individually, are weakened by the existence of a connecting loop. This purine-rich internal loop contributes to flexibility not only through fraying of the two helices but also by acting as a hinge that terminal loop nucleobases into the solvent enables intermolecular base pairing, such as that observed in kissing hairpins. The positioning of the 5′ SL during takeoff raises the question whether the ribosome itself could interact with the terminal loop either through protein-RNA or RNA-RNA interactions. However, disruption of terminal loop interactions through synonymous codon mutations of the loop sequence did not affect bypassing. Moreover, it is unclear exactly how much of the 5′ SL sequence is free to fold during takeoff. The upper helix and the terminal loop alone comprise of a modestly stable hairpin. Although mutations designed to restore the 5′ stem-loop do not substantially restore bypassing levels (Figures 4 and 5), one possible contributory explanation for this observation could be altered elongation rates at mutant codons with lower codon usage (Table S4). Given the relative instability of the lower stem due to its shorter sequence and rapid fraying detected by NMR, the primary contribution of the lower 5′ SL structure may be to extend the top aspect of the structure to potential interacting partners. The 5′ SL's highly dynamic nature agrees with previous findings that the density for the 5′SL region is disordered in the cryo-EM study of ribosomes paused at the take-off site [22].
Earlier work suggested that as the mRNA exits the ribosome upon bypassing, formation of the 5′ SL may act as either an initial 'pusher' of forward bypassing, or as a backstop to prevent backward sliding, or both [21,31]. The 5′ top element, which according to the ribosome profiling data can form when the ribosome resides on the take-off codon, is more stable than the lower stem, and would act as an obstacle for ribosome backward movement even in the absence of the lower part of the structure. While we cannot rule out potential significance of its formation for a contrasting "push" function, it is possible that avoidance, or strict limitation of any potential for a "push" function that could influence the likely key timing of formation of the A-site stem-loop, has limited selection for a stronger backstop, which may anyhow be unnecessary.
As a translating ribosome approaches the coding gap, its leading edge may encounter the 5′SL. There is no evidence that a consequence of this affects the ribosomes changes that occur after the 5′ part of the specific nascent peptide signal coding sequence has been translated, but the possibility has not been ruled out.
Our finding that monosomes are significantly more efficient than polysomes in mediating gene 60 programmed bypassing, both in vivo and in vitro, contrasts with an initial supposition that following translating ribosomes may selectively enhance forward bypassing by a leading ribosome. Gene 60 bypassing starts at codon 46 and this allows a maximum of three ribosomes trailing 5′ behind a ribosome paused at the take-off site. While the context for gene 60 translation initiation is relatively strong [(U) GAGG-6nts-AUG], its initiation level is likely influenced by overlapping translation of an upstream coding sequence and physiological state of the cell at the time of infection [2]. Under polysome conditions, behind the leading ribosome proximity of the mRNA entrance channel of the closest trailing ribosome has potential to prevent nucleotides emerging from the leading ribosome from forming the 5′ SL. While this could account for part of the lower bypassing efficiency under polysome than under monosome conditions, it cannot be the sole reason. Under polysome conditions, the number of ribosomes that complete bypassing is lower than in monosome conditions, due to increased drop-off of bypassing peptidyl-tRNA likely because the 3′SL that facilitates landing [31] does not re-form behind the leading ribosome. It is also possible that the proximity to the bypassing ribosome of its closest trailing ribosome, which is performing continued translation, negatively impacts the pseudotranslocation, e.g., by preventing formation of the hyper-rotated state. The results add to emerging appreciation of the importance of casespecific ribosome loading levels for recoding [45][46][47], as well as elsewhere such as neuronal decoding [48]. This highlights the need, at least in recoding studies, to express cassettes under different loading conditions to avoid missing important stimulatory signals and for ascertaining physiologically relevant efficiencies.
Functional effects of formation of the 5′ SL have formal similarities to one explanation for how a proportion of Cricket Paralysis virus internal IRESmediated translation initiation involves a 3′ removed site. Formation of an extended IRES in the ribosomal P site may lead to non-adjacent downstream initiation [49]. Further, a synthetically created upstream SL structure 4-nt 5′ of the SARS coronavirus frameshift was shown to attenuate frameshifting [50].
It has become common to regard genetic information as not just codon identity dependent on aminoacyl-tRNA and release factor levels and features, but to also recognize relevant mRNA structure and modification as well as specific nascent peptide sequence as important constituents [51]. Tapes to (protein) shapes are no longer considered an accurate descriptor of the process. Nevertheless, the extent to which aspects other than codon identity contribute to T4 gene 60 decoding is still remarkable. It is fortunate for them and us that Crick et al. [1] performed their pioneering work on the adjacent rII gene and not gene 60!

Ribosome profiling
The method for ribosome profiling was described by Ingolia et al. [52] and modified for E. coli [39]. Three 200-ml cultures of E. coli strain MC4100 (OD 600~0 .5) were each infected with bacteriophage T4 at a multiplicity of infection of 10 and harvested at different time points (2.5, 4, or 5 min) post-infection. In parallel, a 200-ml culture (uninfected) was also harvested as a control. Cells were collected by fast filtration and immediately frozen in liquid nitrogen. Cells were lysed by mechanical disruption while frozen to prevent further translation elongation during sample preparation. Clarified lysates, corresponding to 20 OD 260 (in 100 mM NH 4 Cl, 10 mM MgCl 2 , 20 mM Tris (pH 8.0), 0.4% Triton X-100, 0.1% NP40, 5 mM CaCl 2 , 100 units/ml RNase-free DNase, and 100 μg/ml chloramphenicol), were treated with micrococcal nuclease (MNase) (60 units/OD 260 ) for 1 h at 25°C. Nuclease digestion was stopped by the addition of EGTA. Lysates were then loaded onto 10%-55% sucrose gradients (gradient buffer: 100 mM NH 4 Cl, 10 mM MgCl 2 , 20 mM Tris (pH 8.0), and 100 μg/ml choramphenicol) and centrifuged at 35,000 rpm for 2.5 h at 4°C. Monosomes fractions were collected and ribosome-protected mRNAs were isolated by acid phenol extraction and isopropanol precipitation. Following dephosphorylation, RNAs (20 μg each) were fractionated on 15% TBE-urea gels and the appropriate size ranges, 30-40 or 45-150 nt, were excised. RNAs were eluted from the gel slices, and sequencing libraries were prepared following the standard protocol [52]. In addition, ribosome profiling was also performed on 200-ml cultures of E. coli containing a plasmid-borne cassette with the gp60 sequence. E. coli cells (OD 600~0 .5) were induced for 10 min with IPTG. Following induction, cells were processed and RNA was extracted in the same manner as the T4 infected E. coli as outlined above.

Ribosome profiling data analysis
The adaptor sequence (CTGTAGGCACCAT CAATTCGTATGCCGTCTTCTGCTTGAA, for the single-end reads and GATCGTCGGACTGTAG A A C T C T G A A C G T G T A G A T C T C G G T G G o r TGGAATTCTCGGGTGCCAAGGAACTCCAG TCACTGAC for the paired end reads) was cleaved with Cutadapt (DOI:10.14806/ej.17.1.200) with parameters (− n 2 -match-read-wildcards -minimumlength = 20). Reads that failed to align to rRNA were aligned to the reference genomes of E. coli sub-strain MG1655 (accession number NC_000913.2) and T4 (NC_000866.4). The single end reads were mapped with Bowtie [19261174] with parameters (−m 1), i.e. allowing for no ambiguous mapped reads. The pairedend reads associated with the longer fragments were aligned with Bowtie2 [22388286] with parameters (− M 10 -I 0 -no-unal -no-discordant -nomixed -dovetail). Reads were selected to be within the range of 45 to 150 nt in length; however, the majority of fragments were found to be shorter than this. Reads less than 45 nt were discarded. The plots were produced with matplotlib library (https:// doi.org/10.1109/MCSE.2007.55). Ribosome profiling data are deposited in the NCBI GEO database (GSE146240).

5′ SL RNase digestion assays
An extended 5' SL sequence, called 5′ SLx and consisting of 34 nt from C96 through U129 with a Cy3 fluorophore chemically attached to C96, was chemically synthesized by Integrated DNA Technologies. 5′ SLx of 1, 10, and 100 ng/μl was digested by 0.1 ng/μl RNase A (Ambion) or RNase T1 (Ambion) in the presence of 1, 10, and 100 ng/μl yeast RNA, respectively, according to recommended protocols. The reactions were incubated at room temperature for 15 min and resolved on a pre-heated 15% TBEurea denaturing gel at 30 W for 10 min. Cy3 fluorescence was recorded on a Typhoon imager.

Sample preparation
Natural abundance full-length (5′ SL) and truncated 5SL (5′ SLtrunc) RNA constructs chemically synthesized by Integrated DNA Technologies were resuspended individually in RNA Sample Buffer (10 mM sodium phosphates (pH 6.5) with 100 mM NaCl) and purified by size-exclusion chromatography. RNA NMR samples were then exchanged into NMR Sample Buffer (10 mM sodium phosphates (pH 6.5) with 20 mM NaCl) and 10% v/v D 2 O to final concentrations of 0.8 and 0.7 mM, respectively, for NMR experimentation with exchangeable 1 Hs. Samples were then exchanged into NMR Sample Buffer with 99.99% v/v D 2 O for nonexchangeable 1 H acquisition. Uniformly 13 C-and 15 N-labeled full-length 5′ SL with added terminal GC pair ([U 13 C, 15 N]-5′ SLGC) was prepared by in vitro transcription with HiScribe T7 High-Yield RNA Synthesis kits (New England Biolabs) using 2 μM DNA oligomer templates and 7.5 mM [U-13 C, 15 N]-rNTPs (Cambridge Isotopes). The RNA transcript was purified by preparative PAGE followed by sizeexclusion chromatography into RNA Sample Buffer [44,53]. The sample was exchanged into NMR Sample Buffer with 10% v/v D 2 O at a concentration of 0.3 mM for exchangeable spectra, and then into NMR Sample Buffer with 99.99% v/v D 2 O for nonexchangeable spectra. To measure residual dipolar coupling, Pf1 magnetic resonance cosolvent (ASLA Biotech) was added to 5′ SLGC at a final concentration of 21 mg/ml, and the sample was allowed to equilibrate within the magnetic field overnight.

Spectra acquisition and assignment
Unless indicated otherwise, all NMR experiments were performed on a 800-MHz Agilent VNMRS with 5 mm 1 H{ 13 C, 15 N} cryoprobe using established RNAPack experiments [54]. FIDs were processed in VNMRJ 4.2 and exported to MestreNova (Mes-treLab Research) and SPARKY [55] programs for analysis and peak assignment. 5′ SLtrunc was analyzed first, and this construct's chemical shifts were used to inform assignment of 5′ SL and 5′ SLGC. For 5′ SL and 5′ SLtrunc, 1D 1 H and SSNOESY spectra were acquired on samples in 10% D 2 O; 1D 1 H, TNNOESY, and 1 H- 13 15 N} cryoprobe with and without 13 C decoupling and before and after the addition of 21 mg/ml Pf1 to 5′ SLGC. The chemical shifts for the 5′ SLGC construct are deposited (BMRB ID: 28090).

Structure determination
To model the upper helix and tetraloop of the 5′ SLtrunc construct, 65 NOE distance constraints, 10 dihedral angle constraints, and 8 base-planarity constraints from NMR experiments of 5′ SLtrunc were inputted as parameters into Xplor-NIH 2.44 [58][59][60]. From these parameters, 400 structures were generated from repeated simulated annealing using the RNA-ff1 force field. For the complete 5′ SL sequence model, the same procedure was used incorporating 143 NOE distance constraints, 18 dihedral angle constraints, and 18 base-planarity constraints from NMR experiments of 5′ SL and 5′ SLGC. Two hundred structures were generated, and the 20 structures with the lowest energies were selected. These structures were then refined against 19 Δ 1 J CH residual dipolar coupling constraints to generate 20 additional structures, from which the two structures with the lowest energies were selected for analysis.  (22,31). Resulting initiation complexes were purified by centrifugation through a 1.1 M sucrose cushion in the same buffer. The ternary complex EF-Tu-GTP-aminoacyl-tRNA was prepared by incubating EF-Tu (58 μM) with GTP (1 mM), phosphoenol pyruvate (3 mM), and pyruvate kinase (0.1 mg/ml) for 15 min at 37°C, then adding purified total aa-tRNA (about 60 μM) and EF-G (2 μM) and incubating for 1 min at 37°C. In vitro translation by monosomes was started by mixing initiation ribosome complexes (to final concentration 0.016 μM) with the ternary complexes (50 μM). Translation was carried out at 37°C for different time intervals from 3 to 1200 s. To form the polysome, translation was carried out in the same way but in the presence of additional 30S ribosomal subunits (0.16 μM; 10-fold over the mRNA); 50S ribosomal subunits (0.24 μM), IF1, IF2, and IF3 (0.24 μM each); and Bpy-[ 3 H]Met-tRNA fM et (0.24 μM) to allow for re-initiation on the same mRNA. Translation products were separated by Tris-Tricine gel electrophoresis. Fluorescent peptides were detected in gels using Starion IR/FLA-9000 scanner (FujiFilm) and quantified using the Multi Gauge software. Bypassing efficiency was calculated as a ratio of the density corresponding to the bypass (byp) band to the sum of the byp and stop bands.

In vivo bypassing assays
The E. coli strains DH5α and MG1655 cells were used for plasmid propagation and protein synthesis, respectively. Strains were grown on Luria-Bertani medium (LB) for the gene 60 bypassing assays. Constructs were produced by amplification of complementary oligonucleotides (Integrated DNA Technologies) to produce a full-length sequence containing 5′ Xho1 and 3′ BamHI restriction sites. These were cloned into the vector CRC01 Weak SD vector or CRC01 Strong SD vector. It was previously confirmed the CRC01 Strong SD vector displays a higher ribosome load per message from polysome fractions when compared to the CRC01 Weak SD vector [45]. The low ribosome vector used did not as completely restrict loading as that of another vector whose results are not shown because of artifacts associated with generation of the severe restriction. To distinguish between the stop product and the byp product, the His/ Nanoluciferase C-terminal tag is present in the alternative −1 frame relative to the Firefly luciferase/ His N-terminal tag. Both the N-terminal and C-terminal tags are inframe for the Δ gap control. Overnight cultures of strains containing the appropriate plasmid were diluted 1:100 in LB medium. Each culture was grown in triplicate at 37°C at 200RPM. Once an OD of 0.4-0.5 was reached, the cultures were induced with 0.1 mM IPTG for 1 h at their respective temperatures. After induction, cultures were incubated on ice for 10 min and the OD was noted. Cells were lysed by resuspension in 2 × laemmli sample buffer (based on OD) and were incubated for a further 30 min on ice. Cells were subsequently centrifuged at 4°C for 30 min at 20,000g to remove cell debris. Equivalent amounts of proteins were diluted and boiled for 10 min at 95°C. Proteins were separated by SDS-PAGE and transferred onto nitrocellulose membrane (Protran). Immunoblots were incubated at 4°C overnight in 5% milk/ phosphate buffered saline-Tween containing a 1:5000 dilution of mouse anti-His conjugated to a fluorescently labeled secondary antibody. Immunoreactive bands were detected on membranes after incubation using a LI-COR Odyssey® Infrared Imaging Scanner (LI-COR Biosciences). The amounts of stop and byp products were quantified by densitometry using Image Lite Studio (LI-COR Biosciences). The bypassing efficiency was determined by taking the amount of byp product as a ratio of the total amount of stop plus byp products. Measurements were tabulated from three technical replicates for each sample, and the mean and standard deviations were calculated. Data were represented graphically and statistical analysis was performed (Prism 5).