Introduction

Eukaryotic translation is complex, occurring in four phases: initiation, elongation, termination and recycling (Fig. 1). During the initiation phase, the combined effort of more than a dozen eukaryotic initiation factors (eIFs) brings the 40S and 60S ribosomal subunits to the mRNA and positions the ribosome and the initiator methionine tRNA (Met-tRNAiMet) at the start codon1. Initiation is followed by elongation, in which the ribosome actively moves along the mRNA, using tRNAs and eukaryotic elongation factors (eEFs) to synthesize protein2. Termination occurs when the elongating ribosomal complex reaches the stop codon and the stop codon is recognized by eukaryotic release factors (eRFs) eRF1 and eRF3. The peptide, mRNA and tRNA are then released and subunit dissociation is promoted by eRFs and other factors, permitting the released subunits to be recycled and to participate in another round of translation2,3.

Fig. 1: Antiviral responses involving translation and viral RNA-based strategies to manipulate translation.
figure 1

The translation cycle is generally divided into four phases, depicted here. For clarity, details such as each GTP hydrolysis event, all involved factors and individual steps are not shown; further details can be found in refs1,2,3. Briefly, during canonical cap-dependent eukaryotic translation initiation, mRNA is recognized by the eukaryotic initiation factor (eIF) eIF4F complex, which contains eIF4E, eIF4G and eIF4A. This complex binds the modified nucleotide cap on the 5′ end of the mRNA, resulting in an mRNA activated for translation. A series of intermolecular recognition events leads to recruitment of the 43S complex to this activated mRNA; the 43S complex contains the small (40S) ribosomal subunit, eIFs (eIF3, eIF1, eIF1A and eIF5) and the eIF2–Met-tRNAiMet–GTP ternary complex. Next, the mRNA sequence is scanned in a 5′ to 3′ direction by the ribosomal subunit and associated factors in an ATP hydrolysis-dependent process. During scanning, eIF2-bound GTP is hydrolysed (stimulated by eIF5). The purpose of this scanning is to locate the proper start codon; the most used is an AUG triplet. When a start codon is selected, a codon–anticodon interaction is formed with Met-tRNAiMet in the P site, forming the 48S preinitiation complex. Phosphate is released by eIF2 and conformational changes involving a number of eIFs (eIF2, eIF1A, eIF1 and eIF5B) and a second GTP hydrolysis event on eIF5B lead to the release of most protein factors and the joining of the large (60S) ribosome subunit, creating an elongation-competent 80S ribosome. During elongation, codons are read by aminoacylated tRNAs delivered by the eukaryotic elongation factor (eEF) eEF1A in a GTP hydrolysis-dependent process. As tRNAs decode the message and enter the ribosome, they deliver their cognate amino acid to the growing polypeptide chain. Formation of each peptide bond is followed by GTP hydrolysis-dependent translocation by eEF2 and delivery of the next tRNA. Once a peptide chain has been made, the ribosome must terminate protein synthesis, release the protein and allow the ribosome to be used again (recycling). Once a stop codon (UAA, UGA or UAG) enters the A site, it is recognized by eukaryotic release factors (eRFs). The action of the eRFs along with other factors, including ATP-binding cassette sub-family E member 1 (ABCE1), ligatin and potentially others, leads to release of the peptide, subunit dissociation, tRNA release and ribosome recycling. During recycling, protein factors needed for the next round of translation are loaded back onto the ribosomal subunits; these include the proteins that make up the multifactor complex (MFC). Most phases of translation can be regulated, but two specific phases are noteworthy owing to their effect during viral infection, shown in purple boxes. The first is to interrupt the process of mRNA recruitment through the cap, primarily through the inactivation of eIF4E by hypophosphorylation of the factor or sequestration by eIF4E-binding proteins 1 and 2. The second is by inhibiting initiator tRNA delivery by phosphorylation of the α-subunit of eIF2. This prevents exchange of GDP for GTP on the factor; thus, it cannot be used to deliver initiator tRNA. Specific kinases do this in response to stresses induced by many viral infections, the most common being sensing of double-stranded RNA viral replication intermediates or endoplasmic reticulum stress by viral replication complexes4. Viruses use RNA to interact with and exploit the translation process at many steps; examples that are discussed in this Review are shown in yellow boxes. CITE, cap-independent translation element; IRES, internal ribosome entry site.

The complexity of the eukaryotic translation cycle provides for precise regulation but also allows eukaryotic viruses to exploit or manipulate the process. To counter this, cells have evolved mechanisms for detecting viral infection and then altering the translation capacity of the cell. Specifically, cells can recognize pathogen-associated molecular patterns (PAMPs; for example, double-stranded RNA or 5ʹ triphosphorylated RNAs) or they can be stimulated by outside signals (for example, interferons), resulting in the activation of the innate immune response, which can lead to a programmed decrease in translational capacity aimed at limiting the production of viral proteins4. This decrease in translation is primarily achieved by inhibiting translation initiation, either by interfering with the delivery of Met-tRNAiMet through phosphorylation of eIF2α or the recruitment of ribosomes to mRNAs by sequestering eIF4E. In turn, viruses have evolved ways to overcome or even exploit these antiviral defences to promote viral protein synthesis. Many eukaryotic viruses employ protein-based mechanisms that manipulate the cellular translation machinery or subvert antiviral responses5,6 (Box 1), but many other mechanisms depend on structured viral RNA elements. As the genomes of positive-sense RNA viruses are the templates for translation, these genomes are a rich repository of functional RNA elements that manipulate the cellular translation machinery, even before viral proteins accumulate. However, RNA elements that manipulate the translation machinery are found in diverse eukaryotic viruses and throughout viral genomes, highlighting their importance for viral infection.

The number and variety of viral RNA-based strategies used to manipulate translation (reviewed in ref.7) is too great to comprehensively discuss all of them and their detailed mechanisms in this Review. Thus, we refer the reader to other, more focused reviews where necessary. In this Review, we focus on illustrative examples that conceptually reveal the diversity of RNA structure-based mechanisms for manipulating translation. The Review is organized to follow the phases of translation, starting with initiation at the 5′ end and progressing to termination at the 3′ end of viral RNAs. We first describe examples of how viruses use RNA-based strategies to exploit or mimic canonical translation initiation processes, then present examples of viral RNAs that drive non-canonical modes of initiation. Next, we discuss how elongation and termination processes can be manipulated by viral RNAs, then end with some interesting examples of structures in viral 3′ UTRs that enable viruses to interact with the translation machinery in novel ways.

Canonical translation initiation

Modification of the 5' terminus to mimic host mRNAs

Perhaps the most straightforward way for a viral RNA to co-opt the cellular translation machinery is to directly mimic cellular mRNA translation initiation signals. There are several ways viruses enzymatically create a cap-like structure at the 5ʹ end of the viral RNA (reviewed in refs8,9): using the host capping machinery in the nucleus, used by many double-stranded DNA viruses; encoding viral capping enzymes that are used in the cytoplasm; and through ‘cap-snatching’, in which the viral RNA polymerase cleaves the capped 5ʹ end of nascent cellular mRNAs and uses this to prime viral RNA synthesis. This was first discovered in influenza A virus10; novel examples were recently reported for two totiviruses11,12. Some viruses further chemically modify the cap to mask viral RNAs from detection and degradation13,14. Although capping is a common way for viruses to hijack the cellular machinery, many viruses use cap-independent modes of translation initiation.

Exploiting scanning and start codon selection

In canonical eukaryotic translation initiation, recruitment of the 43S ribosomal complex to mRNAs is followed by 5ʹ to 3ʹ scanning to find a start codon; this process is readily exploited by viruses using RNA-based signals. The most used start codon is an AUG within a consensus sequence known as the Kozak sequence (Fig. 2a). A ‘strong’ Kozak consensus sequence is gccRccAUGG, in which the start AUG is surrounded by highly conserved nucleotides (uppercase letters; a purine (R) three nucleotides upstream of the AUG and a G immediately downstream of the AUG) and less conserved nucleotides (lowercase letters); variations on this provide various degrees of translation efficiency, or ‘strength’15. If a start codon within a ‘weak’ Kozak sequence is encountered, scanning ribosomes can sometimes continue scanning, or ‘leak’, past the AUG codon to locate another start codon (Fig. 2b). This leaky scanning allows the production of different proteins in specific quantities from a single mRNA, and by shifting the reading frame to different start codons, different proteins can be synthesized (Fig. 2c). Viruses exploit this by using RNA templates that contain multiple reading frames, each starting at an AUG within a Kozak sequence of a specific strength, thus regulating the degree to which each codon is used for initiation. Notable mammalian viruses that use this include influenza A and B viruses and orthoreoviruses. A recently described example is in Andes virus, in which leaky scanning produces both the nucleocapsid protein and the non-structural protein from a single small mRNA16.

Fig. 2: Variations on scanning and start codon recognition.
figure 2

Manipulation of the scanning and start codon recognition process by RNA sequence and structure is a strategy used by viruses to affect translation at the initiation phase and produce different proteins in different amounts. a | A canonical cellular message that is 7-methylguanosine-capped (cap) and has an AUG in a strong Kozak context is shown. Ribosomes scan (grey dashed line) until they reach this AUG, where translation of the ORF (pale yellow) begins, as shown with the thick black arrow. This mechanism results in efficient production of a single protein product (yellow, to the right). b | Leaky scanning can occur when multiple codons exist with different context strengths. In this conceptual diagram, ribosomes scan and then encounter a start codon in a weak context (shown here with pyrimidines (Y) at the −3 and +4 positions). Some ribosomes stop here and initiate, shown with the thin black arrow. Other ribosomes bypass this upstream AUG and continue scanning to reach a downstream AUG in a strong context and then initiate there. The amount of initiation at each AUG is dictated by the context strength. Two AUGs in the same reading frame are shown, leading to a longer and shorter isoform of the same protein, and the resultant population of protein products is shown to the right. c | Non-AUG codons can be used for initiation. An upstream alternative ORF is shown in blue, which is translated if scanning ribosomes initiate at a CUG in a weak context. Ribosomes that do not initiate at this CUG continue scanning to a downstream AUG in a strong context (pale yellow ORF). In the example shown here, the two ORFs overlap but are in different reading frames (green), leading to two entirely different protein products (right). Various combinations of overlapping reading frames in different contexts and start codons of different types and strengths can give rise to diverse outcomes in terms of types and relative amounts of protein, all from a single RNA template.

Ribosome profiling has uncovered a large number of non-AUG start codons (especially CUG) that are likely used to regulate gene expression17, and viruses can combine non-AUG initiation with start codon context ‘strength’ to provide precise regulation. For example, in panicum mosaic virus (PMV) infection, up to four ORFs are translated from a single subgenomic viral RNA using a combination of leaky scanning and initiation at GUG codons18. Furthermore, some viruses manipulate scanning and start site selection by encoding RNA signals that cause a scanning ribosome to ‘shunt’ over sections of the 5ʹ UTR that possibly contain short ORFs19. Another strategy of manipulating ribosome scanning is used by some alphaviruses when a subgenomic RNA (sgRNA) is translated under conditions of translation inhibition by eIF2α phosphorylation. Specifically, a stable RNA stem-loop element downstream of the start codon (downstream loop) pauses scanning long enough for Met-tRNAiMet to enter the initiation complex without an initiation factor20,21,22. In this case, ribosome stalling occurs when the downstream loop becomes trapped in a section of the 18S ribosomal RNA (rRNA), locking the start codon in the P site23. This latter example shows how a virus can use a fairly small RNA element to overcome part of the host antiviral response.

Strategies to modify or prevent modification of the translation machinery

Some viruses also have the ability to alter the translation machinery in such a way as to influence the selection of an mRNA template in their favour or directly prevent the cell from limiting translation. For example, poxvirus kinase phosphorylates small ribosomal subunit proteins, which enhances the translation of viral mRNAs with conserved adenosine repeats in their 5ʹ UTRs24. In addition, similar to some proteins, specific viral RNAs can prevent phosphorylation of eIF2α; examples include the adenovirus VA25 and Epstein–Barr virus EBER26, which both act by binding and inhibiting the enzyme that phosphorylates eIF2α.

Viral internal ribosome entry sites

As outlined above, some viral RNAs mimic canonical initiation signals found in cellular mRNAs, but many have evolved mechanisms to bypass the need for these signals. In particular, cis-acting RNA elements called internal ribosome entry sites (IRESs) recruit the translation machinery through pathways independent of the 5ʹ end or cap. The advantages of using an IRES vary, but a common theme is that they allow viral protein synthesis when the canonical cap-dependent translation initiation mechanism is repressed. For example, some viruses that use IRESs encode proteases that cleave specific cellular eIFs; this globally inhibits translation of most cellular mRNAs, but the virus can continue translation using its IRES. In other instances, the cellular antiviral response globally depresses cap-dependent translation through eIF2α phosphorylation or eIF4E sequestration, but the IRES allows viral protein synthesis to continue. IRES structures are diverse, but many have substantial secondary and tertiary structure, and they mostly lie upstream of the ORF they control. Many of the IRESs have been divided into several mechanistic classes on the basis of how they recruit the ribosome, the eIFs that they require to function and similarities in their secondary structures27. Briefly, class 1 and 2 IRESs are generally larger and require more eIFs and protein cofactors than other classes in order to function. Class 3 IRESs bind directly to the ribosome but require only a few canonical eIFs for function. Class 4 IRESs form compact, well-defined folds that bind directly to the ribosome and function without using any eIFs. A second classification scheme has been proposed in which hepatitis A virus is the sole member of class 3 (ref.28). An interesting anti-correlation exists between the number of protein factors an IRES requires for translation initiation and the degree to which it forms a single well-defined conformation29. Below, we briefly describe key features of these IRES classes, starting with the simplest and best understood to the more complex and least understood. A more detailed description of the mechanisms and structures of these IRESs is found in ref.27.

Class 4 IRESs

Class 4 IRESs have thus far been identified exclusively in the family Dicistroviridae, of which the single-stranded RNA genomes contain two ORFs separated by an intergenic region (IGR). The IGR contains an IRES that controls translation of the downstream ORF, thus they are often referred to as the ‘IGR IRESs’; they have been extensively functionally and structurally characterized since their discovery, providing insight into some basic aspects of ribosome function (reviewed in refs30,31). There are two types of IGR IRESs, but all are ~150–200 nucleotides in length and are capable of initiating translation in a variety of cell types and organisms. These IRESs can even initiate translation in bacteria, although the mechanism appears to be inefficient and different than in eukaryotes32.

The molecular mechanism of these IRESs depends on its 3D structure, which allows them to initiate translation by co-opting the elongation cycle of translation. Early structural studies revealed that IGR IRESs fold into a compact structure containing three RNA pseudoknots in a two-domain architecture33,34,35. One pseudoknot domain (domain 3) partially mimics tRNA structure and the mRNA–tRNA codon–anticodon interaction36,37,38. To initiate translation, the folded IRES binds directly to the 40S ribosomal subunit with high affinity and assembles an 80S ribosome without using scanning, an AUG start codon, Met-tRNAiMet or any initiation factor (including eIF2)30,31. In fact, eIF2α phosphorylation can increase translation initiation efficiency from these IRESs39, providing an elegant way for the virus to take advantage of this major cellular antiviral response mechanism. Within the assembled 80S ribosome, the IRES is placed between the two subunits with domain 3 in the A site of the ribosome, where it interacts with the ribosome in the same way as a codon–anticodon pair36 (Fig. 3a). One round of eEF2-catalysed ‘pseudotranslocation’ (so-called because it occurs without tRNA) moves this domain to the P site, allowing charged tRNA delivery to the A site by eEF1A. A second round of eEF2-catalysed translocation moves this tRNA to the P site, enabling elongation. More recent studies have revealed details of the molecular movements occurring during this process37,40,41,42; movement of the IRES through the space between the ribosomal subunits has been described as akin to that of an inchworm43.

Fig. 3: Internal ribosome entry sites.
figure 3

a | Class 4 intergenic region internal ribosome entry sites (IRESs) are found between two viral ORFs. The three secondary structural domains are labelled. The yellow boxed area indicates the portion that interacts with the 40S subunit, and the blue boxed area is the portion that interacts with the 60S subunit. On the right is a cryo-electron microscopy (cryo-EM) reconstruction of the IRES bound to the 40S subunit (PDB 4V92)41, with the location of the domains labelled and the approximate location of the 60S subunit shown. The 60S subunit location is indicated, but that subunit has been removed to allow the domains of the IRES to be visualized. Domains 1 and 2 are labelled 1 and 2, as they form a single compact folded entity. b | Secondary structure cartoon of the hepatitis C virus IRES, representing the class 3 IRESs. The IRES is at the 5′ end of the viral genome, which starts with a triphosphate (ppp). Secondary structural domains are labelled, and the 40S and 60S subunit interaction sites are boxed in yellow and blue, respectively. At the right is a cryo-EM reconstruction of the IRES bound to the 40S subunit (PDB 5A2Q)49, with the location of the IRES RNA domains labelled and the approximate location of the 60S subunit shown; it has been removed to allow the full IRES to be seen. c | Class 1 and 2 IRESs are similar in organization and function but are not identical. Secondary structure cartoons of the encephalomyocarditis virus (class 2, top) and poliovirus (class 1, bottom) IRESs are shown, with secondary structure domains labelled. Both viral RNAs have a viral protein genome-linked (VPg) peptide on their 5′ end. Only the secondary structures necessary for IRES function are shown; upstream structures are omitted. The approximate binding sites for various eukaryotic initiation factors and IRES trans-acting factors are shown; additional details for related IRESs can be found in refs27,44. In the class 2 IRESs (top), there are two closely spaced start codons at the 3′ end of the IRES. For the class 1 IRESs (bottom), an upstream AUG codon (AUG1) is needed for ribosome entry, but then scanning leads to initiation at a downstream codon (AUG2). PCBP2, poly-C-binding protein 2; PTB, polypyrimidine tract-binding protein; Yn, polypyrimidine tract.

Class 3 (hepatitis C virus-like) IRESs

The best-known class 3 IRES is from hepatitis C virus (HCV) (in the family Flaviviridae) (Fig. 3b); others have been found in viruses from the families Flaviviridae and Picornaviridae28,44. Unlike class 4 IRESs, class 3 IRESs have thus far been found only at the 5ʹ end of viral RNAs, and they assume an extended architecture containing a variety of stem-loop structures that are organized around helical junctions and a pseudoknot (Fig. 3b). Interestingly, although the various class 3 IRESs share overall secondary structures with clear common patterns, there is also substantial sequence and structure variation that may provide for additional regulation or variation in function. This observation, and the fact that they appear in multiple virus families, suggests a useful versatility. It may not be immediately obvious why these viruses have evolved to use an IRES. One simple explanation is that class 3 IRES-containing viruses simply did not evolve the ability to cap the 5ʹ ends of their RNAs and thus were committed to developing a form of cap-independent translation. However, a more compelling cause is that at least some class 3 IRESs can operate in both eIF2-dependent and eIF2-independent modes45,46,47,48.

Extensive biochemical, genetic and structural studies have allowed development of mechanistic models for class 3 IRESs. Early structural studies consisted of NMR and X-ray crystallography applied to several isolated HCV IRES domains, which were combined with low-resolution cryo-electron microscopy (cryo-EM) reconstructions29. Most recently, high-resolution cryo-EM structures of class 3 IRES–ribosome complexes have revealed the molecular details of the interactions underlying the structure-based mechanism49,50,51. Using the HCV IRES as an example, the structure of the IRES possesses several domains that have specific roles during IRES-driven translation initiation52. The IRES RNA directly binds to the 40S ribosomal subunit and eIF3 using multiple contact points in domain III53,54. Binding positions the AUG start codon within the decoding groove; thus, there is no scanning. Interestingly, in the IRES−40S–eIF3 complex, eIF3 is displaced from its normal binding position on the 40S subunit51; this displacement may be necessary for the IRES to access the 40S subunit or it may be a functionally important remodelling step46. Domain II does not increase the affinity between the IRES and the 40S subunit but makes specific contacts to the decoding groove of the ribosome, changes ribosome conformation and directs several mechanistically important events29,48,55,56,57,58. Domain II may change position during the initiation process50, possibly acting as an important regulatory element.

Perhaps the most interesting step in class 3 IRES initiation is the delivery of Met-tRNAiMet to the IRES−40S–eIF3 complex, as this is when the ability to operate in an eIF2-dependent or independent mode comes into play. If eIF2 is available, Met-tRNAiMet presumably is supplied by the factor. However, in the eIF2-independent mode, it has been proposed that Met-tRNAiMet could be delivered by alternative factors eIF2A or eIF2D59,60,61, or through the action of eIF5B alone47, or even through direct ‘factorless’ tRNA binding in the case of some class 3 IRESs62. In addition, eIF1A appears to be particularly important for stabilizing tRNA binding in the eIF2-independent mode46. Regardless of the pathway, once Met-tRNAiMet is bound, 60S subunit recruitment leads to 80S ribosome formation and then elongation52.

Although a strong framework for understanding the class 3 IRESs exists, there are areas that require additional investigation. First, although we have presented the molecular mechanism of the HCV IRES translation initiation as a stepwise progression, there is uncertainty regarding the precise order of these recruitment events and how this might relate to the ability to use different tRNA binding modes. There is evidence that the HCV IRES binds pre-formed 40S subunit-containing preinitiation complexes at a specific step of ribosome recycling, then manipulates them to drive downstream steps of tRNA recruitment and 80S ribosome formation46. Second, although a limited set of canonical eIFs appear to be sufficient for translation on most class 3 IRESs tested, roles for some auxiliary factors have been proposed63. Such factors combined with variable class 3 IRES structure may fine-tune function for certain cell types, for the specific needs of different viruses or for providing additional ways to respond to changing cellular conditions and antiviral responses.

Class 1 and 2 IRESs

Class 1 and 2 IRESs are found exclusively in picornaviruses and are similar to one another in that they are generally ~450 nucleotides long, are found in the 5ʹ UTR of the viral RNAs, are unable to bind directly to the 40S subunit and require almost the entire set of canonical translation initiation factors (excluding eIF4E)27,64. Both classes have complex secondary structures comprising multiple domains containing many stem-loops, internal loops, bulges and junctions (Fig. 3c). Both class 1 and 2 use IRES trans-acting factors (ITAFs) — protein factors that are not considered part of the canonical translation machinery but are functionally important to a specific IRES63. Viruses that use class 1 or 2 IRESs gain an advantage by depressing cap-dependent initiation while promoting IRES-driven translation5,44. Specifically, enteroviruses such as poliovirus cleave eIF4G, which decouples cap and factor binding, and the IRES then uses a cleaved fragment of eIF4G for translation initiation. Cardioviruses such as encephalomyocarditis virus (EMCV) cause the relocalization of eIF4E into the nucleus, which depresses cap-dependent translation, but class 1 IRES-dependent translation can continue. Both class 1 and 2 IRESs are at least partially refractory to eIF2α phosphorylation during infection65. For example, poliovirus may be able to operate in an eIF2-independent mode late during infection by using eIF5B66, suggesting that, similar to the class 3 and 4 IRESs, they have evolved strategies to bypass canonical Met-tRNAiMet delivery. Despite these similarities, class 1 and 2 IRESs are not identical in terms of secondary structure and mechanism, and there are variations within each class. To present some overarching mechanistic concepts, we use the poliovirus and EMCV IRESs as prototypes for classes 1 and 2, respectively, and direct the reader to focused reviews for further information on other IRESs27,44,64.

The general model that has emerged from biochemical and genetic experiments is that class 1 and 2 IRESs are flexible scaffolds to which ITAF binding facilitates eIF binding, leading to recruitment of the ribosome. In the case of the class II EMCV IRES, polypyrimidine tract-binding protein (PTB) is an important ITAF that simultaneously binds to polypyrimidine tracts (Yn) near the 5ʹ (domain II) and 3ʹ ends of the IRES (just upstream of the translation start site), altering the overall conformation of the IRES67,68 (Fig. 3c). The class 1 poliovirus IRES also uses PTB as an ITAF69 and poly-C-binding protein 2 (PCBP2), which binds to a cytosine-rich stretch in the large central domain IV of the poliovirus IRES70 (Fig. 3c). PTB and PCBP2 are not the only ITAFs that have been identified, and there is variation across the various members of the class 1 and 2 IRESs44. In addition to ITAF binding, long-range RNA–RNA interactions involving GNRA tetraloops are proposed to further organize the active conformation of both class 1 and 2 IRESs71,72,73.

The ITAF-assisted conformation of the class 1 and 2 IRESs is the platform for recruiting eIFs: in the poliovirus IRES, eIF4G and eIF4A bind to domain V70, whereas in the EMCV IRES they bind to domain IV74. It is worth noting that in viral infections in which eIF4G is cleaved by viral proteases, the amino-terminal fragment of eIF4G that binds eIF4E is lost, whereas the carboxy-terminal fragment that interacts with eIF3 interacts directly with the IRES RNA. EMCV IRES domain V also binds other factors, including eIF4B75. Binding of eIFs to the class 1 and 2 IRESs then leads to the recruitment of the 43S complex. However, the class 1 and 2 IRESs differ in the use of start codons for initiation44. For the class 2 EMCV IRES, translation initiates without scanning at the second of two closely spaced AUG codons76. However, other class 2 IRESs appear to use both AUG start codons, suggesting that scanning occurs77. During translation from the class 1 poliovirus IRES, the ribosome is initially recruited to an upstream ‘cryptic’ AUG codon, and it then scans through domain VI to reach the AUG start codon78,79.

Despite a good mechanistic and biochemical framework for understanding class 1 and 2 IRESs, their large and extended architectures and the complexity of the associated translation preinitiation complexes have made structural studies difficult. Isolated domains of IRESs have been investigated using NMR80,81, and small-angle X-ray scattering has been used to determine the global shape of part of EMCV IRES domain IV bound to the HEAT1 domain of eIF4G81. However, to date, the structures of assembled class 1 or 2 IRES–ribosome–eIF–ITAF complexes are unsolved. This gap in knowledge limits our ability to fully understand how these complex assemblies are constructed, differences when compared with canonical initiation and the dynamic conformational changes that take place. We expect advances in cryo-EM coupled with other structural methods to help address these unknowns.

Other viral IRESs

In addition to those described above, other IRESs have less well-understood mechanisms and structures and therefore cannot be easily classified. For example, the Halastavi árva virus 5ʹ UTR IRES operates with most canonical eIFs but requires retrograde scanning82, precluding easy assignment to any aforementioned class. Other IRESs are more challenging to classify. For example, the leader sequence of HIV-1 RNAs alone can stimulate translation in a cell-type-specific manner83, but different HIV-1 transcripts contain different IRES structures and splice site variations84, and the viral transcripts are capped. In HIV-1 and other retroviruses, the contributions of IRES-dependent and cap-dependent translation during infection remain the subject of debate85. Overall, the diversity of structures and mechanisms used by IRESs is greater than initially anticipated, and future characterization of these RNAs must include focused studies of many different examples to discover both idiosyncratic features of individual IRESs and common principles.

Elongation and termination

Initiation is the most regulated phase of translation, but regulation also occurs during elongation or termination through RNA-based signals within an ORF (Fig. 4). Viruses that depend on RNA elements that are encoded within ORFs must simultaneously maintain coding capacity and functional structures. Thermodynamically stable RNA elements within coding regions can alter translation efficiency. Viral mRNAs use this phenomenon to slow translation; for example, the Epstein–Barr virus EBNA1 mRNA contains regions that can form very stable G-quadruplex structures86. In this section, we discuss three other examples: RNA structure-dependent programmed ribosomal frameshifting (PRF), stop codon readthrough and termination-coupled reinitiation.

Fig. 4: RNA elements within the coding region.
figure 4

Specific folded RNA elements within a coding region can affect translation and lead to decoding of an alternative or extended frame. a | Programmed ribosomal frameshifting can be induced by structures that stall the ribosome and are coupled to slippery sequences. If encountered, RNA in the decoding groove can then shift into a different frame (in this example, a shift backwards of one nucleotide, in other words, a –1 programmed ribosomal frameshift (–1PRF)). If this occurs, translation continues but in a different frame (purple) than the original frame (yellow). Frameshifting can occur as a result of special pseudoknots (above) or stem-loops (below). NMR structures of a frameshifting pseudoknot from mouse mammary tumour virus (PDB 1RNK)160 or a hairpin from HIV-1 (PDB 1ZC5)92 are shown. b | Stop codon readthrough can occur through a variety of RNA-based mechanisms; when it occurs, the sequence downstream (purple) is translated. This simple case relies only on RNA sequence. Some tobamoviruses contain a stop codon within a UAG-CARYYA motif (R is a purine and Y is a pyrimidine) that can induce readthrough at low frequencies. c | Readthrough can be driven by structured RNA. A simple stem-loop (top). A pseudoknot from murine leukaemia gammaretrovirus and its NMR structure to the right (PDB 2LC8)113 (middle). The carnation Italian ringspot Tombusvirus readthrough signal involves long-range interactions (bottom). d | Termination upstream ribosome-binding sites (TURBSs) exist in some caliciviruses. They contain a stem (motifs 2 and 2*) with an extended loop (motif 1) that binds the ribosome on a terminating message to stimulate reinitiation on a downstream ORF (purple).

Programmed ribosomal frameshifting signals

PRF occurs when a specific RNA element causes an elongating ribosome to pause and the mRNA–tRNA codon–anticodon interactions break and then reform in a different reading frame; often the shift is one nucleotide backwards, leading to ‘–1PRF’87,88. As viruses have limited coding capacities, many viral genomes encode overlapping ORFs and use PRF to double or triple the coding capacity of a single RNA template (Fig. 4a). The percentage of ribosomes that undergo frameshifting is generally only a small percentage, but it can reach 40% for some viral –1PRF signals89. These RNA signals also have evolved to achieve a specific ratio of protein products.

PRF signalling structured RNA elements have been documented in bacterial, yeast, plant and mammalian systems and consist of two types. The first is a pseudoknot located six to eight nucleotides downstream of a ‘slippery sequence’88. These pseudoknots often control the expression of the gag-pro or gag-pol genes of retroviruses, regulating the relative expression of structural and non-structural proteins. Interestingly, PRF in severe acute respiratory syndrome (SARS) coronavirus infection may involve trans RNA–RNA interactions90. The second type of structure is a thermodynamically stable stem-loop. Examples are found in other members of the family Retroviridae such as HIV-1, HIV-2 and simian immunodeficiency virus91,92,93 (Fig. 4a). For stem-loop-directed frameshifting, stability of the stem is of primary importance, but adjacent structures and long-range interactions with other RNA elements can further modulate frameshifting. An interesting example is the pea enation mosaic virus (PEMV) recoding stimulatory element; RNA secondary structure elements upstream and downstream of the frameshift site modulate the frequency of frameshifting, which occurs just upstream of a termination codon94. Interestingly, dynamic rearrangement between a pseudoknot and a tandem stem-loop conformation in a –1PRF signal in West Nile virus RNA seems important for frameshifting95, highlighting the complexity of the molecular motions that govern this process.

The mechanism by which specific RNA elements stimulate frameshifting at a slippery sequence remains an area of active investigation. Thermostability of the RNA fold alone does not dictate frameshifting efficiency96, and mechanisms may differ between individual frameshifting signals. In some cases, frameshifting correlates with stability of ribosome-adjacent base pairing97 and metastable or non-canonical conformations on an elongating ribosome87,98. The resolution of paused ribosomes into either the 0 or −1 frame is influenced by the dynamics of the PRF structure95 and the inherent helicase activity of ribosomes99. Viral frameshifting can also be stimulated by viral and cellular protein factors that bind viral RNA adjacent to the slippery site and act in place of, or in concert with, higher-order RNA folding. For example, porcine reproductive and respiratory syndrome virus utilizes a complex of its Nsp1β replicase subunit and PCBP that associates with a sequence downstream of the slippery site and directs either –2 or –1 frameshifting100,101,102. Additionally, the cardioviruses EMCV and Theiler’s murine encephalomyelitis virus contain atypically spaced stem-loops proximal to frameshifting sites that seem to cooperate with the viral 2A protein to facilitate high-efficiency frameshifting (~70–82% frameshifting)103,104,105.

Stop codon readthrough

Viruses can increase coding capacity by the ribosomal readthrough of stop codons, resulting in extended protein isoforms. Signals near the stop codon may promote recognition by a suppressor tRNA rather than by a termination factor protein, dependent primarily on the RNA sequence immediately adjacent to the stop codon106,107,108,109. Tobacco mosaic virus and other plant viruses use this method to make extended coat protein isoforms110. Another signal that can cause readthrough is an RNA element just 3ʹ to the stop codon, including a stem-loop or pseudoknot (Fig. 4c). A stem-loop has been computationally predicted to exist in some alphaviruses and plant viruses within the family Virgaviridae111 and experimentally shown in Colorado tick fever virus112. Pseudoknots are also implicated in translational readthrough in murine leukaemia gammaretrovirus infection113,114. Readthrough can be promoted by a long-range cis interaction involving RNA near the stop codon (for example, in some members of the Tombusviridae and Luteoviridae families115) (Fig. 4c, bottom). The precise molecular mechanisms of readthrough in viral RNAs remain poorly defined. Recent studies of readthrough in turnip crinkle virus (TCV) infection suggest that these RNA elements can adopt multiple conformations, further illustrating the dynamic and complex nature of viral recoding elements116. Lacking detailed knowledge of the molecular mechanisms, it is tempting to hypothesize a mechanism in which these RNA elements slow translation termination long enough to favour the recruitment of low-abundance suppressor tRNAs117.

Termination-dependent reinitiation

Viruses can manipulate translation by interrupting the recycling of a terminated ribosome in favour of reinitiation of an adjacent viral ORF; this differs from readthrough in that the synthesis of the first protein terminates, and then the ribosome reinitiates at a site proximal to the termination site (either upstream or downstream) to produce a second protein. A well-characterized example of reinitiation, used by feline calicivirus, is directed by a structured RNA element called the termination upstream ribosome-binding site (TURBS) that lies within the last 40–80 nucleotides of the upstream ORF (Fig. 4d). This type of reinitiation was originally suspected to be –1PRF and exists within other Caliciviridae family members (including human and bovine noroviruses) and haemorrhagic viruses such as rabbit haemorrhagic disease virus118,119,120. The TURBS contains a sequence that forms complementary interactions with the apical loop of helix 26 of 18S rRNA. Viral RNA–rRNA base pairing is sufficient to retain the 40S subunit for a time after termination121, then the 40S subunit can move to the second ORF start site a short distance away; subsequent reinitiation appears to be more streamlined than normal initiation122. Notably, the ribosomal binding site of the TURBS is remarkably similar to sequences in the class 3 IRESs that are needed for 40S subunit recruitment, suggesting either an evolutionary link between these RNA elements (and thus between IRES-driven initiation and reinitiation) or a convergent mechanistic solution to the problem of recruiting and manipulating the 40S subunit.

Controlling translation from the 3′ end

Some viral RNAs contain RNA elements near the 3ʹ end that control or manipulate the translation machinery. A variety of such elements are found in plant viruses; we will focus on them as interesting and important examples. In most cases, the mechanism of translational control involves long-range interactions between the 5ʹ and 3ʹ ends of the viral RNA using interactions and signals differing from canonical cap–poly(A) tail 5ʹ to 3ʹ communication.

3' cap-independent translation elements

3ʹ cap-independent translation elements (3ʹ-CITEs) are structurally diverse RNA elements in the 3ʹ UTRs of many RNA plant viruses that allow translation of the upstream sequences by recruiting translation components (eIFs or ribosomal subunits) to the 3ʹ UTR, but then translation initiates at the 5ʹ end owing to 5ʹ to 3ʹ communication123,124,125,126. Seven 3ʹ-CITE structural classes have been identified125. Although naturally found in the 3ʹ UTR, 3ʹ-CITEs have been shown to function when placed in the 5ʹ UTR127, and 3ʹ-CITEs from one virus have been shown to function in other viruses, demonstrating that they are portable elements. However, unlike IRESs (which can drive translation initiation independent of the 5ʹ end), the 5ʹ end is needed for the function of some 3ʹ-CITEs even if the cap is not128,129,130. 3ʹ-CITEs could provide several advantages to the virus. First, they could allow successful competition with cellular mRNAs for translation components, as the affinity of many 3ʹ-CITEs for their target eIFs is high (dissociation constants in the mid-nanomolar range)131,132. In addition, 3ʹ-CITEs could capture ribosomes terminating at the 3ʹ end and deliver them to the 5ʹ end. This mechanism could prevent any hindrance between the replication and translation machineries if they were operating on the same viral RNA, or it could regulate the rate of translation of different subgenomic viral RNAs with different 5ʹ UTRs133. Several excellent recent reviews describe the diversity of 3ʹ-CITE structure and function123,124,125,126; therefore, we briefly provide representative examples to illustrate key concepts.

Most known 3ʹ-CITEs function by directly binding components of the cap-binding eIF4F complex; for example, the highly efficient barley yellow dwarf virus translation element (BTE) found in some members of the family Tombusviridae134,135,136,137 (Fig. 5a). The BTE contains a conserved stem-loop structure that directly binds eIF4G, aided by other factors, in a way that still allows eIF4G to bind poly(A) binding protein (PABP) and eIF4E138,139,140. Base pairing between a sequence in the 3ʹ-CITE and a complementary sequence in the 5ʹ UTR brings the bound factor or factors to the 5ʹ end where they can drive translation initiation128,129,137. Other 3ʹ-CITEs use specific RNA sequences and structures to directly bind eIF4E141,142,143,144 (Fig. 5b) or the intact eIF4F complex126 (Fig. 5c). For example, the panicum mosaic virus-like translation enhancers (PTEs) are a class of 3ʹ-CITEs that bind directly to eIF4E using a folded RNA element coupled with long-range interactions to complementary sequences in either the 5ʹ UTR or within an upstream coding region126,144.

Fig. 5: Translation enhancers in the 3′ UTR or at the 3′ end.
figure 5

a | A class of 3′ cap-independent translation elements (3′-CITEs) called BTEs (barley yellow dwarf virus translation elements) adopt extended RNA architectures in the 3′ UTR, interacting with eukaryotic initiation factor 4G (eIF4G) and a stem-loop in the 5′ UTR to facilitate translation. b | PTEs (panicum mosaic virus-like translation enhancers) such as the one from saguaro cactus virus contain motifs that may pseudoknot (indicated by pk?) to form structures that recruit eIF4E161 and participate in long-range interactions with the sequence in either the 5′ UTR or the 5′ coding region161. Other PTEs may interact with sequence in the 5′ coding region. c | I-shaped structures or Y-shaped structures (YSSs) recruit some component of the cap-binding complex and interact with sequences in the 5′ end of the mRNA (a YSS is shown). d | T-shaped structure (TSS)-type 3′-CITEs are proposed to bind directly to ribosomal subunits. e | TLSs (tRNA-like structures) are aminoacylated (AA, red), bound by eEF1A and/or interact with the ribosome. Crystal structure of the turnip yellow mosaic virus TLS is shown here (PDB 4P5J)156.

Most types of 3ʹ-CITEs directly bind eIFs; however, others directly bind to the ribosome itself. For example, the T-shaped structures (TSSs) found in TCV and cardamine chlorotic fleck virus RNAs directly interact with 80S ribosomes or 60S subunits145,146 (Fig. 5d), and they have been proposed to form folded structures akin to tRNAs. Unlike most 3ʹ-CITEs, TSSs lack identified sequences that base pair with the 5ʹ end (with the exception of the kissing-loop TSS, so named because it base pairs long distance with a hairpin-loop structure); the ‘bridge’ between the two ends is thought to occur through the 40S subunit association with the 5ʹ end147. Interestingly, some viruses use more than one type of 3ʹ-CITE cooperatively. For example, PEMV2 possesses an eIF-binding PTE that lacks the ability to pair to 5ʹ sequences, a 60S subunit and 80S ribosome-binding TSS (used only by the sgRNA)148 and a kissing-loop TSS that binds both ribosome subunits and forms long-distance kissing-loop interactions with a hairpin in the coding region149,150. In addition, new types of 3ʹ-CITEs and combinations of 3ʹ-CITEs continue to be discovered. The most recently identified type comes from Cucurbit aphid-borne yellows virus (CABYV)-Xinjiang. Only 55 nucleotides long, the CABYV-Xinjiang-like translation element (CXTE) functions in eIF4E-depleted lysate and is enhanced by the viral 5ʹ UTR; however, the detailed molecular mechanisms remain unknown125. Interestingly, the melon necrotic spot virus appears to have acquired a CXTE (from CABYV) in addition to an I-shaped structure (a type of 3ʹ-CITE named for its extended secondary structure)125. This example illustrates how 3ʹ-CITEs can transfer between viruses and how it is likely that additional types of 3ʹ-CITEs, and combinations of 3ʹ-CITEs, are awaiting discovery.

Despite the importance, widespread distribution and diversity of 3ʹ-CITEs, to date there is limited information about their 3D structures. Structural predictions of the PTEs from PMV and PEMV2 in both the apo form and eIF4E-bound form have been constructed144, but there are no experimentally determined structures. Also, molecular models of the TSS from TCV and the kissing-loop TSS from PEMV2 have been constructed that predict structural characteristics akin to tRNAs124, and the structure of the TCV TSS has been studied by NMR and small-angle X-ray scattering151. However, to date there are no high-resolution structures of other diverse 3ʹ-CITEs in isolation or bound to their translation machinery targets. In addition, conformational changes and dynamics within these complexes are important152; recent quantitative biophysical and structural studies of the TCV TSS revealed how the TSS is disassembled through viral RNA-dependent RNA polymerase binding, a requirement for replication153. However, more exploration is needed to understand the conformational dynamics underlying 3ʹ-CITE binding to their targets and their functions.

tRNA-like structures

The extreme 3ʹ terminus of Tymovirus and Tobamovirus plant viruses contains elements that recruit specific host aminoacyl tRNA synthetases to aminoacylate the 3ʹ end of the viral RNA, an event that promotes diverse viral processes including stabilizing the viral RNA and enhancing translation from a capped 5ʹ end by a mechanism that remains unknown154. These ‘tRNA-like structures’ (TLSs) exist in three types with distinct secondary structures and tertiary folds154, charged with either valine, histidine or tyrosine. Aminoacylated TLSs also bind eEF1A155, and the turnip yellow mosaic virus (TYMV) TLS has a high affinity for ribosomes156. Biophysical and biochemical studies have informed the development of models of different TLS types, but only the structure of the unbound TLS RNA from TYMV has been experimentally solved to high resolution156 (Fig. 5e). This structure reveals a global tRNA-like fold but with marked differences compared with tRNAs that help explain its multifunctionality. In the case of TYMV TLS, there is evidence that the tRNA-like fold forms tertiary interactions with an adjacent upstream pseudoknot domain (UPD) and could serve as a structural switch to control events during infection157. However, the molecular details of how aminoacylation and factor binding to the TLS at the 3ʹ end enhance translation from the 5ʹ end is still unclear. One study concluded that the TYMV TLS can deliver a valine as the first amino acid of viral proteins158; other studies have challenged this finding159. It could be that eEF1A-bound TYMV TLS interacts with either the ribosome or other components of the translation machinery and recruits them to the viral RNA, but this has yet to be definitively shown, and there is no clear mechanism of 5ʹ to 3ʹ communication.

Summary and future directions

In this Review, we have presented an overview of the many types of RNA elements that manipulate the eukaryotic translation machinery at all phases of the protein synthesis process: initiation, elongation, termination and recycling. Using illustrative examples, we show how these RNA elements are abundant and structurally and mechanistically diverse and how they provide viruses with sophisticated ways to exploit the translation machinery or overcome antiviral defences. Studying these RNA elements not only provides insight into virus replication mechanisms and new targets for therapeutic intervention or agricultural control but also increases our understanding of the fundamental mechanisms of translation.

As we have explored diverse types of RNAs, we have highlighted some of the major unknowns. For a given functional RNA, we may know the sequence, have a putative secondary structural model, have identified some interacting proteins and know the translation phases involved, but for most of these RNAs, there is still much to be learned. For example, for most, we do not know the 3D structures of the RNA elements, the details of how they interact with their targets and the consequences of that interaction. For example, when a viral RNA binds to an eIF or to the ribosome, where does it bind and how does this alter the structure of the target to alter function in a way that promotes viral replication? Likewise, there is a need to understand the dynamics of these interactions and processes in many ways: how do the interactions between viral RNAs and their targets change as cellular conditions change? Does cellular localization affect these processes? Moreover, translation is intimately connected to other cellular processes such as RNA decay, raising the question of how these viral mechanisms are affected by or function within a global context. We predict that some of the next important advances will be comprehensive descriptions of the detailed structures of viral RNAs in complex with their targets, an understanding of how these interactions lead to manipulation of the translational machinery, how conditions change during the course of the viral infection, how these events are coordinated with other viral processes and how this relates to pathogenesis.

Answering these questions will require contributions from many fields, including virology, structural biology, biochemistry and cell biology. Excitingly, new tools are emerging that can help address these questions. For example, advances in structural methods such as cryo-EM will allow the visualization of large complexes that include viral RNAs and the translation machinery, which is particularly useful for studies of IRESs and 3ʹ-CITEs. Likewise, super-resolution microscopy and methods such as time-resolved fluorescence in situ hybridization (FISH) should provide insight into the changing localization and composition of viral RNA–protein complexes in cells. Methods such as ribosome profiling and in-cell chemical probing of RNA will allow the characterization of RNA structures and translational status during the course of viral infection. The next advances will not come from a single approach but from an integration of these emerging technologies with classical virology and biochemistry.