HIV-1 RNA genome packaging: it’s G-rated

ABSTRACT A member of the Retroviridae, human immunodeficiency virus type 1 (HIV-1), uses the RNA genome packaged into nascent virions to transfer genetic information to its progeny. The genome packaging step is a highly regulated and extremely efficient process as a vast majority of virus particles contain two copies of full-length unspliced HIV-1 RNA that form a dimer. Thus, during virus assembly HIV-1 can identify and selectively encapsidate HIV-1 unspliced RNA from an abundant pool of cellular RNAs and various spliced HIV-1 RNAs. Several “G” features facilitate the packaging of a dimeric RNA genome. The viral polyprotein Gag orchestrates virus assembly and mediates RNA genome packaging. During this process, Gag preferentially binds unpaired guanosines within the highly structured 5′ untranslated region (UTR) of HIV-1 RNA. In addition, the HIV-1 unspliced RNA provides a scaffold that promotes Gag:Gag interactions and virus assembly, thereby ensuring its packaging. Intriguingly, recent studies have shown that the use of different guanosines at the junction of U3 and R as transcription start sites results in HIV-1 unspliced RNA species with 99.9% identical sequences but dramatically distinct 5′ UTR conformations. Consequently, one species of unspliced RNA is preferentially packaged over other nearly identical RNAs. These studies reveal how conformations affect the functions of HIV-1 RNA elements and the complex regulation of HIV-1 replication. In this review, we summarize cis- and trans-acting elements critical for HIV-1 RNA packaging, locations of Gag:RNA interactions that mediate genome encapsidation, and the effects of transcription start sites on the structure and packaging of HIV-1 RNA.

genetic information, all retroviruses, including HIV-1, package two copies of RNA into the virus particle (8).This unique genome packaging feature allows recombination to occur during reverse transcription to generate a hybrid DNA containing portions of genetic information from each copackaged RNA (9).Frequent recombination is required for maintaining genome integrity (10) and assorts mutations in the viral population allowing better-adapted variants to emerge, such as those that escape host immune responses or antiviral treatments (11,12).Therefore, HIV-1 genome packaging has a strong impact on virus replication and evolution.By imaging viral RNA in individual particles, it was shown that the packaging of the HIV-1 RNA genome is highly efficient and regulated.The vast majority of viral particles (>90%) contain two copies of HIV-1 unspliced RNA (13).This review focuses on how HIV-1 efficiently selects and packages its unspliced RNA.For simplicity, HIV-1 unspliced RNA is referred to as HIV-1 RNA hereafter.
It has long been recognized that Gag needs to select HIV-1 RNA from an abundant pool of cellular RNAs and spliced viral RNAs as the virion genome.However, recent studies revealed that genome packaging is more selective than originally appreciated because not all HIV-1 RNA molecules are created equal.Using neighboring sequences to initiate transcription, HIV-1 can generate multiple species of unspliced RNAs that are 99.9% identical, differing by only a few nucleotides at the 5′ end.However, HIV-1 selectively packages one RNA species that has a single guanosine at the 5′ end (1G RNA) over another species that is 2-nt longer with three guanosines at the 5′ end (3G RNA) (14)(15)(16)(17).The mechanism used by HIV-1 to distinguish two 9 kb RNAs with a 2-nt difference and the possible selective advantage of such a strategy will also be discussed in this review.

Trans-acting element: Gag
Efficient HIV-1 RNA genome packaging is mediated by transand cis-acting elements within Gag and the viral RNA, respectively (2).The Gag polyprotein is composed of several functional domains: matrix (MA), capsid (CA), spacer peptide 1 (SP1), nucleo capsid (NC), spacer peptide 2 (SP2), and p6 (Fig. 1A).These domains act in concert to assemble virus particles at the plasma membrane and to ensure efficient genome packaging (4).MA is a globular protein that contains a highly basic patch able to interact with negatively charged lipids present within the plasma membrane, most notably phosphatidylinositol 4,5-bisphosphate (PIP 2 ) (18)(19)(20)(21)(22). Furthermore, the first glycine residue is modified with the addition of a myristic acid moiety that ensures Gag anchoring to the plasma membrane (23).The CA domain consists of N-and C-terminal domains that contain residues essential for Gag multimerization during assembly and, following Gag polyprotein cleavage, during virion maturation for capsid core formation (24).In tandem with CA-driven Gag oligomerization, CA-SP1 forms a stable 6-helix bundle with inositol hexakisphosphate (IP6) to enhance Gag lattice formation during assembly (25,26).NC is an RNA-binding domain and acts as a nucleic acid chaperone (27,28).This domain contains multiple basic residues important for RNA binding and two critical zinc fingers, or zinc knuckles, each with a conserved cysteine-histidine motif, CCHC, that coordinates a Zn 2+ molecule (29,30).The C-terminal domain, p6, contains critical PTAP and YPXL-type binding motifs necessary for interaction with host machinery important for assembled viruses to bud off from the host cells (31)(32)(33).
Among the Gag domains, NC has RNA binding activity and was first examined for its role in specifically packaging viral RNAs.NC serves multiple functions during HIV-1 replication.After polyprotein cleavage and virus maturation, the NC protein with its nucleic acid chaperone activity plays important roles during reverse transcription and integration (27,(34)(35)(36)(37)(38).Thus, NC mutations often cause pleiotropic effects in viral replication; here, we focus on the role of NC in genome packaging.By replacing the NC domain with that of another retrovirus, it was shown that such hybrid Gag proteins can gain the ability to package RNA from a different retrovirus (39).For example, replacing the HIV-1 NC with that of murine leukemia virus (MLV), or vice versa, allowed the hybrid Gag to package MLV RNA or HIV-1 RNA, respectively (40,41).However, replacing the NC domain is insufficient to alter RNA packaging specificity in some other cases (42); for example, a hybrid HIV-1 Gag with mouse mammary tumor virus (MMTV) NC does not preferentially package MMTV RNA (43).Extensive mutational studies have shown that multiple regions of the HIV-1 NC, including the zinc fingers and the basic residues, are important for selective packaging of HIV-1 RNA (29,(34)(35)(36)(44)(45)(46)(47)(48)(49)(50)(51)(52).Abolishing the conserved CCHC motifs causes severe genome packaging defects and even replacing these motifs with other zinc-chelating motifs such as CCCC or CCHH can cause packaging defects (30,49,52,53).Interest ingly, the two zinc fingers of HIV-1 NC are not functionally equivalent (30); the first or N-terminal zinc finger is more important for genome packaging although both motifs are required for optimal genome packaging (30,54).Removal or mutating zinc fingers has also been reported to cause mislocalization of Gag within cells (55).In addition to the zinc-chelating CCHC, other residues within the zinc fingers, such as the conserved aromatic residues, also play an important role in HIV-1 RNA packaging specificity (30,48,52).Although the two zinc finger motifs are critical for RNA genome packaging, deleting both zinc fingers does not completely abolish RNA packaging, suggesting that other regions of Gag contribute to RNA binding (41).The HIV-1 NC domain contains multiple basic residues flanking and within the two zinc finger motifs.These basic residues are also important in RNA packaging.Abolishing multiple basic residues can cause very severe RNA packaging defects, whereas mutating a single basic residue often causes mild defects (36,45,56,57).
In addition to NC, the MA and CA domains in Gag have also been shown to play critical roles in HIV-1 RNA genome packaging.Mutations in the MA or CA domain often cause assembly defects, making it difficult to study their contributions to RNA packaging (58)(59)(60)(61)(62)(63).To probe this question, a complementation system was established to separate the assembly and the RNA-binding functions of Gag using one type of Gag to assemble particles and a second type of Gag to bind and package the RNA genome (64).It was shown that mutations in the RNA-binding Gag that abolish membrane targeting or Gag-Gag multimerization also cause RNA packaging defects (64).The requirement for Gag multimerization has been confirmed by another study analyzing interactions between CA-NC and HIV-1 RNA and showed that mutations that affect Gag multimeri zation also impact its ability to bind HIV-1 RNA specifically (65).Interestingly, MA can bind RNA (66); however, the MA domain mainly binds host RNA including tRNA in the cytoplasm (67).The RNA binding activity of MA regulates Gag trafficking by preventing nonspecific membrane binding by Gag and ensuring that Gag is targeted to the plasma membrane (68).Using an in vitro binding assay, the p6 domain was suggested to be important in Gag binding to HIV-1 RNA (69).However, other studies found that the p6 domain is dispensable in Gag:RNA-specific binding (70) and RNA genome packaging (64), leaving the role of the p6 domain in RNA genome packaging unsubstantiated.The importance of membrane targeting by the MA domain and the Gag:Gag multimerization of the CA domain in facilitating RNA packaging suggests that HIV-1 Gag multimerizes on the RNA at the plasma membrane to package the RNA genome (64) (Fig. 2).
The 5′ UTR of the HIV-1 RNA along with sequences that extend into the 5′ end of the gag gene are necessary and sufficient to mediate genome packaging (100).This region can adopt different structural conformations that affect RNA functions.The HIV-1 5′ UTR is highly structured and contains multiple elements important for viral replication, including those essential for Gag recognition and packaging of a dimeric RNA (Fig. 1B) (4).Generally, the 5′ UTR secondary structure is composed of multiple stem loop motifs (Fig. 1B): trans-acting response (TAR) stem loop, polyA stem loop, a stem-loop region containing the primer-binding site (PBS), and a set of three stem loop structures termed SL1, SL2, and SL3.Through numerous studies, several functional roles have been attributed to these elements and are briefly discussed here.The TAR stem loop contains binding sites for the viral protein Tat and host protein cyclinT1.Tat recruits the host super elongation complex, which includes cyclinT1, to abrogate the pausing of RNA polymerase II and allow the elongation of RNA transcription to continue (101)(102)(103)(104)(105)(106)(107)(108).The polyA stem contains the polyadenylation signal.HIV-1 RNA has two polyadenylation signals located near the 5′ and 3′ end of the RNA.The 3′ polyadenylation signal is used for cleavage and the addition of the polyA tail to the HIV-1 RNA (2).The PBS stem loop plays an important role in reverse transcription; using sequence complementarity, a host tRNA is annealed to the 18-nt PBS and serves as a primer for the initiation of viral DNA synthesis (109,110).The SL1, SL2, and SL3 contain multiple Gag/NC-binding sites, although the locations of these sites may vary in different studies (111)(112)(113)(114)(115)(116).In addition, the loop of SL1 contains a GC-rich 6-nt palindromic sequence known as the dimerization initiation signal (DIS), which plays a critical role in the selection of co-packaged RNA (117)(118)(119).DIS elements in two HIV-1 RNA molecules interact to form a "kissing loop" and initiate RNA dimerization (120)(121)(122)(123). SL2 contains the major splice donor site (SD), which is used by HIV-1 to generate partially spliced and completely spliced transcripts (124).Interestingly, although these elements are important in HIV-1 RNA packaging, deleting any one or two stem loops including TAR, polyA, PBS, SL1, Sl2, and SL3 may cause genome packaging defects but does not completely abolish HIV-1 RNA packag ing (100,(125)(126)(127)(128)(129)(130)(131).These studies demonstrate that the cis-acting sequence important for packaging is not located within a single element; instead, multiple elements with redundant functions may exist to facilitate HIV-1 genome packaging.

Export
In addition to the elements described above, the ribosomal frameshift signal was once thought to play a role in RNA packaging (132), but a later study showed that this element does not affect HIV-1 genome encapsidation (133).Similar to cellular mRNAs, RNA modifications have been identified on HIV-1 RNA including N6-methyladenosine (m6A) and acetylation of the N4 position of cytidine (ac4C).The reported effects of HIV-1 RNA modifications are mainly on RNA stability and gene expression (134)(135)(136)(137)(138), although the possible role of m6A on genome packaging has also been suggested (139).It was reported that the 5′ cap of HIV-1 RNA is trimethylated (71).The hyper methylated cap is reported to allow HIV-1 RNA to be translated through a pathway distinct from that of cellular RNAs (72); whether the cap modification affects packaging is unclear.It was shown that at the plasma membrane, Gag preferentially binds A-rich host mRNAs.However, the exact contribution of the nucleotide composition on HIV-1 RNA packaging is difficult to assess.HIV-1 and other viruses have evolved to have low 5′-CG-3′ dinucleotide content in their RNA genomes to mimic host mRNAs and avoid the detection of zinc-finger antiviral protein (ZAP) (140).Global alteration of HIV-1 sequen ces can inadvertently trigger antiviral response (140) and/or alter intricately regulated HIV-1 splicing (141).Thus, whether the nucleotide composition affects HIV-1 genome packaging remains unclear.

DIS sequence and RNA dimerization
HIV-1 RNA dimerization relies heavily on complementary interactions between the DIS elements of two RNA molecules.In vitro experiments also demonstrated that the presence of DIS in SL1 promotes dimerization of RNA segments (117,(142)(143)(144)(145)(146).The exact DIS sequence in the SL1 loop region varies and most HIV-1 strains have a 6-nt palindromic DIS.Most subtype B viruses have a GCGCGC palindrome, whereas many subtype A, C, or G viruses have a GTGCAC palindrome (147,148).Early genetic studies showed that the complementarity of the DIS sequences affects recombination rates.Progeny between two proviruses with the same DIS exhibits high recombination rates, whereas progeny between two proviruses with DIS sequences that cannot form perfect base pairings exhibits low recombination rates (149).These studies suggest that the DIS identity affects the frequency of viral RNA copackaging (150).This was later demonstrated by directly examining viral RNA in particles.HIV-1 RNAs with identical DIS sequences are copackaged at a frequency similar to random assortment, indicating that RNAs generated from different proviruses can dimerize efficiently.However, two RNAs with different DIS sequences that cannot form perfect base-pairing, such as one RNA containing the subtype B DIS and the other RNA containing the subtype C DIS, are copackaged together much less frequently than random distribution.Conversely, two RNAs containing engineered DISs that are complementary to each other, such as one virus containing GGGGGG and the other virus containing CCCCCC, are copackaged together more efficiently than random distribution (13,98,151).These observations demonstrated that the two copackaged RNAs interact via the DIS to dimerize and the dimerization step occurs prior to packaging.Although the initial RNA:RNA interactions are mediated by DIS, HIV-1 RNAs in the virion engage in dimeric contacts far beyond DIS interactions and Gag/NC plays an important role in the extended dimerization (152)(153)(154)(155)(156)(157).Despite the proven role of the DIS in HIV-1 RNA dimerization, it has been reported that HIV-1 with deletions of the DIS, or SL1 including the DIS, can replicate; furthermore, their virion RNAs are dimeric (147,150,(158)(159)(160).These results suggest that there may be other elements within the RNA that facilitate dimerization or that HIV-1 can employ alternate mechanisms to promote RNA dimerization.

HIV-1 5′ UTR structures and Gag/NC-binding sites
Although the HIV-1 5′ UTR is known to be highly structured, its precise functional conformation(s) has been highly debated.Many predicted 5′ UTR RNA conformations have similar structures with minor variations such as the length of the stems and the locations of the bulges.Most, but not all, structures suggest that to facilitate genome packaging, TAR, polyA, SL1, SL2, and SL3 are formed, with DIS located at the top of SL1, and a region of U5 interacting with the Gag AUG start codon and its flanking sequences.Because HIV-1 RNA serves as the template for translation and the virion genome, it has been suggested that the 5′ UTR can adopt structures to favor translation or packaging (161) (Fig. 1C).In the conformation facilitating packaging, TAR, polyA, PBS, SL1, SL2, and SL3 RNA stem loops are formed with the DIS displayed within the loop of SL1, available for RNA dimerization (Fig. 1C, structure on the left).In the RNA structure facilitating translation, many of the stem loops, including the SL1, are not formed; the DIS sequence forms base-pairing with sequences in the polyA stem and is not available for RNA dimerization (Fig. 1C, structure on the right).Mutagenesis experiments have been performed to test the proposed structures, and it was found that elements within the DIS-exposed conformation are important for RNA packaging, but translation is not dependent on the formation of predicted translation-competent structures (162).
Multiple studies have shown that Gag/NC preferentially binds unpaired guanosine (G) residues within the HIV-1 5′ UTR (111,(163)(164)(165). Using a chemical probe reverse-foot printing assay, the NC binding of HIV-1 RNA within virions has been mapped (164).This study identified two primary binding sites, both located in bulges of SL1, two secondary binding sites, one located immediately upstream of SL1 and one located in SL2, and two tertiary binding sites, located within SL3 (164).The primary NC-binding sites described in this virion chemical probing study were also later identified as the major Gag-binding sites in an in vitro binding assay (116,166).Mutating one or more of these binding sites affects in vitro NC:RNA binding (116,(166)(167)(168).The contribution of these Gag/NC-binding sites to HIV-1 RNA packaging was examined using a cell-based assay and mutating the guanosines in the primary binding sites located within SL1 were shown to have the most effects.Importantly, combining mutations at different NC-binding sites causes synergistic defects in RNA packaging (111).These results suggest that there is redundancy in Gag-RNA interactions and multiple Gag polyproteins need to bind to HIV-1 RNA to facilitate packaging.
Based on nuclear magnetic resonance (NMR) spectroscopy studies, a three-way junction structure was recently proposed (115,169).In this structure, SL1 exists with exposed DIS, but SL2 is not formed, instead its sequences base pair with part of the PBS stem and sequences upstream of SL1 forming the first three-way junction (Fig. 1D).The second three-way junction is composed of SL3 with downstream AUG region interacting with U5.The three-way junction structure model also predicted 17 exposed guanosines as putative NC-binding sites.It is worth noting that many of the predicted exposed guanosines overlap with the previously described NC-binding sites identified using chemical probing of virion RNA (164).To test the three-way junction model, mutants with multiple substitutions of these guanosines were generated and shown to have RNA packaging defects (111,115,169).Further studies identified a UUUU:GGAG base-paring region to be critical for RNA packaging, with the two consecutive guanosines as a high-affinity NC-binding site (165).The UUUU and GGAG flank the SL3 shown in Fig. 1B; in addition, the two consecutive guanosines are described as an NC tertiary binding site in virion RNA (164,170).It was shown that the lability of the base-pairing is important for its function, stabilizing the interactions by substituting the uridines to cytosines abolishes high-affinity NC binding (165).
Using in-gel chemical probing, it was shown that RNA containing the HIV-1 5′ UTR forms multiple conformations (16) (Fig. 1E).Hence, it was also proposed that HIV-1 5′ UTR forms an ensemble of conformations, some suitable for packaging, whereas others are not.RNAs with conformations that exhibit exposed DIS and many Gag-binding sites are likely to be packaged and serve as virion genomes (Fig. 1E, left).By contrast, RNA in other conformations, the DIS and major Gag-binding sites form intramolecular base-pairing and are sequestered, and therefore not accessible to interact with another HIV-1 RNA or allow Gag binding (Fig. 1E, right) (16).
Although each of these studies proposes distinct RNA structures and Gag/NC:RNA interactions, there are at least three points of consensus.First, HIV-1 5′ UTR RNA structures play an important role in genome packaging.Second, the formation of SL1 along with the availability of the DIS and SL3 are two common structural features essential for efficient genome packaging.Third, multiple Gag/NC-binding sites exist in the 5′ UTR and multiple proposed sites are used to mediate RNA packaging.
Despite being efficiently packaged during virus assembly ( 13), HIV-1 RNA is not required for Gag particle formation.In the absence of HIV-1 RNA, Gag can still assemble into particles and instead packages cellular mRNA (171).The mechanism by which HIV-1 ensures the packaging of viral RNA when it is not required for particle assembly is intriguing.It was shown that when Gag is expressed at levels reflecting those in infected cells, HIV-1 RNA promotes virus assembly and particle production (172).These findings lead to the hypothesis that with many Gag-binding sites, the dimerized HIV-1 RNA allows the binding of multiple Gag proteins and increases the local Gag concentration, which facilitates a switch in Gag conformation to promote Gag-Gag interactions leading to particle assembly (172).By contrast, lacking multiple specific Gag-binding sites, cellular mRNAs are less efficient at increasing a local Gag concentration needed to facilitate the Gag conformation switch and virus assembly.Therefore, compared to Gag binding to cellular mRNA, the Gag-dimeric RNA complex has a significant advantage in initiating virus assembly (172).This property provides a mechanism to ensure high-efficiency RNA genome packaging by HIV-1 (5).

Dynamics of HIV-1 RNA packaging: when, where, and how
The interactions between HIV-1 RNA and Gag are interwoven: HIV-1 RNA is translated to generate Gag/Gag-Pol polyproteins and Gag packages HIV-1 unspliced RNA as the viral genome during assembly.Where the initial Gag-RNA interaction occurs has been a long-standing question.Translation of HIV-1 RNA occurs within the cytoplasm where both Gag and RNA are present, providing ample opportunity for Gag:RNA interactions.It was once proposed that Gag packages the RNA from which it is translated (packag ing in cis or cis-packaging) (173).This hypothesis was not supported by later studies, which showed that in cells expressing two proviruses, HIV-1 RNAs that do not encode functional Gag can be packaged as efficiently as those that encode functional Gag (174); thus, Gag does not preferentially package in cis.
Imaging studies reported that most of the cytoplasmic HIV-1 Gag is monomeric and uses diffusion as a main transport mechanism (175,176).By employing biochemical assays, HIV-1 RNA and Gag have been shown to interact in the cytoplasm (177).Using crosslinking-immunoprecipitation sequencing (CLIP-seq) analyses, it was shown that cytoplasmic Gag can bind HIV-1 RNA; however, most of the cytoplasmic RNA-associated Gag appears to be bound to host RNAs (67).By contrast, at the plasma membrane, Gag is predominantly bound to HIV-1 RNA.Thus, compared to the membrane-bound Gag, cytoplasmic Gag has far less selectivity toward binding to HIV-1 RNA (67).Because Gag is known to target and assemble at the plasma membrane, it was often suggested that the cytoplasmic interaction with Gag facilitates HIV-1 RNA trafficking to the plasma membrane.However, imaging studies showed that HIV-1 RNA can travel to the plasma membrane without Gag, indicating that Gag:RNA interactions are not required to localize HIV-1 RNA at the plasma membrane (178,179).Therefore, although Gag: HIV-1 RNA can interact in the cytoplasm, whether RNA packaging is a direct result of such interactions is unknown.Further studies will be needed to clarify the initial point of Gag:HIV-1 RNA interaction that leads to specific genome packaging.It was also reported that Gag can traffic to the nucleus and bind to RNA, leading to the hypothesis that Gag retrieves HIV-1 RNA for genome packaging (180).Deletion of a majority of MA, including the regions proposed to contain a nuclear localization signal (NLS) and nuclear export signal (NES), does not disrupt HIV-1 RNA packaging, suggesting that Gag nuclear trafficking is not essential for RNA packaging (181,182).The biological significance of Gag nuclear trafficking, or how frequently it occurs, awaits further studies.
A major location for Gag:HIV-1 interactions that lead to genome packaging is the plasma membrane (Fig. 2).After exiting the nucleus, HIV-1 RNA mainly travels through the cytoplasm by diffusion; the presence of Gag does not affect the mode of transport or the speed of HIV-1 RNA movement (178).However, Gag does affect the time HIV-1 RNA stays on the plasma membrane (183).In the absence of Gag, HIV-1 RNA can travel to the plasma membrane but only remains there briefly.In the presence of Gag, even mutant Gag containing multimerization defects, HIV-1 RNA extends its stay at the plasma membrane.By contrast, the Gag lacking the NC domain cannot retain HIV-1 RNA at the plasma membrane.Thus, the initial Gag:RNA interactions that allow HIV-1 RNA to be retained at the plasma membrane require the ability of Gag to bind RNA, but not Gag multimerization (183).Because the presence of Gag changes the behavior of most HIV-1 RNA, these findings indicate that most of the HIV-1 RNAs at the plasma membrane are associated with Gag.
HIV-1 RNA packaging and virus assembly processes have been studied using live-cell imaging techniques by fluorescently tagging both HIV-1 RNA and a portion of the Gag.In these studies, the RNA signal is often observed first, then Gag signals can be seen to colocalize with RNA with increasing intensity over time, suggesting that Gag assembles on the RNA scaffold (178,179,183,184).In these studies, a mixture of tagged and untagged Gag was used to make morphologically normal viral particles (185).In addition, each tagged Gag only had one fluorescent protein and could not be detected at a single-molecule level, whereas each HIV-1 RNA was tagged with multiple fluorescent proteins and could be detected at a single RNA level.For these technical reasons, the appearance of RNA signal without a Gag signal should not be interpreted as Gag-free HIV-1 RNAs.
Most viral particles contain two HIV-1 RNA molecules; however, the addition of a second, complementary, DIS in the same RNA allows the formation of a self-dimer and leads to the packaging of only one copy of self-dimerized RNA (151).Therefore, HIV-1 regulates its genome encapsidation by packaging a dimeric RNA.It was suggested that during MLV replication, RNA dimerization occurs in the nucleus (186,187).To probe this possibility, a cell-fusion approach was employed using two cell lines each harboring a provirus and conditions that facilitated cytoplasmic but not nuclear fusion (98).These experiments demonstrated that most of the RNA dimerization and partner selection process does not occur in the nucleus, but instead, in the cytoplasm or at the plasma membrane.A live-cell imaging approach was also used to examine the location of HIV-1 RNA dimerization (184).In this approach, two different HIV-1 RNAs were tagged with distinct fluorescent proteins so that dimerized RNA was visualized by colocalization of the two fluorescent RNA signals.In addition, HIV-1 Gag was tagged with a third fluorescent protein, and the process of RNA dimerization and virus assembly on the plasma membrane was studied.These experiments showed that two different RNA signals often merge at the plasma membrane, followed by an increase in Gag signals consistent with particle assembly (184).HIV-1 RNA molecules also interact in the absence of Gag, but these complexes quickly fall apart, indicating that Gag is required to stabilize the RNA dimer (184).These findings indicate that HIV-1 RNAs dimerize not in the cytoplasm but on the plasma membrane and stabilization of the RNA dimer requires the Gag protein.Dimerization often occurs at an early stage of the virus assembly process.Furthermore, the dimerization process is likely mediated by the interactions of two RNA-Gag complexes, rather than two RNAs (184).
HIV-1 RNA can undergo two distinct biological processes: translation and packag ing.The relationship between translation and packaging was explored using a live-cell imaging assay that can distinguish between HIV-1 RNA molecules that are being actively translated from those that are not (188).It was found that both types of HIV-1 RNA can reach the plasma membrane.However, HIV-1 RNA molecules that are being actively translated do not stay near the plasma membrane, whereas nontranslating HIV-1 RNA molecules are retained on the plasma membrane to allow Gag accumulation.These results indicate that the nontranslating RNA is selected as a substrate to facilitate Gag multimerization and virus assembly (188).It should be noted that this study distin guishes between whether an RNA is or is not being actively translated but does not address whether a previously translated RNA can serve as a virion genome.

The impact of transcription start sites on HIV-1 RNA packaging
HIV-1 has three consecutive guanosines at the junction of U3 and R. Most of the time, host RNA Pol II initiates at one of the three guanosines to generate unspliced RNAs with three, two, or one guanosine, referred to as 3G, 2G, and 1G RNA, respectively (14).It was reported that 3G RNA is more abundant in infected cells; however, 1G RNA is selected over 3G RNA and is enriched in virions (14), which was later confirmed by other groups (15)(16)(17)111).These findings revealed that the 5′ context of the HIV-1 RNA can affect its functions.All the RNA elements present in 1G RNA are also contained in 3G RNA; thus, it is intriguing how HIV-1 can distinguish between two 9 kb RNAs that only differ by 2-nt.The functions of many RNA elements in the HIV-1 5′ UTR are structure-dependent.Therefore, the current prevailing hypothesis is that the 5′ UTR of 1G and 3G RNA folds into different conformations: the 1G RNA, but not 3G RNA, folds into conformation(s) that facilitates packaging.
The precise conformations of the 5′ UTR that render 3G and 1G RNA non-packageable and packageable, respectively, are still being debated.The guanosine(s) at the very 5′ end of the HIV-1 RNA, along with the 5′ cap, can form base-pairing with downstream sequences and become part of the TAR stem (Fig. 1D and E).The U5:DIS alternate base-pairing model proposes that the DIS can form base-pairing with U5 sequences, making it unavailable for RNA dimerization and preventing genome packaging (Fig. 1D).This alternative structure is more stable in 3G RNA than in 1G RNA (15).In addition, binding studies using in vitro transcribed RNA containing the HIV-1 5′ UTR showed that the host factor eIF4E binds 3G-like short RNA better than 1G-like short RNA (189).Thus, this model also proposes that the cap of 1G RNA is sequestered, preventing RNA from being translated.Therefore, 1G RNA is used as the viral genome, whereas 3G RNA is used as the mRNA (189).
By contrast, the "zip and pack" model proposes that the polyA stem is a keystone structural element and that its formation promotes the downstream 5′ UTR sequences to fold into conformations with exposed DIS and major Gag-binding sites so that the RNA can be packaged (16).In the 1G RNA, the cap and the first guanosine base pair with two cytosines at the end of TAR, allowing stable polyA stem formation and the subse quent folding of the 5′ UTR into packageable conformers.Conversely, the additional guanosines present in 3G RNA can extend beyond base-pairing with TAR and invade the polyA stem, destabilizing the structure.Thus, 3G RNA tends to have a destabilized polyA stem; consequently, the downstream 5′ UTR folds into structures in which the DIS and the major Gag-binding sites are sequestered, disfavoring RNA packaging (Fig. 2).This model is supported by mutational analyses: mutants with a 2-nt insertion between the TAR and polyA stems can package 3G RNA efficiently because the insertion prevents the very 5′ guanosines in the 3G RNA from destabilizing the polyA stem (16).
Additional studies demonstrated that these two features-multiple transcription start sites and preferential packaging of one RNA species-are not unique to HIV-1 but are also present in other primate lentiviruses (190).Isolates of simian immunodeficiency virus (SIV), closely related to the ancestral viruses that caused zoonotic infection of humans, exhibit similar utilization of multiple transcription start sites and selective packaging indicating that HIV-1 retains these conserved characteristics from ancestral viruses (190).In addition, HIV-2 also uses multiple transcription start sites and preferen tially packages a specific RNA.Features in viral replication often provide advantages in replication fitness.To probe whether the transcription start site usage provides HIV-1 with a replication advantage, HIV-1 mutants were generated that either predominantly expressed 3G RNA with very little 1G RNA or predominantly expressed 1G RNA with no 3G RNA (17).It was found that both mutants can undergo multiple rounds of replication, indicating neither 3G nor 1G RNA, two major HIV-1 transcripts, are strictly required for spreading infection (17).Another study with similar mutants also confirmed these observations (191).However, neither mutant virus can replicate as well as the wild-type virus (17).Thus, HIV-1 expresses different unspliced transcripts with specified functions to optimize replication fitness (17).

SUMMARY AND PROSPECTIVES
HIV-1 must package its RNA genome to generate infectious viruses to spread to new hosts.How HIV-1 identifies and packages two copies of unspliced viral RNA from a vast pool of other cellular RNAs has been a long-standing question in the field.Studies suggest that the cis-acting elements important for genome encapsidation reside in HIV-1 5′ UTR and their functions are structure-dependent.The HIV-1 5′ UTR forms complex structures, some of which have exposed DIS and multiple Gag-binding sites that facilitate genome packaging.Gag interacts with HIV-1 RNA at the plasma membrane and Gag:RNA complexes initiate dimerization using the DIS sequences, followed by accumulation of Gag on the RNA dimer complex to assemble HIV-1 particles.The selection of HIV-1 RNA dimer is based on Gag-binding sites on the 5′ UTR and the ability of the dimeric HIV-1 RNA to serve as a preferred scaffold to promote Gag assembly.However, many aspects of this process are not well defined or understood such as the HIV-1 RNA structure(s) that promote packaging, the properties of HIV-1 RNA that make it a suitable scaffold to promote virus assembly, how two RNA:Gag complexes merge and dimerize, and whether RNA modifications can affect the packaging of HIV-1 RNA.Lastly, as a critical step of viral replication, whether strategies can be developed to interfere with genome packaging, thereby abolishing the production of infectious particles remains an open question.Future studies on these topics and others will advance our understanding of HIV-1 RNA genome packaging and explore the development of novel antiviral strategies.

FIG 1
FIG 1 Cisand Trans-acting elements important for selective HIV-1 RNA packaging.(A) Schematic of HIV-1 Gag polyprotein containing matrix (MA), capsid (CA), spacer peptide 1 (SP1), nucleocapsid (NC), spacer peptide 2 (SP2), and p6 domains.Gag is also N-terminally myristoylated as denoted at the N-terminus of MA (squiggle).(B) Structural elements within the 5′ UTR of HIV-1 RNA (left) and necessary for nuclear export (right).Conserved stem loops within the 5′ UTR are highlighted from left to right as the trans-activation response (TAR) element, polyA, primer-binding site (PBS), SL1, SL2, and SL3.Additional features highlighted are a portion of U5 sequences (blue), dimerization initiation signal (DIS; pink), splice donor (SD; orange), exposed guanosines in SL3 (purple), and start codon of Gag (AUG; green).The highly structured rev-response element (RRE) located within the env gene is depicted on the right.Structures of the 5'UTR conformational switch models proposed by (C) Berkhout group, (D) Summers/Telesnitsky groups, and the zip and pack model proposed by (E) Musier-Forsyth/Hu groups.The 5′ UTR structures predicted to favor and inhibit packaging are shown on the left and right, respectively.Highlighted features correspond to labeling in (B).

FIG 2
FIG 2 Steps of HIV-1 RNA packaging and virus assembly.Unspliced HIV-1 RNA nuclear export is facilitated by Rev:RRE interactions with host machinery.These RNA species can form distinct 5′ UTR structures that lead to preferential packaging or translation.HIV-1 RNA is held at the plasma membrane by a low number of Gag, anchored by the myristic acid moiety and PIP 2 -binding site on MA.Dimerization of HIV-1 RNA occurs on the plasma membrane and facilitates the rapid accumulation of Gag proteins.