Solution Structure of the HIV-1 Intron Splicing Silencer and Its Interactions with the UP1 Domain of Heterogeneous Nuclear Ribonucleoprotein (hnRNP) A1*

Splicing patterns in human immunodeficiency virus type 1 (HIV-1) are maintained through cis regulatory elements that recruit antagonistic host RNA-binding proteins. The activity of the 3′ acceptor site A7 is tightly regulated through a complex network of an intronic splicing silencer (ISS), a bipartite exonic splicing silencer (ESS3a/b), and an exonic splicing enhancer (ESE3). Because HIV-1 splicing depends on protein-RNA interactions, it is important to know the tertiary structures surrounding the splice sites. Herein, we present the NMR solution structure of the phylogenetically conserved ISS stem loop. ISS adopts a stable structure consisting of conserved UG wobble pairs, a folded 2X2 (GU/UA) internal loop, a UU bulge, and a flexible AGUGA apical loop. Calorimetric and biochemical titrations indicate that the UP1 domain of heterogeneous nuclear ribonucleoprotein A1 binds the ISS apical loop site-specifically and with nanomolar affinity. Collectively, this work provides additional insights into how HIV-1 uses a conserved RNA structure to commandeer a host RNA-binding protein.

Human immunodeficiency virus type 1 (HIV-1) 2 requires controlled synthesis of its protein complement for persistent infection and successful virion production. Genome expression is tightly regulated at the levels of transcription, splicing, mRNA nuclear export, and translation (1). RNA polymerase II-dependent transcription yields a 9-kilobase (kb) polycistronic transcript that undergoes multiple rounds of alternative splicing to produce upward of 100 different viral mRNAs that are classified by their intron composition: unspliced, incom-pletely spliced (4 kb), and completely spliced (2 kb) (2,3). During early phase HIV-1 replication, cytoplasmic levels of the 2-kb transcripts predominate to encode Tat, Rev, and Nef. Accumulation of unspliced and 4-kb transcripts coincides with the transition to late phase replication and expression of Gag, Gag/Pol, Vpr, Vif, and Env/Vpu. Thus, HIV-1 splicing pathways are essential components of the viral replication cycle and represent new targets for therapeutic intervention (3,4).
The identities of HIV-1 transcripts are determined by the combinatorial use of several 5Ј donor and 3Ј acceptor splice sites. Efforts to understand mechanisms of HIV-1 splicing reveal that all of the acceptor sites and splice donors D2-D4 are suboptimal due to non-consensus core splicing signals (4,5). Proper spliced ratios are therefore established via auxiliary cis RNA features, collectively known as splicing regulatory elements. Splicing regulatory elements function either as enhancers or silencers of splicing by recruiting host proteins belonging to the serine-arginine-rich (SR) and hnRNP families, respectively. Splicing regulatory element location (intronic or exonic) influences how frequently a given splice site is utilized, thereby making it difficult to determine general rules of the mechanisms that regulate alternative splicing (6).
Splicing from site D4 to A7 removes the Rev-responsive element, which leads to the accumulation of 2-kb transcripts that exit the nucleus in a Rev-independent manner. The extent of splicing to site A7 (ssA7) is tightly regulated through a complex network of three splicing regulatory elements: an intronic splicing silencer (ISS), a bipartite exonic splicing silencer (ESS3a/b), and an exonic splicing enhancer (ESE3) (7)(8)(9)(10)(11). RNA secondary structure probing revealed that the isolated ssA7 locus folds into a conserved domain consisting of three stem loops where the ISS, ESE3, and ESS3 elements reside within single strand loop regions (Fig. 1). Cellular proteins hnRNP A1 and SRSF1 (ASF/SF2) were identified as the trans repressor and activator of ssA7 by forming complexes with their respective silencer and enhancer elements (12,13). Interactions between hnRNP A1 and ssA7 effectively block early spliceosome assembly, whereas SRSF1 counteracts the block by stabilizing U2AF loading at the 3Ј splice site (5,7,8).
To better understand the structural basis of ssA7 regulation, we previously determined the solution NMR structure of the ESS3 stem loop (SL3 ESS3 ) and more recently elucidated its mode of interaction with the UP1 domain of hnRNP A1 (14 -16). UP1 binds site-specifically to an AG dinucleotide motif of the SL3 ESS3 apical loop (16). Secondary structure probing of the isolated ssA7 locus revealed that the ISS element folds into a stem loop structure with an exposed AGUGA apical loop (Fig. 1); however, a recent SHAPE model of the entire HIV-1 genome showed that the AGUGA sequence is involved in base pairing to form a short helix (17). Because the accurate position of the AGUGA silencer element is key to understanding its function on ssA7, we determined the NMR solution structure of the isolated ISS stem loop (SL1 ISS ) and characterized its interactions with UP1. Our structural studies reveal that SL1 ISS folds into a stable structure with an exposed AGUGA apical loop that binds UP1 specifically and with high affinity.

Experimental Procedures
Nucleotide sequences (excluding recombinants) comprising the ISS element were derived from the Los Alamos HIV sequence database (www.hiv.lanl.gov/content/sequence/HIV/ mainpage.html) for HIV-1 group M subtypes (21 total) for which at least 10 sequences had been submitted. Sequences for individual subtypes were aligned in Geneious using the ClustalW multiple sequence alignment algorithm (18). Consensus sequences and logos for individual subtypes were generated also using Geneious based on the majority nucleotide present at each site. This alignment was then used for the calculation of the mean evolutionary distance and inference of a maximum likelihood phylogenetic tree using 1,000 bootstrap replicates in Mega v5.2 (19). The maximum composite likelihood method (20) was used for distance calculation, whereas the Hasegawa-Kishino-Yano model (21) with ␥-distributed rate variation among sites was chosen for tree reconstruction based on the likelihood ratio test of log likelihood values for each nested model provided using the model test also implemented in Mega.
Preparation of ISS Constructs-Three ISS constructs of different sizes were used in this study: a 53-nt SL1 ISS (7904 -7955) construct, a 33-nt SL1 ISS-33 (7915-7945) construct, and a 25-nt SL1 ISS-25 (7920 -7940) construct where numbering corresponds to native residues of HIV-1 Bru . The RNA samples were in vitro transcribed from synthetic DNA templates (Integrated DNA Technologies) as described previously (15). For NMR studies, 13 C-, 15 N-, and 2 H-labeled rNTPs were purchased from Cambridge Isotope Laboratories. Following synthesis, the RNA samples were purified by denaturing PAGE (8 -16% depending on size), electroeluted, and loaded onto a HiPrep 16/60 Sephacryl S-100 column (GE Healthcare) to desalt and remove soluble acrylamide. RNA samples for NMR analysis were exchanged into 5 mM K 2 HPO 4 (pH 6.5) and 90% H 2 O, 10% D 2 O or 100% D 2 O using a Millipore Amicon Ultra-15 centrifugal device. RNA samples prepared for calorimetry and analytical size exclusion chromatography were exchanged into 10 mM FIGURE 1. A, genetic landscape of the HIV-1 genome highlighting the nine open reading frames along with donor and acceptor splice sites. B, the experimentally determined RNA secondary structure (HIV-1 Bru ) surrounding site A7 reveals that its cis regulatory elements localize to three stem loops: SL1, intron splicing silencer (red); SL2, exonic splicing enhancer 3 (green); and SL3, exonic splicing silencer 3a/b (red). The polypyrimidine tract is colored blue, and the arrow points to the location of splice site A7. LTR, long terminal repeat. K 2 HPO 4 (pH 6.5), 120 mM KCl, 10 mM NaCl, 0.5 mM EDTA, 1 mM tris(2-carboxyethyl)phosphine.
Folding studies were carried out by heating the samples under dilute conditions to 95°C for 3 min followed by snap cooling on ice. Native polyacrylamide gels were run to check the conformational homogeneity. Independent of the length of the RNA fragment, the predominant species (Ͼ95%) migrated as a monomer under dilute RNA concentrations. However, under high RNA (excess of 150 M) and high salt concentrations, we observed slower migrating species for SL1 ISS and SL1  . Therefore, all NMR experiments were carried out using RNA concentrations that ranged from 80 to 120 M and in low ionic strength buffer. Single conformers were resolved under very dilute RNA concentrations, and high ionic strength conditions were used for calorimetric and size exclusion titrations. Theoretical molar extinction coefficients were calculated using the Nanodrop 2000 calculation tool.
NMR Data Acquisition, Processing, and Analysis-Two-dimensional NMR experiments were carried out using Bruker Avance (800-and 900-MHz) high field NMR spectrometers equipped with cryogenically cooled, HCN triple resonance probes and a z axis pulsed field gradient accessory. To check RNA conformational homogeneity and purity, one-dimensional 1 H NMR experiments were performed using a Bruker Avance III HD (500 MHz) equipped with the Prodigy broadband cryoprobe. All NMR data were processed by NMRPipe/ NMRDraw (22) and analyzed using the software NMRView J (23). Exchangeable 1 H spectra were measured at 288 K with the Watergate NOESY ( m ϭ 250 ms) pulse sequence. The Watergate NOESY experiment was collected on an [ 2 H]AC-selectively labeled SL1 ISS sample. Non-exchangeable protons were assigned following well established procedures. 1 H-1 H NOESY ( m ϭ 250 ms) and 1 H-1 H TOCSY ( m ϭ 75 ms) spectra were recorded in 100% D 2 O at 298 K on SL1 ISS , SL1 ISS-33 , and SL1 ISS-25 samples labeled with different combinations of deuterated rNTPs including equimolar rNTPs, rRTP (3Ј,4Ј,5Ј,5Љ) and rYTP (5,3Ј,4Ј,5Ј,5Љ). 1 H-13 C heteronuclear multiple quantum coherence spectra were recorded for the different ISS constructs to verify NOE assignments. Chemical shift assignments were determined for all aromatic protons and ribose positions H1Ј-H2Ј.
Structure Calculations-Distance restraints used in structure calculations were extracted from 1 H-1 H NOESY ( m ϭ 250 ms) spectra collected on full-length SL1 ISS constructs prepared with different combinations of 2 H-labeled rNTPs. The smaller constructs (SL1 ISS-33 and SL1 ISS-25 ) were used only to aid in chemical shift assignments. Restraint boundaries were defined qualitatively by the intensity of the NOE and grouped into strong (1.8 -3.0 Å), medium (2.5-4.5 Å), weak (3.5-6.0 Å), and very weak (3.5-7.5 Å) bins. Very weak distance restraints were only applied for residues from the apical and bulge loops. Sugar pucker restraints were extracted from analysis of H1Ј-H2Ј TOCSY cross-peak intensities. Medium to strong TOCSY cross-peaks were observed for three of the five nucleotides (7930 -7932) in the apical loop. Additionally, the bulge loop uridines (7923-7924) showed moderate H1Ј-H2Ј TOCSY peaks. These residues were therefore allowed to sample both C2Ј-endo and C3Ј-endo conformations. All other residues have C3Ј-endo sugar puckers. A-form distance restraints involving H8/H6 to H3Ј-H4Ј were included based on observed NOE interactions between H8/H6 and H1Ј-H2Ј and knowledge of the sugar puckers.
Standard A-form backbone dihedral angle restraints (Ϯ20°) were applied to all residues except 7923, 7924, and 7930 -7932. Finally, the glycosidic angle was restrained to be anti (180 Ϯ 90°) for all residues except G7930 and G7932, which were restrained in a syn (0 Ϯ 90°) conformation as evident by a downfield C8 chemical shift (see below).
Hydrogen bond and planarity (Xplor-NIH only; 20 kcal/mol Å 2 ) restraints were applied for all WC and UG base pairs that were consistent with one-dimensional 1 H chemical shifts, 15 N chemical shifts ( 1 H-15 N HSQC), and NOE cross-peak patterns derived from 1 H-1 H NOESY spectra collected in H 2 O and D 2 O. Loose hydrogen bond restraints were applied to residues involved in the 2X2 internal loop based on NOE cross-peak patterns (see below).
Initial SL1 ISS structures (256) were calculated in Xplor-NIH 2.34 (25) as described previously (15) with the exception that the second stage gentle folding was not used (26); instead, the 10 lowest energy structures from Xplor-NIH were further refined in AMBER12 using the ff99bsc0 OL3 force field and Particle Mesh Ewald Molecular Dynamics. The structures were prepared for simulation in AMBER using xLeap. In both minimization and production runs, the SL1 ISS constructs were simulated in implicit solvent using the pairwise generalized Born model (27,28) with a 10 mM salt concentration along with a 24-Å cutoff for non-bonded interactions and a 10-Å cutoff for calculation of the Born radii.
The structures were first minimized over 4,000 steps using 2,000 steps of steepest descent followed by 2,000 steps of conjugate gradient. After minimization, 20 production runs were performed with 1 ns of simulation time (500,000 steps of 2 fs each). During this refinement step, the temperature was increased from 0 to 300 K over 100 ps, held at 300 K for 800 ps, and decreased from 300 to 0 K over 100 ps. NMR restraints were included at a constant weight (20 kcal mol Ϫ1 Å Ϫ1 ) for inter-and intranucleotide distances along with hydrogen bonding restraints. RDCs were implemented in the calculations as single value restraints containing weak weighting coefficients (0.1) (29). The alignment tensor was determined in AMBER by first fitting the tensor to the Xplor-NIH input structure. The initial tensor components were then simultaneously optimized along with the atomic coordinates. Langevin dynamics with a collision frequency of 1 ps Ϫ1 was used for temperature control, and SHAKE was used to constrain bonds involving hydrogen. The final 10 lowest energy structures were visualized in PyMOL (30), and quality assessments were performed using MolProbity (31).

Size Exclusion Chromatography in Line with Small Angle X-ray Scattering (SEC-SAXS) Analysis of SL1 ISS -SEC-SAXS
experiments were performed on SL1 ISS (5 mM MES, 10 mM KCl (pH 6.5)) at BioCAT (beamline 18-ID, Advanced Photon Source). Data were collected and processed as described previously (16) with the exception of an extended detector q range (q ϳ 0.005-0.38 Å Ϫ1 ). Initial molecular reconstructions were performed in Primus (32) from the ATSAS (33) program suite. Guinier fitting was used to test for aggregation and estimate the radius of gyration (R g ) (R g ϫ q Ͻ 1.3), and the curve was fit to generate the pair distance distribution P(r) plot using GNOM (34). Initial ab initio models were created using DAMMIF (35) where 20 independent models were generated. The models were aligned and averaged using DAMAVER (36), and the most probable model from the average was generated using damfilt.
The final molecular envelope was calculated using DAM-MIN (37) in slow mode. To reduce degeneracy among the models, a method similar to that used by Fang et al. (38) was implemented where a parallelepiped starting volume in DAMMIN was utilized with starting dimensions ϳ20 Å greater than the initial damfilt model. Using the same GNOM output for initial reconstruction, 20 models were generated followed by alignment and averaging by DAMAVER. The damstart model generated by DAMAVER underwent a final refinement using DAMMIN in expert mode, producing a final molecular envelope with a 2 value of 0.97 to the experimental data.
Incorporation of SAXS Molecular Reconstruction during AMBER Refinement-The resulting SAXS molecular envelope was used as a shape constraint in AMBER 12 using the EMAP module (39). The molecular envelope was fit to the lowest energy NMR-derived structure using SUPCOMB (40), allowing for an enantiomer search with the SAXS reconstruction. A 5-Å volumetric map was written from the SAXS ab initio bead model using the EMAP module. The simulation protocol was identical to the method described above with the addition that the SAXS molecular reconstruction was incorporated in the simulation (alongside NOE and RDC restraints) with a constraining constant (c map ) of 0.01 kcal/g. The SAXS refined structure and lowest energy NMR structure were compared with the experimental scattering curve in CRYSOL (41) and visualized in PyMOL (25).
UP1 Expression and Purification-The C-terminal His 6tagged UP1 protein was prepared as described previously (15).
Calorimetric Titrations of UP1-Calorimetric titration studies were carried out at 25°C using a VP-ITC calorimeter (MicroCal, LLC) with the SL1 ISS-33 and SL1 ISS-25 RNA oligonucleotides. Each RNA sample was prepared by diluting to a concentration of 2.5 M in binding buffer (10 mM K 2 HPO 4 , 120 mM KCl, 10 mM NaCl, 0.5 mM EDTA, 1 mM tris(2-carboxyethyl) phosphine (pH 6.5)). C-terminal His 6 -tagged UP1 protein was prepared for the titration studies by exchanging it in the same binding buffer as used for RNA sample preparation using the Amicon Ultra-4 centrifugal filter devices. The UP1 protein (ϳ50 M) was titrated into ϳ1.4 ml of 2.5 M RNA over 36 injections of 8 l each. Prior to fitting to a 1:1 binding isotherm in Origin v7.0, the raw data were corrected for heats of dilution by subtracting the average of the last eight values of the titration curve.

The ISS Element Folds into a Phylogenetically Conserved Stem
Loop-To assess the conservation of the RNA region surrounding ISS, we performed phylogenetic alignments of HIV-1 group M subtype reference strains for which there were at least 10 sequences in the Los Alamos HIV-1 database. We also included sequences from SIV cpz . In total, we analyzed 10,353 HIV-1 and 17 SIV cpz sequences. Maximum likelihood analysis for group M subtype-specific and SIV cpz consensus sequences revealed relatively low evolutionary diversity for the ISS element with a mean genetic distance of 0.06 Ϯ 0.02 substitution per site in contrast to ϳ0.45 substitution per site (42) reported for the surrounding HIV-1 env region as a whole. Fig. 2A shows subtype-specific consensus logos determined from the alignments and the corresponding maximum likelihood phylogenetic tree. As expected from the maximum likelihood analysis, the group M consensus sequences exhibit a very high degree of conservation with a conspicuous 5-nt uridine tract and a complimentary 5Ј-AGGGA-3Ј motif centered ϳ36 nt downstream. The 5Ј-AGGGA-3Ј motif is also conserved in SIV cpz ; however, the uridine tract is variable. Secondary structure predictions (group M sequences) using the most frequently occurring nucleotides from the global consensus sequence reveal that these subregions base pair to fold a stem loop structure that consists of a 5-nt apical loop, a 2X2 internal loop, a UU bulge, and an unpaired G (Fig. 2B). The consensus secondary structure determined here agrees with earlier phylogenetic work based on a much smaller set of group M sequences (12,13). Folding of consensus sequences from individual subtypes further supports that the ability to form a stem loop structure is a phylogenetic feature of this region of env wherein a perfectly base-paired lower helix and purine-rich apical loop are the most conserved secondary structural elements (Fig. 3). The variations in the size of the apical and internal loops observed might augment the strength of the ISS element in a clade-specific manner. Despite the predominance of the ISS stem loop structure across group M viruses, compensatory base pair substitutions are not observed. The lack of covariation indicates that this region of env is under tight pressure to maintain amino acid identity. Nevertheless, the gross features of the ISS stem loop determined here are observed in the most recent SHAPE map of the HIV-1 NL43 genome, which was recalibrated based on phyloge-netic comparisons with SIV cpz and SIV mac (17). Therefore, we conclude that the ISS element exists in a genomic region capable of forming intramolecular base pairs. NMR Analysis of the ISS Stem Loop-To determine the threedimensional structure of the RNA region surrounding ISS, we in vitro transcribed a 53-nt construct (SL1 ISS ; corresponding to residues 7904 -7955 of HIV-1 Bru ), which included a non-native terminal GC base pair needed for efficient transcription. We also prepared two shorter constructs, SL1 ISS-33 (7915-7944) and SL1 ISS-25 (7920 -7942), that were used to facilitate chemical shift assignments. G7909 was removed from the SL1 ISS construct because the 1 H imino spectra were consistently poor under various solution conditions. The low spectral quality of constructs containing G7909 likely result from local dynamics induced by the bulge, which is consistent with its high chemoenzymatic reactivity observed during RNA structure probing (12,13).
The secondary structure of SL1 ISS was verified by 1 H-1 H NOESY spectra collected in water (288 K) where base-specific imino assignments were further confirmed on a [ 15 N]GU-selectively labeled SL1 ISS construct. Fig. 4 shows the upfield imino region of the 1 H-1 H NOESY along with the corresponding 1 H-15 N HSQC. Signature NOE cross-peaks and chemical shifts are observed for the three consecutive UG wobble pairs located within the center portion of the lower stem. U7917 H3 gives a strong NOE to G7943 H1, which is consistent with a standard UG wobble base pair adjacent to the 2X2 internal loop. A weak NOE is also observed between resonances at 11.20 and 10.56 ppm; however, the corresponding 1 H-15 N correlation peaks are not seen due to rapid solvent exchange. Rapid solvent exchange during the mixing time also precluded assignments of the imino shifts of WC base pairs from the upper helix; however, onedimensional 1 H spectra collected on SL1 ISS-25 and SL1 ISS-33 provided clear evidence that the upper helix folds with the expected number and type of base pairs (not shown).
Chemical shift assignments of the non-exchangeable protons were made using multiple ISS constructs that varied in size and type of nucleotide selective labeling scheme. Fig. 5 shows a representative 1 H-1 H NOESY spectrum collected in 100% D 2 O (298 K and pH 6.5) on an SL1 ISS construct where all ribose 3Ј-5Љ positions and pyrimidine position 5 were selectively deuterated. In general, sharp and well resolved NOE cross-peaks were obtained with this labeling pattern. Sequential H8/H6 (i) to H1Ј (i Ϫ 1) NOE cross-peaks are readily traced for residues 7904 -7917 and 7943-7955 of the lower stem. In addition, adenines involved in AU base pairs give A-form NOE cross-peaks involving H2 to i ϩ 1 H1Ј on the same strand and i ϩ 1 H1Ј on the opposite strand (Fig. 5B).
Within the 2X2 internal loop, a continuous NOE walk pattern is traced along the 5Ј side from U7917 to A7920 and along the 3Ј side from U7940 to G7943. A7920 H2 gives a long range NOE to U7941 H1Ј, confirming that a standard A7920:U7940 FIGURE 2. Phylogenetic analysis reveals highly conserved regions within ISS. A, results of phylogenetic analysis for HIV-1 group M subtypes and SIV cpz sequences are represented using a maximum likelihood reconstructed cladogram (left) and subtype-specific consensus logos (middle). The height of the individual nucleotides within the logos is proportional to the observed frequency of the nucleotide at the particular position within the alignment. The percentage of sequences within a given strain that exactly match the consensus logo is given to the right. The region shown corresponds to residues 7904 -7955 using the Bru numbering system. B, predicted secondary structure using the most frequently occurring nucleotides of the global consensus logo. As illustrated, ISS folds into a phylogenetically conserved stem loop structure that exposes an AGUGA apical loop. Pred., predicted.
base pair closes the 2X2 internal loop (Fig. 5B). Additional stacking interactions involving H8/H6 protons were also detected in the aromatic region of the 1 H-1 H NOESY (not shown). Interestingly, G7943 H1Ј is shifted upfield to ϳ5.0 ppm and gives a strong NOE to A7942 H2 and a medium NOE to G7944 H8. The chemical shift and NOE patterns of G7943 H1Ј are consistent with it stacking below A7942 where the Hoogsteen edge of A7492 faces the center of the internal loop.
The NOE cross-peaks of the upper helix and apical loop were in general less pronounced than those observed within and below the 2X2 internal loop (Fig. 5) due to line broadening likely brought on by local millisecond dynamics. Nevertheless, NMR measurements for this region of SL1 ISS were made using full-length SL1 ISS , SL1 ISS-33 , and SL1 ISS-25 . First, strong H8/H6 (i) to H2Ј (i Ϫ 1) NOE cross-peaks were detected for residues A7920 -U7922, consistent with A-form geometry. Second, NOE cross-peaks from A7938 H2 to C7925 H1Ј and G7939 H1Ј reveal the U7923-U7924 stack outside the helix and the A7938-G7939 step is A-form. Third, downfield C8 chemical shifts of G7930 and G7932 imply that these bases predominantly adopt syn conformations about the glycosidic bond. Lastly, NOE interactions of moderate intensity were observed between A7933 and A7934, confirming their stacking interaction. The collective NOE cross-peak patterns and chemical shifts are consistent with ISS folding into a stem loop structure composed of consecutive UG wobble pairs, a 2X2 internal loop, and a conformationally flexible UU bulge and 5-nt apical loop.
Description of the SL1 ISS Structure-NMR structures were determined using 659 NOE distance restraints for an average of ϳ12 restraints per nucleotide along with 44 C H RDC restraints (Table 1). Fig. 6A shows the ensemble of 10 low energy structures refined using a generalized Born implicit solvent model in AMBER (43). Heavy atom superimposition of base-paired residues from the lower (7904 -7917 and 7943-7955) and upper (7920 -7928 and 7934 -7940) helices provide an r.m.s.d. value of 0.75 Å, indicating that the structures are well defined by the NMR data. Alignments of residues from just the lower or upper helices give r.m.s.d. values of 0.28 Å. In addition, comparison of measured and back-calculated RDC values for the lowest energy structure shows an excellent correlation (Fig. 6B).
The SL1 ISS structure reveals that residues 7904 -7917 and 7943-7955 of the lower stem form a perfectly base-paired helix. Indeed, structural parameters determined using 3DNA (44) are consistent with A-form geometry ( Table 2). The central portion of the lower helix consists of three consecutive UG wobble pairs where U7911-U7913 stack along the 5Ј side and G7947-G7949 stack along the 3Ј side. Stacking of three consecutive UG pairs leads to a slightly wider major groove (ϳ15.0 Å compared with 12.4 Å for WC bps) where a unique pattern of exocyclic carbonyl functional groups align to create an electronegative sur-  interactions are indicated for every other nucleotide. Numbers are colored according to secondary structural elements shown in A. 7,900 should be added to the assignment labels to obtain the native HIV-1 Bru numbering. Note that G* corresponds to the non-native guanosine added to increase transcription yields. Lines denote the sequential walk pathway, and the arrow indicates that the G7943 H1Ј chemical shift is upfield (ϳ5 ppm) relative to all other H1Ј chemical shifts.

Structural Insights into the HIV-1 ISS Element
face (Fig. 6C). Such arrangements of consecutive UG wobble pairs in RNA helices are known to modulate thermodynamic stability and provide sites of interactions for divalent cations, small molecules, and proteins (45).
As expected from the NOE cross-peaks, the 2X2 internal loop adopts a stable architecture that is closed by a UG wobble and AU base pair (Fig. 7D). The Watson-Crick edge of G7918 faces the interior of the loop and is positioned across from the Hoogsteen edge of A7942. Structures refined using loose hydrogen bond restraints consistent with trans Hoogsteen/ sugar edge sheared AG and cis Watson-Crick/Watson-Crick UU geometries produced local stacking interactions that are consistent with the observed NOE patterns. Moreover, molecular dynamic simulations conducted without hydrogen bonds reveal that the 2X2 internal loop is stable (not shown). Incorporation of the 2X2 internal loop slightly overwinds the helix as determined by the average C1Ј-C1Ј distance of 9.1 Å across the AG and UU pairs compared with ϳ10.5 Å for Watson-Crick pairs. The folded 2X2 internal loop determined here agrees with the low chemoenzymatic reactivity observed for these bases during RNA secondary structure probing of the isolated ssA7 locus (12,13).
The upper helix consists of seven Watson-Crick base pairs where U7923 and U7924 bulge out to allow partial stacking of U7922:A7938 and C7925:G7937 base pairs. Consistent with the sparse NOE interactions, the 5-nt apical loop samples multiple conformations, although some residual structural features exist.
Small Angle X-ray Scattering Provides Complimentary Insight into the Global Shape of SL1 ISS -We further assessed the global shape of SL1 ISS by SEC-SAXS. Fig. 7A shows the Kratky profile (I(q) ϫ q 2 versus q where q is the scattering angle and I(q) is the scattering intensity) of the scattering data collected on SL1 ISS (ϳ7 mg/ml) in 10 mM MES, 10 mM KCl (pH 6.5). The inverted parabolic shape of the Kratky plot reveals that SL1 ISS folds into a stable structure, consistent with our NMR results. The R g calculated from the linear region of the Guinier plot is 24.9 Å, which coincides with the approximate width of an A-form RNA helix. The shape of the pair distance distribution function P(r) indicates that SL1 ISS adopts a kinked cylindrical structure with a maximum dimension of ϳ90 Å (Fig.  7B).
Twenty ab initio molecular reconstructions of SL1 ISS were used to determine the final DAMMIN-refined model. The theoretical molecular mass determined from the excluded volume is 18.9 kDa, which agrees favorably with the expected molecular mass (17.2 kDa) of SL1 ISS . The overall size of the SAXS reconstruction easily accommodates the NMR-determined structure (Fig. 7C); however, back-calculation of the scattering curve from the NMR structure gave a 2 value of 4.7. The slightly higher 2 value suggests that the global NMR structure of SL1 ISS differs slightly from the SAXS data. To test whether incorporation of SAXS-derived shape information can improve agreement between experimental and back-calculated scattering data, we developed a simulation schedule in AMBER wherein the structure was jointly refined against NOE distance restraints, RDCs, and the SAXS molecular reconstruction. Restraint information and energy terms for the SAXS reconstruction were implemented using the EMAP force field in AMBER (39). Fig. 7C shows the comparison between the starting low energy NMR structure and the structure refined against the SAXS reconstruction. The SAXS-refined structure reduced the 2 value to 3.4 and led to a slight reorientation of the upper helix relative to the starting NMR structure. The upper helix reorients about the 2-nt bulge loop, which acts as a flexible hinge to allow interhelical motions. The higher 2 value of the NMR-only structure is likely a result of poor conformational sampling of the upper helix when only NOE restraints and sparse RDCs are used. Comparison of the back-calculated RDC values with and without the EMAP force field parameters reveals excellent agreement (Fig. 7D). Thus, inclusion of SAXS molecular reconstructions during RNA structure refinement represents a new approach to determine global conformational properties of large RNAs. Efforts are underway to further benchmark and validate this approach.
Calorimetric and NMR Titrations Reveal That the UP1 Domain of hnRNP A1 Binds the ISS Apical Loop Specifically and with High Affinity-Biochemical footprinting studies showed that hnRNP A1 interacts with ISS where discrete protections were observed primarily within the apical loop and along the 3Ј side of the upper helix, including residues of the 2X2 internal loop (12,13). The protection pattern suggests that either multiple hnRNP A1 molecules load onto the ISS stem loop, binding of hnRNP A1 induces changes in the RNA structural dynamics, or the C-terminal domain makes nonspecific contacts with the RNA. To gain insights into the mode of specific interactions, calorimetric titrations were performed using the UP1 domain of hnRNP A1 with SL1 ISS-33 and SL1 ISS-25 . Titrations using fulllength hnRNP A1 were not feasible due to protein aggregation under our solution conditions. SL1 ISS-33 contains the bulge, apical, and internal loops, whereas the internal loop is removed in SL1 ISS-25 . Comparisons of thermodynamic parameters between the two constructs offer insights into the influence of RNA structural elements toward binding free energies. Fig. 8 shows titration curves and thermodynamic parameters of UP1 with SL1 ISS-33 and SL1 ISS-25 . UP1 binds both stem loops with nM affinities (K d(SL1ISS-33) ϭ 100.3 Ϯ 19.0 nM and K d(SL1ISS-25) ϭ 106.3 Ϯ 16.2 nM) wherein the thermodynamic driving force is a large favorable change in total binding enthalpy (Table 3). Given the similarities in the thermodynamic profiles, the calo- rimetric data suggest that UP1 interacts with SL1 ISS-33 and SL1 ISS-25 through a common surface.
To gain residue-specific insights into the UP1 binding surface on SL1 ISS , 1 H-13 C TROSY HSQC titrations (310 K and 900 MHz) were carried out using an [ 13 C]A-selectively labeled SL1 ISS-25 construct. Fig. 9 shows spectral plots of the C8-H8 region following stepwise addition of unlabeled UP1. At a 0.5: 1.0 molar ratio of UP1 to SL1 ISS-25 , correlation peaks for free and bound A7933 are observed. As the molar ratio increases to 1:1, only the bound form of A7933 is observed. A7934 also exhibits similar spectral properties, although the chemical shifts of the free and bound forms are less resolved. Minor perturbations are also seen at positions 7927 and 7938. A7920, A7929, and A7936, which are all involved in stable WC base pairs, do not experience significant chemical shift perturbations. Collectively, the NMR titrations reveal that UP1 binds site-specifically to SL1 ISS-25 and in a manner that preserves its overall secondary structure.
We further assessed the stoichiometry and homogeneity of the UP1-SL1 ISS-33 complex by performing stepwise titrations using analytical size exclusion chromatography (not shown). At a 0.5:1 (UP1:SL1 ISS-33 ) molar ratio, two well resolved and symmetrical peaks are observed corresponding to free SL1 ISS-33 and the UP1-SL1 ISS-33 complex. Increasing the molar ratio to 1:1 leads to a stoichiometric shift of the free RNA to its bound form. At excess UP1, a single well resolved and symmetrical peak corresponding to the complex is observed. Taken together, the calorimetric, NMR, and SEC data reveal that UP1 binds the ISS stem loop element as a 1:1 complex specifically and with high affinity.

Discussion
Balanced expression of the HIV-1 genome is controlled in part through alternative splicing of the 9-kb polycistronic transcript. The pool of spliced mRNAs consists of many different isoforms that vary with the viral life cycle. The virus has developed strategies to maintain transcript homeostasis by using cis RNA sequence elements that recruit host RNA-binding proteins to modulate the activities of donor and acceptor sites. The series of events that control HIV-1 splicing are not well understood; however, there is growing precedence that RNA structure modulates HIV-1 splicing (46,47). Therefore, an understanding of mechanisms that determine how frequently a splice site gets utilized requires accurate knowledge of the RNA interactions surrounding it.  Structural Insights into the HIV-1 ISS Element FIGURE 7. Small angle x-ray scattering of SL1 ISS . A, Kratky profile of SL1 ISS demonstrates that the RNA adopts a stable fold in solution. B, pair distance distribution function of SL1 ISS . C, refinement of SL1 ISS NMR structure against the SAXS reconstruction using AMBER. The SAXS molecular envelope is shown as a transparent surface with the lowest energy SL1 ISS NMR structure (red) fit to the density. The NMR structure was jointly refined against NOE distance restraints, RDCs, and the molecular reconstruction (blue) using the EMAP force field term in AMBER. The two structures superimpose over all base pairs from the lower helix with an r.m.s.d. of 1.3 Å. D, correlation plot between back-calculated RDC values with (x axis) and without (y axis) inclusion of the molecular reconstruction during refinement. Early chemoenzymatic probing of the RNA region near splice site A7 revealed that the ISS element folds into a stem loop structure consistent with our phylogenetic model (12,13); however, this feature was not observed in the original SHAPE map of the entire HIV-1 genome (48). In a more recent SHAPE model restrained with nucleotide covariation, the lower SL1 ISS helix, including the consecutive UG wobble pairs, are detected (17), but differences in base pair arrangements of the upper helix lead to a 6-nt UUCUAU apical loop instead of a 5-nt AGUGA loop (Fig. 10A). The AGUGA sequence is restrained to base pair with a complementary UCAC sequence to form a 4-bp upper helix. Because human hnRNP A1 binds AGUGA to suppress splicing at site A7 (12,13), its location (base-paired or exposed) is key to understanding the mechanism of A7 activity.
In the SHAPE model, A7938 is far away from C7925, yet we observe a medium intensity NOE between A7938 H2 and C7925 H1Ј (Fig. 10B). The presence of this NOE indicates that C7925 stacks near A7938. The H2 of A7938 also gives a strong NOE to G7939 H1Ј. These collective NOE patterns indicate that this region of the upper helix adopts local A-form geometry. Interestingly, A7938 and C7925 are unreactive toward the SHAPE reagent, which is more consistent with both being involved in stable secondary structure as opposed to unstructured loops. Indeed, satisfying the A7938 H2 to C7925 H1Ј NOE would lead to a local refolding of the SHAPE model to agree with the phylogenetic consensus structure. In doing so, ϳ87% (13 of 15) of the residues that show low to medium SHAPE reactivity are accommodated in either WC or non-WC base pairs (Fig. 10C).
The low reactivity of AGUGA is not explained by our NMR data, however. The three-dimensional structure of SL1 ISS revealed that the apical loop is flexible where some residual structure exists as determined by the NOE interactions. The residual structure might limit reactivity with the SHAPE reagent but not at the level of being unreactive. A possible explanation is that the AGUGA loop is involved in RNA-RNA terti- ary interactions within the context of genomes extracted from virions (48). Our calorimetric titrations with SL1 ISS-33 and SL1 ISS-25 revealed that both stem loops interact with UP1 as a 1:1 complex and with identical affinities ( Table 3). The smaller SL1 ISS-25 construct lacks residues involved in the 2X2 internal loop, and thus it is unlikely to fold with an exposed 6-nt UUC-UAU apical loop as seen in the SHAPE model. To that point, analogous NOE cross-peak patterns involving A7938 H2 are observed in spectra collected on SL1 ISS-25 . Therefore, our results are consistent with earlier footprinting data that showed that the primary hnRNP A1 binding site is the 5-nt AGUGA apical loop.
In closing, there are now four different secondary structural models of the HIV-1 genome (17, 48 -50). The original SHAPEderived model has been reinterpreted once by incorporating mutational profiling (49) and more recently by including phylogenetic restraints (17). During the preparation of this manuscript, Kjems and co-workers (50) published yet another model of HIV-1 wherein the conclusion is that only 31% of the base pairs they identified are common to the original SHAPE model. The multiplicity of secondary structural models of the HIV-1 genome presents real challenges for structural biochemists that seek to better interpret the molecular functions of yet uncharacterized HIV-1 RNA elements. Perhaps when global chemical reactivity profiles and phylogenetic covariation disagree, structural studies on isolated genomic regions can provide details of the "foldability" of RNA structure. These structures can then guide targeted mutations to test the biological implications of disrupting or deleting elements of secondary structure. Toward that end, we solved the three-dimensional structure of a conserved genomic region that surrounds the intronic splicing silencer acting on acceptor site A7. Our structure reveals that this region folds into a stable stem loop that exposes an apical AGUGA loop. Using calorimetric and NMR titrations, we further revealed that the UP1 domain of hnRNP A1 binds sitespecifically through the apical loop wherein the body of the stem loop structure is preserved. Thus, this work reveals that HIV-1 uses a phylogenetically conserved RNA structure to recruit the host hnRNP A1 protein upstream to acceptor site A7. When combined with our previous studies on ESS3 (14,16), the mechanism that begins to emerge is one wherein the RNA structure functions as a scaffold to direct the assembly of a functional protein-RNA complex that occludes site A7. This work puts us one step closer to determining the architecture of the intact complex.  (17). Right, alternative model of SL1 ISS determined by phylogenetic comparisons and chemoenzymatic probing of the isolated ssA7 locus (12,13). B, NOE evidence that the upper helix of SL1 ISS folds with a structure consistent with the phylogenetic model. C, mapping of the SHAPE reactivity data onto the three-dimensional structure of SL1 ISS shows that ϳ87% of the low and medium reactive nucleotides can be accommodated in regions of stable structure.