Representative cancer-associated U2AF2 mutations alter RNA interactions and splicing

High-throughput sequencing of hematologic malignancies and other cancers has revealed recurrent mis-sense mutations of genes encoding pre-mRNA splicing factors. The essential splicing factor U2AF2 recognizes a polypyrimidine-tract splice-site signal and initiates spliceosome assembly. Here, we investigate representative, acquired U2AF2 mutations, namely N196K or G301D amino acid substitutions associated with leukemia or solid tumors, respectively. We determined crystal structures of the wild-type (WT) compared with N196K- or G301D-substi-tuted U2AF2 proteins, each bound to a prototypical AdML polypyrimidine tract, at 1.5, 1.4, or 1.7 Å resolutions. The N196K residue appears to stabilize the open conformation of U2AF2 with an inter-RNA recognition motif hydrogen bond, in agreement with an increased apparent RNA-binding affinity of the N196K-substituted protein. The G301D residue remains in a similar position as the WT residue, where unfavorable proximity to the RNA phosphodiester could explain the decreased RNA-binding affinity of the G301D-substituted protein. We found that expression of the G301D - substituted U2AF2 protein reduces splicing of a minigene transcript carrying prototypical splice sites. We further show that expression of either N196K-or G301D-substituted U2AF2 can subtly alter splicing of representative

Large-scale sequencing projects, together with an emerging plethora of protein structures, have revealed statistically significant clustering of disease-associated mutations at proteinligand interfaces (1)(2)(3)(4). This revelation explains how mis-sense mutations of the same gene can cause different diseases by affecting distinct functional interfaces of an encoded protein product. Conversely, trans-acting mutations that modify mutually interacting surfaces of a multisubunit complex often produce similar disease symptoms. For example, mis-sense mutations in different domains of the WAS (Wiskott-Aldrich syndrome) protein cause clinically distinct disorders such as Wiskott-Aldrich syndrome or X-linked neutropenia (1). On the other hand, mutations clustered at the mutual interfaces of either complement factor H and component C3 proteins result in hemolytic uremic syndrome (1). Indeed, understanding the 3D aspects of the gene-to-disease process is of great interest to pharmaceutical and medical industries seeking to identify druggable targets for precision medicine. For example, an Phe 508 deletion remodels the interactome of the cystic fibrosis transmembrane conductance regulator (DF508). Reduced levels of specific DF508 cystic fibrosis transmembrane conductance regulator interactors can partially restore channel function (5). As a second example, oncogenic mutations of the p53 tumor suppressor protein can stimulate interactions with the transcription factor Nrf2, which in turn increases expression of proteasome genes and confers proteasome inhibitor resistance to cancer cells (6). The premise that disease-relevant mutations tend to cluster at 3D protein interfaces has been incorporated in several recent computational approaches for distinguishing neutral passenger mutations from candidate drivers of human disease (7)(8)(9).
Among hematologic malignancies and certain types of cancers, acquired mis-sense mutations frequently affect pre-mRNA splicing factors involved in the early stages of 3´splicesite selection (10). Most often, mutational hot spots affect clusters of residues at the protein-RNA interfaces of SF3B1, SRSF2, and U2AF1 (11). The cancer-associated mutations of SF3B1 further are believed to modify the toroidal structure of the protein and alter its recruitment of RNA helicases (12,13). The recurrent mutations of SF3B1, SRSF2, and U2AF1 in turn dysregulate the functions of the encoded proteins for gene expression and are thought to be drivers of cancer progression. Lower-frequency mutations of other splicing factors also may represent potential drug targets that alter proto-oncogenic functional events. Precedents for clinical consequences from such "long-tail" mutations in the frequency distributions of somatically mutated genes already have been established outside the field of pre-mRNA splicing, as exemplified for the paralogous mutations of Ras superfamily members (14).
We previously documented cancer-associated mutations for U2AF2, the heterodimeric partner of the U2AF1 pre-mRNA splicing factor (15). In most cases, U2AF2 mutations affect residues that are located in discrete domains of the protein, most prominently the two central RNA recognition motifs (RRM1 and RRM2), as well as an N-terminal region for heterodimerization with U2AF1 and a C-terminal protein-interaction motif (Fig. 1) recognizing a polypyrimidine (Py) tract signal preceding the major class of 3´splice sites (16,17). Structure determinations by NMR and X-ray crystallography demonstrate that the two RRMs recognize a continuous nine-uridine Py tract in an open, side-by-side configuration (18)(19)(20)(21). In the absence of RNA, the inter-RRM conformation is dynamic and can adopt a range of RRM1/RRM2 proximities (20,22). A closed U2AF2 conformation, in which the RNA-binding surface of RRM1 is masked by RRM2 (18), is stabilized in the heterodimer with the U2AF1 subunit (23). Notably, many of the cancer-associated U2AF2 mutations are predicted to cluster near the RNA or RRM1/ RRM2 interface of the respective open and closed conformations (15). This observation suggests that cancer-associated mutations of the U2AF2 RRMs could modulate binding to the Py tract and dysregulate pre-mRNA splicing or other U2AF2-RNA-dependent processes. However, the structural and functional consequences of such long-tail U2AF2 mutations have yet to be investigated empirically.
Here, we use X-ray crystallography, RNA-binding assays, and pre-mRNA splicing assays of minigene reporters and endogenous transcripts to investigate the consequences of two representative cancer-associated mutations of U2AF2. We focused on an N196K substitution of the N-terminal RRM1 that recurs among patients with acute myeloid leukemia (AML) and a distinct, G301D substitution of the C-terminal U2AF2 RRM2 observed in cases of colon adenocarcinoma and castration-resistant prostate carcinoma. We compared the respective 1.4 and 1.7 Å resolution structures of the N196K-and G301D-substituted proteins bound to a prototypical Py tract with a baseline, 1.5 Å resolution structure of the wild-type (WT) complex. The U2AF2 loop containing the N196K mutation shifts to form an inter-RRM hydrogen bond that appears to stabilize the open conformation. The G301D conformation remains unchanged but may disfavor binding to the nearby phosphodiester backbone of the RNA. Accordingly, the N196K mutation increases the apparent RNA-binding affinity of U2AF2, whereas the G301D mutation has a converse effect. The mutations differently affect splicing of representative tran-scripts, with the G301D mutation showing effects similar to siRNA-mediated reductions in U2AF2 levels. These different structural and functional consequences affirm that the N196K and G301D mutations of U2AF2 have the potential to drive dysregulated gene expression in leukemias and cancers and, in addition, resolve how mutations in the same U2AF2 gene can result in clinically distinct disorders.

Results
Structure of human U2AF2 12L bound to the AdML Py tract As a baseline to distinguish the structural influence of the N196K and G301D mutations on a bona fide splice-site complex, we first determined the crystal structure of the U2AF2 RRM-containing region (U2AF2 12L ; Fig. 1B) bound to a modified, Py tract oligonucleotide (5´-UUUU(dU)U(5BrdU)CC-3´) ( Table 1 and Fig. 2A). The sequence of the co-crystallized oligonucleotide matched the prototypical adenovirus major late promoter transcript (AdML), which also is identical to the preferred U2AF2 binding site as determined by in vitro selection (16). The two terminal cytidines differed from the primarily uridine sequences of our prior U2AF2 12L structures (20). As for prior structures, including a deoxyuridine (dU) and 5-bromo-dU in the oligonucleotide marked the sequence register and facilitated high-quality crystals. The U2AF2 12L -AdML Py tract structure was determined at 1.5 Å resolution by molecular replacement. Based on similar crystallization conditions (succinate, pH 7.0), crystal packing, and resolution limits, the prior structure of U2AF2 12L bound to an all-uridine oligonucleotide (PDB code 5EV3) was an appropriate starting model for refinement and comparison of structural changes to bind the terminal cytidines. Other high-resolution U2AF2 12L structures bound to uridines (PDB code 5EV1 and 5EV2) were available; however, the nonphysiological low pH of the crystallization conditions (pH 4.0) altered relevant terminal nucleotide interactions.
The overall structure of the U2AF2 12L -AdML complex is nearly identical to the uridine counterpart (RMSD 0.1 Å for 185 matching Ca atoms of PDB code 5EV3), apart from a large, ;3 Å shift in the positions of the terminal cytidines (Fig. 2B). These tandem cytidines are well-ordered in an unbiased, feature-enhanced electron density map (24) (Fig. 2C), where they are engaged by a combination of direct and water-mediated hydrogen bonds with U2AF2 Arg 146 , Arg 150 , and Asp 231 (Fig.  2E). The well-defined contacts with the Asp 231 side chain are consistent with the ability of a D231V-variant U2AF2 to alter specificity for the terminal nucleotides of the Py tract (25). Despite comparable resolutions and similar crystallization conditions, the terminal uridines of the prior structure are poorly defined in the electron density (Fig. 2D). A Gln 147 side chain that mediates hydrogen bonds with the uracil base edges (Fig.  2F) has been displaced by arginine side chains in the current cytidine-bound structure. Although a prior deoxyribose substitution may contribute structural differences at the terminal nucleotides, the base-specific interactions and absence of 29 hydroxyl contacts supports the conclusion that U2AF2 preferentially secures the tandem cytidines of the AdML Py tract compared with the uridine counterpart.

Structure of N196K U2AF2 12L bound to the AdML Py tract
To view the structural changes caused by a representative cancer-relevant substitution of the U2AF2 RRM1, we determined the crystal structure of an N196K-substituted U2AF2 12L bound to the AdML Py tract at 1.7 Å resolution (Table 1). This N196K substitution is among the most common U2AF2 mutations, resulting from A ! T transversions in four AML patients (COSMIC code COSU544). The precipitant (malonate, pH 7.0) and crystal packing environment is similar to the WT AdML counterpart described above, such that structural changes can be attributed to the amino acid substitution. The Asn 196 residue is located in a loop of the N-terminal RRM1 near the bound oligonucleotide and at the RRM1/RRM2 interface ( Fig. 2A). Apart from local movement of this loop region, the overall structure of the N196K-substituted protein remains similar to the WT complex (RMSD 0.6 between 197 matching Ca atoms). In the WT complex, the Asn 196 side chain is poorly ordered and modeled as two alternative conformations, one of which mediates a hydrogen bond with the uracil-O 2 atom (Fig. 3, A and C). By contrast, the mutant lysine side chain is well-defined in the electron density (Fig. 3, B and D). While remaining welldefined, the Ser 294 residue shows evidence of two alternative conformations in the N196K-substituted structure. Rather than directly interacting with the nucleotide, the position of the Lys 196 -containing loop has shifted to achieve a hydrogen bond with one conformation of the Ser 294 backbone carbonyl in the opposite RRM. This interaction could potentially stabilize the open U2AF2 conformation for association with uridine-rich Py tracts.

Structure of G301D U2AF2 12L bound to the AdML Py tract
We next investigated the structural changes caused by a representative cancer-relevant substitution of the U2AF2 RRM2 by determining the crystal structure of a G301D-substituted U2AF2 12L bound to the AdML Py tract at 1.4 Å resolution (Table 1). This G301D substitution results from a A ! G transition identified in colon adenocarcinoma and castration-resistant prostate carcinoma patients (26,27). A related G301S substitution also occurs in papillary renal cell carcinoma (International Cancer Genome Consortium code DO48476). The crystallization conditions and packing environment of the G301D structure remained similar to the N196K and unmodified counterparts. The Gly 301 residue is located preceding the third b-strand of RRM2 and adjacent the first uridine of the bound oligonucleotide ( Fig. 2A). There, the Gly 301 residue appears to serve a structural role and lacks direct contacts with the bound oligonucleotide (Fig. 4, A and C). Overall, the protein backbone remains unchanged by the G301D substitution (RMSD 0.2 Å where I i is an intensity I for the ith measurement of a reflection with indices hkl and is the weighted mean of all measurements of I. where n is the number of observations of the intensity I i . d CC 1/2 , correlation coefficient between intensities of random-half data set (42). e R work = S hkl ||F obs (hkl)j 2 jF calc (hkl)||)/S hkl jF obs (hkl)j for the working set of reflections. R free is R work for ;7% of the reflections excluded from the refinement. All data were used in the refinement. f Calculated using the program MolProbity (43). between 188 matching Ca atoms). The mutant Asp 301 side chain is anchored by hydrogen bonds to the Lys 328 and Asn 268 side chains of neighboring RRM2 loops, from which it displaces an ordered water molecule of the Gly 301 structure (Fig. 4, B and D). The partial negative charge of the acidic Asp 301 side chain is located within van der Waals packing distance of the electronegative terminal phosphate of the oligonucleotide (3.7 Å oxygen-oxygen). This close proximity is expected to disfavor U2AF2-RNA association.
N196K and G301D substitutions alter U2AF2 12L affinity for the AdML Py tract Based on the structures of N196K and G301D U2AF2 12L , we predicted that these amino acid substitutions would influence the RNA-binding affinities of the mutated proteins. To test this prediction, we measured the fluorescence anisotropy changes during titration of the WT and mutant proteins into a consensus 3´splice site labeled with 5´fluorescein and fit the apparent binding affinities (Fig. 5). The N196K substitution increased the apparent binding affinity of U2AF2 12L for this splice-site RNA by approximately 4-fold. Conversely, the G301D substitution decreased the RNA-binding affinity of U2AF2 12L by nearly 12-fold. The magnitudes of the mutation-induced effects on U2AF2 12L -RNA association (;1-1.5 kcal mol 21 ) agree with the positive charge and inter-RRM contacts of the Lys 196 residue, as well as with the Asp 301 negative charge introduced near the phosphodiester backbone.   . U2AF2 12L G301D interactions with bound oligonucleotide. A, a feature-enhanced electron density map contoured at 1 s shows the WT Gly 301 side chain interacting with an alternative conformation (a/b) of Asn 268 and two ordered water molecules. B, the Asp 301 variant displaces one water molecule and instead forms direct hydrogen bonds with a single Asn 268 conformation and the Lys 328 side chain. The terminal phosphates have been omitted for clarity. C and D, view of U2AF2 12L interactions at these sites. The electron density map is colored to indicate the following: oligonucleotide, purple; Gly 301 or Asp 301 residues, cyan; other residues, marine blue; waters, red spheres.
Cancer-associated mutations alter U2AF2 structure-function N196K and G301D substitutions alter splicing of a U2AF2-responsive minigene model We hypothesized that the altered RNA binding caused by the N196K and G301D substitutions would in turn affect splicesite selection. We first tested this hypothesis by RT-PCR and quantitative real-time RT-PCR of a well-characterized pyPY minigene ( Fig. 6 and Fig. S1A), which comprises IgM M1 and partial M2 exons fused to an intron and exon from AdML (28) (Fig. 6A). These alternative 3´splice sites are marked by uridine-poor (py) or uridine-rich (PY) Py tracts. Because the PY sequence of the minigene corresponds to the AdML prototype used for our U2AF2 1,2L co-crystal structures, this reporter is well-suited for evaluating U2AF2 structure-function relationships. In our cell line (HEK 293T) and culture conditions ("Experimental procedures"), the pyPY transcript was primarily unspliced (Fig. 6B). A small amount of splicing was detected at  . The sequences of the intron preceding each 3´splice site are shown above for py and below for PY. B, representative RT-PCR of pyPY transcripts from HEK293T cells stably expressing the pyPY minigene and transfected either WT or the indicated mutant U2AF2. A cryptic splice site resulting in an ;330-bp band represents an "AG" consensus closest to the 5´splice-site donor. This site lacks a detectable Py tract and remains unchanged by U2AF2 expression, unlike the U2AF2-sensitive py and PY 3´splice sites. STD, molecular size standards. C and D, quantitative real-time PCR analysis of the relative expression levels of the py (C) and PY (D) isoforms. Two-tailed unpaired t tests with Welch's correction of the average values from three experiments were calculated for the mutants compared with WT in GraphPad Prism: n.s., not significant (p > 0.05); *, p < 0.05; **, p < 0.05. Immunoblots of the transfected samples are shown in Fig. S1A. the strong, consensus PY site. A minor product of a cryptic splice site corresponds in size to a proximal AG, which lacks a distinguishable Py tract and is not expected to respond to U2AF2 levels. Transfection of a plasmid expressing WT U2AF2 increased splicing, particularly at the weak py site as noted previously (20). Expression of the N196K-substituted U2AF2 also increased use of the py splice site, consistent with a gain in RNA-binding affinity. Conversely, expression of the G301Dsubstituted U2AF2 reduced py splicing nearly to background levels. These results agree with the differences in structure and RNA-binding affinities of the two mutant proteins, although downstream consequences for other interactions (e.g. regulation of U2AF1) may play roles in the altered splicing.

N196K and G301D substitutions of U2AF2 alter splicing of representative endogenous pre-mRNAs
We next tested the influence of expressing N196K-and G301D-substituted U2AF2 on splicing of representative endogenous pre-mRNAs in a human cell line (HEK 293T) (Fig. 7). We focused on three transcripts known to exhibit U2AF2-responsive exon-skipping (4): GSK3B, THYN1, and SAT1. To mimic the heterozygous context of acquired mutations in cancers, we overexpressed either WT, N196K, or G301D variants in the presence of endogenous U2AF2. We compared the effects of siRNA-mediated reductions in U2AF2 levels (Fig. S2). As expected (4), loss of U2AF2 increased inclusion of the GSK3B cassette exon and decreased inclusion of THYN1 and SAT1 cassette exons. The N196K and G301D substitutions lead to subtle but reproducible changes in splicing (Fig. 7). Overexpression of the G301D-mutant U2AF2 had a lesser but similar effect as U2AF2 knockdown, supporting that this mutant can stall the splicing process. The N196K variant slightly enhanced or had similar effects as WT U2AF2 on splicing of the GSK3B, THYN1, and SAT1 sites, consistent with its RNA-binding properties. Together with the alterations in pyPY splicing, these differences demonstrate that the N196K-and G301D-mutant U2AF2 can influence splicing of endogenous gene transcripts even in the presence of the normal U2AF2 counterpart.

Discussion
A number of the cancer-associated mutations of U2AF2 affect residues at the RNA interface of the U2AF2 RRMs (15). In the present study, we demonstrate structural and functional consequences for two different representatives of this mutational class, including a leukemia-associated N196K mutation and solid tumor-associated G301D mutation. The positively charged N196K mutation of RRM1 favors U2AF2 binding to the negatively charged RNA. The mutant lysine also promotes a local conformational change and mediates a hydrogen bond bridge to the neighboring RRM2, which appears to promote the open U2AF2 conformation for RNA binding. Accordingly, the N196K mutation increases the RNA-binding affinity of U2AF2. The G301D mutation, on the other hand, introduces a negative charge abutting the 5´terminal phosphate of the bound oligonucleotide in the crystal structure. Although the G301D protein and RNA conformations remain similar to the WT structure, the aspartate side chain is expected to repulse the neighboring phosphate. In support of this conclusion, the G301D mutation penalizes the RNA-binding affinity of U2AF2.
For a prototypical pyPY minigene, expressing the N196K variant invokes a similar effect as WT U2AF2, most likely by Figure 7. N196K and G301D mutations of U2AF2 alter splicing of representative transcripts. A-C, schematic diagrams of the indicated cassette exons from representative GSK3B, THYN1, or SAT1 transcripts. Splicing of these sites in HEK 293T cells, transfected with either empty control vector (pCMV) or plasmids expressing WT U2AF2 or the N196K or G301D variants, was analyzed by D-F, RT-PCR followed by agarose gel electrophoresis with ethidium bromide staining or G-I, quantitative real-time RT-PCR normalized to GAPDH. STD, molecular size standards. Immunoblots are shown in Fig. S1 (B and C), corresponding analyses of U2AF2 knockdown samples are shown in Fig. S2, and primer sequences are listed in Table S1. Two-tailed unpaired t tests with Welch's correction of the average values from three experiments were calculated for the mutants compared with WT in GraphPad Prism: n.s., not significant (p > 0.05); *, p < 0.05; **, p < 0.005; ***, p < 0.0005.
Cancer-associated mutations alter U2AF2 structure-function binding to and activating the weak py tract. Conversely, the G301D-substituted U2AF2 is unable to activate splicing of the pyPY minigene, in agreement with its reduced RNA-binding affinity. Because U2AF2 in turn regulates expression of its heterodimeric U2AF1 subunit and likely other splicing factors (4,28), it is possible that the U2AF2-associated effects on pyPY splicing are indirect. However, the clear response of pyPY splicing to U2AF2 levels and substitutions is consistent with a primarily direct effect under the conditions of this experiment. The splicing of endogenous transcripts in human cells is even more complex because of competing factors and coupled processes such as transcription and polyadenylation. Moreover, we expressed the cancer-associated U2AF2 variants in the presence of normal U2AF2 to mimic the expected heterozygous state of cancer cells that have acquired the U2AF2 mutations. Nevertheless, expression of the N196K or G301D U2AF2 variants subtly but detectably alters splicing of endogenous transcripts. Although small, such changes could destabilize gene expression sufficiently to promote a cancerous state. Moreover, differences in the RNA-binding properties of the mutant proteins appear to affect splicing in different ways that could contribute to separate disease outcomes (e.g. AML for N196K versus colorectal/prostate carcinomas for G301D variants).
An ongoing challenge in the field of personalized medicine is to distinguish cancer "driver" mutations from the millions of neutral variants that have been documented in nearly every human gene (29). For example, inherited single-nucleotide variants can in some cases predispose carriers to cancers yet more often are simply neutral passengers. Although we did not detect significant growth differences for HEK 293T cells expressing the mutant U2AF2 proteins in the time frame of our experiments (data not shown), the WT U2AF2 allele is essential for cell viability, and accordingly, its acquired mutations typically are heterozygous. Our finding that the N196K or G301D variants of U2AF2 change splicing of representative gene transcripts, coupled with the apparent absence of N196K-or G301D-encoding mutations among inherited U2AF2 singlenucleotide variants (30), suggests that critical functional consequences could prevent passage of these mutations through the germline. Taken together, these findings support that the N196K or G301D mutations of U2AF2 are capable of contributing to the oncogenic dysregulation of gene expression.
Conversely, mutations that affect mutual interfaces of distinct subunits can have analogous functional consequences and cause the same disease, i.e. "guilt by association" (1,31,32). Because U2AF2 contacts the majority of 3´splice sites (4), a mutant Py tract signal would have little impact on the transcriptome compared with a mis-sense mutation of the U2AF2 protein itself. However, acquired mutations affecting protein partners of U2AF2 (including U2AF1, SF3B1, SF1, or RNA unwindases) could trigger similar downstream effects. Accordingly, our crystal structure suggests that the AML-associated N196K substitution stabilizes the open U2AF2 conformation, as observed for the AML/myelodysplasia-associated S34F mutation of U2AF1 in complex with a subset of splice sites (23). As a second example, the cancer-associated K700E mutation of SF3B1 is expected to weaken SF3B1-RNA contacts and modulate RNA unwindases (11)(12)(13), which could mimic the G301D-dependent destabilization of the U2AF2-Py tract complex. Third, a search using cBioPortal (33,34) indicates that SF1 alterations recur among castration-resistant prostate cancers and colorectal adenocarcinomas, which are the same cancer types associated with the U2AF2 G301D mutation. Indeed, the most common mis-sense mutations of SF1 (R255Q/W and R135C/H) are located at its RNA interface, which is expected to reduce RNA binding to the ternary SF1-U2AF2-U2AF1 complex, as would the G301D substitution of U2AF2. These observations suggest that in certain contexts, the long-tail N196K and G301D U2AF2 mutations may evoke similar consequences as more common mutations in other splicing factors, thereby dysregulating pre-mRNA splicing and contributing to neoplastic transformation.
In conclusion, the results presented here offer a concrete molecular mechanism for N196K and G301D mis-sense mutations of U2AF2 to contribute to the progression of malignancies by altering splice-site signal recognition. By analogy, other known 3D interfaces of the U2AF2 protein can explain the enrichment of cancer-associated mutations in its splicing factor partners. Taken collectively, these examples of long-tail splicing factor mutations represent a source of pre-mRNA splicing aberrations among cancers that may be more widespread than apparent based on inspection of the relatively rare, individual occurrences. Beyond the N196K/G301D-containing cluster of mutations at the U2AF2-RNA interface, other subsets of cancer-associated U2AF2 mutations are located at the inter-RRM interface of its closed conformation or in the C-terminal SF1/ SF3B1-interaction motif. More studies are needed to distinguish the relevance of other long-tail mutations for U2AF2 structure, function, and associated cancers. Meanwhile, our confirmation that the cancer-associated N196K and G301D mutations modify the structural and functional properties of U2AF2 raises the possibility of targeting U2AF2 and its partners as a potential means to investigate and treat leukemias and cancers.

Expression and purification
For crystallization and RNA-binding experiments, the WT, N196K, or G301D variants of the U2AF2 RNA-binding domain included the N/C-terminal extensions of RRM1/RRM, RRM1, RRM2, and the inter-RRM linker (residues 141-342 of NCBI RefSeq NP_009210). These U2AF2 12L proteins were expressed and purified as described (20). Following a final step of sizeexclusion chromatography on a Superdex-75 prep-grade column (Cytiva Inc.) equilibrated with 100 mM NaCl, 15 mM HEPES, pH 6.8, 0.2 mM TCEP, the purified U2AF2 12L was concentrated using a Vivaspin 15R (Sartorius Corp.) centrifugal concentrator with a 10-kDa molecular mass cutoff. The protein concentration was estimated using the calculated extinction coefficient of 8,940 M 21 cm 21 and absorbance at 280 nm. Purified, deprotected oligonucleotides for co-crystallization were purchased from Integrated DNA Technologies Inc. Purified, fluorescein-labeled RNA oligonucleotides were purchased from Horizon Discovery Ltd. and deprotected according to the manufacturer's instructions.

Crystallization and structure determination
Prior to crystallization, WT, N196K, or G301D U2AF2 12L variants were mixed in a 1:1.2 molar ratio with purified oligonucleotide (5´-UUUU(dU)U(5BrdU)CC-3´) and incubated on ice for 20 min. The final protein concentration was ;20 mg ml 21 . Diffraction quality crystals were obtained within approximately 1 week from a hanging drop of 1 ml of macromolecule layered with 1 ml of precipitant equilibrated over a 0.7-ml reservoir at 4°C. The precipitant of the WT complex was 1 M succinic acid, 0.1 M HEPES, pH 7.0, 3% (w/v) PEG mono methyl ether 2000. The precipitant of the N196K or G301D variants was 0.24 M sodium malonate, pH 7.0, 5% sucrose, and 20-25% w/v PEG 3350. Additionally, 0.1 ml of 5% w/v LDAO detergent (Hampton Research) was added to the N196K or G301D protein-oligonucleotide mixtures immediately prior to the precipitant solution. Each crystal was coated with a mixture of 1:1 (v/v) paratone-N and silicone oil and then flash-cooled in liquid nitrogen before data collection at 100 K. Crystallographic data sets were collected remotely at the Stanford Synchrotron Radiation Light Source Beamline 12-2 (35). The data were processed using the Stanford Synchrotron Radiation Light Source AUTOXDS script (A. Gonzalez and Y. Tsai) implementation of XDS (36) and CCP4 packages (37). The structures were determined using the Fourier synthesis method starting from PDB code 5EV3. The models were adjusted using COOT (38) and refined using PHENIX (39). The crystallographic data and refinement statistics are given in Table 1.

Fluorescence anisotropy RNA-binding assays
Protocols for the RNA-binding experiments were essentially as described (40). A 5´-fluorescein-labeled, 32-mer RNA oligonucleotide contained a near-consensus 3´splice site (5´-CCU-GUCCCUUUUUUUUUUUUAGGUCCUGGGCA, with the AG consensus underlined). The purified proteins and RNA were diluted separately by .100-fold into a binding buffer comprising 100 mM NaCl, 15 mM HEPES at pH 6.8, 0.2 mM TCEP, 0.1 unit ml 21 Superase-In TM (Invitrogen). The final RNA concentration in the cuvette was 30 nM. The volume changes during addition of the protein were ,10% to minimize dilution effects. The fluorescence anisotropy changes during titration were measured using a FluoroMax-3 spectrophotometer and temperature-controlled at 23°C by a circulating water bath. The samples were excited at 490 nm, and the emission intensities were recorded at 520 nm with a slit width of 5 nm. The fluorescence emission spectra also were monitored for similarity throughout the experiment. Each titration was fit as described (40) to obtain the apparent equilibrium dissociation constant (K D ). These fits and the p values of a two-tailed unpaired t test with Welch's correction were calculated using Prism version 6.0 (GraphPad Software Inc.). The apparent equilibrium affinities (K A ) are the reciprocals of each K D . The average K D or K A values and standard deviation among three replicates sequences are listed in Table S1.

Data availability
The atomic coordinates and structure factors of the WT, N196K, and D301G variants of U2AF 12L bound to AdML oligonucleotide (accession codes 6XLV, 6XLW, and 6XLX) have been deposited at the Protein Data Bank.