Nuclear export receptor CRM1 recognizes diverse conformations in nuclear export signals

Nuclear export receptor CRM1 binds highly variable nuclear export signals (NESs) in hundreds of different cargoes. Previously we have shown that CRM1 binds NESs in both polypeptide orientations (Fung et al., 2015). Here, we show crystal structures of CRM1 bound to eight additional NESs which reveal diverse conformations that range from loop-like to all-helix, which occupy different extents of the invariant NES-binding groove. Analysis of all NES structures show 5-6 distinct backbone conformations where the only conserved secondary structural element is one turn of helix that binds the central portion of the CRM1 groove. All NESs also participate in main chain hydrogen bonding with human CRM1 Lys568 side chain, which acts as a specificity filter that prevents binding of non-NES peptides. The large conformational range of NES backbones explains the lack of a fixed pattern for its 3-5 hydrophobic anchor residues, which in turn explains the large array of peptide sequences that can function as NESs. DOI: http://dx.doi.org/10.7554/eLife.23961.001

Previous structures of CRM1 bound to five different NESs showed virtually identical NES-binding grooves (Fung et al., 2015;Monecke et al., 2009;Dong et al., 2009;Güttler et al., 2010). NESs from Snurportin-1 (SNUPN NES ; class 1c) and Protein Kinase A Inhibitor (PKI NES ; class 1a) bind CRM1 as a-helix followed by a short b-strand, while the proline-rich NES from HIV-1 REV (Rev NES ; class 2) adopts mostly extended conformation ( Figure 1B) (Güttler et al., 2010). The majority of CRM1-NES interactions involve NES hydrophobic anchor side chains, with very few polar and main chain interactions. Previously, we studied NESs with the F1XXF2XXXF3XXF4 (class 3) pattern where the i, i + 3, i + 7, i + 10 F positions suggested a single long amphipathic helix. However, it is perplexing that a long all-helical peptide could fit in the narrow tapering CRM1 groove. Structures of such NESs from kinase RIO2 and cytoplasmic polyadenylation element-binding protein 4 (hRio2 NES , CPEB4 NES ) showed that they do not adopt all-helical conformations but unexpectedly adopt helix-strand conformations that bind CRM1 in the opposite or minus (À) polypeptide direction to that of SNUPN NES , PKI NES and Rev NES ((+) NESs) (Fung et al., 2015). hRio2 NES and CPEB4 NES were hence reclassified as class 1a-R NESs ( Figure 1A). Figure 1. Structures of CRM1-bound NESs that match the potentially all-helical class 3 pattern. (A) Current NES sequence patterns (F is Leu, Val, Ile, Phe or Met and X is any amino acid). Potential amphipathic a-helices, predicted with hydrophobic patterns of i, i + 4, i + 7 or i, i + 3, i + 7 or i, i + 3, i + 7, i + 10, are shaded grey. (B) Structure of PKI NES (F0L) (dark blue, PDB ID: 3NBY), Rev NES (pink, 3NBZ) and CPEB4 NES (yellow, 5DIF) bound to the NES-binding groove of CRM1 (grey surface). NESs are shown in cartoon representations with their F side chains shown as sticks. (C) The overall structure of the engineered Sc CRM1 (grey)-Ran.GTP (orange)-RanBP1 (purple)-mDia2 NES (pale green) complex. The structure of (D) mDia2 NES (pale green), (E) CDC7 NES (green-cyan) and (F) X11L2 NES (forest) bound to the NES-binding groove of Sc CRM1 in the engineered Sc CRM1-Ran-RanBP1 complex. *The X11L2 NES sequence matches the class 3 pattern, but binds CRM1 according to the new hydrophobic pattern F0XXF1XXXF2XXF3XXXF4 that we termed the class 4 pattern. mDia2 NES is not shown in the leftmost panel of (D) to view the five hydrophobic pockets (P0-P4) of the CRM1 groove. Rightmost panels of (D-F): overlays of 3.0s positive densities of kick OMIT mFo-DFc maps (calculated with peptides omitted) and final coordinates of the NES peptides. Middle panels of (D-F): black dashes show CRM1-NES hydrogen bonds and polar contacts. DOI: 10.7554/eLife.23961.002 The following source data and figure supplements are available for figure 1:  We do not understand the extent of NES structural diversity nor how NESs with different hydrophobic patterns that presumably reflect different secondary structures all bind to the seemingly invariant and three-dimensionally constrained CRM1 groove. Here, eight new structures of CRM1 bound to diverse NESs show several different and unexpected NES backbone conformations that share only a common one-turn helix element. All NESs also participate in hydrogen bonding with Lys568 of Hs CRM1, and mutagenic/structural analysis identifies Lys568 as a selectivity filter that blocks binding of non-NES peptides.

CRM1-bound NESs adopt diverse conformations
We study three NESs that uniquely match the all-helical class 3 pattern (F1XXF2XXXF3XXF4). Because most previously studied NESs have substantial helical content, we also study five NESs that match class 2 (F1XF2XXF3XF4) and class 1b (F1XXF2XXF3XF4) patterns, where the hydrophobic residue positions do not suggest an amphipathic helix ( Figure 1A). All eight NESs bind Hs CRM1 in the presence of RanGTP with dissociation constants (K D s) of 670 nM-20 mM, and were crystallized bound to the previously described engineered Sc CRM1-RanGppNHp-Yrb1p complex (Figure 1 Class 3 NESs from mouse diaphanous homolog 3 (mDia2 NES: 1179 SVPEVEALLARLRAL 1193 ) and the cell division cycle 7-related protein kinase (CDC7 NES: 456 QDLRKLCERLRGMDSSTP 473 ) are indeed allhelix peptides, both forming 3-turn a-helices that occupy only the wide part of the CRM1 groove ( Figure 1D,E, Figure 1-figure supplement 4). The last residue of the mDia2 protein (Leu1193) binds the CRM1 P3 pocket leaving the P4 pocket empty. CDC7 NES is far from the protein C-terminus but structures of longer peptides suggest that CDC7 NES exits the groove after Met468 or F3 (Figure 1-figure supplement 5). The mDia2 NES and CDC7 NES sequence patterns should thus be F0XXF1XXXF2XXF3. The F4 anchor position is clearly not used in mDia2 NES and CDC7 NES even though F4 is key for activities of several other NESs (Wen et al., 1995;Meyer et al., 1996;Richards et al., 1996;Scott et al., 2002;Tsukahara and Maru, 2004). The number of F anchor residues necessary for CRM1 binding can vary between 3-5 ( Figure 1-figure supplement 6). A third class 3-matching NES from beta-amyloid binding protein X11L2 (X11L2 NES: 55 SSLQELVQQ-FEALPGDLV 72 ) binds differently ( Figure 1F, Figure 1-figure supplement 4). 57 LQELVQQFEAL 67 forms a 3-turn a-helix, 68 PGDL 71 forms a type I b-turn, and X11L2 NES therefore exhibits a new F0XXF1XXXF2XXF3XXXF4 (class 4) pattern.
Structures of NESs with non-helical patterns are also informative. The previous structure of Rev NES (class 2) suggested that its three prolines may constrain against a helical conformation ( Figure 1B) (Güttler et al., 2010). Class 2 NESs in the Mothers against decapentaplegic homolog 4 protein (SMAD4 NES: 133 YERVVSPGIDLSGLTLQ 149 ) and the fragile X mental retardation protein (FMRP NES: 423 YLKEVDQLRLERLQI 437 ) have few to no prolines but still bind CRM1 with mostly loop-like . This left no other experimentally verified NES in the databases that unambiguously match class 1b pattern (Kosugi et al., 2008;Xu et al., 2012c). We engineered a class 1b NES by adding an alanine to FMRP NES (FMRP-1b NES ; YLKEVDQLRALERLQID), which forms a short 1.5-turn 3 10 helix followed by a 3-residue b-strand ( Figure 2E, Figure 2-figure supplements 1 and 2). The 3 10 helix favorably presents F1 and F2 into the CRM1 groove and natural class 1b NESs are likely to bind similarly.

Structural requirements for an NES
Structures of >13 different CRM1-bound NESs are now available, and may be sorted into five or six groups according to peptide backbone conformations ( Figure 3A). Class 1 NESs are helix-strand peptides with either a-helices (class 1a, 1c) or 3 10 helices (class 1b). Class 1-R NESs are strand-helix peptides, class 2 NESs are mostly loop-like and class 3 NESs are all-helix peptides. The helix-b-turn X11L2 NES structure revealed a new F0XXF1XXXF2XXF3XXXF4 (class 4) pattern.
The only common secondary structural element in the NES structures is one turn of NES helix at F2X 2-3 F3 (grey box, Figure 3A). This conserved turn of helix is flanked on one side by additional turns of helix (classes 1, 1-R) or by loops (class 2), and on the other side by b-strands (classes 1, 1-R, 2) or b-turn (class 4), or the helix ends as the chain terminates or exits the groove (class 3) ( Figure 3A). Dihedral (psi) angles in the 1-turn of helix gradually increase in progression from helical to b-strand conformations ( Figure 3B). In all (+) NESs, main chain carbonyls of F2 +1 and F3 residues in the 1-turn helix element hydrogen bonds with the Sc CRM1 Lys579 (or Mm CRM1/ Hs CRM1 Lys568) side chain, much like niche3/4 motifs where carbonyls of residues i and i + 2 or i + 3 coordinate a cationic group (Torrance et al., 2009). The F3-Lys579 hydrogen bond is possible only because the b-strand psi angle turns F3 carbonyl towards Lys579 ( Figure 3C). NES helix-Lys579 hydrogen bonds are absent in (À) NESs as backbone carbonyls point in the opposite direction. Here, carbonyls of the N-terminal b-strand hydrogen bond with Lys579 ( Figure 3-figure supplement 1). Therefore, another common structural feature of NESs is hydrogen bonding between NES backbone and Sc CRM1 Lys579 ( Hs CRM1 Lys568). Mutations of Hs CRM1 Lys568 impair NES binding. Mutants Hs CRM1(K568A) and Hs CRM1 (K568M) bind FITC-PKI NES two to three orders of magnitude weaker than wild type Hs CRM1, supporting the importance of Lys568-NES interactions (Figure 3-figure supplement 2).
In summary, an active NES (1) can use many different backbone conformations to present 3-5 hydrophobic anchor residues into 3-5 CRM1 hydrophobic pockets (P0 and/or P4 are sometimes not used), (2) has one turn of helix with helix-strand transition that binds the central portion of the CRM1 groove and (3) has backbone conformation that can hydrogen bond with Lys568 of Hs CRM1.
Hs CRM1 Lys568 is a selectivity filter for NES recognition What then are CRM1 groove features that selectively recognize the key NES features? Arrangement of hydrophobic pockets in the groove likely selects NESs with suitably placed F residues. Groove shape, tapering and most constricted at Sc CRM1 Lys579 ( Hs CRM1 Lys568), likely selects for NES  helices that transition to strands or NES helices that end ( Figure 3C). Is groove-constricting Hs CRM1 Lys568 perhaps key for differentiating active from false positive NES sequences? We tested mutants Hs CRM1(K568A) and Hs CRM1(K568M) for interactions with three previously identified false positive NESs that match NES consensus but do not bind CRM1: peptides from Hexokinase-2 (Hxk2 pep: 18 DVPKELMQQIENFEKIFTV 36 , class 3 match), Deformed Epidermal Autoregulatory Factor 1 homolog (DEAF1 pep:452 SWLYLEEMVNSLLNTAQQ 469 ; class 1a-R match) and COMM domain-containing protein 1 (COMMD1 pep; 173 KTLSEVEESISTLISQPN 190 ; class 3 match) ( Figure 4A) (Xu et al., 2012c(Xu et al., , 2015. Wild type Hs CRM1 does not bind the peptides but Hs CRM1(K568A) binds Hxk2 pep and DEAF1 pep , and Hs CRM1(K568M) binds DEAF1 pep but not Hxk2 pep , suggesting that Lys568 is important in filtering out false positive NESs ( Figure 4A).
Both Sc CRM1(K579A)-bound Hxk2 pep and DEAF1 pep are all-helix peptides ( Figure 4B,C, Figure 4-figure supplement 1, Figure 4-source data 1). The fourth turn of the Hxk2 pep helix packs into hydrophobic space widened by removal and rearrangement of the Lys579 and Glu582 side chains, respectively ( Figure 4B). The 2.5-turn a-helix of DEAF1 pep binds in the (À) direction and is slightly longer than helices in true (À) NESs ( Figure 4C). Superpositions of Hxk2 pep and DEAF1 pep onto wild type CRM1 grooves show the fourth turn of the Hxk2 pep helix and the N-terminus of the DEAF1 pep helix clashing with Sc CRM1 Lys579/ Mm CRM1 Lys568 side chains ( Figure 4B,C, Figure 4figure supplement 2). The rest of the mutant Sc CRM1(K579A) groove is highly similar to the wild type groove. Therefore, the key feature of the wild type groove that prevents Hxk2 pep and DEAF1pep binding is Lys568, which is not only a critical hydrogen bond donor for binding NESs, but its long side chain also blocks binding of sequences that do not meet NES structural requirements.
Structures of many diverse NES sequences suggest how one unchanging peptide-bound CRM1 groove can recognize up to a thousand different peptides. Dependence of 3-5 hydrophobic residues in 8-15 residues-long NES arises from the substantial binding energy of anchor hydrophobic side chains interacting with 3-5 CRM1 hydrophobic pockets. However, lack of contact with NES backbone allows anchor side chains to be presented in many conformations including both N-to C-terminal orientations, explaining broad specificity defined by highly variable spacings between anchors. Interestingly, NES conformation is not entirely unrestrained, as CRM1 groove constriction imposes either exit/termination of the NES chain or its continuation in extended configuration. Solutions for the broadly specific NES recognition contrast with those of analogous systems. MHC I and II proteins, each recognizing at least hundreds of different peptide antigens, use many peptide main chain contacts for affinity with only a few supplementary peptide side chain interactions (Zhang et al., 1998;Madden, 1995). The result here is a conformational selection of particular lengths of extended peptides binding in conserved N-to C-terminal orientation, with little sequence restriction. The Calmodulin-helical peptide system is yet another contrast, as the binding domain uses its flexible fold to adapt to various helical ligands (Tidow and Nissen, 2013;Hoeflich and Ikura, 2002).
CRM1-NES structures expanded the six NES patterns derived from peptide library studies to the eleven patterns shown in Figures 1A and 3A. The ever-expanding set of NES patterns suggests that no fixed hydrophobic pattern likely describes the NES. Furthermore, only~50% of consensus-matching previously reported NESs that were tested actually bound CRM1, contributing to the inefficiency of available NES predictors (with precision of 50% at 20% recall rate) (Xu et al., 2012b(Xu et al., , 2015Kosugi et al., 2014;Fu et al., 2011). The many available NES structures, diversity of NES conformations and the structurally conserved one-turn helix NES element revealed here will enable development of structure-rather than sequence-based NES predictors (Raveh et al., 2011;Schindler et al., 2015;Trellet et al., 2013;Yan et al., 2016). There is a need to identify many more CRM1 cargoes as apoptosis of different cancer cells upon CRM1 inhibition by the drug Selinexor (in clinical trials for a variety of cancers) and other inhibitors (Parikh et al., 2014;Mendonca et al.,  2014; Das et al., 2015;Alexander et al., 2016;Gounder et al., 2016;Abdul Razak et al., 2016;Lapalombella et al., 2012;Inoue et al., 2013;Etchin et al., 2013;Tai et al., 2014;Cheng et al., 2014;Kim et al., 2016;Hing et al., 2016;Vercruysse et al., 2016) appears to be driven by nuclear accumulation of different sets of NES-containing cargoes, but identities of most of these apoptosiscausing cargoes are still unknown.
Finally, we find that the Hs CRM1 Lys568 side chain acts as a filter that physically selects for NESs with helices that transition to strands or end at the narrow part of the CRM1 groove. Interestingly, Lys568 interacts electrostatically with Hs CRM1 Glu571, which is mutated to glycine or lysine in chronic lymphocytic leukemia and lymphomas with poor prognosis (Puente et al., 2011;Jardin et al., 2016;Camus et al., 2016). Disease mutations will abolish Glu571-Lys568 contacts and possibly affect NES binding and selectivity.
NES peptides are modeled according to the positive difference density (2mFo-DFc map) at the binding groove after refinement of the CRM1-Ran-RanBP1 model. In all structures, there are good electron densities for the NES main chain and directions of side chain density in the helical portion of the peptides allow unambiguous determination that they are all oriented in the positive (+) NES orientation. Side chain assignments of the NES peptides are guided by (1) densities of F side chains that point into the binding groove, (2) densities for long non-F side chains such as arginine, phenylalanine and methionine and (3) physical considerations such as steric clashes.
For example, to model the bound mDia2 NES peptide (sequence: GGSY-1179 SVPEVEALLARLRA L 1193 ), we made use of the obvious electron densities (mFo-DFc map) for long side chains to guide sequence assignment. There is a strong side chain density suitable for an arginine side chain on the peptide (white dashed circle in Figure 1-figure supplement 2A). There are only two arginine residues in the mDia2 NES peptide, Arg1189 and Arg1191. If the long side chain density is assigned to Arg1189, then Arg1191 would end up pointing into the binding groove -a very energetically unfavorable and unlikely situation. Furthermore, Ala1188 would end up in the P2 pocket of CRM1 where there is an obvious density for a longer hydrophobic side chain (left panel, Figure 1-figure supplement 2A). On the other hand, when Arg1191 is assigned to the long and continuous side chain density (adjacent to helix H12A of CRM1), the remaining side chains in the NES end up in positions that are consistent the electron densities.
For the FMRP-1b NES (sequence: 1 GGS-YLKEVDQLRALERLQID 20 ), there are no obvious long side chain densities that could help with modeling. There are however obvious densities for several side chains in the first two turn of the NES helix. These side chain densities are consistent with two possible sequence assignments: 5 LKEVDQLRAL 14 or the more C-terminal 11 LRALERLQID 20 . We tested modeling of FMRP-1b NES by refining both peptide models and by testing a mutant peptide that should distinguish between the two models. Ten cycles of PHENIX refinement of the 5 LKEVDQL-RAL 14 model resulted in positive and negative difference densities (mFo-DFc map) at several NES side chains, which suggested an incorrect assignment (left panels, Figure 1-figure supplement 2B). In contrast, different densities are absent when the 11 LRALERLQID 20 model is refined (right panels, Figure 1-figure supplement 2B). The final FMRP-1b NES structure was therefore modelled as 11 LRALERLQID 20 . The sequence assignment of FMRP-1b NES was also tested using a mutant FMRP-1b NES that has the sequence YLKEVDQLRALER. If the NES is 5 LKEVDQLRAL 14 , FMRP-1b NES mutant YLKEVDQLRALER should bind well to CRM1. However, if 11 LRALERLQID 20 is the FMRP-1b NES , mutant YLKEVDQLRALER should not bind CRM1 as the C-terminal half of the NES or 17 LQID 20 which includes F3 and F4 is missing. Results in Figure 1-figure supplement 3 show that FMRP-1b NES mutant YLKEVDQLRALER does not bind CRM1, providing further support that the NES is indeed 11 LRALERLQID 20 as currently assigned.

NES activity assays
Pull-down binding assays, in vivo NES activity assay and differential bleaching experiments for determining binding affinities were all performed the same way as described in Fung et al. (2015). The data were analyzed in PALMIST (Scheuermann et al., 2016) and plotted with GUSSI (Brautigam, 2015).