Structural basis of ClpXP recognition and unfolding of ssrA-tagged substrates

When ribosomes fail to complete normal translation, all cells have mechanisms to ensure degradation of the resulting partial proteins to safeguard proteome integrity. In Escherichia coli and other eubacteria, the tmRNA system rescues stalled ribosomes and adds an ssrA tag or degron to the C-terminus of the incomplete protein, which directs degradation by the AAA+ ClpXP protease. Here, we present cryo-EM structures of ClpXP bound to the ssrA degron. C-terminal residues of the ssrA degron initially bind in the top of an otherwise closed ClpX axial channel and subsequently move deeper into an open channel. For short-degron protein substrates, we show that unfolding can occur directly from the initial closed-channel complex. For longer degron substrates, our studies illuminate how ClpXP transitions from specific recognition into a nonspecific unfolding and translocation machine. Many AAA+ proteases and protein-remodeling motors are likely to employ similar multistep recognition and engagement strategies.


Introduction
ClpXP and related AAA+ proteases maintain cellular health by degrading incomplete, damaged, or unneeded proteins in a process that must be specific to avoid destruction of essential intracellular proteins . In bacteria and eukaryotic organelles, AAA+ proteases typically recognize substrates via short N-or C-terminal peptide sequences. Escherichia coli ClpXP, for example, degrades proteins bearing a C-terminal degron called the ssrA tag that is added during tmRNA-rescue of stalled ribosomes (Keiler et al., 1996;Keiler, 2015). During rescue, tmRNA binds in the empty A-site of a stalled ribosome, adds a charged alanine to the nascent polypeptide in a tRNA-like reaction, replaces the original mRNA with a short open reading frame that directs translation of the remaining residues of the ssrA degron, and finally recruits translation termination factors via a stop codon. A different bacterial mechanism, which is similar to eukaryotic systems, adds alanine tails to the nascent polypeptide during the ribosome-rescue reaction (Buskirk and Green, 2017;Lytvynenko et al., 2019).
The sequence of the E. coli ssrA tag is AANDENYALAA-COO - (Keiler et al., 1996). The terminal Ala-Ala-COOdipeptide of this degron is the most important element for ClpXP degradation (Flynn et al., 2001), and related degrons ending in Ala-Ala target other cellular proteins to ClpXP (Flynn et al., 2003;Neher et al., 2006;Lytvynenko et al., 2019). The ssrA tag initially binds in the axial channel of the hexameric AAA+ ClpX ring, where the pore-1, pore-2, and RKH loops contribute to recognition (Siddiqui et al., 2004;Farrell et al., 2007;Martin et al., 2008a;Iosefson et al., 2015). Subsequent mechanical reactions requiring ATP hydrolysis unfold adjacent regions of native protein structure and then translocate the denatured polypeptide through the channel and into the degradation chamber of the double-ring ClpP 14 peptidase for proteolysis Olivares et al., 2018). Here, we establish the molecular basis of the recognition step in targeted ClpXP degradation of ssrA-tagged proteins. We also characterize subsequent unfolding/translocation steps that initiate processive degradation.

Structure determination
For cryo-EM, we used an ssrA-tagged green fluorescent protein substrate (GFP-G 3 YG 9 SENYALAA; ssrA residues underlined), a single-chain E. coli ClpX ∆N pseudohexamer (Martin et al., 2005), and E. coli ClpP. Approximately 15 s before vitrification, we combined the GFP substrate and ATP with a mixture of the ClpX ∆N variant, ClpP 14 , and ATPγS. Classification and three-dimensional reconstruction of EM images containing one ClpX ∆N pseudohexamer and one ClpP tetradecamer generated density maps of ClpX bound to one heptameric ring of ClpP 14 and 7-10 residues of the ssrA degron in two distinct conformations at resolutions of 3.1-3.2 Å (Figures 1A, 1B; Table 1). As seen in previous cryo-EM structures (Fei et al., 2020;Ripstein et al., 2020), subunits of the ClpX ∆N hexamer formed a shallow spiral (labeled ABCDEF from the top to the bottom of the spiral in the clockwise direction), which docked asymmetrically with a flat ClpP 7 ring. The pore of the cis ClpP ring contacting ClpX was open, as expected (Fei et al., 2020;Ripstein et al., 2020), but the pore of the trans ClpP ring was closed in both new structures. The ssrA degron bound in the top of the ClpX channel in a 5 structure we call the recognition complex and moved ~25 Å or 6 residues deeper into the channel in a structure we call the intermediate complex ( Figure 1B).

Determinants of degron recognition
In the recognition complex, the ssrA degron bound high in the ClpX channel. Access to the lower channel was blocked by a previously unvisualized conformation of the pore-2 loop of ClpX subunit A ( Figures 1B, 2A 2C; movie S1). As discussed below, multiple experiments support the role of this structure in specific substrate recognition. First, we found that mutation of the penultimate or ultimate C-terminal alanines of the ssrA tag in 29-residue peptide substrates increased K M for ClpXP degradation but had little effect on V max ( Figure 3A). A prior study showed that mutation of the antepenultimate residue of the ssrA tag (leucine) or the residue two amino acids upstream (tyrosine) also increases K M modestly (Flynn et al., 2001), which can be rationalized based upon recognition-complex contacts. Second, previously characterized Y153A, V154F, and R228A ClpX mutations increase K M for ClpP degradation of ssrA-tagged substrates 50-fold or more (Siddiqui et al., 2004;Farrell et al., 2007;Martin et al., 2008a;Iosefson et al., 2015). Based on the recognition-complex structure, we constructed new T199A, T199S, T199V, V202A, and H230A variants. In assays of GFP-ssrA degradation, the T199A, T199V, V202A, and H230A mutations also caused large increases in K M ( Figure 3B). By contrast, K M for degradation by the conservative T199S variant increased only ~4-fold ( Figure 3A), supporting a key role for a hydrogen bond between the side-chain hydroxyl of Thr 199 in ClpX and the αcarboxylate of the ssrA degron ( Figure 2A). Third, compared to wild-type ClpXP, the R228A variant displays reduced specificity for an ssrA-tagged substrate and increased specificity for an N-degron substrate (Farrell et al., 2007). Finally, human ClpXP has leucines at positions corresponding to Thr 199 and His 230 in E. coli ClpX and does not degrade ssrA-tagged substrates, but a human hybrid containing transplanted pore-2 and RKH loops from the E. coli enzyme acquires this activity (Martin et al., 2008a).

The intermediate complex resembles translocation complexes
In the intermediate complex, five ClpX pore-1 loops and four pore-2 loops packed against degron side chains with a periodicity of two residues ( Figures 2B, 2C, movie S2).
This arrangement of pore loops interacting with polypeptide in the channel has been observed previously in ClpXP complexes with other substrates and in different AAA+ proteases and protein-remodeling machines (Fei et al., 2020;Ripstein et al., 2020;Puchades et al., 2020). These enzyme structures are thought to reflect snapshots during non-specific translocation. Indeed, ClpXP translocates a variety of sequences, including polymeric tracts of glycine, proline, lysine, arginine, glutamate, and glutamine (Barkow et al., 2009), and the specific contacts observed in the recognition complex were absent in the intermediate complex. Non-specific translocation allows ClpXP to degrade any protein after degron recognition and unfolding of attached native structure . Prior to recognition, our results suggest that the ClpX channel is occluded, thereby preventing non-specific binding and degron-independent degradation.

Dependence of substrate unfolding on degron length
Can a stable protein substrate, like GFP, be unfolded directly from the recognition complex or is substrate engagement by additional pore-1 and pore-2 loops deeper in the ClpX channel, as a consequence of one or more translocation steps, required to allow mechanical substrate denaturation? If direct unfolding from the recognition complex is possible, then a degron of ~5 residues, which is the number of ssrA-tag residues interacting with ClpX in the recognition complex, should be sufficient for degradation. By this model, longer degrons should also support degradation, but shorter degrons should not because they cannot make contacts needed for recognition. To test this model, we constructed substrates with degrons of 3, 5, 7, 9, or 11 residues following the last structured residue of GFP ( Figure 4A) and assayed degradation. Strikingly, GFP-LAA was not degraded, whereas GFP-YALAA and substrates with longer tags were robustly degraded ( Figure 4B). Modeling revealed that the native barrel of GFP-YALAA docked snugly with the top of the AAA+ ClpX ring, with the YALAA in the same position as in the recognition complex ( Figures 4C, 4D; movie S3). By contrast, the tag of GFP-LAA was too short to allow formation of recognition-complex contacts without severe steric clashes with ClpX. It might be argued that the C-terminal β-strand of GFP-YALAA, which is ~20 residues in length, unfolds to allow additional C-terminal residues of the substrate to bind deeper in the channel of ClpX. However, global GFP unfolding occurs with a half-life of years (Kim et al., 2000), and the C-terminal β-strand remains stably associated even when it is non-covalently attached to the remaining native structure (Nager et al., 2011). The inability of ClpXP to degrade GFP-LAA also argues against a model for degradation in which the C-terminal β-strand of GFP spontaneously denatures. We conclude that a power stroke initiated directly from the recognition complex can unfold GFP-YALAA ( Figure 4E).
In the recognition complex, subunits ABCDE of ClpX contained ATP/ATPγS and subunit  Top. A substrate with a relatively long degron (~20 residues) is recognized and subsequent ATP-dependent power strokes then move the degron deeper into the ClpX channel in the intermediate complex, and then the engaged complex, from which unfolding occurs. Bottom. A substrate with a short degron (~5 residues) forms a recognition complex that is engaged and can therefore carry out direct ATP-dependent unfolding. axial channel is open and five pore-1 loops and multiple pore-2 loops contact every two residues of the substrate polypeptide in the channel (Fei et al., 2020;Ripstein et al., 2020). This structural feature has been widely observed in AAA+ proteases and proteinremodeling machines, suggesting that these diverse molecular machines employ a common mechanism of substrate translocation (for review, see Puchades et al., 2020).

Discussion
Strikingly, the recognition complex of ClpX bound to the ssrA degron is unique compared to previously determined structures. For example, all previous structures have open channels, whereas the axial channel of the AAA+ ring of ClpX in the recognition complex is closed by the pore-2 loop of subunit A, which makes specific contacts with the C-terminal residue of the ssrA tag. Moreover, only the pore-1 loops of the top two subunits in the ClpX spiral contact the substrate degron in the recognition complex, as opposed to contacts between five pore-1 loops and substrates in other known structures. The contacts we see between ClpX and the ssrA degron in the recognition complex explain multiple biochemical results, including ones that show that the C-terminal Ala-Ala dipeptide of the ssrA tag is most important for ClpXP degradation (Flynn et al., 2001; Figure 3A) and others that demonstrate that the side chains of six ClpX residues in the pore-1 loop (Tyr 153 , Val 154 ), pore-2 loop (Thr 199 , Val 202 ), and RKH loop (Arg 228 , His 230 ) play critical roles both in substrate binding and in substrate specificity (Siddiqui et al., 2004;Farrell et al., 2007;Martin et al., 2008a;Iosefson et al., 2015; Figure 3B).
In addition to providing a specific binding site for the ssrA degron, the closed axial Structures, by themselves, can suggest but do not establish order in a kinetic pathway.
However, a recent study of the kinetics of ClpXP association with a substrate bearing a 20-residue ssrA degron similar to the one studied here provides evidence for three sequentially occupied substrate-bound conformations (Saunders et al., 2020). The first two are likely to correspond to our recognition and intermediate complexes. Conversion of the first to the second complex depends on the rate of ATP hydrolysis, as does formation of the third kinetically defined state, which is probably similar to the fully engaged complex depicted in Figure 5. Prior ClpXP complexes showing the native portion of a protein substrate contacting the top of the central channel with an attached peptide filling the channel (Fei et al., 2020) provide structural evidence for this fully engaged state. Our results indicate that ClpX can unfold short-degron substrates directly from the closed-channel recognition complex, in which just two pore-1 loops contact the degron, and long-degron substrates from a subsequent open-channel engaged complex in which five pore-1 loops engage the substrate (Fig. 5). Although unfolding in these two cases must be somewhat different in terms of the detailed mechanism, substrate residues in the upper part of the channel are gripped most tightly during ClpXP unfolding of substrates with long degrons (Bell et al., 2019), supporting a model in which ClpX-substrate contacts near the top of the channel play key roles in both unfolding mechanisms.
The multistep mechanism proposed here for protein degradation by ClpXP may be used by related AAA+ proteases that recognize unstructured degrons, as it nicely resolves how unfolding motors make initiation highly specific (and thus limited to appropriate protein targets) but then allow nonspecific unfolding and translocation for subsequent degradation and/or remodeling. Whether other AAA+ enzymes also use closed axial channels for degron recognition remains to be determined. More generally, AAA+ proteases and protein-remodeling machines must deal with problems similar to those encountered by multisubunit enzymes in transcription, translation, DNA replication, and protein secretion, which also require transitions from specific recognition conformations to complexes that utilize the chemical energy of ATP or GTP hydrolysis to processively and non-specifically move along their polymeric substrates.
After three rounds of 2D classification, 344,069 complexes with one ClpX ∆N hexamer were selected for 3D reconstruction. Using a 40-Å low-pass filtered ClpP map as the search model (EMDB: EMD-20434, Fei et al., 2020) but no mask, 3D auto-refinement without symmetry (C1) yielded a ClpXP map at 3.5-Å resolution. CTF-refinement and particle polishing improved the overall resolution of this map to 2.8 Å. After focused classification on ClpX without alignment, multiple runs with different class numbers converged to two major classes, which were later named the recognition complex (3.1 Å resolution) and intermediate complex (3.2 Å resolution). Data analysis and reconstruction was performed within the Relion 3.0.8 pipeline (Zivanov et al., 2018). We docked structures of ClpP (pdb 6PPE) or ClpXP-substrate complexes (pdb 6PP5 and 6PP7) into EM maps using Chimera (Pettersen et al., 2004), rigid-body refined ClpX domains using Coot (Emsley and Cowtan, 2004), and performed real-space refinement using PHENIX (Adams et al., 2010). The ssrA degron was first modeled and refined as polyalanine, and specific side chains were modeled and refined subsequently.
To model GFP-YALAA binding to ClpXP, we extended the C-terminal β-strand of GFP (pdb 1EMA; residues 2-229) with the sequence YALAA and aligned this sequence with the YALAA of recognition-complex degron using the PyMOL Molecular Graphics System, v. 2.0 (Schrödinger, LLC). Analysis in MolProbity (Williams et al., 2018) revealed multiple clashes between GFP and the RKH loops of ClpX, which were minimized manually in Coot (Emsley and Cowtan, 2004) by rotation of the residue 2-229 segment of GFP relative to the rest of the complex. To minimize clashes further, we created a map of the modeled complex and optimized geometry using the Calculate F(model) and Real-space refinement utilities, respectively in PHENIX (Adams et al., 2010). The final model had a MolProbity score of 1.0 (100 th percentile) and one minor clash (0.44 Å) between GFP and ClpX.
X.F., T.A. Bell, S.R.B., and R.T.S. performed cryo-EM experiments, model building, or biochemical experiments. T.A. Baker and R.T.S. supervised research. All authors contributed to writing and/or revising the manuscript.