Cotranslational Protein Folding inside the Ribosome Exit Tunnel

Summary At what point during translation do proteins fold? It is well established that proteins can fold cotranslationally outside the ribosome exit tunnel, whereas studies of folding inside the exit tunnel have so far detected only the formation of helical secondary structure and collapsed or partially structured folding intermediates. Here, using a combination of cotranslational nascent chain force measurements, inter-subunit fluorescence resonance energy transfer studies on single translating ribosomes, molecular dynamics simulations, and cryoelectron microscopy, we show that a small zinc-finger domain protein can fold deep inside the vestibule of the ribosome exit tunnel. Thus, for small protein domains, the ribosome itself can provide the kind of sheltered folding environment that chaperones provide for larger proteins.

, related to Fig. 1. (a) Seqeunce of the full ADR1a-SecM(L=46) construct. ADR1a is in red, the SecM AP in green, residues from the periplasmic domain of E. coli LepB in black, and linker residues generated during construction in grey. (b) Sequences of the ADR1a-AP part of all constructs analyzed in Fig. 1d. Zn 2+ -binding residues are highligthed in blue.

Enzymes and Chemicals
Unless otherwise stated, all enzymes were obtained from Thermo Scientific (Waltham, MA, USA) and New England Biolabs (Ipswich, MA, USA).
All other reagents were from Sigma-Aldrich (St. Louis, MO, USA).

DNA Manipulations
All ADR1a constructs were generated from the previously described pING1 plasmid carrying a truncated lepB gene containing a [6L,13A] H segment insert and the Escherichia coli SecM arrest peptide, FSTPVWISQAQGIRAGP, under the control of an arabinose-inducible promoter (Ismail et al., 2012). A soluble, non-membrane targeted LepB derivative was generated by a deletion of codon 4-77 using PCR, corresponding to the removal of transmembrane segment 1 and 2. The resulting plasmid was digested with SpeI and KpnI to release the [6L,13A] segment, and oligonucleotides corresponding to a GSGS-flanked ADR1a domain (GSGS-KPYPCGLCNRCFTRRDLLIRHAQKIHSGN-SGSG) were ligated in its place.
Shorter linker lengths, L, between ADR1a and the arrest peptide were generated by shortening the linker from its N-terminal end by PCR as previously described (Ismail et al., 2012). Site-directed mutagenesis was performed to generate constructs with the non-functional FSTPVWISQAQGIRAGA arrest peptide and constructs with ADR1a domains where either one or both of the underlined, Zn 2+ -binding His residues in the sequence KPYPCGLCNRCFTRRDLLIRHAQKIHSGN were changed to Ala. For RNA transcription using the T7 promoter, all constructs were subcloned into pET19b (Novagen, Madison, WI, USA) using NcoI and BamHI.

In Vitro Transcription and Translation for Measurements of f FL .
In vitro transcription was performed with T7 RNA polymerase according to the manufacturer's protocol (Promega) using PCR products as templates for the generation of truncated nascent chains. RNA obtained was purified using RNeasy Mini Kit (Qiagen). Translation was performed in the commercially available PUREfrex™ system (Shimizu et al., 2005) and in a S135 E. coli extract. The S135 cell extract was prepared as previously described (Schwarz et al., 2007). To obtain an essentially Zn 2+ -free S135 cell extract for the Zn 2+ -titration assay ( before being added to an equal volume of 20% ice-cold trichloroacetic acid (TCA).
Samples were incubated on ice for 30 min and spun for 5 min at 20,800 g at 4°C.
Pellets were washed with cold acetone, spun again for 5 min at 4°C, and subsequently solubilized in Tris-SDS solution (10 mM Tris-Cl, pH 7.5, 2% SDS) at 95°C for 10 min. Samples were spun for 5 min at room temperature and the lysate was used for immunoprecipitation using LepB antisera. The samples were resolved by SDS-PAGE and quantitated as described above. Experiments were repeated three times using independent culture incubations, and standard errors (s.e.m.) were calculated.

Single-Ribosome Inter-Subunit FRET Experiments.
fMet-tRNA fMet bound 30S pre-initiated complexes (PICs), Cy3B labeled on the 16S rRNA (Marshall et al., 2008), were formed on the ADR1a-SecM(L=24; Δ1-158) mRNA constructs and immobilized to the surface of pre-treated zero-mode waveguide (ZMW) chips (SMRT Cell, Pacific Biosciences) through hybridization of the mRNAs to biotinylated splint DNA oligos . Elongation mixtures, containing 200 nM fluorescence-quencher-labeled (BHQ-2) 50S ribosomal subunits, 240 nM EF-G, and 3 µM total aa-tRNA·EF-Tu·GTP ternary complex (TC), were delivered to the ZMW chips in a modified PacBio RS sequencer where all individual ZMWs are illuminated with 532 nm laser and fluorescence data is acquired over time . Preparation of native or fluorophore-labeled biomolecules was performed as described in  and references therein. The elongation reactions were carried out in a Tris-based polymix buffer at 20°C in the presence of 1 µM IF2, 4 mM GTP, 2 mM Trolox and a PCA/PCD oxygen-scavenging system. Fluorescence data was collected at 10 Hz for 10 min, and filtered and analyzed using MATLAB (MathWorks) scripts as has been described previously . ZMW chips were loaded stochastically at 30 % occupancy. The 30 % ZMWs with lowest signal were used to calculate background and σ. The wells with signal greater than n*σ above background, that lasted longer than 10 s, were selected for n = N[1 10]. The minimal of discrete differential of the resulting function (number of picked wells from n) was used to identify the best n at least two high-low-high-FRET cycles signaling intersubunit rotation . Average state lifetimes were calculated by fitting the individual lifetimes to a single-exponential distribution using maximum-likelihood parameter estimation. Only lifetimes from productive states were included (i.e., low-FRET states that was followed by a high-FRET state and vice versa) to eliminate artifacts from photophysical effects. The density of elongating ribosomes, decreasing with codon number due to both photobleaching and erroneous translation termination, was calculated and summarized from assigned FRET states in n individual traces.

Molecular Dynamics Simulations
The cotranslational folding curve of the ribosome-ADR1a nascent chain complex was calculated on an arrested ribosome using the coarse-grained model of O'Brien and coworkers (O'Brien et al., 2011(O'Brien et al., , 2012 in which amino-acids are represented as one interaction site, purine containing nucleotides as three interaction sites and pyrimidine containing nucleotides as four interaction sites. In this model electrostatic interactions are treated using Debye-Huckel theory, with a 10 Å Debye screening length. We utilized the force-field and Langevin Dynamics protocol published previously (O'Brien et al., 2011(O'Brien et al., , 2012. Briefly, a structure-based force-field (Ueda et al., 1978;Onuchic and Wolynes, 2004) 'Brien et al., 2011'Brien et al., , 2012 such that the stability of the folded zinc-finger in isolation was equal to -2.0 kcal/mol at 310 K. ADR1a was then covalently attached to unstructured linkers having the same sequences as used in the experiments (see Fig. S1). Linker lengths of 17 to 46 residues were simulated. At each linker length, replica-exchange simulations (Sugita and Okamoto, 1999) were run with 8 temperature windows ranging between 290 and 370 K. A simulation structure of the zinc finger domain was classified as folded if its root-mean-squared deviation (RMSD) from the ADR1a NMR structure was < 3.5 Å, and classified as unfolded if the RMSD > 5.5 Å. The WHAM equations (Kumar et al., 1992) were then utilized to calculate the probability of the domain being folded as a function of linker length at 310 K.

Chain Complexes
The ADR1a-SecM(L=25) construct, which is at the peak of the force profile in Fig.   1d, was chosen for large-scale preparation and cryo-EM analysis. The E. coli SecM stalling sequence was modified by mutating 5 residues to obtain the Sup1 version of the M. succiniproducens SecM AP (HPPIRGSP) (Yap and Bernstein, 2009). The resulting sequence was overlapped by PCR to the DNA fragment encoding the last 29 amino acids of the yeast ADR1a protein, yielding ADR1a-SecM(Ms-Sup1; L=25), Fig. S4a. Primers containing 5´ SapI sites were used to PCR-amplify ADR1a-SecM(Ms-Sup1; L=25), which was subsequently cloned into a p7XNH vector by using the FX cloning method (Geertsma, 2014

Cryo-EM Specimen Preparation, Data Collection, Processing, and Model
Building Carbon-coated holey grid preparation of ADR1a-SecM(Ms-Sup1; L=25) RNCs was carried out as described previously . Cryo-EM data was collected on a Titan Krios TEM (FEI, USA) operated at 300 keV and equipped with a back-thinned Falcon II (FEI, USA) direct electron detector. The camera was calibrated for a nominal magnification of 75,000x, resulting in a pixel size of 1.37 Å at the specimen. Seven blocks of frames s -1 were recorded in automatic mode with a dose of 5 e -/Å 2 per block at defocus values between -1 and -3.2 mm. Frames were aligned using the software developed by the Yifang Cheng lab at UCSF (Li et al., 2013).
Micrographs showing drift or contamination were manually discarded from the dataset. All processing was performed using the SPIDER software package (Frank et al., 1996). The initial dataset of 496,340 particles was first cleaned from nonribosomal particles (306,243 ribosomal particles left) and subsequently sorted for the presence of A, P and E site tRNAs. The dataset that contained strong density for tRNA in the P-site, was further refined by applying a cross correlation cut-off. The final dataset contained 151,900 particles and was refined to a final average resolution of 4.8 Å according to the FSC criterion at cut-off at 0.14. Potential over-fitting was excluded by truncating high frequencies (low-pass filter at 8 Å) during the whole refinement process (Scheres and Chen, 2012).
For structural comparison and interpretation of the cryo-EM density obtained, we fitted the structure of the E. coli 70S ribosome (PDB ID: 3OFR), using UCSF Chimera (Pettersen et al., 2004). A poly-alanine model of the SecM-stalled nascent chain was built based on the model of a TnaC stalled peptide (PDB ID: 4YU8)