Quantitative reconstitution of yeast RNA processing bodies

Significance Biomolecular condensates concentrate related biomolecules within cells. Biochemical reconstitutions using one to a few molecules indicate that many condensates are formed through liquid–liquid phase separation. However, it is unclear to what extent the principles gleaned from simplified reconstitutions apply to complex cellular condensates that are enriched in tens to hundreds of different molecules. Here, we used the seven most highly concentrated proteins in yeast RNA processing bodies (P bodies) and RNA to reconstitute these archetypal condensates. Combining proteins and RNA at their cellular concentrations generated in vitro P bodies with protein partitioning and dynamics that are quantitatively consistent with cellular measurements. Our findings suggest that much of the information specifying condensate properties is carried by interactions between the most abundant components.


Cloning of expression plasmids
We examined seven proteins that are highly concentrated in S. cerevisiae P bodies: Dcp1, Dcp2, Dhh1, Edc3, Lsm1-7, Pat1, and Xrn1 (Fig. 1A) (1). We did not include Upf1 because it does not affect the targeting of most normal mRNAs to P bodies, but is more of an accessory factor affecting only a subset of mRNAs undergoing aberrant translational termination and nonsensemediated decay (NMD) (2). Genes for Dcp1, Dcp2, Dhh1, Edc3, and Pat1 were cloned into a modified pMAL plasmid (New England Biolabs) using standard methods and NdeI and BamHI restriction sites. In order to use this strategy, silent mutations were first inserted into the coding sequences to eliminate restriction sites in Dcp2 (one NdeI site), Edc3 (one BamHI site), and Pat1 (two NdeI sites), using standard site directed mutagenesis. The modified pMAL plasmid (pMTTH) contains TEV-cleavable N-terminal MBP and C-terminal His6 tags. These genes were also cloned into a modified MTTH vector with monomeric EGFP cloned in after the P-body gene and before the second TEV cleavage site/His6 tag.
Expression plasmids for Lsm1-7 and Xrn1 were generous gifts from Yigong Shi and Lionel Benard, respectively.

Protein expression and purification
Lsm1-7 and Xrn1 were expressed and purified as described previously (3)(4)(5). The general protocol for expressing and purifying the other P-body proteins is described below, with protein-specific information provided thereafter. Plasmids were transfected into Escherichia Coli BL21 (DE3) cells and grown overnight at 37 °C on Luria Broth (LB)/Ampicillin agar plates. Individual colonies were resuspended in a 50 mL culture of LB/Ampicillin and grown overnight at 37 °C. Cells were collected by centrifugation (3,400 x g, 10 minutes), resuspended in LB, and added to six, one-liter cultures of LB/Ampicillin in four-liter unbaffled flasks. Cultures were grown at 37 °C until an OD600 of ~ 0.5 at which point the temperature was decreased to 18 °C. Cultures were induced with 1 mM IPTG at an OD600 of ~ 1.0 and grown overnight at 18 °C. Cells were collected by centrifugation (4,700 x g, 40 minutes) and the pellet was resuspended in 25 mM Tris pH 8, 10% (V:V) glycerol, 500 mM NaCl, 10 mM imidazole, and 5 mM -mercaptoethanol (BME). Suspensions were transferred to 50 mL conical tubes and stored at -80 °C for future use.
The amylose eluate was diluted threefold into 25 mM Tris pH 8, 0 mM NaCl, and 5 mM BME for a final concentration of 50 mM NaCl, and filtered through a 0.45 µm Whatman filter (GE Healthcare). Filtrate was loaded onto a Q sepharose ion-exchange column and eluted using a 50 500 mM NaCl gradient over ten column volumes, taking fractions every 2.5 mL. Fractions containing the desired protein were loaded onto an SD200 26/600 size exclusion column equilibrated with 10 mM MES pH 7, 5% glycerol, 300 mM KOAc, and 5 mM BME. Fractions containing the desired protein were concentrated by ultrafiltration (Amicon centricon) with 3k (Lsm1-7) and 30k (all other proteins) molecular weight cutoffs. Single-use aliquots were flash frozen in liquid nitrogen and stored at -80 °C for future use.
Cleaving the MBP-fusion tag before ion-exchange and size-exclusion columns resulted in a partial loss of protein for Dhh1 and a total loss of protein for Edc3 and Pat1. Thus, to simplify procedures and maintain consistency, the MBP-tag was retained on all proteins during purification, and condensate formation was initiated by TEV cleavage.
Protein-specific deviations from the general protocol are as follows. Only two liters of culture were used for Dcp1 and Dhh1. Dcp2 was expressed in terrific buffer (TB) instead of LB due to low expression of the full-length protein. 10 mM ATP was included in the wash buffers for the Ni 2+ column for Dhh1 in order to disrupt interactions with RNA. Without ATP, Dhh1 eluted as a series of peaks from the ion-exchange column presumably due to an inhomogeneous distribution of bound RNA molecules. 1M urea was included in the ion-exchange buffers for Pat1 to prevent aberrant oligomerization and/or precipitation. The size exclusion buffer for Pat1 included 1 M KOAc to similarly eliminate aberrant oligomerization and/or precipitation.
EGFP-fusion proteins for Dcp1, Dcp2, Dhh1, Edc3, and Pat1, and mCherry-fusion protein for Pat1 were purified the same as described above. Lsm1-7 and Xrn1 were labeled with maleimide AlexaFluor488 (Thermo Fisher Scientific). Lsm1-7 and Xrn1 were labeled overnight with a 5:1 AlexaFluor488:protein ratio, after last ion-exchange column and before size exclusion column. Free dye was purified away using a 5 mL Desalting HP column (Cytiva), before the labeled protein was purified using size exclusion.
See Table S5 for protein sequences.

RNA reagents
RPL41A and RNA10 RNA (Fig. S11) were purchased from Integrated DNA Technologies, resuspended in 10 mM Tris-HCl, 1 mM EDTA, pH 7.0 buffer, and single-use aliquots were flash frozen in liquid nitrogen and stored at -80 °C for future use. Total yeast RNA was purchased from Sigma-Aldrich and solubilized and stored as described above. MFA2 RNA was generated using in vitro transcription. Template DNA was generated by PCR out of the pRP802 plasmid (Parker lab) using the following oligos: TAATACGACTCACTATAGCGAGC and TTTTTCATGAAAAAATCTGTTAAA-GTGATAACTAC. Note that RPL41A and MFA2 RNAs are oligoadenylated as such RNAs are bound by Lsm1-7 and Pat1 with tighter affinity than polyadenylated RNAs (6). MFA2 RNA was transcribed from this DNA template using the Invitrogen Ambion MEGAscript T7 Transcription Kit by following the standard protocol with 15% Cy5-UTP included. The reaction was allowed to proceed overnight, then treated with the Megaclear Transcription Clean-Up Kit (Thermo Scientific

Microscopy
Microscopy experiments were carried out in 384-well glass bottom microwell plates (Brooks Life Micrographs were analyzed using FIJI (20). For Z-stacks the Z plane with the highest intensity was selected for analysis. A threshold was set at three-fold above the background signal for each channel. Concentrations were determined from fluorescence intensities using standard curves for RFP (Pat1), Alexa Fluor 488 (Lsm1-7 and Xrn1), and GFP (Dcp1, Dcp2, Dhh1, Edc3, and Pat1) (Fig. S32-34). Small condensates, that are close in size (xy area) to the point spread function (PSF) of a microscope exhibit diluted intensities (1). We empirically determined that the intensity of condensates smaller than 2.5 µm in diameter exhibit a linear dependence on condensate size, whereas the intensities of condensates larger than 2.5 µm in diameter were independent of condensate size (Fig. S35). Thus condensates larger than 2.5 µm in diameter were selected for further analysis of condensate concentration. We used a fluorescence intensity cut-off of threefold above background for each fluorophore, and assigned the center of mass for condensates in each of the microscope channels. Condensates with centers of mass within 1 µm were considered to be overlapping to account for slight discrepancies in the designation of the center of mass and for stage drift during imaging.
RNA is not degraded by Xrn1, or any potentially contaminating host proteins, during the course of microscopy experiments (Fig. S36). Xrn1 is not active in these experiments for two reasons. 1) The defined RNA species used in these experiments are not monophosphor lated which is the preferred substrate for Xrn1 (13). 2) Even in e periments hen the preferred monophosphorylated are used (not in this article), Xrn1 is not active in the absence of Mg 2+ (13).
Our experiments in this article lack Mg 2+ . In further experiments we have formed condensates with P-body proteins and monophosphorylated RNA. RNA is stable within these condensates for hours and is not degraded by Xrn1 until Mg 2+ is added to the reaction. For the proteolysis experiments 0.8 µg trypsin was added to each reaction. The condensed phase concentration for Pat1-RFP and Dhh1-GFP were monitored before and after trypsin addition. Relative condensed phase protein concentrations based on pre-trypsin levels are displayed (Fig. S30).

Electrophoretic Mobility Shift Assays
Reactions included 10 nM Alexa Fluor647-labeled RPL41A RNA. P-body proteins were titrated over >100-fold range, with exact values depending on expression/purification yield and affinity for RNA (Fig. S10), in 300 mM KOAc, 10 mM MES pH 7 or pH 5.8, and 5 mM BME. Reactions were incubated on ice for 2 hr, then resolved on a 6% native PAGE gel. 0.5x MBE buffer (MES, pH 7 or 5.8; Borate; EDTA) and gels were equilibrated and run at 4 °C. Gels were imaged using Bio-Rad ChemiDoc MP. Bands were detected using Bio-Rad Image Lab software (v. 6.1) and curves were fit using the Specific Binding with Hill Slope equation on Graphpad Prism (v. 9.0.0 for Mac).

Native Gels
Reactions with Dcp1 were incubated at room temperature at the indicated pH (5.8 or 7) without TEV cleavage for 2 hours. 2x sample buffer (62.5 mM MES pH 7, 25% glycerol, 1% bromophenol blue) was then added and samples were loaded onto a native polyacrylamide (37.5:1 acrylamide:bis-acrylamide) gel with 4% stacking and 6% separating portions. Gels used 25 mM MES pH 7 as the buffer. Running buffer with 25 mM MES pH 7 and 192 mM Glycine was used and gels were pre run for 30 minutes before loading samples.
See also Fig. 1A, and S2-S3. (F) Structural model for the middle domain (MD) of Pat1 (Pat1 241-422 ) was generated using AlphaFold (16,17). Secondary structure and disorder predictions are in agreement that Pat1 MD contains structured elements ( Figure S3). However, these -helices likely lack a defined and stable stereospecific three-dimensional structure due to: the lack of extensive tertiary contacts within the MD domain or to the HEAT domain, the flexible linkers between the -helices, and consequently the low confidence of this model for the regions outside of the -helices.  (11), and Xrn1 C domain binds to Dcp1 (9); these structures have been omitted for brevity.

Figure S3. Intrinsically disordered regions within P-body proteins.
Disorder score was calculated for P-body proteins using Disprot VSL2 (15). Higher scores suggest disordered regions and lower scores suggest structured domains.
See also Fig. 1A and S2. (A) SDS PAGE gel of purified P-body proteins used in this study. Proteins were purified using a four step purification process: Ni 2+ affinity, amylose affinity, ion exchange, and size exclusion columns. Dhh1, Edc3, and Pat1 required the MBP tag for purification to prevent loss of protein.
See methods for details. (B) Using the purified proteins, TEV cleavage of the MBP tag was used to initiate phase separation. TEV was added at a 50:1 P-body protein:TEV ratio as higher amounts of TEV See also Fig. 1B-1C.

Figure S5. Estimated cellular concentrations of P-body proteins.
The absolute number of proteins in a Saccharomyces cerevisiae cell were obtained from Ho et al. (33). Briefly, Ho et al. combined mass spectrometry and fluorescent microscopy studies with quantitative proteome wide data. We converted the number of proteins per cell to concentration using the following assumptions: (i) cell volume of 62 µm 3 ; (ii) P-body proteins were not restricted from any organelle as the cytoplasm makes up the majority of the cell volume, and most P-body proteins also reside in the nucleus which is the largest compartment after the cytoplasm (34). Each dot is from a single quantitative report and the red line indicates the median. The median cellular concentration estimates for P-body proteins range from 370 to 90 µM.
See also Fig. 1D.    There is a reasonable correlation between histidine density and pH-sensitive condensate formation (R 2 = 0.49) that is enhanced if Dcp1 is not considered ( Figure 2I: R 2 = 0.75). In the outlier, Dcp1, a single dominant histidine residue (H206) controls pH-sensitive condensate formation. In contrast, the protein isoelectric point does not correlate with pH-sensitive condensate formation (R2 = 0.02). These data suggest that multiple pH-sensitive histidine residues contribute to homotypic condensate formation and are distributed throughout Dcp2 and the other P-body proteins.   See also Table S2; Fig. 3A-C and S11.   Figure S11. Types of RNA used in this study. Portions of RPL A co ering the UTR and translational start site and the UTR and start of the polyA tail were combined in this 60 nucleotide RNA. RNA10 is a small single-stranded RNA previously used in RNA helicase studies (35). Total RNA contains more diverse RNA types and features (length, secondary structure, etc.). Yeast Mating Factor A (MFA2) is a full-length mRNA (348 nucleotides) that localizes to cellular P bodies (36). Predicted secondary structures for RPL41A and MFA2 from the RNAstructure website is shown (37).
See also Fig. 3A-C, 5B, S10, S19, S22, and S23.   (A) Representative micrographs for titration of Edc3 and truncations. Total concentration of protein and contrast are labeled above and below the micrographs, respectively. All are EGFP fusion proteins. (B) Cartoon schematic representing: Edc3 has a lower saturation concentration (Csat) than any of the truncations, suggesting that the disordered middle region and the structured YjeF-N domain synergize to drive Csat lower in full-length Edc3. The condensate protein concentration for Edc3 283-551 is higher, albeit at higher total protein concentrations.
See also Fig. 4C and 4M.  See also Fig. 4E and 4O.

Figure S17. Comparison between P-body protein regions forming homotypic condensates in vitro and contributing to P-body formation in vivo.
The ability of P-body protein regions to exhibit homotypic LLPS in vitro ( Fig. 4 and S12-16) and contribute to P-body formation in vivo (1, 22,23,30,38) are scored in a qualitative fashion. Scoring for homotypic LLPS refers to experimental conditions and data from Fig. 4.
There is reasonable agreement between P-body protein regions that exhibit LLPS in vitro (Fig. 4), and contribute to P-body formation in vivo (Fig. S17). Our in vitro results are consistent with a synergy between multiple regions within Dcp2, and within Dhh1, contributing to P-body formation in cells (1, 38). Furthermore, our results are consistent with the importance of Edc3 283-551 and Pat1 423-796 contributing to P-body formation (23,30). However, there are discrepancies for some P-body protein regions (Fig. S17). Edc3 1-66 does not form homotypic condensates, yet is important for P-body formation in vivo (23). This discrepancy is likely due to heterotypic interactions between Edc3 1-66 and helical leucine motifs in Dcp2 that contribute to P-body formation (1, 27). Dcp1 forms homotypic condensates in vitro, but represses P-body formation in cells by activating the RNA-decapping of Dcp2 (22), an activity not present in homotypic Dcp1 condensates. Edc3 67-282 and Pat1 241-422 form homotypic condensates in vitro, but are not important for P-body formation in vivo (23,30). We do not currently understand these latter discrepancies, however there are some inherent limitations to our in vitroin vivo comparisons for P-body protein regions (Fig. S17). Since the threshold concentration for Edc3 67-282 is at least tenfold higher than Edc3 in vitro (Fig. S14), one possibility is that Edc3 67-282 may contribute to P body formation, but only at higher expression levels than were tested in vivo (23). Furthermore, our in vitro assays are quantitative whereas some of the previous cellular data is more qualitative in scoring for P-body formation based on one or two molecular markers (23,30). Thus, protein regions with weaker contributions to phase separation, such as Edc3 67-282 , may be missed by qualitative examinations in cells. While investigating the LLPS of isolated protein regions informs on the importance of homotypic interactions, this approach does not always correlate with condensate formation in cells. For example, regions whose contributions to P-body formation are mediated via heterotypic interactions with other components will be missed by this approach (see Fig. 5 and Discussion).
See also Fig. 4, S12-16, and Discussion. Figure S18. Line profiles for overlap between Pat1 and Dcp2 or Edc3. (A) Representative micrographs for overlap between Edc3-EGFP or Dcp2-EGFP with Pat1-mCh when Edc3 or Dcp2 are added at the time of condensate initiation (0 hr), and two or eight hours after condensate formation is initiated. (B) Reactions were set up with all P-body proteins and RNA except Edc3. Edc3 was added back in as condensate formation was initiated (0 hr). Line profiles for Pat1 (blue) and Edc3 (yellow). Correlation coefficient between the two line profiles is displayed. (C) As in (A) but Edc3 added 2 hours after condensate formation was initiated. (D) As in (A) but Edc3 added 8 hours after condensate formation was initiated. (E) As in (B) except Dcp2 was set aside instead of Edc3. Dcp2 was added back as condensate formation was initiated (0 hr). (F) As in (D) but Dcp2 was added 2 hours after condensate formation was initiated. (G) As in (D) but Dcp2 was added 8 hours after condensate formation was initiated.
(H) Quantification of the Pearson correlation coefficients of Edc3 and Dcp2 with Pat1 when added at condensate initiation (black) or after condensates have been formed for 2 hours (gray). Thirty condensates were measured from two replicate experiments.
During the course of our experiments, we observed that sufficient preincubation (1.5 hours) of all molecules together was required before initiating the reactions with TEV cleavage in order for different proteins to colocalize with one another. This suggests that a balance between heterotypic and homotypic interactions leads to P-body formation (and also that the in vitro system is slow to equilibrate). To further investigate this idea we intentionally investigated time as a variable in this set of experiments. We incubated different subsets of the P-body reconstitution separately for different amounts of time after TEV cleavage was initiated, before mixing the solutions together. We found that Dcp2 is readily recruited into the condensates containing the rest of the P-body proteins and RNA, at each timepoint tested. In contrast, Edc3 forms condensates that either coat (2 hr) or dock on the condensates with the rest of the P-body proteins and RNA. Images were taken 24 hr after TEV cleavage was initiated and therefore 16-24 hr after the solutions were mixed together. Thus, these images likely represent the systems at equilibrium. One interpretation of these data is that homotypic condensates mature over time and fail to coalesce with heterotypic condensates. Indeed, alternative lines of investigation ( Fig. 5 and S28-S31) also suggest that homotypic condensates mature more rapidly than heterotypic condensates. However, we note that additional plausible interpretations for this data exist such as potential differences in surface tension between homotypic and heterotypic condensates (39).

Figure S20. Specific interactions drive partitioning into reconstituted P-bodies.
(A) Representative micrograph of reconstituted P-bodies using Pat1-mCh as a marker, and with free EGFP (not tagged to any P-body protein), SIM-3R-EGFP, EML4-ALK-mEmerald, and Not1 754-1000 -Alexa Flour 647 added. All P-body proteins and RNA were included in the reaction mixture, and P-bodies were formed at pH 5.8 and 300 mM KOAc. Micrographs in the same row represent different channels of the same experiment. (B) Quantification of overlap in condensates between the Pat1-mCh and other channels. (C) Representative micrographs of homotypic condensates formed by EGFP, SIM-3R, EML4-ALK, and Not1 754-1000 . Experimental conditions as described in (A), except only the individual proteins were included in these reactions (no P body proteins and RNA). (D) Total condensate area per slide for control proteins. Pat1 data included for reference. EGFP and SIM-3R do not partition into reconstituted P bodies, nor do they form homotypic condensates on their own. EML4-ALK forms a few small homotypic condensates, and these condensates rarely overlap with reconstituted P bodies. Not1, a protein recruited to P bodies in cells, does not form homotypic condensates and is recruited into reconstituted P bodies, likely through interaction with Dhh1 (21). Collectively, these results suggest that in vitro P bodies are formed by specific interactions involving P-body protein and RNA molecules.

Figure S21. Correlation between P-body protein partition coefficients in vitro and in vivo.
We compared the partition coefficients of P-body proteins in S. cerevisiae (1), to in vitro values observed under different experimental conditions. (A) Cellular protein concentrations (range 90 370 nM, see Figure S5 and Methods), pH 5.8, and 30 °C incubation temperature. Black diagonal line is where in vivo and in vitro values are equivalent, gray dotted lines indicate twofold differences. This is the same data as in Fig. 5C, included here for reference to other experimental conditions. (C) Overlap between Pat1 (black) or RNA (gray) and P-body proteins. Dcp1, Dcp2, Dhh1, and Edc3 are EGFP-fusion proteins and Lsm1-7 and Xrn1 are conjugated to AlexaFluor488. Relative to acidic pH (5.8) conditions (Fig. 5A-B and S19), there is less overlap between RNA with all P-body proteins, and between Dcp1 and Xrn1 with Pat1. These results suggest that acidic pH are the more relevant conditions for P-body formation as all components overlap in that condition.
See also Fig. 5A-B and S20. Four different types of RNA were tested in our reconstitution: RNA10, a 10 nucleotide single stranded RNA; RPL41A, a 60 nucleotide RNA consisting of portions of the coding and untranslated regions of ribosomal protein of the large subunit 41A; MFA2, a 348 nucleotide full-length mRNA of yeast Mating Factor Alpha that is known to localize to P bodies; and total RNA from yeast. For all RNA sources we found that RNA had little impact on condensate formation and partitioning of proteins into condensates. Furthermore, we observed that all RNAs were more highly enriched into condensates under acidic pH conditions, consistent with enhanced binding with P-body proteins under these conditions (Fig. 3). Lastly, the partitioning of RNA species correlates with their length. These data indicate that RNA is not required for in vitro P body formation and does not impact protein partitioning, but can increase condensate size. One caveat here is that longer and more structured mRNAs, which are more highly enriched in cellular P bodies (40)(41)(42), may have more dramatic influences on protein partitioning and condensate size.
See also Fig. S11, S19, S22, and Methods.    The individual proteins Pat1, Edc3, and Dhh1 had the highest number of condensates and total condensate area, supporting their importance in P-body formation. The number and area of Pat1 alone is comparable to Pat1 in heterotypic condensates, suggesting the importance of Pat1 for P-body formation.
See also Fig. 5G. All three proteins are slightly more dynamic in heterotypic condensates as compared to homotypic condensates.
See also Fig. 5H.  Condensates with RPL41A RNA are more rapidly and completely proteolyzed by trypsin, suggesting that RNA promotes a more mobile and reversible condensate material state. Arginine more readily dissolves condensates with RPL41A RNA compared to those without RNA. The buffer control demonstrates that condensates with all proteins or with Pat1 have similar number of condensates and similar condensate area with or without RNA.  Figure S32. EGFP standard curves for quantitative microscopy.
(A) Standard curves were measured using six different EM gain settings to obtain a linear relationship between protein concentration and pixel intensity over a wide range of protein concentrations. Note the change in x-axis scale between different EM gain settings. Standard curves were done in pH 7 (left) and pH 5.8 (right). MBP-Edc3 1-66 -EGFP was used to generate standard curves because it did not form condensates at any of these concentrations. (B) In our buffer conditions we did not observe a significant difference in EGFP fluorescence at pH 7 and pH 5.8 [compare left and right columns in (A)], in contrast with a previous finding (43). Using previously reported buffer conditions Haupts et al. :10 mM citrate/citric acid buffer and 100 mM KOAc), we observe higher fluorescence intensity at pH 7, consistent with a previous report (43). However, under our buffer conditions fluorescence intensity is similar at pH 7 and pH 5.8. We hypothesize this difference is due to higher salt in our buffer (300 mM vs 125 mM KOAc), but did not further pursue identifying key differences between buffers.

Figure S33. RFP standard curves for quantitative microscopy.
Standard curves were measured using six different EM gain settings to obtain a linear relationship between protein concentration and pixel intensity over a wide range of protein concentrations. Note the change in x-axis scale between different EM gain settings. Standard curves were done in pH 7 and pH 5.8. A single SUMO interacting motif (SIM) fused to RFP was used to generate standard curves because it did not form condensates at any of these concentrations. Figure S34. AlexaFluor488 (AF488) standard curves for quantitative microscopy. Standard curves were measured using six different EM gain settings to obtain a linear relationship between protein concentration and pixel intensity over a wide range of protein concentrations. Note the change in x-axis scale between different EM gain settings. Standard curves were done in pH 7 and pH 5.8. Free AF488 (not bound to any protein) was used to generate standard curves. For diameters greater than 16 pixels the fluorescence intensity is unrelated to the size of the condensate. 16 pixels corresponds to approximately 2.5 m on our microscope/detector. Therefore condensates with diameters greater than 2.5 m were selected for concentration measurements to avoid dilution of intensities in condensates close in size to the PSF.