Substrate-specific effects of natural genetic variation on proteasome activity

Protein degradation is an essential biological process that regulates protein abundance and removes misfolded and damaged proteins from cells. In eukaryotes, most protein degradation occurs through the stepwise actions of two functionally distinct entities, the ubiquitin system and the proteasome. Ubiquitin system enzymes attach ubiquitin to cellular proteins, targeting them for degradation. The proteasome then selectively binds and degrades ubiquitinated substrate proteins. Genetic variation in ubiquitin system genes creates heritable differences in the degradation of their substrates. However, the challenges of measuring the degradative activity of the proteasome independently of the ubiquitin system in large samples have limited our understanding of genetic influences on the proteasome. Here, using the yeast Saccharomyces cerevisiae, we built and characterized reporters that provide high-throughput, ubiquitin system-independent measurements of proteasome activity. Using single-cell measurements of proteasome activity from millions of genetically diverse yeast cells, we mapped 15 loci across the genome that influence proteasomal protein degradation. Twelve of these 15 loci exerted specific effects on the degradation of two distinct proteasome substrates, revealing a high degree of substrate-specificity in the genetics of proteasome activity. Using CRISPR-Cas9-based allelic engineering, we resolved a locus to a causal variant in the promoter of RPT6, a gene that encodes a subunit of the proteasome’s 19S regulatory particle. The variant increases RPT6 expression, which we show results in increased proteasome activity. Our results reveal the complex genetic architecture of proteasome activity and suggest that genetic influences on the proteasome may be an important source of variation in the many cellular and organismal traits shaped by protein degradation.

tion that much of the genetic basis of variation in proteasome activity will involve indirect effects.
This notion is consistent with recent theoretical 2 and empirical 3,4 observations suggesting that most trait heritability is driven by weak trans-acting variation in non-core (i.e., non-proteasome for proteasome activity) genes. Indeed, we argue that this is a novel and important result in the manuscript: We show that most heritable variation in proteasome activity likely results from individual genetic differences in non-proteasome genes. Consequently, understanding how genetic effects on the proteasome contribute to variation in traits such as health, aging, and disease will require understanding how these indirect genetic effects influence the proteasome. We believe our manuscript is an important first step in this regard. and BY / RM missense variants per gene (bottom) separated by category.

Reviewer 1 Minor Comments
Some additional comments: Line 41 "Ubiquitin system enzymes bind degradation-promoting signal sequences" -generally, only the E3 enzymes do.
To better convey the functions of ubiquitin system enzymes, the phrase mentioned by the reviewer is replaced with the following in the revised manuscript (page 4, para. 1, line 43): "Ubiquitin system enzymes target proteins for degradation by binding degradation-promoting signal sequences (termed "degrons") and covalently attaching chains of the small protein ubiquitin." Line 57: "Until recently, it was largely unknown how individual genetic differences affect UPS protein degradation" There are lots of genetic data on genetic differences that affect the UPS. I think what is meant here is the question of how natural or semi-natural genetic variation affects the UPS. This should be clarified.
The reviewer's interpretation is correct and the revised manuscript (page 4, para. 2, line 59) includes the following sentence that more clearly articulates the indicated sentence's meaning: "Until recently, it was largely unknown how natural genetic variation affects UPS protein degradation." Line 88: "proteasome can exist in multiple configurations" What exactly is meant by this phrase?
The revised manuscript (page 5, para. 3, line 92) includes the following paragraph that clarifies the meaning of the phrase mentioned by the reviewer by providing an expanded review of the literature: "Proteasomes can also be assembled in multiple configurations that impart distinct affinities for different classes of substrates. In yeast, the catalytically active 20S core particle may be uncapped or singly or doubly capped with the 19S regulatory particle or the proteasome activator Blm10 5 .
Reflecting this, many QTLs for Ac/N-degrons affect all or a majority of the full set of Ac/Ndegrons. By contrast, Arg/N-degrons are created and recognized via molecular mechanisms that affect individual or small subsets of Arg/N-degrons." I don't follow this reasoning: each class requires a common set of factors (E1, E2s, E3s) and has a small number of processing enzymes specific to certain degrons: for Ac/N-degrons, some will require MAPs (not sure how the N-end substrates were made for this study though) and different ones will require different Ac transferases, while for Ac/N-degrons, a very small number will require a deamidase and a small number 5 will require Arg transferase.
We have substantially revised the text mentioned by the reviewer to better explain why we expected the sets of QTLs affecting Arg/N-degrons to be more substrate-specific than those affecting Ac/Ndegrons (page 16, para. 2, line 237): "The N-end Rule is divided into two primary branches based on how N-degrons are generated and recognized [19][20][21][22] . Based on the molecular mechanisms of Arg/N-degron processing and recognition, affects Ac/N-degrons similarly 1 ." Line 264: "multiple factors specifically regulate the degradation of ubiquitin-independent proteasomal substrates, without affecting the degradation of ubiquitinated substrates (ref. 80)." Please give an example or two.
In the sentence mentioned by the reviewer, we were specifically referring to factors, such as Elp2 or Ncs6, that have been shown to regulate the ubiquitin-independent turnover of proteasomal substrates but do not affect the turnover of ubiquitinated substrates 26,27 . However, we do not know if these or similar genes are causal for our proteasome activity QTLs and have therefore removed the indicated sentence from the revised manuscript. We thank the reviewer for alerting us to this error, which we have corrected in the revised manuscript.
We have also added additional examples that show that increasing proteasome subunit expression increases proteasome activity, including Padmanabhan et al. 2016 (page 29, para. 2, line 523).
Line 427: "Thus, PAAF1 association with Rpt6 creates a stable Rpt6 pool that can be used to rapidly drive proteasome assembly." This doesn't really address the question of how increasing levels of one subunit of the˜30+ subunit proteasome complex drives assembly of the whole particle.
There are some examples of how this can work, but none involve Rpt6 to my knowledge.
The revised manuscript (page 29, para. 3, line 539) contains an expanded discussion of how increasing RPT6 expression could increase proteasome activity: While increasing proteasome subunit expression is thus an established means of increasing proteasome activity, the mechanism(s) of this effect are not well-understood. They may involve coordinated increases in the expression of additional proteasome genes or enhanced proteasome assembly. In the case of RPT6, one possibility is that increasing expression levels increases the number of 19S regulatory particles and, in turn, the fraction of 26S proteasomes. The proteasome pool comprises uncapped 20S core particle and 20S singly or doubly capped with 19S regulatory particles ("26S proteasomes") or other proteasome activators such as Blm10. Estimates of the fraction of uncapped 20S proteasomes in the proteasome pool vary across species and cell types, but are generally no less than 30% 11,15,28,29 , suggesting that a large fraction of the proteasome pool could be converted to 26S proteasomes. Moreover, the 26S fraction is dynamic and responsive to changes in 19S subunit expression. For example, in human cells, decreasing the expression of either of the 19S subunits Rpt6 and Rpn2 reduces the fraction of 26S proteasomes 30 . Current models of proteasome assembly posit that the 20S core particle can serve as a template for assembling the 19S regulatory particle [31][32][33] . Rpt6 plays a critical role in this process -insertion of its C-terminal tail into the α2-α3 pocket is the first step in assembling the 19S regulatory particle's base that sits atop the 20S core particle [32][33][34] . After insertion, Rpt6 functions as an anchor to which other RPT heterodimers are added [32][33][34] . These findings suggest that increasing RPT6 expression could increase the 26S proteasome fraction by promoting the formation of an assembly intermediate that acts as a scaffold for further 19S assembly onto the 20S core.
Experimentally, it should be straightforward to examine the RPT6 hit more carefully using the CRISPR'ed allele created. First, does the SNP really affect only RPT6 transcription and not the nearby ALG13 gene, to which it is actually closer? The latter gene is involved in an essential biological process, protein glycosylation, and indirect effects from changes of its expression could certainly impact the protein homeostasis network (for example, endoplasmic reticulum-associated protein degradation).
In the revised manuscript, we present data showing that the causal RPT6 -175 variant increases Rpt6 abundance but does not affect Alg13 (page 20, para., 4, line 348 and Figure 6C / D.) Second, the authors make the interesting speculation that because the RM strain is from a "European wine" lineage, enhanced proteasome activity might be selected for its possibly enhanced ethanol tolerance. This could be tested experimentally with the CRISPR'd and matched BY strain too.
We have removed the text suggesting that RPT6 -175 may have arisen as a result of adaptation to the wine-making environment from the revised manuscript. We retain figure 8A to provide readers with a view of how the causal RPT6 -175 RM allele alters the molecular properties of the RPT6 promoter.
Third, the SNP is proposed to allow Yap1 binding in the RM but not BY strain. The dependence of increased RPT6 mRNA on Yap1 in the CRISPR'd strain can be measured.
We have removed text from the revised manuscript suggesting that the increase in Rpt6 results from Yap1 binding to the RPT6 promoter.
Finally, if the SNP increases Rpt6 levels, the prediction of the authors is that there will be increased proteasomes as a result. This is also testable (although it might be beyond the expertise of a genetics-intensive lab, so I would not demand this).
We did not perform this experiment because it is outside of the laboratory's expertise. We thank the reviewer for not requiring this experiment as part of the revision.
In summary, it would be important, in my opinion, to show that at least one QTL (RPT6-175) directly affects proteasome activity or regulation in a way that can be understood at least roughly in mechanistic terms.
In the revised manuscript we demonstrate that the RPT6 -175 variant increases Rpt6 levels without affecting Alg13 levels. We further show that overexpressing RPT6 expression is sufficient to increase the degradation of multiple UPS substrates that we used to detect the QTL containing RPT6 -175. Collectively, these results provide strong support for our interpretation that RPT6 -175 increases proteasome activity by increasing Rpt6 levels.

Reviewer 2
In this manuscript Collins and coworkers examine the influence of natural genetic variation in yeast on the stability of two degrons (as a proxy for proteasome activity). To do so they leverage fluorescence protein timers, where steady-state ratiometric measurements of fluorescence are related to the stability of each reporter. Using flow cytometry they perform bulk segregant analysis and map over a dozen loci that are linked to differences in reporter stability. Many such linkages appear to selectively influence one reporter but not the other. Finally, using CRISPR-Cas9 approaches, the authors demonstrate that one of the alleles they identified can recapitulate the effects they identified statistically. Overall this is a fine manuscript and the data seem sound. Additional experiments, analyses, and qualifications are needed (described below) to support or clarify specific claims.
Likewise, there are multiple opportunities to improve impact and generalizability.
We thank the reviewer for their positive feedback and thoughtful suggestions to improve the manuscript.
As described below, we have used these comments to conduct additional analyses and incorporate additional text that will increase the impact and generalizability of the revised manuscript.  38 . Using this approach, we observed that the TDH3 promoter produced a nearly-4 fold increase in the Rpn4 TFT's abundance as compared to the Rpn4 TFT expressed from the ACT1 promoter (Reviewer Response Figure 2A).
However, we found no difference in proteasome activity between BY strains harboring the TFT driven by the ACT1 versus TDH3 promoters (Reviewer Response Figure 2B), suggesting that extra copies of the Rpn4 degron do not affect proteasome activity. We acknowledge that we cannot rule out that expressing the Rpn4 TFT via the ACT1 promoter alters UPS activity relative to strains without any TFT and that no further alteration then occurs in response to the greatly increased expression conferred by the TDH3 promoter, but do not think that this is likely. Likewise, controls for growth phase, cell shape, size, adhesion, and other properties that might influence turnover are important. It may not be possible to fully account for all confounding variables, but they should at least be commented upon.
Our protocols for fluorescence-activated cell sorting and flow cytometry gate cells on the basis of forward and side scatter to restrict cell populations to haploid cells of the same approximate size, shape, and, by extension, cell cycle phase. This gating approach also eliminates any aggregates of adherent cells from analysis. This information in included in the Methods section (page 36, para. 2, line 715).
2. Comparison to prior literature for these degrons. The FP timers enable single cell resolution and impressive scale. At the same time, they report on steady-state fluorescent ratios rather than actual kinetic proteasome function. How do the differences between the OTC and RPN4 degrons measured here compare to conventional measurements of their proteasomal turnover?
Previous studies determined that the half-life of the mouse ODC degron in yeast is approximately 6 minutes 39 . The half-life of the N-terminal ubiquitin-independent Rpn4 degron has not, to our knowledge, been precisely determined, but appears to be between 10 and 20 minutes 40,41 . We observed that the steady state ODC TFT ratio was approximately half that of the Rpn4 TFT , suggesting each reporter's output provides an accurate estimate of its degradation kinetics. Because we are unable to obtain a precise estimate of the half-life of the Rpn4 ubiquitin-independent degron, we have not included this information in the revised manuscript. We 1 and others 27,42 have also shown that the TFT ratio accurately reports a protein's degradation kinetics by comparing the TFT ratio to the well-characterized half-lives of the 20 possible N-degrons of the UPS N-end rule 1,27 . Previous work also demonstrated that the TFT ratio provides greater sensitivity and dynamic range compared to conventional measures of protein degradation by either cycloheximide chase or pulse-chase labeling 27 .
Likewise, do the ratios obtained with this assay behave as one might expect when titrating proteasome inhibitors etc.?
We have used gene deletions to characterize how the TFT ratio changes in response to perturbations that alter ubiquitin proteasome system activity here and elsewhere 1 . We observe that the TFT ratio reports the expected stabilization of both N-degrons and ubiquitin-independent degrons in response to deleting RPN4 and the expected branch-specific stabilization of N-degrons in response to deleting UBR1 or DOA10. Others have shown that the TFT ratio increases as expected in response to bortezomib treatment 43 .
Finally, it is critical that the manuscript include some discussion of the quantitative ranges of difference in degradation rates that can be observed, and an estimation of those that cannot, based on the maturation kinetics of the GFP and RFP variants used.
The revised Results section contains information on the quantitative ranges of difference in degradation rates that can be observed with the superfolder GFP / mCherry TFT: "Our TFTs contained the faster-maturing green fluorescent protein (GFP 44 ) superfolder GFP (sfGFP) and the slower-maturing red fluorescent protein (RFP 45 ) mCherry (Figure 2A). The two fluorophores in the TFT mature at different rates and, as a result, the RFP / GFP ratio changes over time. If the TFT's degradation rate is faster than the RFP's maturation rate, the TFT's output, expressed as the − log 2 RFP / GFP ratio, is directly proportional to its degradation rate ( Figure   2B). The superfolder GFP / mCherry tandem fluorescent timer can measure the degradation of substrates with half-lives ranging from several minutes to several hours 27 , making it an ideal reporter system for studying short-lived proteasomal substrates." The revised Discussion section elaborates on which proteins can be studied using the superfolder GFP / mCherry TFT: "Based on these results, we anticipate that variant effects on the degradation of individual proteins will also be highly substrate-specific. Understanding how natural genetic variation affects the proteome through effects on the degradation of individual proteins will thus require reporters that can sensitively measure the degradation of proteins with half-lives ranging from several minutes to several hours 46,47 . The mCherry / sfGFP TFT is well-suited to this purpose. Previous studies have shown that this TFT should be suitable for measuring the degradation of approximately 80% of yeast proteins based on their half-lives 27,27 , assuming the protein tolerates the TFT tag. Recently, a genome-wide TFT tagging approach successfully used the mCherry / sfGFP timer to measure the turnover of approximately 70% (around 4,000 proteins) of the yeast proteome 48 , suggesting that degradation QTLs for most proteins could be mapped using this reporter. TFTs with red fluorescent proteins that mature over longer time scales, such as mRuby or dsRed, can be used to measure the degradation of longer-lived proteins 27 ." 3. Comparison to known genetic architecture of the UPS and other core regulatory processes.
Others have examined UPS function with deletion and overexpression experiments, among other approaches. How do the results here compare in terms of number of genes identified, overlap, etc.?
It is difficult to directly compare our results with those obtained using genome-wide deletion or overexpression-based approaches. A critical difference between screening approaches and the genetic mapping method we employed is that genetic mapping detects the effects of natural genetic variation. As we show in Reviewer Response Figure 1, there is a limited source of natural variation in proteasome genes that could directly affect proteasome activity. Instead, we find many QTL regions contain genes with no known links to the proteasome. We believe this is an important insight from our work, since it suggests that genetic effects on the proteasome that contribute to variation in traits such as health, longevity, and disease will arise through small, indirect effects, as we elaborate on the in the Discussion section (page 26, para. 2, line 433).
Do the authors find essential genes that were not previously seen?
Answering this question comprehensively would require experimental dissection of causal genes, as we have performed for the (known) essential proteasome component Rpt6. We note that several QTLs did not contain any genes with known roles in proteasome function or assembly, suggesting unknown components or, perhaps more likely, indirect effects, as described in the revised Results section (page 19, para. 2, line 299).
Did prior studies also find substrate specific genetic contributions?
Two previous proteome-wide efforts found multiple substrate-specific changes in the proteome following deletion or perturbation of UPS genes 38,49 . These works appear as references 13 and 14 in the revised manuscript. Collectively, these results suggest that genetic effects on the proteasome and ubiquitin system are similarly numerous.
4. Completeness of the genetic dissection and origins of degron specific differences. How much of the difference between BY and RM can be explained by the loci that the authors identified?
Also, how much of the specificity in mapping for each degron is a simple property of their different stabilities to begin with? From the investigator's prior work with N-end rule reporters and GFP fusions it might be possible (and interesting) to speculate on how, quantitatively, this type of regulation may integrate with other types of trans regulatory variation.
We cannot readily calcuate the amount of variance explained by our proteasome activity QTLs as explained in the revised Discussion (page 28, para. 3, line 506). It is also difficult to precisely determine how reporter stability influences QTL substrate specificity. Less stable reporters allow greater selection strength between the high and low proteasome activity pools, which may influence QTL detection power 53 . However, as discussed in the revised manuscript (page 17, para. 2, line 261), we observe multiple substrate-specific QTLs for more stable reporters here (e.g., VIIc and XV) and in our prior study 1 , suggesting that QTL substrate-specificity does not simply reflect differences in reporter dynamic range.
5. Tie up the mechanism. I commend the allele reconstruction experiment, but it also left me wondering about the proposed mechanism, which is presently based almost entirely on speculation.
First, although altered regulation of Rpt6 is obviously the easier explanation, the authors should do an experiment to exclude Alg13. Second, seeing allele-specific rescue with Yap1 overexpression or increased Rpt6 ovexpression would signficantly strengthen their case. Of course there can be challenges with such experiments, but they are simple and well worth trying.
In the revised manuscript, we present results showing that the causal RPT6 -175 variant increases Rpt, but not Alg13 levels. We further show that overexpression of RPT6 increases the degradation of multiple substrates with which we detect the QTL containing RPT6 -175 (page 20, para.

4, line 353).
These results provide strong support for our conclusion that RPT6 -175 increases proteasome activity by increasing RPT6 expression.
6. Ecological impact. The authors determine that the RM allele is derived, which seems reasonable. But at the same time in the tree in Fig. S1, which I would recommend moving to the main text, there are several niches that also have a high frequency of the allele. It may challenging to distinguish positive from balancing selection, but it does look like there may be some enrichment for fermentative niches? I would recommend that the authors investigate this possibility.
To investigate a potential enrichment of the RM allele of RPT6 -175 in fermentative niches, we computed the allele's frequency across the clades defined in Peter et al., 2018 54 and classified clades as "fermentative" or "non-fermentative". Reviewer Response Figure 3 shows the results of this analysis. We did not detect enrichment of the RM allele of RPT6 -175 in fermentative niches versus non-fermentative niches (Mann-Whitney U p = 0.25). This is likely due, in part, to the low  7. In my view, Figure 1 can probably go to the supplement, whereas Figure S1 should probably be in the main text.
We appreciate the reviewer's suggestion and acknowledge that Figure 1 is not needed in the main text for readers in the field of proteasomal protein degradation. However, we anticipate that this work will also be of interest to readers in the field of complex trait genetics, in particular, individuals studying the genetics of gene expression. Based on discussions in our laboratory, we are confident that these readers will appreciate the visual distinction between ubiquitin-dependent and -independent proteasomal protein degradation Figure 1 provides and have, therefore, left this figure in the main text in the revised manuscript.
Per the reviewer's suggestion, Figure S1 appears as part of main text Figure 8 in the revised manuscript.
8. For all quantitative statements I would ask the authors to provide not only the p-value and statisical test used but also the magnitude of the effect.
The revised manuscript includes effect sizes for all statistical comparisons.
9. What happened with the outlier in Fig. 2E?
We cannot explain the BY rpn4∆ ODC TFT outlier Figure 2E. This strain may have acquired a compensatory mutation in response to the deleterious effect of deleting the RPN4 gene. However, we do not have data supporting this notion and we therefore continue to include data from this strain in Figure 2E in the revised manuscript.
10. Limitations and advantages of the BSA approach should be discussed more fully. For example epistasis might be very hard to see, etc.
The revised Discussion section (page 28, para. 3, line 507) describes the advantages and limitations of bulk segregant analysis: "We identified QTLs for proteasome activity using bulk segregant analysis, an approach that has previously been used to characterize the genetic basis of variation in a host of molecular, cellular, and organismal traits 52, 55-58 . By assaying large numbers of individuals, bulk segregant analysis provides high statistical power to detect variant effects on a trait 52,57 . Here, we used high-throughput reporters to measure proteasome activity in millions of recombinant progeny from a cross of the BY and RM strains, which allowed us to reproducibly identify proteasome activity QTLs. Moreover, bulk segregant analysis is efficient in terms of time, labor, and resources as compared to linkage or association mapping. In particular, by generating two "bulks" with extreme phenotypes, we could detect proteasome activity QTLs through pooled whole-genome sequencing, rather than genotyping individual meiotic progeny. However, the choice of bulk segregant analysis also involves limitations that arise from this pooled whole-genome sequencing approach. Because we do not ascertain the genotypes of individual meiotic progeny, we cannot readily estimate the heritability of proteasome activity or the variance explained by the QTLs we detect. For the same reason, we are unable to detect genetic interactions between loci. Recent advances 59 could enable efficient, statistically powerful mapping of proteasome activity using individual meiotic progeny in future studies, which would address these limitations." 11. Can the authors please add columns for the number of genes and polymorphisms within each confidence interval described in Table 1?
The revised manuscript includes Supplementary Table 5, which contains the number of genes and polymorphisms in each confidence interval in Table 1.