Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Letter
  • Published:

Genetics of single-cell protein abundance variation in large yeast populations

This article has been updated

Abstract

Variation among individuals arises in part from differences in DNA sequences, but the genetic basis for variation in most traits, including common diseases, remains only partly understood. Many DNA variants influence phenotypes by altering the expression level of one or several genes. The effects of such variants can be detected as expression quantitative trait loci (eQTL)1. Traditional eQTL mapping requires large-scale genotype and gene expression data for each individual in the study sample, which limits sample sizes to hundreds of individuals in both humans and model organisms and reduces statistical power2,3,4,5,6. Consequently, many eQTL are probably missed, especially those with smaller effects7. Furthermore, most studies use messenger RNA rather than protein abundance as the measure of gene expression. Studies that have used mass-spectrometry proteomics8,9,10,11,12,13 reported unexpected differences between eQTL and protein QTL (pQTL) for the same genes9,10, but these studies have been even more limited in scope. Here we introduce a powerful method for identifying genetic loci that influence protein expression in the yeast Saccharomyces cerevisiae. We measure single-cell protein abundance through the use of green fluorescent protein tags in very large populations of genetically variable cells, and use pooled sequencing to compare allele frequencies across the genome in thousands of individuals with high versus low protein abundance. We applied this method to 160 genes and detected many more loci per gene than previous studies. We also observed closer correspondence between loci that influence protein abundance and loci that influence mRNA abundance of a given gene. Most loci that we detected were clustered in ‘hotspots’ that influence multiple proteins, and some hotspots were found to influence more than half of the proteins that we examined. The variants that underlie these hotspots have profound effects on the gene regulatory network and provide insights into genetic variation in cell physiology between yeast strains.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Figure 1: Multiple loci affect protein levels.
Figure 2: X-pQTL hotspots.
Figure 3: Hotspot effects.

Similar content being viewed by others

Change history

  • 10 January 2014

    A minor change was made to the opening paragraph.

References

  1. Rockman, M. V. & Kruglyak, L. Genetics of global gene expression. Nature Rev. Genet. 7, 862–872 (2006)

    Article  CAS  Google Scholar 

  2. Smith, E. N. & Kruglyak, L. Gene–environment interaction in yeast gene expression. PLoS Biol. 6, e83 (2008)

    Article  Google Scholar 

  3. Rockman, M. V., Skrovanek, S. S. & Kruglyak, L. Selection at linked sites shapes heritable phenotypic variation in C. elegans. Science 330, 372–376 (2010)

    Article  CAS  ADS  Google Scholar 

  4. Huang, G. J. et al. High resolution mapping of expression QTLs in heterogeneous stock mice in multiple tissues. Genome Res. 19, 1133–1140 (2009)

    Article  CAS  Google Scholar 

  5. West, M. A. L. et al. Global eQTL mapping reveals the complex genetic architecture of transcript-level variation in Arabidopsis. Genetics 175, 1441–1450 (2007)

    Article  CAS  Google Scholar 

  6. Lappalainen, T. et al. Transcriptome and genome sequencing uncovers functional variation in humans. Nature 501, 506–511 (2013)

    Article  CAS  ADS  Google Scholar 

  7. Brem, R. B. & Kruglyak, L. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc. Natl Acad. Sci. USA 102, 1572–1577 (2005)

    Article  CAS  ADS  Google Scholar 

  8. Foss, E. J. et al. Genetic basis of proteome variation in yeast. Nature Genet. 39, 1369–1375 (2007)

    Article  CAS  Google Scholar 

  9. Foss, E. J. et al. Genetic variation shapes protein networks mainly through non-transcriptional mechanisms. PLoS Biol. 9, e1001144 (2011)

    Article  CAS  Google Scholar 

  10. Ghazalpour, A. et al. Comparative analysis of proteome and transcriptome variation in mouse. PLoS Genet. 7, e1001393 (2011)

    Article  CAS  Google Scholar 

  11. Wu, L. et al. Variation and genetic control of protein abundance in humans. Nature 499, 79–82 (2013)

    Article  CAS  ADS  Google Scholar 

  12. Khan, Z., Bloom, J. S., Garcia, B. A., Singh, M. & Kruglyak, L. Protein quantification across hundreds of experimental conditions. Proc. Natl Acad. Sci. USA 106, 15544–15548 (2009)

    Article  CAS  ADS  Google Scholar 

  13. Skelly, D. A. et al. Integrative phenomics reveals insight into the structure of phenotypic diversity in budding yeast. Genome Res. 23, 1496–1504 (2013)

    Article  CAS  Google Scholar 

  14. Ehrenreich, I. M. et al. Dissection of genetically complex traits with extremely large pools of yeast segregants. Nature 464, 1039–1042 (2010)

    Article  CAS  ADS  Google Scholar 

  15. Huh, W.-K. et al. Global analysis of protein localization in budding yeast. Nature 425, 686–691 (2003)

    Article  CAS  ADS  Google Scholar 

  16. Edwards, M. D. & Gifford, D. K. High-resolution genetic mapping with pooled sequencing. BMC Bioinformatics 13, S8 (2012)

    Article  Google Scholar 

  17. Picotti, P. et al. A complete mass-spectrometric map of the yeast proteome applied to quantitative trait analysis. Nature 494, 266–270 (2013)

    Article  CAS  ADS  Google Scholar 

  18. Brem, R. B., Yvert, G., Clinton, R. & Kruglyak, L. Genetic dissection of transcriptional regulation in budding yeast. Science 296, 752–755 (2002)

    Article  CAS  ADS  Google Scholar 

  19. Litvin, O., Causton, H. C., Chen, B. J. & Pe’er, D. Modularity and interactions in the genetics of gene expression. Proc. Natl Acad. Sci. USA 106, 6441–6446 (2009)

    Article  CAS  ADS  Google Scholar 

  20. Zitomer, R. S. & Lowry, C. V. Regulation of gene expression by oxygen in Saccharomyces cerevisiae. Microbiol. Rev. 56, 1–11 (1992)

    CAS  PubMed  PubMed Central  Google Scholar 

  21. Gaisne, M., Bécam, A. M., Verdiere, J. & Herbert, C. J. A. A ‘natural’ mutation in Saccharomyces cerevisiae strains derived from S288c affects the complex regulatory gene HAP1 (CYP1). Curr. Genet. 36, 195–200 (1999)

    Article  CAS  Google Scholar 

  22. Harbison, C. T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99–104 (2004)

    Article  CAS  ADS  Google Scholar 

  23. Butler, G. Hypoxia and gene expression in eukaryotic microbes. Annu. Rev. Microbiol. 67, 291–312 (2013)

    Article  CAS  Google Scholar 

  24. Zaman, S., Lippman, S. I., Zhao, X. & Broach, J. R. How Saccharomyces responds to nutrients. Annu. Rev. Genet. 42, 27–81 (2008)

    Article  CAS  Google Scholar 

  25. Zaman, S., Lippman, S. I., Schneper, L., Slonim, N. & Broach, J. R. Glucose regulates transcription in yeast through a network of signaling pathways. Mol. Syst. Biol. 5, 245 (2009)

    Article  Google Scholar 

  26. Spor, A. et al. Niche-driven evolution of metabolic and life-history strategies in natural and domesticated populations of Saccharomyces cerevisiae. BMC Evol. Biol. 9, 296 (2009)

    Article  Google Scholar 

  27. Warringer, J. et al. Trait variation in yeast is defined by population history. PLoS Genet. 7, e1002111 (2011)

    Article  CAS  Google Scholar 

  28. Fraser, H. B., Moses, A. M. & Schadt, E. E. Evidence for widespread adaptive evolution of gene expression in budding yeast. Proc. Natl Acad. Sci. USA 107, 2977–2982 (2010)

    Article  CAS  ADS  Google Scholar 

  29. Lewis, J. A. & Gasch, A. P. Natural variation in the yeast glucose-signaling network reveals a new role for the Mig3p transcription factor. G3 Gene Genomes Genetics 2, 1607–1612 (2012)

    CAS  Google Scholar 

  30. Henras, A. K. et al. The post-transcriptional steps of eukaryotic ribosome biogenesis. Cell. Mol. Life Sci. 65, 2334–2359 (2008)

    Article  CAS  Google Scholar 

  31. Howson, R. et al. Construction, verification and experimental use of two epitope-tagged collections of budding yeast strains. Comp. Funct. Genomics 6, 2–16 (2005)

    Article  CAS  Google Scholar 

  32. Tong, A. H. Y. & Boone, C. High-throughput strain construction and systematic synthetic lethal screening in Saccharomyces cerevisiae. Methods in Microbiology 36, 369–707 (2007)

    Article  CAS  Google Scholar 

  33. Newman, J. R. S. et al. Single-cell proteomic analysis of S. cerevisiae reveals the architecture of biological noise. Nature 441, 840–846 (2006)

    Article  CAS  ADS  Google Scholar 

  34. Adey, A. et al. Rapid, low-input, low-bias construction of shotgun fragment libraries by high-density in vitro transposition. Genome Biol. 11, R119 (2010)

    Article  CAS  Google Scholar 

  35. Bloom, J. S., Ehrenreich, I. M., Loo, W. T., Lite, T.-L. V. & Kruglyak, L. Finding the sources of missing heritability in a yeast cross. Nature 494, 234–237 (2013)

    Article  CAS  ADS  Google Scholar 

  36. Meyer, M. & Kircher, M. Illumina Sequencing Library Preparation for Highly Multiplexed Target Capture and Sequencing. Cold Spring Harbor Protocols http://dx.doi.org/10.1101/pdb.prot5448 (2010)

  37. Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows–Wheeler transform. Bioinformatics 25, 1754–1760 (2009)

    Article  CAS  Google Scholar 

  38. Broman, K. W., Wu, H., Sen, S. & Churchill, G. A. R/qtl: QTL mapping in experimental crosses. Bioinformatics 19, 889–890 (2003)

    Article  CAS  Google Scholar 

  39. Yvert, G. et al. Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors. Nature Genet. 35, 57–64 (2003)

    Article  CAS  Google Scholar 

  40. Storey, J. D. & Tibshirani, R. Statistical significance for genomewide studies. Proc. Natl Acad. Sci. USA 100, 9440–9445 (2003)

    Article  CAS  ADS  MathSciNet  Google Scholar 

  41. Spivak, A. T. & Stormo, G. D. ScerTF: a comprehensive database of benchmarked position weight matrices for Saccharomyces species. Nucleic Acids Res. 40, D162–D168 (2012)

    Article  CAS  Google Scholar 

Download references

Acknowledgements

We are grateful to C. DeCoste at the Princeton Flow Cytometry Resource Facility for technical assistance and advice on the experiments. This work was supported by National Institutes of Health (NIH) grant R01 GM102308, a James S. McDonnell Centennial Fellowship, and the Howard Hughes Medical Institute (L.K.), German Science Foundation research fellowship AL 1525/1-1 (F.W.A.), a National Science Foundation fellowship (J.S.B.), and NIH postdoctoral fellowship F32 GM101857-02 (S.T.).

Author information

Authors and Affiliations

Authors

Contributions

F.W.A. and L.K. conceived the project, designed research and wrote the paper. F.W.A. and A.H.S. performed experiments. F.W.A. analysed the data. S.T. provided advice on yeast strain construction, the initial experimental design and other experimental procedures. J.S.B. provided advice on experimental procedures and data analysis.

Corresponding authors

Correspondence to Frank W. Albert or Leonid Kruglyak.

Ethics declarations

Competing interests

The authors declare no competing financial interests.

Extended data figures and tables

Extended Data Figure 1 Overview of the experimental design.

Extended Data Figure 2 Illustration of FACS design.

Shown is GFP intensity and forward scatter (FSC, a measure of cell size) recorded during FACS. The correlation between cell size and GFP intensity is clearly visible. The superimposed collection gates are an illustration, and do not show the actual gates used for this gene. a, The low GFP (blue) and high GFP (red) gates sample extreme levels of GFP within a defined range of cell sizes. b, For the ‘null’ experiments, the same cell size range is collected, but without selecting on GFP.

Extended Data Figure 3 Sequence analyses and X-pQTL detection example.

In all panels, physical genomic coordinates are shown on the x-axes. The position of the gene (LEU1) is indicated by the purple vertical line. Top panel: frequency of the BY allele in the high (red) and low (blue) GFP population. SNPs are indicated by dots, and loess-smoothed averages as solid lines. Note the fixation of the BY allele in all segregants at the gene position and at the mating type locus on chromosome III, as well as the fixation of the RM allele at the synthetic genetic array marker integrated at the CAN1 locus on the left arm of chromosome V. Middle panel: subtraction of allele frequencies in the low from those in the high GFP population. SNPs are indicated by grey dots, with the loess-smoothed average indicated in black. Note that, on average, there is no difference between the high and the low populations. Positive difference values correspond to a higher frequency of the BY allele in the high GFP population, which we interpret as higher expression being caused by the BY allele at that locus. The red horizontal lines indicate the 99.99% quantile from the empirical ‘null’ sort experiments. They are shown for illustration only and were not used for peak calling. The blue vertical boxes indicate positions of genome-wide X-pQTL, with the width representing the 2-lod drop interval. Bottom panel: lod scores obtained from MULTIPOOL16. The red horizontal line is the genome-wide significance threshold (lod = 4.5). Stars indicate X-pQTL called by our algorithm; these positions correspond to the blue bars in the middle panel. For this gene, 14 X-pQTL are called.

Extended Data Figure 4 Reproducibility examples.

Shown are allele frequency differences between the high and low GFP populations along the genome of replicates for three genes. The gene positions are indicated by purple vertical lines; note that YMR315W and GCN1 were ‘local’ experiments where peaks at the gene position are visible. The red horizontal lines indicate the 99.99% quantile from the empirical ‘null’ sort experiments. Note the near-perfect agreement for strong X-pQTL, with some differences discernible at weaker loci. See Supplementary Note 1 for details.

Extended Data Figure 5 Example for a local X-pQTL in the gene MAE1.

Shown is the difference in the frequency of the BY allele between the high and the low GFP population along the genome. Red dashed horizontal lines indicate the 99.99% quantile from the empirical ‘null’ sort experiments. They are shown for illustration only and were not used for peak calling.

Extended Data Figure 6 Distributions of X-pQTL effect sizes for X-pQTL with and without a corresponding eQTL.

Effect sizes are shown as the absolute allele frequency differences between the high and low GFP population.

Extended Data Figure 7 The impact of small effect sizes on the π1 estimate.

Each panel shows the P-value distribution obtained from 5,000 tests of a given effect size x, if two groups of 50 individuals each are compared using a t-test. The effect size x is given along with the corresponding variance explained (VE), the π1 estimate, and the fraction of tests that achieved nominal significance (P < 0.05). Note that π1 reaches 0.3 at VE = 0.5% – 1% (middle row, right columns). See Supplementary Note 2 for details.

Extended Data Figure 8 Genes regulated by the hotspots on chromosomes XI, XII and XV.

The table shows genes that have an X-pQTL at three hotspots. For each gene involved in aerobic respiration, we show the X-pQTL lod scores along the genome in the top half of the plot, and the eQTL and pQTL lod scores in the bottom half on an inverted scale. The hotspot locations are shown as grey bars labelled with the names of the causative genes. Purple vertical lines indicate the gene positions. Red dashed horizontal lines are significance thresholds. Stars indicate significant QTL.

Extended Data Table 1 mRNA-specific and protein-specific local QTL
Extended Data Table 2 Hotspot regulators of protein expression

Supplementary information

Supplementary Information

This file contains Supplementary Notes 1-2 and Supplementary Tables 1-3. (PDF 258 kb)

Supplementary Data 1

This file contains full details of the genes studied. (XLSX 67 kb)

Supplementary Data 2

This file contains a list of X-pQTL identified in this study. (XLSX 109 kb)

Supplementary Data 3

This zipped file contains allele count data used in the analyses. (ZIP 32911 kb)

PowerPoint slides

Rights and permissions

Reprints and permissions

About this article

Cite this article

Albert, F., Treusch, S., Shockley, A. et al. Genetics of single-cell protein abundance variation in large yeast populations. Nature 506, 494–497 (2014). https://doi.org/10.1038/nature12904

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/nature12904

This article is cited by

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.

Search

Quick links

Nature Briefing: Translational Research

Sign up for the Nature Briefing: Translational Research newsletter — top stories in biotechnology, drug discovery and pharma.

Get what matters in translational research, free to your inbox weekly. Sign up for Nature Briefing: Translational Research