Skip to content
Licensed Unlicensed Requires Authentication Published by De Gruyter November 15, 2017

A Bayesian hierarchical model for identifying significant polygenic effects while controlling for confounding and repeated measures

  • Christopher McMahan EMAIL logo , James Baurley , William Bridges , Chase Joyner , Muhamad Fitra Kacamarga , Robert Lund , Carissa Pardamean and Bens Pardamean

Abstract

Genomic studies of plants often seek to identify genetic factors associated with desirable traits. The process of evaluating genetic markers one by one (i.e. a marginal analysis) may not identify important polygenic and environmental effects. Further, confounding due to growing conditions/factors and genetic similarities among plant varieties may influence conclusions. When developing new plant varieties to optimize yield or thrive in future adverse conditions (e.g. flood, drought), scientists seek a complete understanding of how the factors influence desirable traits. Motivated by a study design that measures rice yield across different seasons, fields, and plant varieties in Indonesia, we develop a regression method that identifies significant genomic factors, while simultaneously controlling for field factors and genetic similarities in the plant varieties. Our approach develops a Bayesian maximum a posteriori probability (MAP) estimator under a generalized double Pareto shrinkage prior. Through a hierarchical representation of the proposed model, a novel and computationally efficient expectation-maximization (EM) algorithm is developed for variable selection and estimation. The performance of the proposed approach is demonstrated through simulation and is used to analyze rice yields from a pilot study conducted by the Indonesian Center for Rice Research.

Acknowledgement

We would like to acknowledge the Indonesian Center for Agricultural Biotechnology and Genetic Resources Research and Development (ICABIOGRAD) for supplying data, NVIDIA and Amazon Web Services for computational support through their grants programs, the International Treaty on Plant Genetic Resources for Food and Agriculture grant W3A-PR-07-Indonesia, NIH/NIAID grant R01 AI121351, and NIH/NIDA grant R43 DA041211-01A1.

References

Alexandrow, N., S. Tai, W. Wang, L. Mansueto, K. Palis and R. Fuentes (2014): “SNP-Seek database of SNPs derived from 3000 rice genomes,” Nucleic Acids Res., 43, D1023–D1027.10.1093/nar/gku1039Search in Google Scholar

Armagan, A., D. Dunson and J. Lee (2013): “Generalized double Pareto shrinkage,” Stat. Sin., 23, 119–143.10.5705/ss.2011.048Search in Google Scholar PubMed

Dodds, K., J. McEwan, R. Brauning, R. Anderson, T. Stijn, T. Kristjansson and S. Clarke (2015): “Construction of relatedness matrices using genotyping-by-sequencing data,” BMC Genomics, 16, 1047.10.1186/s12864-015-2252-3Search in Google Scholar PubMed

Fan, J. and R. Li (2001): “Variable selection via nonconcave penalized likelihood and its oracle properties,” J. Am. Stat. Assoc., 96, 1348–1360.10.1198/016214501753382273Search in Google Scholar

Geddy, R. and G. Brown (2007): “Genes encoding pentatricopeptide repeat (PPR) proteins are not conserved in location in plant genomes and may be subject to diversifying selection,” BMC Genomics, 8, 130.10.1186/1471-2164-8-130Search in Google Scholar PubMed

Huang, S., R. Shingaki-Wells, N. Taylor and A. Millar (2013): “The rice mitochondria proteome and its response during development and to the environment,” Front. Plant Sci., 4, 16.10.3389/fpls.2013.00016Search in Google Scholar PubMed

Kilian, J., F. Peschke, K. Berendzen, K. Harter and D. Wanke (2012): “Prerequisites, performance and profits of transcriptional profiling the abiotic stress response,” Biochim. Biophys. Acta, 1819, 166–175.10.1016/j.bbagrm.2011.09.005Search in Google Scholar PubMed

Matthews, R., M. Kropff, T. Horie and D. Bachelet (1997): “Simulating the impact of climate change on rice production in Asia and evaluating options for adaptation,” Agric. Syst., 54, 399–425.10.1016/S0308-521X(95)00060-ISearch in Google Scholar

Pandey, S. and H. Bhandari (2009): “Drought, coping mechanisms and poverty,” IFAD Occasional Papers.Search in Google Scholar

People Facts (2012): Population growth. http://os-connect.com/pop/p2ai.html (Last accessed, September 29, 2016).Search in Google Scholar

Sakai, H., S. Lee, T. Tanaka, H. Numa, J. Kim, Y. Kawahara, H. Wakimoto, C. C. Yang, M. Iwamoto, T. Abe and Y. Yamada (2013): “Rice Annotation Project Database (RAP-DB): an integrative and interactive database for rice genomics,” Plant Cell Physiol., 54, e6.10.1093/pcp/pcs183Search in Google Scholar PubMed PubMed Central

Schielzeth, H. and A. Husby (2014): “Challenges and prospects in genome-wide quantitative trait loci mapping of standing genetic variation in natural populations,” Ann. N. Y. Acad. Sci., 132, 35–57.10.1111/nyas.12397Search in Google Scholar PubMed

Servin, B. and M. Stephens (2007): “Imputation-based analysis of association studies: candidate genes and quantitative traits,” PLoS Genet., 3, 1296–1308.Search in Google Scholar

Sharma, M. and G. Pandey (2015): “Expansion and function of repeat domain proteins during stress and development in plants,” Front. Plant Sci., 6, 1218.10.3389/fpls.2015.01218Search in Google Scholar PubMed PubMed Central

Shean, M. (2012): INDONESIA: stagnating rice production ensures continued need for imports. http://www.pecad.fas.usda.gov/highlights/2012/03/Indonesia_rice_Mar2012 (Last accessed, September 29, 2016).Search in Google Scholar

Sheikh, A., B. Raghuram, S. Jalmi, D. Wankhede, P. Singh and A. Sinha (2013): “Interaction between two rice mitogen activated protein kinases and its possible role in plant defense,” BMC Plant Biol., 13, 121.10.1186/1471-2229-13-121Search in Google Scholar PubMed PubMed Central

Simon, N., J. Friedman, T. Hastie and R. Tibshirani (2013): “A sparse-group lasso,” J. Comput. Graph. Stat., 22, 231–245.10.1080/10618600.2012.681250Search in Google Scholar

Singh, D., M. Tsiang, B. Rajaratnam and N. Diffenbaugh (2014): “Observed changes in extreme wet and dry spells during the South Asian summer monsoon season,” Nat. Clim. Change, 4, 456–461.10.1038/nclimate2208Search in Google Scholar

Sun, L. and R. Wu (2015): “Mapping complex traits as a dynamic system,” Phys. Life Rev., 13, 155–185.10.1016/j.plrev.2015.02.007Search in Google Scholar PubMed PubMed Central

Teixeira, P. and E. Glaser (2013): “Processing peptidases in mitochondria and chloroplasts,” Biochim. Biophys. Acta, 1833, 360–370.10.1016/j.bbamcr.2012.03.012Search in Google Scholar PubMed

The UniProt Consortium (2015): “UniProt: a hub for protein information,” Nucleic Acids Res., 43, D204–D212.10.1093/nar/gku989Search in Google Scholar PubMed PubMed Central

Tibshirani, R. (1996): “Regression shrinkage and selection via the lasso,” J. R. Stat. Soc. Ser. B, 58, 267–288.10.1111/j.2517-6161.1996.tb02080.xSearch in Google Scholar

Trivedi, D., S. Yadav, N. Vaid and N. Tuteja (2012): “Genome wide analysis of Cyclophilin gene family from rice and Arabidopsis and its comparison with yeast,” Plant Signal Behav., 7, 1653–1666.10.4161/psb.22306Search in Google Scholar PubMed PubMed Central

World Bank (2013): Indonesia population growth rate. http://data.world␣bank.org/country/indonesia.html (Last accessed, September 29, 2016).Search in Google Scholar

Yang, Z., H. Ma, H. Hong, W. Yao, W. Xie, J. Xiao, X. Li and S. Wang (2015): “Transcriptome-based analysis of mitogen-activated protein kinase cascades in the rice response to Xanthomonas oryzae infection,” Rice, 8, 4.10.1186/s12284-014-0038-xSearch in Google Scholar PubMed PubMed Central

Yazdani, A. and D. Dunson (2015): “A hybrid Bayesian approach for genome-wide association studies on related individuals,” Bioinformatics, 31, 3890–3896.10.1093/bioinformatics/btv496Search in Google Scholar PubMed

Yuan, M. and Y. Lin (2007): “Model selection and estimation in regression with grouped variables,” J. R. Stat. Soc. Ser. B, 68, 49–67.10.1111/j.1467-9868.2005.00532.xSearch in Google Scholar

Zhao, K., M. Wright, J. Kimball, G. Eizenga, A. McClung, M. Kovach, W. Tyagi, M. L. Ali, C. W. Tung, A. Reynolds and C. D. Bustamante (2010): “Genomic diversity and introgression in O. sativa reveal the impact of domestication and breeding on the rice genome,” PLoS One, 5, e10780.10.1371/journal.pone.0010780Search in Google Scholar PubMed PubMed Central

Zhou, X. (2016): Gemma user manual. http://www.xzlab.org/software.html (Last accessed, September 29, 2016).Search in Google Scholar

Zhou, X., P. Carbonetto and M. Stephens (2013): “Polygenic modeling with Bayesian sparse linear mixed models,” PLoS Genet., 9, 1–14.10.1371/journal.pgen.1003264Search in Google Scholar PubMed PubMed Central

Zou, H. (2006): “The adaptive lasso and its oracle properties,” J. Am. Stat. Assoc., 101, 1418–1429.10.1198/016214506000000735Search in Google Scholar

Zou, H. and T. Hastie (2005): “Regularization and variable selection via the elastic net,” J. R. Stat. Soc. Ser. B, 67, 301–320.10.1111/j.1467-9868.2005.00503.xSearch in Google Scholar


Supplemental Material

The online version of this article offers supplementary material https://doi.org/10.1515/sagmb-2017-0044).


Published Online: 2017-11-15
Published in Print: 2017-11-27

©2017 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 26.4.2024 from https://www.degruyter.com/document/doi/10.1515/sagmb-2017-0044/html
Scroll to top button