Skip to main content

Advertisement

Log in

Bayesian multilocus association mapping on ordinal and censored traits and its application to the analysis of genetic variation among Oryza sativa L. germplasms

  • Original Paper
  • Published:
Theoretical and Applied Genetics Aims and scope Submit manuscript

Abstract

Association mapping can be a powerful tool for detecting quantitative trait loci (QTLs) without requiring line-crossing experiments. We previously proposed a Bayesian approach for simultaneously mapping multiple QTLs by a regression method that directly incorporates estimates of the population structure. In the present study, we extended our method to analyze ordinal and censored traits, since both types of traits are common in the evaluation of germplasm collections. Ordinal-probit and tobit models were employed to analyze ordinal and censored traits, respectively. In both models, we postulated the existence of a latent continuous variable associated with the observable data, and we used a Markov-chain Monte Carlo algorithm to sample the latent variable and determine the model parameters. We evaluated the efficiency of our approach by using simulated- and real-trait analyses of a rice germplasm collection. Simulation analyses based on real marker data showed that our models could reduce both false-positive and false-negative rates in detecting QTLs to reasonable levels. Simulation analyses based on highly polymorphic marker data, which were generated by coalescent simulations, showed that our models could be applied to genotype data based on highly polymorphic marker systems, like simple sequence repeats. For the real traits, we analyzed heading date as a censored trait and amylose content and the shape of milled rice grains as ordinal traits. We found significant markers that may be linked to previously reported QTLs. Our approach will be useful for whole-genome association mapping of ordinal and censored traits in rice germplasm collections.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Albert JH, Chib S (1993) Bayesian analysis of binary and polychotomous response data. J Am Stat Assoc 88:669–679

    Article  Google Scholar 

  • Agrama HA, Eizenga GC, Yan W (2007) Association mapping of yield and its components in rice cultivars. Mol Breed 19:341–356

    Article  Google Scholar 

  • Cowles MK (1996) Accelerating Monte Carlo Markov chain convergence for cumulative-link generalized linear models. Stat Comput 6:101–111

    Article  Google Scholar 

  • Cui KH, Peng SB, Xing YZ, Xu CG, Yu SB, Zhang Q (2002) Molecular dissection of seedling-vigor and associated physiological traits in rice. Theor Appl Genet 105:745–753

    Article  PubMed  CAS  Google Scholar 

  • Donnelly PJ, Tavaré S (1995) Goalescents and genealogical structure under neutrality. Annu Rev Genet 29:401–421

    Article  PubMed  CAS  Google Scholar 

  • George EI, McCulloch RE (1993) Variable selection via Gibbs sampling. J Am Stat Assoc 88:881–889

    Article  Google Scholar 

  • Godsill SJ (2001) On the relationship between Markov chain Monte Carlo methods for model uncertainty. J Comput Graph Stat 10:230–248

    Article  Google Scholar 

  • Green PJ (1995) Reversible jump Markov chain Monte Carlo computation and Bayesian model determination. Biometrika 82:711–732

    Article  Google Scholar 

  • Harushima Y, Yano M, Shomura A, Sato M, Shimano T, Kuboki Y, Yamamoto T, Lin SY, Antonio BA, Parco A, Kajiya H, Huang N, Yamamoto K, Nagamura Y, Kurata N, Khush GS, Sasaki T (1998) A high-density rice genetic linkage map with 2275 markers using a single F2 population. Genetics 148:479–494

    PubMed  CAS  Google Scholar 

  • Hawks JG (1983) The diversity of crop plants. Harvard University Press, Cambridge

    Google Scholar 

  • He P, Li SG, Qian Q, Ma YQ, Li JZ, Wang WM, Chen Y, Zhu LH (1999) Genetic analysis of rice grain quality. Theor Appl Genet 98:502–508

    Article  CAS  Google Scholar 

  • Hobert JP, Casella G (1996) The effect of improper priors on Gibbs sampling in hierarchical linear mixed models. J Am Stat Assoc 91:1461–1473

    Article  Google Scholar 

  • Huang N, Parco A, Mew T, Magpantay G, McCouh S, Guiderdoni E, Xu JC, Subudhi P, Angeles ER, Khush GS (1997) RFLP mapping of isozymes, RAPD, and QTLs for grain shape, brown planthopper resistance in a doubled-haploid rice population. Mol Breed 3:105–113

    Article  CAS  Google Scholar 

  • Huang H, Eversley CD, Threadgill DW, Zou F (2007) Bayesian multiple quantitative trait loci mapping for complex traits using markers of the entire genome. Genetics 176:2529–2540

    Article  PubMed  CAS  Google Scholar 

  • International Rice Genome Sequencing Project (2005) The map-based sequence of the rice genome. Nature 436:793–800

    Article  Google Scholar 

  • Iwata H, Uga Y, Yoshioka Y, Ebana K, Hayashi T (2007) Bayesian association mapping of multiple quantitative trait loci and its application to the analysis of genetic variation among Oryza sativa L. germplasms. Theor Appl Genet 114:1437–1449

    Article  PubMed  Google Scholar 

  • Jannink JL (2007) Identifying quantitative trait locus by genetic background interaction in association studies. Genetics 176:553–561

    Article  PubMed  CAS  Google Scholar 

  • Kikuchi S, Satoh K, Nagata T, Kawagashira N, Doi K et al (2003) Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice. Science 301:376–379

    Article  PubMed  Google Scholar 

  • Kilpikari R, Sillanpää MJ (2003) Bayesian analysis of multilocus association in quantitative and qualitative traits. Genet Epidemiol 25:122–135

    Article  PubMed  Google Scholar 

  • Kingman JFC (1982) The coalescent. Stochastic Process Appl 13:235–248

    Article  Google Scholar 

  • Kojima S, Takahashi Y, Kobayashi Y, Monna L, Sasaki T, Araki T, Yano M (2002) Hd3a, a rice ortholog of the Arabidopsis FT gene, promotes transition to flowering downstream of Hd1 under short-day conditions. Plant Cell Physiol 43:1096–1105

    Article  PubMed  CAS  Google Scholar 

  • Kojima Y, Ebana K, Fukuoka S, Nagamine T, Kawase M (2005) Development of an RFLP-based rice diversity research set of germplasm. Breed Sci 55:431–440

    Article  CAS  Google Scholar 

  • Kuo L, Mallick B (1998) Variable selection for regression models. Sankhya Ser B 60:65–81

    Google Scholar 

  • Kurata N, Nagamura Y, Yamamoto K, Harushima Y, Sue N, Wu J, Antonio BA, Shomura A, Shimizu T, Lin SY, Inoue T, Fukuda A, Shimano T, Kuboki Y, Toyama T, Miyamoto Y, Kirihara T, Hayasaka K, Miyao A, Monna L, Zhong HS, Tamura Y, Wang ZX, Momma T, Umehara Y, Yano M, Sasaki T, Minobe Y (1994) A 300 kilobase interval genetic map of rice including 883 expressed sequences. Nature Genet 8:365–372

    Article  PubMed  CAS  Google Scholar 

  • Lanceras JC, Huang HL, Naivikul O, Vanavichit A, Ruanjaichon V, Tragoonrung S (2000) Mapping of genes for cooking and eating qualities in Thai Jasmine rice (KDML105). DNA Res 7:93–101

    Article  PubMed  CAS  Google Scholar 

  • Lander ES, Schork NJ (1994) Genetic dissection of complex traits. Science 265:2037–2049

    Article  PubMed  CAS  Google Scholar 

  • Laval G, Excoffier L (2004) SIMCOAL 2.0: a program to simulate genomic diversity over large recombining regions in a subdivided population with a complex history. Bioinformatics 20:2485–2487

    Article  PubMed  CAS  Google Scholar 

  • Li ZK, Yu SB, Lafitte HR, Huang N, Courtois B, Hittalmani S, Vijayakumar CH, Liu GF, Wang GC, Shashidhar HE, Zhuang JY, Zheng KL, Singh VP, Sidhu JS, Srivantaneeyakul S, Khush GS (2003) QTL × environment interactions in rice. I. Heading date and plant height. Theor Appl Genet 108:141–153

    Article  PubMed  CAS  Google Scholar 

  • Li J, Xiao J, Grandillo S, Jiang L, Wan Y, Qiyun D, Yuan L, McCouch SR (2004) QTL detection for rice grain quality traits using an interspecific backcross population derived from cultivated Asian (O. sativa L.) and African (O. glaberrima S.) rice. Theor Appl Genet 47:697–704

    CAS  Google Scholar 

  • Malosetti M, van der Linden CG, Vosman B, van Eeuwijk FA (2007) A mixed-model approach to association mapping using pedigree information with an illustration of resistance to Phytophthora infestans in potato. Genetics 175:879–889

    Article  PubMed  CAS  Google Scholar 

  • Mei HW, Luo LJ, Ying CS, Wang YP, Yu XQ, Guo LB, Paterson AH, Li ZK (2003) Gene actions of QTLs affecting several agronomic traits resolved in a recombinant inbred rice population and two testcross populations. Theor Appl Genet 107:89–101

    PubMed  CAS  Google Scholar 

  • Meuwissen THE, Hayes BJ, Goddard ME (2001) Prediction of total genetic value using genome-wide dense marker maps. Genetics 157:1819–1829

    PubMed  CAS  Google Scholar 

  • Nei M (1973) Analysis of gene diversity in subdivided populations. Proc Nat Acad Sci USA 70:3321–3323

    Article  PubMed  CAS  Google Scholar 

  • Oraguzie NC, Rikkerink EHA, Gardiner SE, De Silva HN (2007) Association mapping in plants. Springer, New York

    Book  Google Scholar 

  • Pritchard JK, Stephens M, Donnelly P (2000) Inference of population structure using multilocus genotype data. Genetics 155:945–959

    PubMed  CAS  Google Scholar 

  • Satagopan JM, Yandell BS, Newton MA, Osborn TC (1996) A Bayesian approach to detect quantitative trait loci using Markov chain Monte Carlo. Genetics 144:805–816

    PubMed  CAS  Google Scholar 

  • Sillanpää MJ, Arjas E (1998) Bayesian mapping of multiple quantitative trait loci from incomplete inbred line cross data. Genetics 148:1373–1388

    PubMed  Google Scholar 

  • Sillanpää MJ, Arjas E (1999) Bayesian mapping of multiple quantitative trait loci from incomplete outbred offspring data. Genetics 151:1605–1619

    PubMed  Google Scholar 

  • Sillanpää MJ, Bhattacharjee M (2005) Bayesian association-based fine mapping in small chromosomal segments. Genetics 169:427–439

    Article  PubMed  Google Scholar 

  • Sillanpää MJ, Bhattacharjee M (2006) Association mapping of complex trait loci with context-dependent effects and unknown context variable. Genetics 174:1597–1611

    Article  PubMed  Google Scholar 

  • Sillanpää MJ, Hoti F (2007) Mapping quantitative trait loci from a single-tail sample of the phenotype distribution including survival data. Genetics 177:2361–2377

    Article  PubMed  Google Scholar 

  • Sillanpää MJ, Kilpikari R, Ripatti S, Onkamo P, Uimari P (2001) Bayesian association mapping for quantitative traits in a mixture of two populations. Genet Epidemiol 21(Suppl 1):S692–S699

    PubMed  Google Scholar 

  • Sorensen D, Gianola D (2002) Likelihood, Bayesian, and MCMC methods in quantitative genetics. Springer, Heidelberg

    Google Scholar 

  • Sorensen DA, Gianola D, Korsgaard I (1998) Bayesian mixed-effect model analysis of censored normal distribution with animal breeding applications. Acta Agric Scand 48:222–229

    Article  Google Scholar 

  • Thornsberry JM, Goodman MM, Doebley J, Kresovich S, Nielsen D, Buckler ES (2001) Dwarf8 polymorphisms associate with variation with flowering time. Nature Genet 28:286–289

    Article  PubMed  CAS  Google Scholar 

  • Uimari P, Hoeschele I (1997) Mapping linked quantitative trait loci using Bayesian analysis and Markov chain Monte Carlo algorithms. Genetics 146:735–743

    PubMed  CAS  Google Scholar 

  • Uimari P, Sillanpää MJ (2001) Bayesian oligogenic analysis of quantitative and qualitative traits in general pedigrees. Genet Epidemiol 21:224–242

    Article  PubMed  CAS  Google Scholar 

  • Wan XY, Wan JM, Weng JF, Jiang L, Bi JC, Wang CM, Zhai HQ (2005) Stability of QTLs for rice grain dimension and endosperm chalkiness characteristics across eight environments. Theor Appl Genet 110:1334–1346

    Article  PubMed  CAS  Google Scholar 

  • Weber A, Clark RM, Vaughn L, Sanchez-Gonzalez JD, Yu JM, Yandell BS, Bradbury P, Doebley J (2007) Major regulatory genes in maize contribute to standing variation in teosinte (Zea mays ssp parviglumis). Genetics 177:2349–2359

    Article  PubMed  CAS  Google Scholar 

  • Yamamoto T, Lin H, Sasaki T, Yano M (2000) Identification of heading date quantitative trait locus Hd6 and characterization of its epistatic interaction with Hd2 in rice using advanced backcross progeny. Genetics 154:885–891

    PubMed  CAS  Google Scholar 

  • Yamanaka S, Nakamura I, Watanabe KN, Sato YI (2004) Identification of SNPs in the waxy gene among glutinous rice cultivars and their evolutionary significance during the domestication process in rice. Theor Appl Genet 108:1200–1204

    Article  PubMed  CAS  Google Scholar 

  • Yano M, Katayose Y, Ashikari M, Yamanouchi U, Monna L, Fuse T, Baba T, Yamamoto K, Umehara Y, Nagamura Y, Sasaki T (2000) Hd1, a major photoperiod sensitivity quantitative trait locus in rice, is closely related to the Arabidopsis flowering time gene CONSTANS. Plant Cell 12:2473–2484

    Article  PubMed  CAS  Google Scholar 

  • Yi N (2004) A unified Markov chain Monte Carlo framework for mapping multiple quantitative trait loci. Genetics 167:967–975

    Article  PubMed  CAS  Google Scholar 

  • Yi N, Xu S (2000) Bayesian mapping of quantitative trait loci for complex binary traits. Genetics 155:1391–1403

    PubMed  CAS  Google Scholar 

  • Yi N, George V, Allison DB (2003) Stochastic search variable selection for identifying multiple quantitative trait loci. Genetics 164:1129–1138

    PubMed  CAS  Google Scholar 

  • Yi N, Xu S, George V, Allison DB (2004) Mapping multiple quantitative trait loci for complex ordinal traits. Behav Genet 34:3–15

    Article  PubMed  Google Scholar 

  • Yi N, Yandell BS, Churchill GA, Allison DB, Eisen EJ, Pomp D (2005) Bayesian model selection for genome-wide epistatic quantitative trait loci analysis. Genetics 170:1333–1344

    Article  PubMed  CAS  Google Scholar 

  • Yi N, Banerjee S, Pomp D, Yandell BS (2007) Bayesian mapping of genome-wide interacting QTL for ordinal traits. Genetics 176:1855–1864

    Article  PubMed  Google Scholar 

  • Yu J, Buckler ES (2006) Genetic association mapping and genome organization of maize. Curr Opin Biotech 17:155–160

    PubMed  CAS  Google Scholar 

  • Yu J, Pressoir G, Briggs W, Vroh Bi I, Yamasaki M, Doebley JF, McMullen MD, Gaut BS, Nielsen DM, Holland JB, Kresovich S, Buckler ES (2006) A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Nature Genet 38:203–208

    Article  PubMed  CAS  Google Scholar 

  • Zhao K, Aranzana MJ, Kim S, Lister C, Shindo C, Tang C, Toomajian C, Zheng H, Dean C, Marjoram P, Nordborg M (2007) An Arabidopsis example of association mapping in structured samples. PLoS Genet 3:e4

    Article  PubMed  Google Scholar 

  • Zweig MH, Campbell G (1993) Receiver-operating characteristic (ROC) plots—a fundamental evaluation tool in clinical medicine. Clin Chem 39:561–577

    PubMed  CAS  Google Scholar 

Download references

Acknowledgments

This work was supported by a grant from the Ministry of Agriculture, Forestry and Fisheries of Japan (Green Technology Project, QT1001, and Genomics for Agricultural Innovation, DD-4050).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hiroyoshi Iwata.

Additional information

Communicated by M. Sillanpää.

Appendices

Appendix A: Priors and posteriors

We considered the prior distributions of the parameters in the model in Eq. 4 except \( \varvec{\gamma} \) as follows:

$$ \begin{aligned} \alpha_{j} & \sim U( - \infty, \infty ), \\\varvec{\beta}_{k} & \sim N({\mathbf{0}},{\mathbf{I}}\sigma_{\beta _{k}}^{2} ), \\ \sigma_{{\beta_{k} }}^{2} & \sim \upsilon_{\beta } s_{\beta }^{2} \chi_{{\nu_{\beta } }}^{ - 2}, \\ \end{aligned} $$

and

$$ \sigma_{e}^{2} \sim \upsilon_{e} s_{e}^{2} \chi_{{\nu_{e} }}^{ - 2}, $$

where υ β , s 2 β , υ e , and s 2 e are hyperparameters for the distributions. That is, \( \sigma_{{\beta_{k} }}^{2} \) was sampled from a scaled inverted chi-square distribution with υ β degree of freedom and scale parameter s 2 β , and σ 2 e from the same distribution with different parameters (i.e., υ e and s 2 e ).

We considered that the prior distribution of the number of 1 s in \( \varvec{\gamma}, \) i.e., the number of QTLs included in the model (N Q ), follows a truncated Poisson distribution. That is,

$$ N_{Q} = \sum\limits_{k = 1}^{K} {\gamma_{k} } $$

and

$$ p(N_{Q} = n) = \left\{ {\begin{array}{*{20}l} {{{\frac{{\lambda^{n} \exp ( - \lambda )}}{n!}} \mathord{\left/ {\vphantom {{\frac{{\lambda^{n} \exp ( - \lambda )}}{n!}} {\sum\limits_{i = 1}^{{Q_{\max } }} {\frac{{\lambda^{i} \exp ( - \lambda )}}{i!}\quad {\text{if}}\;n \le Q_{\max } } }}} \right. \kern-\nulldelimiterspace} {\sum\limits_{i = 1}^{{Q_{\max } }} {\frac{{\lambda^{i} \exp ( - \lambda )}}{i!}\quad {\text{if}}\;n \le Q_{\max } } }}} \\ 0{\qquad\qquad\qquad\qquad\quad {\text{if}}\;n > Q_{\max } } \\ \end{array} } \right., $$

where λ is a hyperparameter for the distribution and is construed as the expected number of QTLs included in the model, and Q max is a hyperparameter that determines the upper limit of the number of QTLs that can be included in the model. The Poisson prior on the number of 1 s in \( \varvec{\gamma}, \) has been proposed by Yi (2004).

The prior of the cut-points in the ordinal probit model in Eq. 2, i.e., κ m (m = 2, 3,…, − 1), has a uniform distribution that respects the ordering constraints. That is,

$$ \kappa_{m} |\kappa_{m - 1}, \kappa_{m + 1} \sim U[\kappa_{m - 1} ,\kappa_{m + 1} ]. $$

Bayesian implementations of the tobit model are based on data augmentation, and they have been applied to statistical genetic analysis for censored traits (Sorensen et al. 1998; Sillanpää and Hoti 2007). In the tobit model in Eq. 1, y * i values are not observed if y * i is greater than or equal to the threshold y T, although their values are necessary for estimating parameters in the model in Eq. 3. The unobserved y * i values are augmented by values sampled from the fully conditional posterior distribution:

$$ y_{i}^{*} |\varvec{\alpha},\varvec{\eta},\sigma_{e}^{2} \sim TN_{{[y_{\text{T}},\infty )}} ({\mathbf{q}}_{i}\varvec{\alpha}+ {\mathbf{x}}_{i}\varvec{\eta},\sigma_{e}^{2} ) $$
(A1)

where TN [a,b)(μ, σ 2) is a normal distribution N(μ, σ 2) truncated to [a, b), q i is the ith row of matrix Q, and x i is the ith row of matrix X. The sampling of unobserved y * i values was conducted repeatedly in the MCMC sampling procedure, as described in the next section.

Bayesian implementations of the ordinal probit model based on data augmentation were presented by Albert and Chib (1993), and they have been applied to multi-locus QTL mapping for ordinal traits (Yi et al. 2004, 2007). In the model, y * i in Eq. 3 is not observed for all samples, and the variance of y * i is assumed to be 1 for consistency with the standard normal c.d.f. link function (Cowles 1996). Thus, the y * i values are augmented by values sampled from the fully conditional posterior distribution:

$$ y_{i}^{*} |\varvec{\alpha},\varvec{\eta},\varvec{\kappa},y_{i} \sim TN_{{(\kappa_{y_{i - 1}} ,\kappa_{y_{i}} ]}} (\user2{q}_{i}\varvec{\alpha}+ {\mathbf{x}}_{i}\varvec{\eta},1), $$
(A2)

where \( \varvec{\kappa} \) is a vector for the bin cut-points, i.e., \( \left( {\kappa_{0} ,\kappa_{1} , \ldots ,\kappa_{M} } \right)^{T} . \) The fully conditional posterior distribution of κ m is uniform, as follows:

$$ \kappa_{m} |{\mathbf{y}},{\mathbf{y}}^{*} ,\kappa_{m - 1} ,\kappa_{m + 1} \sim U[\max (y_{i}^{*} |y_{i} = m,\kappa_{m - 1} ),\min (y_{i}^{*} |y_{i} = m + 1,\kappa_{m + 1} )]. $$
(A3)

Here, let X* = [γ 1 X 1,γ 2 X 2,…, γ K X K ]. Then, Eq. 4 can be rewritten as:

$$ \user2{y}^{*} = \user2{\rm {\bf Q}\varvec{\alpha} } + {\mathbf{X}}^{*}\varvec{\beta}+ {\mathbf{e}} $$

where \( \varvec{\beta}= \left[ {\varvec{\beta}_{1}^{T} ,\varvec{\beta}_{2}^{T} , \ldots ,\varvec{\beta}_{K}^{T} } \right]^{T}.\) Here, let

$$ \user2{\rm {{\bf Q}}\varvec{\alpha} } + \user2{\rm {\bf X}}^{*}\varvec{\beta}= \user2{\rm {\bf W}\varvec{\theta} }, $$

where

$$ {\mathbf{W}} = [{\mathbf{Q}},{\mathbf{X}}^{*} ] $$
(A4)

and \( \varvec{\theta}= \left[ {\varvec{\alpha}^{T} ,\varvec{\beta}^{T} } \right]^{T}, \) and let

$$ \Upsigma = \left[ {\begin{array}{*{20}c} {\mathbf{0}} & {\mathbf{0}} & \cdots & {\mathbf{0}} \\ {\mathbf{0}} & {{\mathbf{I}}\sigma_{e}^{2} /\sigma_{{\beta_{1} }}^{2} } & {} & \vdots \\ \vdots & {} & \ddots & {\mathbf{0}} \\ {\mathbf{0}} & \cdots & {\mathbf{0}} & {{\mathbf{I}}\sigma_{e}^{2} /\sigma_{{\beta_{K} }}^{2} } \\ \end{array} } \right], $$
(A5)
$$ \user2{\rm {\bf C}} = \user2{\rm {\bf W}}^{T} \user2{\rm {\bf W}} +\varvec{\Upsigma}, $$
(A6)

and

$$ {\mathbf{r}} = {\mathbf{W}}^{T} {\mathbf{y}}^{*} . $$
(A7)

Then, the conditional posterior distribution of the ith element of \( \varvec{\theta} \) is:

$$ \theta_{i} |\varvec{\theta}_{ - i} ,\varvec{\gamma} ,\sigma_{e}^{2} ,{\mathbf{y}}^{*} \sim N\left( {\tilde{\theta }_{i} ,\sigma_{e}^{2} /c_{i,i} } \right), $$
(A8)

where \( \varvec{\gamma} \) is a vector whose kth element is γ k , \( \tilde{\theta }_{i} = (r_{i} - {\mathbf{C}}_{i, - i}\varvec{\theta}_{ - i} )/c_{i,i,},\) is the ith diagonal element of the matrix C, r i is the ith element of vector r, C i,−i is a row vector obtained by deleting element i from the ith row of the matrix C, and \( \varvec{\theta}_{ - i} \) is a vector obtained by deleting element i from the vector \( \varvec{\theta} \) (Sorensen and Gianola 2002, p. 566).

The fully conditional posterior distribution of \( \sigma_{{\beta_{k} }}^{2} \) is given by

$$ \sigma_{{\beta_{k} }}^{2} |\varvec{\beta}_{k} \sim \tilde{\upsilon }_{{\beta_{k} }} \tilde{s}_{{\beta_{k} }}^{2} \tilde{\chi }_{{\tilde{\upsilon }_{{\beta_{k} }} }}^{ - 2} , $$
(A9)

where \( \varvec{\beta}_{k} \) is a column vector for the genetic effects of marker k, i.e., \( \left( {\beta_{k2} , \ldots ,\beta_{{kL_{k} }} } \right)^{T} ,\quad \tilde{\upsilon }_{{\beta_{k} }} = L_{k} + \upsilon_{\beta } - 1, \) and \( \tilde{s}_{{\beta_{k} }}^{2} = \left[ {\varvec{\beta}_{k}^{T}\varvec{\beta}_{k} + \upsilon_{\beta } s_{\beta }^{2} } \right]/\tilde{\upsilon }_{{\beta_{k} }} . \)

For the tobit model, the fully conditional posterior distribution of σ 2 e is given by:

$$ \sigma_{e}^{2} |\varvec{\theta},\varvec{\gamma},{\mathbf{y}}^{*} \sim \tilde{\upsilon }_{e} \tilde{s}_{e}^{2} \tilde{\chi }_{{\tilde{\upsilon }_{e} }}^{ - 2} , $$
(A10)

where \( \tilde{\upsilon }_{e} = n + \upsilon_{e} \) and \( \tilde{s}_{e}^{2} = [({\mathbf{y}}^{*} - {\mathbf{W}}\varvec{\theta})^{T} ({\mathbf{y}}^{*} - {\mathbf{W}}\varvec{\theta}) + \upsilon_{e} s_{e}^{2} ]/\tilde{\upsilon }_{e} . \) For the ordinal probit model, σ 2 e is fixed at 1 (see Eq. A2).

The fully conditional posterior distribution of γ k is given by:

$$ \gamma_{k} |\varvec{\alpha},\varvec{\beta},\varvec{\gamma}_{ - k} ,\sigma_{e}^{2} ,{\mathbf{y}}^{*} \sim B(1, \tilde{p}_{k} ), $$
(A11)

where \( \varvec{\gamma}_{ - k} \) is a vector obtained by removing element k from the vector \( \varvec{\gamma}, \)

$$ \begin{aligned} \tilde{p}_{k} & = \left\{ {\begin{array}{*{20}c} {a_{k} /(a_k + b_k)} & {{\text{if}}\,g_{k} < Q_{\text{max}} } \\ 0 & {{\text{if}}\,g_{k} \ge Q_{\text{max}} } \\ \end{array} ,} \right. \\ a_{k} & = \frac{{\lambda^{{g_{k} + 1}} }}{{c(g_{k} + 1)!}}\exp \left\{ { - \frac{1}{{2\sigma_{e}^{2} }}({\mathbf{y}}^{*} - {\mathbf{Q}}\varvec{\alpha}- {\mathbf{X}}\varvec{\eta}_{k}^{*} )^{T} ({\mathbf{y}}^{*} - {\mathbf{Q}}\varvec{\alpha}- {\mathbf{X}}\varvec{\eta}_{k}^{*} ) - \lambda } \right\},\quad{\text{and}} \\ b_{k} & = \frac{{\lambda^{{g_{k} + 1}} }}{{cg_{k} !}}\exp \left\{ { - \frac{1}{{2\sigma_{e}^{2} }}({\mathbf{y}}^{*} - {\mathbf{Q}}\varvec{\alpha}- {\mathbf{X}}\varvec{\eta}_{k}^{**} )^{T} ({\mathbf{y}}^{*} - {\mathbf{Q}}\varvec{\alpha}- {\mathbf{X}}\varvec{\eta}_{k}^{**} ) - \lambda } \right\} \\ \end{aligned} $$

where

$$\begin{aligned}g_{k}& = \sum_{i = 1}^{K}{\gamma_{i} - \gamma_{k}\quad{\text{and}}} \\c &= \sum_{i = 1}^{{Q_{\max}}}{\frac{{\lambda^{i} \exp ( -\lambda)}} {i!}} .\end{aligned}$$

The vector \( \varvec{\eta}_{k}^{*} \) is the column vector of \( \varvec{\eta} \) with the entries corresponding to marker k replaced by \( \varvec{\beta}_{k} \). Similarly, \( \varvec{\eta}_{k}^{**} \) is obtained from \( \varvec{\eta} \) with the entries corresponding to marker k replaced by the null vector (0).

In the analyses, we set hyperparameters for the prior distributions as \( \upsilon_{\beta } = 2,s_{{_{\beta } }}^{2} = 1,\upsilon_{e} = - 2,s_{e}^{2} = 0,\lambda = 1, \) and \( Q_{\max } = 15. \) In this hyperparameter setting, the prior of \( \sigma_{e}^{2} \) becomes a flat prior (i.e., an improper uniform distribution). Thus, the parameters \( \varvec{\alpha}, \) \( \sigma_{e}^{2} \) and \( \varvec{\kappa} \) had improper distributions in this study. When improper priors are assigned, the posterior distributions may not always be proper (Hobert and Casella 1996). One way to avoid an improper posterior distribution caused by an improper prior distribution is to specify upper and/or lower limits of parameters. In this study, however, we did not specify such limits for the parameters \( \varvec{\alpha},\sigma_{e}^{2} \) and \( \varvec{\kappa}, \) because the improper priors of these parameters seemed to work well in the simulation studies.

Appendix B: MCMC sampling

On the basis of the above equations for prior and posterior distributions, we can use the Gibbs sampler to generate MCMC samples from the posterior distribution of the model parameters. In the sampling of y * i and κ m in the model in Eq. 2, however, we used the multivariate Hastings-within-Gibbs algorithm proposed by Cowles (1996), since the latter algorithm substantially improves the convergence of the MCMC estimations (Cowles 1996). Setting the initial values of the parameters as σ 2 e  = 1, \( \sigma_{{\beta_{k} }}^{2} = 1, \) α j  = 0, β kl  = 0, and γ k  = 0, the MCMC sampling proceeds as follows:

  1. 1.

    Update W and Σ with Eqs. A4 and A5, respectively.

  2. 2.

    Sample \( \varvec{\kappa} \) from the full conditional posterior distribution described in Eq. A3. (This step is only necessary for ordinal data.)

  3. 3.

    Sample y *. The full conditional posterior distribution of y * is described in Eq. A1 for censored data and Eq. A2 for ordinal data.

  4. 4.

    Update C and r with Eqs. A6 and A7, respectively.

  5. 5.

    Sample \( \varvec{\alpha} \) and \( \varvec{\beta} \) (i.e., \( \varvec{\theta} \)) from the full conditional posterior distribution described in Eq. A8.

  6. 6.

    Sample \( \sigma_{{\beta_{k} }}^{2} \) for each k from the full conditional posterior distribution described in Eq. A9.

  7. 7.

    Sample \( \sigma_{e}^{2} \) from the full conditional posterior distribution described in Eq. A10. (This step is not necessary for ordinal data.)

  8. 8.

    Sample \( \varvec{\gamma} \) from the full conditional posterior distribution described in Eq. A11.

The above process was repeated many times (see “Data analysis procedure” in “Materials and methods”) to obtain the MCMC samples.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Iwata, H., Ebana, K., Fukuoka, S. et al. Bayesian multilocus association mapping on ordinal and censored traits and its application to the analysis of genetic variation among Oryza sativa L. germplasms. Theor Appl Genet 118, 865–880 (2009). https://doi.org/10.1007/s00122-008-0945-6

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00122-008-0945-6

Keywords

Navigation