Skip to main content

Advertisement

Log in

Combined Linkage and Association Mapping of Quantitative Trait Loci with Missing Completely at Random Genotype Data

  • Original Research
  • Published:
Behavior Genetics Aims and scope Submit manuscript

Abstract

In genetics study, the genotypes or phenotypes can be missing due to various reasons. In this paper, the impact of missing genotypes is investigated for high resolution combined linkage and association mapping of quantitative trait loci (QTL). We assume that the genotype data are missing completely at random (MCAR). Two regression models, “genotype effect model” and “additive effect model”, are proposed to model the association between the markers and the trait locus. If the marker genotype is not missing, the model is exactly the same as those of our previous study, i.e., the number of genotype or allele is used as weight to model the effect of the genotype or allele in single marker case. If the marker genotype is missing, the expected number of genotype or allele is used as weight to model the effect of the genotype or allele. By analytical formulae, we show that the “genotype effect model” can be used to model the additive and dominance effects simultaneously, and the “additive effect model” can only be used to model the additive effect. Based on the two models, F-test statistics are proposed to test association between the QTL and markers. The non-centrality parameter approximations of F-test statistics are derived to calculate power and to compare power, which show that the power of the F-tests is reduced due to the missingness. By simulation study, we show that the two models have reasonable type I error rates for a dataset of moderate sample size. However, the type I error rates can be very slightly inflated if all individuals with missing genotypes are removed from analysis. Hence, the proposed method can help to get correct type I error rates although it does not improve power. As a practical example, the method is applied to analyze the angiotensin-1 converting enzyme (ACE) data.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

References

  • Abecasis GR, Cardon LR, Cookson WOC (2000a). A general test of association for quantitative traits in nuclear families. Am J Hum Genet 66:279–292

    Article  PubMed  CAS  Google Scholar 

  • Abecasis GR, Cherny SS, Cookson WOC, Cardon LR (2002) Merlin—rapid analysis of dense genetic maps using sparse gene flow trees. Nat Genet 30:97–101

    Article  PubMed  CAS  Google Scholar 

  • Abecasis GR, Cookson WOC, Cardon LR (2000b). Pedigree tests of linkage disequilibrium. Eur J Hum Genet 8:545–551

    Article  PubMed  CAS  Google Scholar 

  • Allison DB (2001) Joint tests of linkage and association for quantitative traits. Theor Popul Biol 60:239–251

    Article  PubMed  CAS  Google Scholar 

  • Almasy L, Blangero J (1998) Multipoint quantitative trait linkage analysis in general pedigrees. Am J Hum Genet 62:1198–1211

    Article  PubMed  CAS  Google Scholar 

  • Almasy L, Williams JT, Dyer TD, Blangero J (1999) Quantitative trait locus detection using combined linkage/disequilibrium analysis. Genet Epidemiol 17(Suppl 1):S31–S36

    PubMed  Google Scholar 

  • Amos CI (1994) Robust variance-components approach for assessing linkage in pedigrees. Am J Hum Genet 54:534–543

    Google Scholar 

  • Amos CI, Elston RC (1989) Robust methods for the detection of genetic linkage for quantitative data from pedigrees. Genet Epidemiol 6:349–360

    Article  PubMed  CAS  Google Scholar 

  • Boerwinkle E, Chakraborty E, Sing CF (1986) The use of measured genotype information in the analysis of quantitative phenotype in man. I. models and analytical methods. Ann Hum Genet 50:181–194

    Article  PubMed  CAS  Google Scholar 

  • Falconer DS, Mackay TFC (1996) Introduction to quantitative genetics, 4th edn. Longman, London

    Google Scholar 

  • Fan RZ, Jung JS (2003) High resolution joint linkage disequilibrium and linkage mapping of quantitative trait loci based on sibship data. Hum Hered 56:166–187

    Article  PubMed  CAS  Google Scholar 

  • Fan RZ, Jung JS, Jin J (2006) High resolution association mapping of quantitative trait loci, a population based approach. Genetics 172:663–686

    Article  PubMed  CAS  Google Scholar 

  • Fan RZ, Spinka C, Jin L, Jung JS (2005) Pedigree linkage disequilibrium mapping of quantitative trait loci. Eur J Hum Genet 13:216–231

    Article  PubMed  CAS  Google Scholar 

  • Fan RZ, Xiong MM (2003) Combined high resolution linkage and association mapping of quantitative trait loci. Eur J Hum Genet 11:125–137

    Article  PubMed  CAS  Google Scholar 

  • Farrall M, Keavney B, MckKenzie CA, Delèpine M, Matsuda F, Lathrop GM (1999) Fine mapping of an ancestral recombination break-point in DCP1. Nat Genet 23:270–271

    Article  PubMed  CAS  Google Scholar 

  • Feingold E (2002) Invited editorial: regression-based quantitative-trait-locus mapping in the 21st century. Am J Hum Genet 71:217–222

    Article  PubMed  CAS  Google Scholar 

  • Fulker DW, Cherny SS, Cardon LR (1995) Multiple interval mapping of quantitative trait loci, using sib-pairs. Am J Hum Genet 56:1224–1233

    PubMed  CAS  Google Scholar 

  • Fulker DW, Cherny SS, Sham PC, Hewitt JK (1999) Combined linkage and association sib-pair analysis for quantitative traits. Am J Hum Genet 64:259–267

    Article  PubMed  CAS  Google Scholar 

  • George V, Tiwari HK, Zhu XF, Elston RC (1999) A test of transmission/disequilibrium for quantitative traits in pedigree data, by multiple regression. Am J Hum Genet 65:236–245

    Article  PubMed  CAS  Google Scholar 

  • Goldgar DE (1990) Multipoint analysis of human quantitative genetic variation. Am J Hum Genet 47:957–967

    PubMed  CAS  Google Scholar 

  • Graybill FA (1976) Theory and application of the linear model. Pacific Grove, California

    Google Scholar 

  • Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer

  • Haseman JK, Elston RC (1972) The investigation of linkage between a quantitative trait and a marker locus. Behav Genet 2:3–19

    Article  PubMed  CAS  Google Scholar 

  • Hedrick PW (1987) Gametic disequilibrium measures: proceed with caution. Genetics 117:331–341

    PubMed  CAS  Google Scholar 

  • Jung JS, Fan RZ, Jin L (2005) Combined linkage and association mapping of quantitative trait loci by multiple markers. Genetics 170:881–898

    Article  PubMed  CAS  Google Scholar 

  • Keavney B, MckKenzie CA, Connell JM, Julier C, Ratcliffe PJ, Sobel E, Lathrop M, Farrall M (1998) Measured haplotype analysis of the angiotension-1 converting enzyme gene. Hum Mol Genet 7:1745–1751

    Article  PubMed  CAS  Google Scholar 

  • Lange K (2002) Mathematical and Statistical methods for genetic analysis, 2nd edn. Springer

  • Li M, Boehnke M, Abecasis GR (2005) Joint modeling of linkage and association: identifying SNPs responsible for a linkage signal. Am J Hum Genet 76:934–49

    Article  PubMed  CAS  Google Scholar 

  • Little RJA, Rubin DB (2002) Statistical analysis with missing data, 2nd edn. Wiley Inter-Science, Wiley, Inc., Publication

  • Pinheiro JC, Bates DM (2000) Mixed-effects models in S and S-plus. Springer

  • Pratt SC, Daly M, Kruglyak L (2000) Exact multipoint quantitative-trait linkage analysis in pedigrees by variance components. Am J Hum Genet 66:1153–1157

    Article  PubMed  CAS  Google Scholar 

  • Sham PC, Cherny SS, Purcell S, Hewitt JK (2000) Power of linkage versus association analysis of quantitative traits, by use of variance-components models, for sibship data. Am J Hum Genet 66:1616–1630

    Article  PubMed  CAS  Google Scholar 

  • Wang T, Elston RC (2005) The bias introduced by population stratification in IBD based linkage analysis. Hum Hered 60:134–142

    Article  PubMed  Google Scholar 

  • Xiong MM, Jin L (2000) Combined linkage and linkage disequilibrium mapping for genome screens. Genet Epidemiol 19:211–234

    Article  PubMed  CAS  Google Scholar 

Download references

Acknowledgments

The research was supported by the National Science Foundation Grant DMS-0505025. We thank two anonymous reviewers for very detailed and thoughtful critiques, which make the paper better.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruzong Fan.

Additional information

Edited by Pak Sham.

Appendices

Appendix A

Multiplying both sides of the “genotype effect model” (1) by \(1_{(G_{Aij}=A_gA_h)}\) and taking expectation lead to

$$ \begin{aligned} \hbox{E} (y_{ij} 1_{(G_{Aij}=A_gA_h)}) &=w_{ij} \gamma \hbox{E} [1_{(G_{Aij}=A_gA_h)}]+ \hbox{E} [1_{(G_{Aij}=A_gA_h)}] \beta_{gh} \\ &= \left\{ \begin{array}{ll}(1-\varepsilon_A)[w_{ij} \gamma +\beta_{gg} ] P_{A_g}^2 &\hbox{ if }g=h \\ (1-\varepsilon_A)[w_{ij} \gamma +\beta_{gh} ] \cdot 2 P_{A_g}P_{A_h} &\hbox{ if }g \neq h\end{array}\right..\end{aligned} $$
(17)

Let G Qij be genotype of the j-th individual of the i-th family at the trait locus Q. A true random effect model describing the trait value is y ij  = w ij γ + g ij H ij  + e ij , where

$$g_{ij} = \left\{ \begin{array}{ll}a & G_{Qij}=Q_1Q_1 \\ d & G_{Qij}=Q_1Q_2 \\ -a & G_{Qij}=Q_2Q_2 \end{array}\right..$$

Since the missing mechanism is missing completely at random, we have

$$ \begin{aligned} P(G_{Qij}=Q_1 Q_1, G_{Aij}=A_gA_g|G_{Aij} \neq ?) &= [P(Q_1A_g)]^2, \\ P(G_{Qij}=Q_1 Q_2, G_{Aij}=A_gA_g|G_{Aij} \neq ?) &= 2 P(Q_1A_g) P(Q_2A_g), \\ P(G_{Qij}=Q_2 Q_2, G_{Aij}=A_gA_g|G_{Aij} \neq ?) &= [P(Q_2A_g)]^2. \end{aligned} $$

Utilizing relations \(P(Q_1A_g)=-D_{A_gQ}+P_{A_g}q_1\) and \(P(Q_2A_g)=-D_{A_gQ}+P_{A_g}q_2,\) we have

$$ \begin{aligned} \hbox{E} (y_{ij} 1_{(G_{Aij}=A_gA_g)}) &=w_{ij} \gamma \hbox{E} [1_{(G_{Aij}=A_gA_g)}]+ \hbox{E} [ g_{ij} 1_{(G_{Aij}=A_gA_g)}] \\ &= w_{ij} \gamma P(A_g A_g | G_{Aij} \neq ?)P(G_{Aij} \neq ?) + \hbox{E} [ g_{ij} 1_{(G_{Aij}=A_gA_g)}| G_{Aij} \neq ?] P(G_{Aij} \neq ?) \\ &= (1-\varepsilon_A) \left[ w_{ij}\gamma P_{A_g}^2+ a [P(Q_1A_g)]^2 + d \cdot 2P(Q_1A_g)P(Q_2A_g)-a [P(Q_2A_g]^2 \right] \\ &=(1-\varepsilon_A) \left[ w_{ij}\gamma P_{A_g}^2+ \mu P_{A_g}^2+ 2 D_{A_gQ} \alpha_Q P_{A_g}-\delta_Q D^2_{A_gQ} \right]. \end{aligned} $$
(18)

Equating Eqs. 17 and 18, we show the Eq. 5 when g = h. Now assume that g ≠ h. Since the missing mechanism is missing completely at random, we have

$$ \begin{aligned} P(G_{Qij}=Q_1 Q_1, G_{Aij}=A_gA_h|G_{Aij} \neq ?) &= 2 P(Q_1A_g) P(Q_1A_h), \\ P(G_{Qij}=Q_1 Q_2, G_{Aij}=A_gA_h|G_{Aij} \neq ?) &= 2P(Q_1A_g) P(Q_2A_h) + 2P(Q_1A_h) P(Q_2A_g), \\ P(G_{Qij}=Q_2 Q_2, G_{Aij}=A_gA_h|G_{Aij} \neq ?) &= 2P(Q_2A_g) P(Q_2A_h). \end{aligned} $$

Utilizing relations \(P(Q_1A_g)=D_{A_gQ}+P_{A_g}q_1,\,\, P(Q_2A_g)=-D_{A_gQ}+P_{A_g}q_2,\,\, P(Q_1A_h)=D_{A_hQ}+P_{A_h}q_1,\,\, P(Q_2A_h)=-D_{A_hQ}+P_{A_h}q_2,\) we have

$$ \begin{aligned} \hbox{E} (y_{ij} 1_{(G_{Aij}=A_gA_h)}) &=w_{ij} \gamma \hbox{E} [1_{(G_{Aij}=A_gA_h)}]+ \hbox{E} [ g 1_{(G_{Aij}=A_gA_h)}] \\ &= w_{ij} \gamma P(A_g A_h | G_{Aij} \neq ?)P(G_{Aij} \neq ?) + \hbox{E} [ g 1_{(G_{Aij}=A_gA_h)}| G_{Aij} \neq ?] P(G_{Aij} \neq ?) \\ &= (1-\varepsilon_A) \left[ w_{ij}\gamma\,\cdot\,2P_{A_g} P_{A_h}+ 2a \left(P(Q_1A_g)P(Q_1A_h) - P(Q_2A_g)P(Q_2A_h) \right) \right.\\ &\left. + d \left(2P(Q_1A_g)P(Q_2A_h) + 2P(Q_2A_g)P(Q_1A_h) \right) \right] \\ &= (1-\varepsilon_A) \left[ 2P_{A_g} P_{A_h} w_{ij}\gamma+ 2P_{A_g} P_{A_h} \mu + 2 \alpha_Q \left(D_{A_gQ} P_{A_h}+ D_{A_hQ} P_{A_g} \right) -2 \delta_Q D_{A_gQ} D_{A_hQ} \right]. \\ \end{aligned} $$
(19)

Equating Eqs. 17 and 18, we show the Eq. 5 when gh.

Appendix B

In relations (17), replacing β gh by α g  + α h and taking summation lead to

$$ \begin{aligned} \hbox{E} (y_{ij}1_{(G_{Aij} \neq ?)}) &= \sum_{1 \le g \le h \le m} \hbox{E} (y_{ij} 1_{(G_{Aij}=A_gA_h)}) \\ &= (1-\varepsilon_A)\sum_{g=1}^m \sum_{h=1}^m \left(w_{ij} \gamma +\alpha_g+\alpha_h \right) P_{A_g}P_{A_h}\\ &= (1-\varepsilon_A) \left(w_{ij} \gamma + 2\sum_{g=1}^m \alpha_gP_{A_g} \right).\end{aligned} $$

Since the missing mechanism is missing completely at random, one has E(y ij 1(G_Aij ≠ ?)) = E(y ij |G Aij ≠ ?)(1 − ɛ A ) = (1−ɛ A )Ey ij  = (1 − ɛ A )(w ij γ + μ). Thus, \(\sum_{g=1}^m \alpha_gP_{A_g} = \mu/2.\)

Again, replacing β gh by α g  + α h in relations (17) and taking summation with respect to h lead to

$$ \begin{aligned} \hbox{E} \left[ y_{ij}1_{(G_{Aij} =A_gA_g)} + \frac{1}{2} \sum_{h \neq g} y_{ij}1_{(G_{Aij} =A_gA_h)} \right] &=(1-\varepsilon_A) \sum_{h=1}^m \left(w_{ij} \gamma +\alpha_g+\alpha_h \right) P_{A_g}P_{A_h} \\ &= (1-\varepsilon_A)P_{A_g} \left(w_{ij} \gamma + \alpha_g+ \sum_{h=1}^m \alpha_hP_{A_h} \right) \\ &= (1-\varepsilon_A)P_{A_g} \left(w_{ij} \gamma + \alpha_g+ \mu/2 \right). \end{aligned} $$
(20)

Notice \(\sum_{g=1}^m D_{A_g Q} =0.\) Taking summation of relations (18) and (19) leads to

$$ \hbox{E} \left[ y_{ij}1_{(G_{Aij} =A_gA_g)} + \frac{1}{2} \sum_{h \neq g} y_{ij}1_{(G_{Aij} =A_gA_h)} \right] =(1-\varepsilon_A)P_{A_g} \left[ w_{ij} \gamma +\mu +D_{A_gQ} \alpha_Q /P_{A_g} \right]. $$
(21)

Equating the right-hand terms of relations (20) and (21) leads to (6).

Appendix C

Assume that there are no covariates, and the dataset is a population sample. Then the model matrix of “genotype effect model” (1) is \(X_i=X_{Ai1}^{\tau}= (x_{Ai1}^{(11)},\ldots, x_{Ai1}^{(mm)}, x_{Ai1}^{(12)},\ldots, x_{Ai1}^{(1m)},\ldots,x_{Ai1}^{(m-1,m)}), i=1,\ldots, N.\) To show non-centrality parameter approximation (7), we first notice the following relation

$$ \hbox{E} [ X_1^{\tau} X_1] = (1- \varepsilon_A) \hbox{diag} (P_{A_1}^2, v^{\tau}) + \varepsilon_A \left(\begin{array}{c}P_{A_1}^2 \\ v \end{array}\right) (P_{A_1}^2, v^{\tau}), $$
(22)

where v is a column vector given by \( v^{\tau} =\left(P_{A_2}^2,\ldots, P_{A_m}^2, 2P_{A_1}P_{A_2},\ldots, 2P_{A_1}P_{A_m},\ldots,2P_{A_{m-1}}P_{A_m} \right).\) In addition, \(\hbox{diag} (P_{A_1}^2,v^{\tau})\) is a diagonal matrix, whose elements on the diagonal are given by the elements of \((P_{A_1}^2,v^{\tau}).\) We may verify (22) by \(E [(x_{A11}^{(gh)})^2] = \hbox{E} 1_{(G_{{A11}=A_gA_h)}}+P(A_gA_h)^2 \hbox{E} 1_{(G_{A11}=?)}=P(A_gA_h)(1-\varepsilon_A)+P(A_gA_h)^2\varepsilon_A,\) and for \((g,h) \neq (k,l), \ E [x_{A11}^{(gh)} x_{A11}^{(kl)} ] = P(A_gA_h)P(A_kA_l) \hbox{E} 1_{(G_{A11}=?)} =P(A_gA_h)P(A_kA_l) \varepsilon_A.\)

Let us denote \(u=\left(P_{A_2}^{-2},\ldots, P_{A_m}^{-2}, [2P_{A_1}P_{A_2}]^{-2},\ldots, [2P_{A_1}P_{A_m}]^{-2}, \cdots, [2P_{A_{m-1}}P_{A_m}]^{-2} \right).\) Applying the large number law and a fact of inverse matrix \((M+a b^{\tau})^{-1} = M^{-1}-(M^{-1} a) (b^{\tau} M^{-1})/ (1+ b^{\tau} M^{-1} a),\) we can calculate the following approximation

$$ \begin{aligned} T (X^{\tau} X)^{-1} T^{\tau} &\approx T \left[N \hbox{E} \left(X_1^{\tau} X_1 \right) \right]^{-1} T^{\tau} \\&= N^{-1} \cdot T \left[ (1- \varepsilon_A) \hbox{diag}(P_{A_1}^2,v^{\tau}) + \varepsilon_A\begin{pmatrix} P_{A_1}^2 \\ v\end{pmatrix}(P_{A_1}^2, v^{\tau}) \right]^{-1} T^{\tau}\\ &= [(1-\varepsilon_A)N]^{-1} \cdot T \left[ \hbox{diag} (P_{A_1}^{-2},u^{\tau})- \varepsilon_A \left(\begin{array}{c} 1\\ 1 \\ \vdots\\ 1 \end{array}\right)(1,1,\ldots,1) \right] T^{\tau}\\ &= [(1-\varepsilon_A)N]^{-1} \cdot T \hbox{diag} (P_{A_1}^{-2}, u^{\tau})T^{\tau}. \end{aligned} $$

Utilizing above relation, we may show non-centrality parameter approximation (7) in the same way as Appendix III, Fan et al. (2006).

Appendix D

Assume that there are no covariates, and the dataset is a population sample. Then the model matrix of “additive effect model” (3) is \(X_i=Z_{Ai1}^{\tau}= (x_{Ai1}^{(1)},\ldots, x_{Ai1}^{(m)}), i=1,\ldots, N.\) To show non-centrality parameter approximation (8), we first notice the following relation

$$ \hbox{E} [ Z_{A11} Z_{A11}^{\tau}] = 2(1- \varepsilon_A) \left[ \hbox{diag} (P_{A_1}, \cdots, P_{A_m}) + \left(\begin{array}{c} P_{A_1} \\ \vdots \\ P_{A_m}\end{array}\right)(P_{A_1}, \cdots, P_{A_m}) \right] + 4 \varepsilon_A \left(\begin{array}{c}P_{A_1} \\ \vdots \\ P_{A_m}\end{array}\right)(P_{A_1}, \cdots, P_{A_m}), $$

which can be verified by \(E [(x_{A11}^{(g)})^2] = 4\hbox{E} 1_{(G_{A11}=A_gA_g)}+\sum_{h \neq g} \hbox{E} 1_{(G_{A11} =A_gA_h)} + 4P_{A_g}^2 \hbox{E} 1_{(G_{A11}=?)} =2(1-\varepsilon_A)P_{A_g} [1+ P_{A_g}] +4P_{A_g}^2\varepsilon_A,\) and for \(h \neq g, \ E [x_{A11}^{(g)} x_{A11}^{(h)} ]=(1- \varepsilon_A)\,\cdot\,2P_{A_g}P_{A_h} + 4P_{A_g}P_{A_h} \varepsilon_A.\) Let \(X=(Z_{A11},\ldots, Z_{AN1})^{\tau}.\) Applying the large number law and a fact of inverse matrix \((M+a b^{\tau})^{-1} = M^{-1}-(M^{-1} a) (b^{\tau} M^{-1})/ (1+ b^{\tau} M^{-1} a),\) we can calculate the following approximation

$$ \begin{aligned} K (X^{\tau} X)^{-1} K^{\tau} &\approx K \left[ N \hbox{E} \left(Z_{A11} Z_{A11}^{\tau} \right) \right]^{-1} K^{\tau} \\ &= N^{-1} \cdot K \left[ 2(1- \varepsilon_A) \hbox{diag} (P_{A_1},\ldots, P_{A_m}) +2(1+ \varepsilon_A) \left(\begin{array}{c} P_{A_1} \\ \vdots \\ P_{A_M} \end{array}\right) (P_{A_1},\ldots, P_{A_m}) \right]^{-1} K^{\tau}\\ &= [2(1- \varepsilon_A)N]^{-1} \cdot K \left[ \hbox{diag} (P_{A_1}^{-1},\ldots, P_{A_m}^{-1})- (1+\varepsilon_A) \left(\begin{array}{c} 1 \\ 1 \\ \vdots \\ 1 \end{array}\right) (1,1,\ldots,1)/2 \right] K^{\tau}\\ &= [2(1- \varepsilon_A)N]^{-1} \cdot K \hbox{diag} (P_{A_1}^{-1},\ldots, P_{A_m}^{-1}) K^{\tau}. \end{aligned} $$

Utilizing above relation, we may show non-centrality parameter approximation (8) in the same way as Appendix IV, Fan et al. (2006).

Appendix E

For g = 1,2,…,m, k = 1,…,n, let us denote \(D_{A_gB_k}=P(A_gB_k)-P_{A_g}P_{B_k},\)which are measures of LD between markers A and B. Here, P(A g B k ) is frequency of haplotype A g B k . It can be shown that for g ≠ h, k ≠ l, ≠ h′, l ≠ l′, (g,h) ≠ (g′,h′), (k,l) ≠ (k′,l′)

$$ \begin{array}{l} \hbox{E}\,x_{Aij}^{(g)}=2P_{A_g}, \hbox{E} (x_{Aij}^{(g)})^2 =(1-\varepsilon_A)(2P_{A_g}^2 +2P_{A_g})+4P_{A_g}^2 \varepsilon_A, \hbox{E} [x_{Aij}^{(g)} x_{Aij}^{(h)}] = 2P_{A_g}P_{A_h}(1-\varepsilon_A)+ 4P_{A_g}P_{A_h} \varepsilon_A, \\ \hbox{E}\,x_{Bij}^{(k)}=2P_{B_k}, \hbox{E} (x_{Bij}^{(k)})^2 = (1-\varepsilon_B)(2P_{B_k}^2 +2P_{B_k})+4P_{B_k}^2 \varepsilon_B,\hbox{E} [x_{Bij}^{(k)} x_{Bij}^{(l)}] = 2P_{B_k}P_{B_l}(1-\varepsilon_B)+ 4P_{B_k}P_{B_l} \varepsilon_B, \\ \hbox{E}\,z_{Aij}^{(gh)}=0, \hbox{E} (z_{Aij}^{(gh)})^2=(1-\varepsilon_A)P_{A_g}^2 P_{A_h}^2 [ P_{A_g}+P_{A_h}]^2, \hbox{E}\,z_{Bij}^{(kl)}=0, \hbox{E} (z_{Bij}^{(kl)})^2=(1-\varepsilon_B) P_{B_k}^2 P_{B_l}^2[ P_{B_k}+P_{B_l}]^2, \\ \hbox{E} [ x_{Aij}^{(g)} z_{Aij}^{(gh)} ]= \hbox{E} [ x_{Aij}^{(g)} z_{Aij}^{(hh^{\prime})} ]=\hbox{E} [ x_{Bij}^{(k)} z_{Bij}^{(kl)} ]= \hbox{E} [ x_{Bij}^{(k)} z_{Bij}^{(ll^{\prime})} ]=\hbox{E} [ x_{Aij}^{(g)} z_{Bij}^{(kl)} ]=\hbox{E} [ x_{Bij}^{(k)} z_{Aij}^{(gh)} ]=0, \\ \hbox{E} [ x_{Aij}^{(g)} x_{Bij}^{(k)} ]=2D_{A_gB_k}(1-\varepsilon_A)(1-\varepsilon_B) +4P_{A_g}P_{B_k},\hbox{E} [ z_{Aij}^{(gh)} z_{Aij}^{(gh^{\prime})}]=(P_{A_g}P_{A_h}P_{A_h^{\prime}})^2(1-\varepsilon_A), \\ \hbox{E} [ z_{Aij}^{(gh)} z_{Aij}^{(g^{\prime}h^{\prime})} ]=0,\hbox{E} [ z_{Bij}^{(kl)} z_{Bij}^{(kl^{\prime})} ] =(P_{B_k}P_{B_l}P_{B_l^{\prime}})^2(1-\varepsilon_B), \hbox{E} [ z_{Bij}^{(kl)} z_{Bij}^{(k^{\prime}l^{\prime})} ]=0,\\ \hbox{E} [ z_{Aij}^{(gh)} z_{Bij}^{(kl)} ]= \left[ P_{A_h} \left(P_{B_l}D_{A_gB_k}-P_{B_k} D_{A_gB_l} \right) - P_{A_g} \left(P_{B_l}D_{A_hB_k}-P_{B_k} D_{A_hB_l} \right) \right]^2(1-\varepsilon_A) (1-\varepsilon_B), \\ \hbox{E} [ y_{ij} x_{Aij}^{(g)} ] = 2P_{A_g} (w_{ij} \gamma + \mu)+2 \alpha_Q D_{A_gQ}(1-\varepsilon_A), \hbox{E} [ y_{ij} x_{Bij}^{(k)} ] = 2P_{B_k} (w_{ij} \gamma + \mu)+2 \alpha_Q D_{B_kQ}(1-\varepsilon_B), \\ \hbox{E} [ y_{ij} z_{Aij}^{(gh)} ] = \delta_Q \left[ P_{A_g} D_{A_h Q} - P_{A_h} D_{A_g Q} \right]^2(1-\varepsilon_A),\hbox{E} [ y_{ij} z_{Bij}^{(kl)} ] = \delta_Q \left[ P_{B_k} D_{B_l Q} - P_{B_l} D_{B_k Q} \right]^2 (1-\varepsilon_B). \end{array} $$
(23)

The quantities in (23) imply that the elements of V A are given by

$$ \begin{aligned} \hbox{Cov} \left(x_{Aij}^{(g)}, x_{Aij}^{(h)}\right) &= -2 P_{A_g}P_{A_h} (1-\varepsilon_A), \\ \hbox{Var} \left(x_{Aij}^{(g)}\right) = 2 P_{A_g}(1-P_{A_g})(1-\varepsilon_A), \\ \hbox{Cov} \left(x_{Aij}^{(g)}, x_{Bij}^{(k)}\right)&= 2D_{A_g B_k } (1-\varepsilon_A)(1-\varepsilon_B), \\ \hbox{Cov} \left(x_{Bij}^{(k)}, x_{Bij}^{(l)}\right)&= -2 P_{B_k}P_{B_l} (1-\varepsilon_B), \\ \hbox{Var} \left(x_{Bij}^{(k)}\right) = 2 P_{B_k}(1-P_{B_k})(1-\varepsilon_B). \end{aligned} $$

Since \(\hbox{E}\,Z_{A \cup B}^{(ij)}\) is a vector of 0s by the quantities in (23), it can be shown that \(V_D =\hbox{Cov}\left(Z_{A \cup B}^{(ij)}, Z_{A \cup B}^{(ij)}\right) = \hbox{E} \left(Z_{A \cup B}^{(ij)} (Z_{A \cup B}^{(ij)})^\tau\right).\) Moreover, the quantities in (23) imply that the covariance matrix \(\hbox{Cov}\left(X_{A \cup B}^{(ij)}, Z_{A \cup B}^{(ij)}\right)\) is a 0 matrix. In addition, the covariances between the trait value y ij and variables \(x_{Aij}^{(g)}, x_{Bij}^{(k)}, z_{Aij}^{(gh)}\) and \(z_{Bij}^{(kl)}\) are

$$ \begin{aligned} \hbox{Cov} \left(y_{ij}, x_{Aij}^{(g)}\right) &= 2 \alpha_Q (1-\varepsilon_A) D_{A_gQ}, \\ \hbox{Cov} \left(y_{ij}, x_{Bij}^{(k)}\right) = 2\alpha_Q (1-\varepsilon_B) D_{B_kQ},\\ \hbox{Cov} \left(y_{ij},z_{Aij}^{(gh)}\right)&=\hbox{E} \left[y_{ij} z_{Aij}^{(gh)}\right],\\ \hbox{Cov} (y_{ij},z_{Bij}^{(kl)})=\hbox{E} \left[y_{ij} z_{Bij}^{(kl)}\right]. \end{aligned} $$

Taking variance–covariance between y ij and \(x_{Aij}^{(g)}, x_{Bij}^{(k)}, z_{Aij}^{(gh)}, z_{Bij}^{(kl)}\) based on relation (12), we may get the regression coefficients (13) of models (10) and (12).

Appendix F

Notice \(\Upsigma_i^{-1}= \frac 1 {\sigma^2} (\gamma_{hj})_{(s+2) \times (s+2)}.\) Let X i be the model matrix of family i = 1, 2, …, I. Then

$$ X_i=\left(\begin{array}{ccccccccccccc}1 & x_{Ai1}^{(1)}& \cdots & x_{Ai1}^{(m-1)} & x_{Bi1}^{(1)}& \cdots & x_{Bi1}^{(n-1)} & z_{Ai1}^{(12)}& \cdots & z_{Ai1}^{(m-1,m)} & z_{Bi1}^{(12)}& \cdots & z_{Bi1}^{(n-1,n)}\\ 1 & x_{Ai2}^{(1)}& \cdots & x_{Ai2}^{(m-1)} & x_{Bi2}^{(1)}& \cdots & x_{Bi2}^{(n-1)} & z_{Ai2}^{(12)}& \cdots & z_{Ai2}^{(m-1,m)} & z_{Bi2}^{(12)}& \cdots & z_{Bi2}^{(n-1,n)}\\ \vdots & \vdots& \cdots & \vdots & \vdots& \cdots & \vdots & \vdots& \cdots & \vdots & \vdots& \cdots & \vdots \\ 1 & x_{Ai,s+2}^{(1)}& \cdots & x_{Ai,s+2}^{(m-1)} & x_{Bi,s+2}^{(1)}& \cdots & x_{Bi,s+2}^{(n-1)} & z_{Ai,s+2}^{(12)}& \cdots & z_{Ai,s+2}^{(m-1,m)} & z_{Bi,s+2}^{(12)}& \cdots & z_{Bi,s+2}^{(n-1,n)}\end{array}\right).$$

Denote \(\gamma= \sum_{k=1}^{s+2} \sum_{l=1}^{s+2} \gamma_{kl}.\) Applying large number law leads to an approximation as

$$ \begin{array}{l} \sum_{i=1}^I X_i^{\tau} \Upsigma_i^{-1} X_i / I \approx\\ \frac 1 {\sigma^2} \left(\begin{array}{ccc} \gamma & \gamma [ \hbox{E}(X_{A \cup B}^{(11)})]^{\tau} & O_1 \\ \gamma \hbox{E} (X_{A \cup B}^{(11)}) & \sum_{k=1}^{s+2} \gamma_{kk} V_A+b V_{A2}+\gamma \hbox{E} (X_{A \cup B}^{(11)}) [\hbox{E} (X_{A \cup B}^{(11)})]^{\tau} & O_2 \\ O_3 & O_4 &\sum_{k=1}^{s+2} \gamma_{kk} V_D+ \sum_{k=3}^{s+2} \sum_{l=k+1}^{s+2} \gamma_{kl} V_{D2}/2 \end{array}\right), \end{array} $$
(24)

where O i , i = 1,2,3,4 are zero vectors or matrices, and \(\hbox{E}\left( X_{A \cup B}^{(11)}\right) = (2 P_{A_1},\ldots, 2 P_{A_{m-1}}, 2 P_{B_1},\ldots, 2 P_{B_{n-1}})^{\tau}.\)

Let

$$S=\left(\begin{array}{ccccc} 0 & 1 & 0 & \cdots & 0 \\ 0 & 0 & 1 & \cdots & 0 \\ 0 & 0 & 0 & \cdots & 1 \end{array}\right)$$

be the test matrix corresponding to hypothesis H ABad0, and \({\phi=}({\alpha,}\,\alpha_{A1},\ldots, \alpha_{A(m-1)}, \alpha_{B1},\ldots, \alpha_{B(m-1)},\delta_{A12},\ldots, \delta_{A(m-1)m},\delta_{B12},\ldots, \delta_{B(n-1)n})^{\tau}\) be the column vector of regression coefficient of “genotype effect model” (12). Utilizing regression coefficients (13), we may show (15) by plugging approximation (24) into \(\lambda_{ABad} =(S \phi)^{\tau}[S(\sum_{i=1}^I X_i^{\tau}\Upsigma_i^{-1} X_i)^{-1} S^{\tau}]^{-1}(S \phi).\) One may want to notice that we may use Theorem 8.5.11, Harville (1997), to calculate the inverse of the right-hand matrix of (24).

Appendix G

For pedigrees in graph A of Fig. 1, the constants b 1 and b 2 of λ AB,ad in (16) are given by

$$ \begin{aligned} b_1 &= [ \gamma_{15} +(\gamma_{17} + \cdots + \gamma_{1,11})/2]+ [ \gamma_{25} +(\gamma_{27} + \cdots + \gamma_{2,11})/2] \\ & + [ \gamma_{36} +(\gamma_{37} + \cdots + \gamma_{3,11})/2]+ [ \gamma_{46} +(\gamma_{47} + \cdots +\gamma_{4,11})/2] \\ & +(\gamma_{57} + \cdots +\gamma_{5,11})+(\gamma_{67} + \cdots +\gamma_{6,11}) +\sum_{k=7}^{11} \sum_{l=k+1}^{11} \gamma_{kl}, \\ b_2&= \sum_{k=7}^{11} \sum_{l=k+1}^{11} \gamma_{kl}/2. \end{aligned} $$

For pedigrees in graph B of Fig. 1, constants b 1 and b 2 are given by

$$ \begin{aligned} b_1 &= \gamma_{1,12}+[ \gamma_{2,12} +(\gamma_{2,13} + \cdots + \gamma_{2,16})/2]+ [ \gamma_{3,12}+ \cdots+ \gamma_{3,16}]/2 \\ & +[ \gamma_{4,12} +\cdots+ \gamma_{4,16}]/2 +[ \gamma_{5,12}/2+ (\gamma_{5,13} + \cdots + \gamma_{5,16})] \\ & + [ (\gamma_{6,13}+\cdots +\gamma_{6,16})+ (\gamma_{6,17}+\gamma_{6,18})/2]+ [ \gamma_{7,13} + \cdots+ \gamma_{7,18}]/2 \\ & +[(\gamma_{8,13} + \cdots +\gamma_{8,16})/2 +(\gamma_{8,17}+\gamma_{8,18}) ] +(\gamma_{9,17}+\gamma_{9,18})+ (\gamma_{10,17}+\gamma_{10,18})/2 \\ & +(\gamma_{11,17}+\gamma_{11,18})/2 +(\gamma_{12,13} + \cdots +\gamma_{12,16})/4+(\gamma_{13,14} + \gamma_{13,15} +\gamma_{13,16}) \\ & +(\gamma_{14,15} +\gamma_{14,16}) +\gamma_{15,16}+[ \gamma_{13,17} + \cdots+ \gamma_{16,17}]/4+[ \gamma_{13,18} + \cdots+ \gamma_{16,18}]/4 +\gamma_{17,18}, \\ b_2&= [(\gamma_{13,14} +\gamma_{13,15}+ \gamma_{13,16})+(\gamma_{14,15} +\gamma_{14,16})+ \gamma_{15,16}]/2 +\gamma_{17,18}/2. \end{aligned} $$

Rights and permissions

Reprints and permissions

About this article

Cite this article

Fan, R., Liu, L., Jung, J. et al. Combined Linkage and Association Mapping of Quantitative Trait Loci with Missing Completely at Random Genotype Data. Behav Genet 38, 316–336 (2008). https://doi.org/10.1007/s10519-008-9194-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10519-008-9194-3

Keywords

Navigation