Skip to main content
Log in

Analysis of an outcome-dependent enriched sample: hypothesis tests

  • Published:
Statistical Methods & Applications Aims and scope Submit manuscript

Abstract

An outcome-dependent sample is generated by a stratified survey design where the stratification depends on the outcome. It is also known as a case–control sample in epidemiological studies and a choice-based sample in econometrical studies. An outcome-dependent enriched sample (ODE) results from combining an outcome-dependent sample with an independently collected random sample. Consider the situation where the conditional probability of a categorical outcome given its covariates follows an explicit model with an unknown parameter whereas the marginal probability of the outcome and its covariates are left unspecified. Profile-likelihood (PL) and weighted-likelihood (WL) methods have been employed to estimate the model parameter from an ODE sample. This article develops the PL- and WL-based families of tests on the model parameter from an ODE sample. Asymptotic properties of their test statistics are derived. The PL likelihood-ratio, Wald and score tests are shown to obey classical inference, i.e. their test statistics are asymptotically equivalent and Chi-squared distributed. In contrast, the WL likelihood-ratio statistic asymptotically has a weighted Chi-squared distribution and is not equivalent to the WL Wald and score statistics. Our theoretical derivation and simulation show that tests based on these new statistics carry nominal type I error and good power. Advantages of ODE sampling together with the implementation of the PL and WL methods are demonstrated in an illustrative example.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Similar content being viewed by others

References

  • Agresti AA (2002) Categorical data analysis. Wiley-Interscience, Hoboken

    Book  MATH  Google Scholar 

  • Breslow NE, Cain KC (1988) Logistic regression for two-stage case–control data. Biometrika 75:11–20

    Article  MATH  MathSciNet  Google Scholar 

  • Breslow NE (1996) Statistics in epidemiology: the case–control study. J Am Stat Assoc 91:14–28

    Article  MATH  MathSciNet  Google Scholar 

  • Breslow N, McNeney B, Wellner JA (2003) Large sample theory for semiparametric regression models with two-phase, outcome dependent sampling. Ann Stat 31:1110–1139

    Article  MATH  MathSciNet  Google Scholar 

  • Chatterjee N, Chen HY, Breslow NE (2003) A pseudoscore estimator for regression problems with two-phase sampling. J Am Stat Assoc 98:158–168

    Article  MATH  MathSciNet  Google Scholar 

  • Chatterjee N, Chen YH (2007) Maximum likelihood inference on a mixed conditionally and marginally specified regression model for genetic epidemiologic studies with two-phase sampling. J R Stat Soc B 69:123–142

    Article  MATH  MathSciNet  Google Scholar 

  • Chen HY (2003) A note on the prospective analysis of outcome-dependent samples. J R Stat Soc B 65: 575–584

    Article  MATH  Google Scholar 

  • Cosslett SR (1981a) Efficient estimation of discrete-choice models. In: Manski C, McFadden D (eds) Structural analysis of discrete data with econometric applications. The MIT Press, Cambridge, pp 51–111

    Google Scholar 

  • Cosslett SR (1981b) Maximum likelihood estimator for choice-based samples. Econometrica 49:1289–1316

    Article  MATH  MathSciNet  Google Scholar 

  • Doll R, Hill AB (1950) Smoking and carcinoma of the lung. Br Med J 221:739–748

    Article  Google Scholar 

  • Doll R, Peto R, Boreham J, Sutherland I (2004) Mortality in relation to smoking: 40 years’ observations on male British doctors. Br Med J 328:1519–1527

    Article  Google Scholar 

  • Harville DA (1997) Matrix algebra from a statistician’s perspective. Springer, New York

    Book  MATH  Google Scholar 

  • Holt D, Ewings PD (1989) Logistic models for contingency tables. In: Skinner CJ, Holt D, Smith TMF (eds) Analysis of complex surveys. Wiley, New York, pp 261–279

    Google Scholar 

  • Johnson NL, Kotz S (1970) Continuous univariate distributions, vol 2. Houghton Mifflin, Boston

    MATH  Google Scholar 

  • Kang Q, Nelson PI, Vahl CI (2010) Parameter estimation from an outcome-dependent sample using weighted likelihood method. Statist Sinica 20:1529–1550

    MATH  MathSciNet  Google Scholar 

  • Kullback S (1997) Information theory and statistics. Dover Publications, New York

    MATH  Google Scholar 

  • Manski CF, Lerman SR (1977) The estimation of choice probabilities from choice based samples. Econometrica 45:1977–1988

    Article  MATH  MathSciNet  Google Scholar 

  • Manski CF, McFadden D (1981) Alternative estimators and sample designs for discrete choice analysis. In: Manski C, McFadden D (eds) Structural analysis of discrete data with econometric applications. The MIT Press, Cambridge, MA, pp 2–50

  • Manski CF, Thompson TS (1989) Estimation of best predictors of binary response. J Econ 40:97–123

    Article  MATH  MathSciNet  Google Scholar 

  • Morgenthaler S, Vardi Y (1986) Choice-based samples: a nonparametric approach. J Econ 32:109–125

    Article  MATH  MathSciNet  Google Scholar 

  • Prentice RL, Pyke R (1979) Logistic disease incidence models and case–control studies. Biometrika 66: 403–411

    Article  MATH  MathSciNet  Google Scholar 

  • Rao CR (1973) Linear statistical inference and its applications, 2nd edn. Wiley, New York

    Book  MATH  Google Scholar 

  • Rao JNK, Thomas DR (1989) Chi-squared tests for contingency table. In: Skinner CJ, Holt D, Smith TMF (eds) Analysis of complex surveys. Wiley, New York, pp 89–114

    Google Scholar 

  • Roberts G, Rao JNK, Kumar S (1987) Logistic regression analysis of sample survey data. Biometrika 74:1–12

    Article  MATH  MathSciNet  Google Scholar 

  • Rose S, van der Laan MJ (2009) Why match? Investigating matched case–control study designs with causal effect estimation. Int J Biostat 5: Article 1

  • Scott A, Wild C (1986) Fitting logistic models under case–control or choice based sampling. J R Stat Soc B 48:170–182

    MATH  MathSciNet  Google Scholar 

  • Scott AJ, Wild CJ (1997) Fitting regression models to case–control data by maximum likelihood. Biometrika 84:57–71

    Article  MATH  MathSciNet  Google Scholar 

  • Vardi Y (1985) Empirical distributions in selection bias models. Ann Stat 13:178–203

    Article  MATH  MathSciNet  Google Scholar 

  • Wang XF, Zhou HB (2006) A semiparametric empirical likelihood method for biased sampling schemes with auxiliary covariates. Biometrics 62:1149–1160

    Article  MATH  MathSciNet  Google Scholar 

  • Zhou H, Weaver MA, Qin H, Longnecker MP, Wang MC (2002) A semiparametric empirical likelihood method for data from an outcome-dependent sampling scheme with a continuous outcome. Biometrics 58:413–421

    Article  MATH  MathSciNet  Google Scholar 

  • Zhou H, Song R, Wu YS, Qin J (2011) Statistical inference for a two-stage outcome-dependent sampling design with a continuous outcome. Biometrics 67:194–202

    Article  MATH  MathSciNet  Google Scholar 

Download references

Acknowledgments

We thank Paul I. Nelson for his constructive comments on this paper. We also thank the anonymous reviewer and the associate editor for their insightful comments and suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Christopher I. Vahl.

Appendices

Appendix A: Proof of Theorem 1

First we adopt Rao’s (1973, sec. 6e) strategy to convert \({ LR }^{W}\) into a quadratic form. Note that \(\nabla _{\varvec{\uptheta }} l_N^W (\hat{{\varvec{\uptheta }}}^{W})=\mathbf{0}\) and \(\nabla _{\varvec{\upbeta }} l_N^W ({\varvec{\Psi }}(\hat{{\varvec{\upbeta }}}^{W}))=\mathbf{0}\). The chain rule implies \(\nabla _{\varvec{\upbeta }} l_N^W ({\varvec{\Psi }}({\varvec{\upbeta }}))=\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }})\mathbf{M}^{\prime }\). Subject \(\nabla _{\varvec{\uptheta }} l_N^W (\hat{{\varvec{\uptheta }}}^{W})\) and \(\nabla _{\varvec{\upbeta }} l_N^W ({\varvec{\Psi }}(\hat{{\varvec{\upbeta }}}^{W}))\) to first-order Taylor-series expansions at \({\varvec{\uptheta }}^{*}\) and \({\varvec{\upbeta }}^{*}\), respectively, and apply Lemma 1. This leads to

$$\begin{aligned} \hat{{\varvec{\uptheta }}}^{W}-{\varvec{\uptheta }}^{*}&= -\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})(\mathbf{H}^{W})^{-1}+O_p (N^{-1}), \nonumber \\ \hat{{\varvec{\upbeta }}}^{W}-{\varvec{\upbeta }}^{*}&= -\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})\mathbf{M}^{\prime }(\mathbf{MH}^{W}\mathbf{M}^{\prime })^{-1}+O_p (N^{-1}). \end{aligned}$$
(6)

Perform second-order Taylor-series expansions on \(l_N^W ({\varvec{\uptheta }}^{*})\) at \(\hat{{\varvec{\uptheta }}}^{W}\) and \(\hat{{\varvec{\upbeta }}}^{W}\), separately, and take the difference. This leads to

$$\begin{aligned} { LR }^{W}\!+\!N[(\hat{{\varvec{\uptheta }}}^{W}\!-\!{\varvec{\uptheta }}^{*})\mathbf{H}^{W}(\hat{{\varvec{\uptheta }}}^{W}\!-\!{\varvec{\uptheta }}^{*})^{\prime }\!-\!(\hat{{\varvec{\upbeta }}}^{W}\!-\!{\varvec{\upbeta }}^{*})\mathbf{MH}^{W}\mathbf{M}^{\prime }(\hat{{\varvec{\upbeta }}}^{W}\!-\!{\varvec{\upbeta }}^{*})^{\prime }]\!+\!O_P (N^{-1/2})\!=\!0. \end{aligned}$$

Plugging Formula (6) into the above equation yields

$$\begin{aligned} { LR }^{W}=N\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})\mathbf{O}^{W}[\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})]^{\prime }+o_p (1). \end{aligned}$$

From Johnson and Kotz (1970, pp. 150–151), \({ LR }^{W}\) has the same asymptotic distribution as \(\sum _{i=1}^q {[e_i^W \chi _i^2 (1,0)]} \), where \(e_1^W \ge \cdots \ge e_q^W \) are the eigenvalues of \(\mathbf{O}^{W}\mathbf{V}^{W}\). Note that \(-\mathbf{O}^{W}\mathbf{H}^{W}\) is idempotent of rank \(r\). By Lemma 1, both \(\mathbf{V}^{W}\) and \(-\mathbf{H}^{W}\)are \(p.d\). Hence we find \(\mathbf{O}^{W}\) to be positive-semi-definite of rank \(r\) and, subsequently, eigenvalues of \(\mathbf{O}^{W}\mathbf{V}^{W}\) satisfy that \(e_1^W \ge \cdots \ge e_r^W >0\) and \(e_{r+1}^W =\cdots =e_q^W =0\). This completes the proof of Theorem 1\((i)\).

A similar strategy is used to derive the limiting distribution of \({ LR }^{P}\). Briefly, we have

$$\begin{aligned} { LR }^{P}=N\nabla _{\varvec{\upupsilon }} l_N^P ({\varvec{\upupsilon }}^{*})\mathbf{O}^{P}[\nabla _{\varvec{\upupsilon }} l_N^P ({\varvec{\upupsilon }}^{*})]^{\prime }+o_p (1), \end{aligned}$$

where \(\mathbf{O}^{P}=\mathbf{N}^{\prime }(\mathbf{NH}^{P}\mathbf{N}^{\prime })^{-1}\mathbf{N}-(\mathbf{H}^{P})^{-1}\), \(\mathbf{N}=diag(\mathbf{M},\mathbf{I}_K )\). To prove Theorem 1(ii), it suffices to show that \(\mathbf{O}^{P}\mathbf{V}^{P}\) is idempotent of rank \(r\). By Lemma 1, \(\exists {\varvec{\Gamma }}\) such that . Given the fact that the last \(K\) rows of \(\mathbf{N}\) is and \(\mathbf{N}^{\prime }(\mathbf{NH}^{P}\mathbf{N}^{\prime })^{-1}\mathbf{NH}^{P}\mathbf{N}^{\prime }=\mathbf{N}^{\prime }\), we obtain

Obviously, \(-\mathbf{N}^{\prime }(\mathbf{NH}^{P}\mathbf{N}^{\prime })^{-1}\mathbf{NH}^{P}+\mathbf{I}_{q+K} \) is idempotent of rank \(r\).

Appendix B: Proof of Theorem 3

Partition \(\mathbf{H}^{W}\) in accordance with \({\varvec{\uptheta }}=({\begin{array}{ll} {\varvec{\upalpha }}&{} {\varvec{\upbeta }} \\ \end{array} })\) into four submatrices:\(\mathbf{H}_{11}^W \), \(\mathbf{H}_{12}^W \), \(\mathbf{H}_{21}^W \), \(\mathbf{H}_{22}^W \). Let \(\mathbf{Q}^{W}=(\mathbf{I}_r \quad -\mathbf{H}_{12}^W (\mathbf{H}_{22}^W )^{-1})\), \(\mathbf{H}^{W11}=[\mathbf{H}_{11}^W -\mathbf{H}_{12}^W (\mathbf{H}_{22}^W )^{-1}\mathbf{H}_{21}^W ]^{-1}\), and . Theorem 8.5.11 of Harville (1997) states that , which yields

$$\begin{aligned} (\mathbf{H}^{W})^{-1}\mathbf{R}=\left( \mathbf{Q}^{W}\right) ^{\prime }\mathbf{H}^{W11}. \end{aligned}$$
(7)

According to Lemma 1, Slutsky’s theorem and Formula (7), we have

$$\begin{aligned} Score^{W}&= N\nabla _{\varvec{\upalpha }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})[\mathbf{Q}^{W}\mathbf{V}^{W} (\mathbf{Q}^{W})^{\prime }]^{-1}[\nabla _{\varvec{\upalpha }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})]^{\prime }+o_p (1), \nonumber \\ Wald^{W}&= N(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})[\mathbf{R}^{\prime }(\mathbf{H}^{W})^{-1}\mathbf{V}^{W}(\mathbf{H}^{W})^{-1}\mathbf{R}]^{-1}(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})^{\prime }+o_p (1) \nonumber \\&= N(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})[\mathbf{H}^{W11}\mathbf{Q}^{W}\mathbf{V}^{W}(\mathbf{Q}^{W})^{\prime }\mathbf{H}^{W11}]^{-1}(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})^{\prime }+o_p (1) \nonumber \\&= N(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})(\mathbf{H}^{W11})^{-1}[\mathbf{Q}^{W}\mathbf{V}^{W}(\mathbf{Q}^{W})^{\prime }]^{-1}(\mathbf{H}^{W11})^{-1}(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}\!-\!\mathbf{a})^{\prime }\!+\!o_p (1).\nonumber \\ \end{aligned}$$
(8)

Perform first-order Taylor-series expansions on \(\nabla _{\varvec{\upalpha }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})\) and \(\nabla _{\varvec{\upbeta }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})\) at \({\varvec{\upbeta }}^{*}\), respectively. It follows from \(\nabla _{\varvec{\upbeta }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})=\mathbf{0}\) that

$$\begin{aligned} \nabla _{\varvec{\upalpha }} l_N^W (\mathbf{a},\hat{{\varvec{\upbeta }}}^{W})=\nabla _{\varvec{\uptheta }} l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})(\mathbf{Q}^{W})^{\prime }+O_p (N^{-1}). \end{aligned}$$
(9)

Also note that \(\nabla _{\varvec{\uptheta }} l_N^W (\hat{{\varvec{\uptheta }}}^{W})=\mathbf{0}\). Performing a first-order Taylor-series expansion on \(\nabla _{\varvec{\uptheta }} l_N^W (\hat{{\varvec{\uptheta }}}^{W})\) at \(({\begin{array}{ll} \mathbf{a}&{} {{\varvec{\upbeta }}^{*}} \\ \end{array} })\) and applying Formula (7) leads to

$$\begin{aligned} \nabla _{\varvec{\uptheta }} l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})(\mathbf{Q}^{W})^{\prime }=-(\hat{{\varvec{\uptheta }}}^{W}\mathbf{R}-\mathbf{a})(\mathbf{H}^{W11})^{-1}+O_p (N^{-1}). \end{aligned}$$
(10)

It is thus seen from Formulas (8), (9) and (10) that \(Score^{W}=Wald^{W}+o_p (1)\). The asymptotic distribution of \(Score^{W}\) and \(Wald^{W}\) is a direct result of Lemma 1 and Johnson and Kotz (1970, pp. 150–151). This completes of proof of Theorem 3\((i)\).

To prove Theorem 3(ii), first note that \(\mathbf{MH}^{W}\mathbf{M}^{\prime }=\mathbf{H}_{22}^W \). Perform second-order Taylor-series expansions on \(l_N^W ({\varvec{\uptheta }}^{*})\) and \(l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})\) at \(\hat{{\varvec{\uptheta }}}^{W}\) and \(\hat{{\varvec{\upbeta }}}^{W}\), separately, and take the difference. This generates

$$\begin{aligned} { LR }^{W}&= 2N[l_N^W ({\varvec{\uptheta }}^{*})-l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})]-N(\hat{{\varvec{\uptheta }}}^{W}-{\varvec{\uptheta }}^{*})\mathbf{H}^{W}(\hat{{\varvec{\uptheta }}}^{W}-{\varvec{\uptheta }}^{*})^{\prime }\\&+N(\hat{{\varvec{\upbeta }}}^{W}-{\varvec{\upbeta }}^{*})\mathbf{H}_{22}^W (\hat{{\varvec{\upbeta }}}^{W}-{\varvec{\upbeta }}^{*})^{\prime }+O_P (N^{-1/2}). \end{aligned}$$

For \({\varvec{\upalpha }}^{*}=\mathbf{a}+N^{-1/2}{\varvec{\Delta }}\), we have

$$\begin{aligned} l_N^W ({\varvec{\uptheta }}^{*})-l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})&= N^{-1/2}{\varvec{\Delta }}[\nabla _{\varvec{\upalpha }} l_N^W ({\varvec{\uptheta }}^{*})]^{\prime }-0.5N^{-1}{\varvec{\Delta }}\mathbf{H}_{11}^W {\varvec{\Delta }}^{\prime }+O_p (N^{{-3}/2}), \\ {\varvec{\upbeta }}^{*}-\hat{{\varvec{\upbeta }}}^{W}&= \nabla _{\varvec{\upbeta }} l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})(\mathbf{H}_{22}^W )^{-1}+O_p (N^{-1})\\&= [\nabla _{\varvec{\upbeta }} l_N^W ({\varvec{\uptheta }}^{*})-N^{-1/2}{\varvec{\Delta }}\mathbf{H}_{12}^W ](\mathbf{H}_{22}^W )^{-1}+O_p (N^{-1}). \end{aligned}$$

Recall that \(\hat{{\varvec{\uptheta }}}^{W}-{\varvec{\uptheta }}^{*}=-\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})(\mathbf{H}^{W})^{-1}+O_p (N^{-1})\). Collecting the information above, we then convert \({ LR }^{W}\) to a quadratic form as

$$\begin{aligned} { LR }^{W}&= -[N^{1/2}\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})(\mathbf{Q}^{W})^{\prime }-{\varvec{\Delta }}(\mathbf{H}^{W11})^{-1}]\mathbf{H}^{W11}[N^{1/2}\nabla _{\varvec{\uptheta }} l_N^W ({\varvec{\uptheta }}^{*})(\mathbf{Q}^{W})^{\prime }\nonumber \\&-{\varvec{\Delta }}(\mathbf{H}^{W11})^{-1}]^{\prime }+o_p (1) \nonumber \\&= N\nabla _{\varvec{\uptheta }} l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})(\mathbf{Q}^{W})^{\prime }(-\mathbf{H}^{W11})\mathbf{Q}^{W}[\nabla _{\varvec{\uptheta }} l_N^W (\mathbf{a},{\varvec{\upbeta }}^{*})]^{\prime }+o_p (1). \end{aligned}$$
(11)

Apply Cholesky decomposition to the \(r\times r\,p.d.\) matrix \(\mathbf{Q}^{W}\mathbf{V}^{W}(\mathbf{Q}^{W})^{\prime }\) to get \(\mathbf{Q}^{W}\mathbf{V}^{W}(\mathbf{Q}^{W})^{\prime }=\mathbf{LL}^{\prime }\). Let \(e_1^W \ge \cdots \ge e_r^W >0\) be the eigenvalues of \(-\mathbf{L}^{\prime }\mathbf{H}^{W11}\mathbf{L}\) and let \(\mathbf{P}\) be the associated orthogonal matrix of eigenvectors, i.e. \(-\mathbf{P}^{\prime }\mathbf{L}^{\prime }\mathbf{H}^{W11}\mathbf{LP}=diag(e_1^W ,\ldots ,e_r^W )\). Further denote \(\mathbf{p}_i \) as the \(\hbox {i}^{\mathrm{th}}\) row vector of \(\mathbf{P}\). From Johnson and Kotz (1970, pp. 150–151) and Lemma 1, \({ LR }^{W}\) has a limiting distribution of \(\sum _{i=1}^r {[e_i^W \chi _i^2 (1,\varphi _i )]} \), \(\varphi _i =0.5{\varvec{\Delta }}(\mathbf{L}^{\prime }\mathbf{H}^{W11})^{-1}(\mathbf{p}_i )^{\prime }\mathbf{p}_i (\mathbf{H}^{W11}\mathbf{L})^{-1}{\varvec{\Delta }}^{\prime }\). Because \((\mathbf{L}^{\prime }\mathbf{H}^{W11})^{-1}\mathbf{P}^{\prime }\mathbf{P}(\mathbf{H}^{W11}\mathbf{L}^{\prime })^{-1}\) is \(p.d.\), \(\varphi _1 =\cdots =\varphi _r =0\) iff \({\varvec{\Delta }}=\mathbf{0}\). It is easy to see that \(\mathbf{O}^{W}\mathbf{V}^{W}=-(\mathbf{Q}^{W})^{\prime }\mathbf{H}^{W11}\mathbf{Q}^{W}\mathbf{V}^{W}\) has the same eigenvalues as \(-\mathbf{L}^{\prime }\mathbf{H}^{W11}\mathbf{L}\). This completes the proof of Theorem 3(ii).

With respect to Theorem 3(iii), partition \(\mathbf{H}^{P}\) into \(\mathbf{H}_{11}^P \), \(\mathbf{H}_{12}^P \), \(\mathbf{H}_{21}^P \), \(\mathbf{H}_{22}^P \) by separating out \({\varvec{\upalpha }}\) from \({\varvec{\upbeta }}\) and \({\varvec{\upxi }}_{+Y} \). Set \(\mathbf{Q}^{P}=(\mathbf{I}_r \quad -\mathbf{H}_{12}^P (\mathbf{H}_{22}^P )^{-1})\) and . The fact that for some \({\varvec{\Gamma }}\) assures that \(-\mathbf{H}^{P11}=[\mathbf{Q}^{P}\mathbf{V}^{P}(\mathbf{Q}^{P})^{\prime }]^{-1}\) (we leave the proof of this equation to the reader). Our formulation of \(Score^{P}\) is a direct application of this equality. Analogous to the proof for Theorem 3\((i)\), we have

$$\begin{aligned} Wald^{P}=Score^{P}+o_p (1)=\chi ^{2}(r,-0.5{\varvec{\Delta }}(\mathbf{H}^{P11})^{-1}{\varvec{\Delta }}^{\prime })+o_p (1). \end{aligned}$$

Like Formula (11), \({ LR }^{P}\) can be converted to a quadratic form as

$$\begin{aligned} { LR }^{P}&= -[N^{1/2}\nabla _{\varvec{\upupsilon }} l_N^P ({\varvec{\upupsilon }}^{*})(\mathbf{Q}^{P})^{\prime }-{\varvec{\Delta }}(\mathbf{H}^{P11})^{-1}]\mathbf{H}^{P11}[N^{1/2}\nabla _{\varvec{\upupsilon }} l_N^P ({\varvec{\upupsilon }}^{*})(\mathbf{Q}^{P})^{\prime }\\&-{\varvec{\Delta }}(\mathbf{H}^{P11})^{-1}]^{\prime }+o_p (1) \\&= -N\nabla _{\varvec{\upupsilon }} l_N^P (\mathbf{a},{\varvec{\upbeta }}^{*},{\varvec{\upxi }}_{+Y}^*)(\mathbf{Q}^{P})^{\prime }\mathbf{H}^{P11}\mathbf{Q}^{P}\nabla _{\varvec{\upupsilon }} l_N^P (\mathbf{a},{\varvec{\upbeta }}^{*},{\varvec{\upxi }}_{+Y}^*)+o_p (1) \\&= -N\nabla _{\varvec{\upalpha }} l_N^P (\mathbf{a},\hat{{\varvec{\upbeta }}}^{P},{\breve{{\varvec{\upxi }}}}_{+Y} )\mathbf{H}^{P11}[\nabla _{\varvec{\upalpha }} l_N^P (\mathbf{a},\hat{{\varvec{\upbeta }}}^{P},{\breve{{\varvec{\upxi }}}}_{+Y} )]^{\prime }+o_p (1)=Score^{P}+o_p (1) \end{aligned}$$

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Vahl, C.I., Kang, Q. Analysis of an outcome-dependent enriched sample: hypothesis tests. Stat Methods Appl 24, 387–409 (2015). https://doi.org/10.1007/s10260-014-0285-4

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10260-014-0285-4

Keywords

Navigation