Marker-dependent observation and carry-forward of internal covariates in Cox regression

Cook, Richard J.; Lawless, Jerald F.; Xie, Bingfeng

doi:10.1007/s10985-022-09561-9

Marker-dependent observation and carry-forward of internal covariates in Cox regression

Published: 20 June 2022

Volume 28, pages 560–584, (2022)
Cite this article

Lifetime Data Analysis Aims and scope Submit manuscript

Abstract

Studies of chronic disease often involve modeling the relationship between marker processes and disease onset or progression. The Cox regression model is perhaps the most common and convenient approach to analysis in this setting. In most cohort studies, however, biospecimens and biomarker values are only measured intermittently (e.g. at clinic visits) so Cox models often treat biomarker values as fixed at their most recently observed values, until they are updated at the next visit. We consider the implications of this convention on the limiting values of regression coefficient estimators when the marker values themselves impact the intensity for clinic visits. A joint multistate model is described for the marker-failure-visit process which can be fitted to mitigate this bias and an expectation-maximization algorithm is developed. An application to data from a registry of patients with psoriatic arthritis is given for illustration.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

The global prevalence of rheumatoid arthritis: a meta-analysis based on a systematic review

Article 11 November 2020

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

Article Open access 19 December 2014

The Clone-Censor-Weight Method in Pharmacoepidemiologic Research: Foundations and Methodological Implementation

Article 17 February 2024

References

Aalen O, Borgan Ø, Fekjær H (2001) Covariate adjustment of event histories estimated from Markov chains: the additive approach. Biometrics 57(4):993–1001
Article MathSciNet Google Scholar
Altman D (1991) Categorising continuous variables. Br J Cancer 64(5):975
Article Google Scholar
Andersen P, Gill R (1982) Cox’s regression model for counting processes: a large sample study. The Annals of Statistics 10(4):1100–1120
Article MathSciNet Google Scholar
Andersen P, Liestol K (2003) Attenuation caused by infrequently updated covariates in survival analysis. Biostatistics 4(4):633–649
Article Google Scholar
Andy S, Keeffe E (2003) Elevated AST or ALT to nonalcoholic fatty liver disease: accurate predictor of disease prevalence? The American Journal of Gastroenterology 98(5):955–956
Article Google Scholar
Butler A, English E, Kilpatrick E, Östlundh L, Chemaitelly H, Abu-Raddad L, Alberti K, Atkin S, John W (2021) Diagnosing type 2 diabetes using Hemoglobin A1c: a systematic review and meta-analysis of the diagnostic cutpoint based on microvascular complications. Acta Diabetologica 58(3):279–300
Article Google Scholar
Cook R, Lawless J (2018) Multistate Models for the Analysis of Life History Data. Chapman and Hall/CRC, Boca Raton, FL
Book Google Scholar
Cook R, Lawless J (2021) Independence conditions and the analysis of life history studies with intermittent observation. Biostatistics 22(3):455–481
Article MathSciNet Google Scholar
Datta S, Satten G (2001) Validity of the Aalen-Johansen estimators of stage occupation probabilities and Nelson-Aalen estimators of integrated transition hazards for non-Markov models. Statistics & Probability Letters 55(4):403–411
Article MathSciNet Google Scholar
de Bruijne M, Cessie S, Kluin-Nelemans H, Houwelingen H (2001) On the use of Cox regression in the presence of an irregularly observed time-dependent covariate. Stat Med 20(24):3817–3829
Article Google Scholar
Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J Royal Stat Soc: Series B (Methodological) 39(1):1–38
MathSciNet MATH Google Scholar
Eeg-Olofsson K, Cederholm J, Nilsson P, Zethelius B, Svensson AM, Gudbjörnsdottir S, Eliasson B (2010) New aspects of HbA1c as a risk factor for cardiovascular diseases in type 2 diabetes: an observational study from the Swedish National Diabetes Register (NDR). J Intern Med 268(5):471–482
Article Google Scholar
Gelman A, Park D (2009) Splitting a predictor at the upper quarter or third and the lower quarter or third. The American Statistician 63(1):1–8
Article MathSciNet Google Scholar
Gladman D, Chandran V (2011) Observational cohort studies: lessons learnt from the University of Toronto Psoriatic Arthritis Program. Rheumatology 50(1):25–31
Article Google Scholar
Jewell N, Kalbfleisch J (1996) Marker processes in survival analysis. Lifetime Data Anal 2(1):15–29
Article Google Scholar
Jiang S, Cook R, Zeng L (2020) Mitigating bias from intermittent measurement of time-dependent covariates in failure time analysis. Stat Med 39(13):1833–1845
Article MathSciNet Google Scholar
Lin D, Wei LJ (1989) The robust inference for the Cox proportional hazards model. Journal of the American Statistical Association 84(408):1074–1078
Article MathSciNet Google Scholar
Louis T (1982) Finding the observed information matrix when using the EM algorithm. J Royal Stat Soc: Series B (Methodological) 44(2):226–233
MathSciNet MATH Google Scholar
Martinussen T (1999) Cox regression with incomplete covariate measurements using the EM - algorithm. Scandinavian J Stat 26(4):479–491
Article MathSciNet Google Scholar
McQuarrie E, Traynor J, Taylor A, Freel E, Fox J, Jardine A, Mark P (2014) Association between urinary sodium, creatinine, albumin, and long-term survival in chronic kidney disease. Hypertension 64(1):111–117
Article Google Scholar
Papageorgiou G, Mauff K, Tomer A, Rizopoulos D (2019) An overview of joint modeling of time-to-event and longitudinal outcomes. Annual review of Statistics and its Application 6:223–240
Article MathSciNet Google Scholar
R Core Team (2018) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, https://www.R-project.org/
Raboud J, Reid N, Coates R, Farewell V (1993) Estimating risks of progressing to AIDS when covariates are measured with error. J Royal Stat Soc: Series A 156(3):393–406
Article Google Scholar
Rahman P, Gladman D, Cook R, Zhou Y, Young G, Salonen D (1998) Radiological assessment in psoriatic arthritis. Br J Rheumatol 37(7):760–765
Article Google Scholar
Royston P, Altman D, Sauerbrei W (2006) Dichotomizing continuous predictors in multiple regression: a bad idea. Stat Med 25(1):127–141
Article MathSciNet Google Scholar
Struthers C, Kalbfleisch J (1986) Misspecified proportional hazard models. Biometrika 73(2):363–369
Article MathSciNet Google Scholar
Tsiatis A, Davidian M (2001) A semiparametric estimator for the proportional hazards model with longitudinal covariates measured with error. Biometrika 88(2):447–458
Article MathSciNet Google Scholar
Tsiatis A, Davidian M (2004) Joint modeling of longitudinal and time-to-event data: an overview. Statistica Sinica 14(3):809–834
MathSciNet MATH Google Scholar
Wong G, Chan H, Tse YK, Yip T, Lam K, Lui G, Wong V (2018) Normal on-treatment ALT during antiviral treatment is associated with a lower risk of hepatic events in patients with chronic hepatitis B. J Hepatol 69(4):793–802
Article Google Scholar
Wulfsohn M, Tsiatis A (1997) A joint model for survival and longitudinal data measured with error. Biometrics 53(1):330–339
Article MathSciNet Google Scholar

Download references

Acknowledgements

It is our great pleasure to contribute to the special issue of Lifetime Data Analysis in honour of David Oakes, whose many important contributions to the field of life history analysis are marked by their relevance, creativity, rigour, and clarity. We thank the Guest Editors Jong H. Jeong and Amita Manatunga for the opportunity to take part in this special and much-deserved recognition. This work was funded by the Natural Sciences and Engineering Research Council of Canada (RGPIN-2017-04207 for R.J.C. and RGPIN-2017-04055 for J.F.L.). R.J.C. is a Mathematics Faculty Research Chair at the University of Waterloo. The authors thank Drs. Dafna Gladman and Vinod Chandra of the Centre for Prognosis in Rheumatic Diseases for stimulating discussion and permission to use the data in the application.

Author information

Authors and Affiliations

Department of Statistics and Actuarial Science, University of Waterloo, Waterloo, ON, N2L 3G1, Canada
Richard J. Cook, Jerald F. Lawless & Bingfeng Xie

Authors

Richard J. Cook
View author publications
You can also search for this author in PubMed Google Scholar
Jerald F. Lawless
View author publications
You can also search for this author in PubMed Google Scholar
Bingfeng Xie
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Richard J. Cook.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix

1.1 Computation of $r^{(k)}(s)= E( {\bar{Y}}_i(s) d N_i(s))$

Here we next describe the computation of the expectation $r^{(0)}(s)= E\{ {\bar{Y}}_i(s) d N_i(s)\}$ based on this joint model. We use E to denote expectations taken with respect to the joint model depicted in Fig. 3 and P to denote corresponding probabilities based on the joint process. Note that

$$\begin{aligned} r^{(0)}(s) = d \varLambda _2(s) {E}_{Z(0), X_2} \{ E_{{\bar{Y}}(s), X_1(s)} \left( {\bar{Y}}(s) \exp (\beta _1 X_1(s) +\beta _2'X_2) |Z(0), X_2\right) \} \end{aligned}$$

which can be computed as

$$\begin{aligned} d \varLambda _2(s) {E}_{{Z}(0),X_2} \{ \textstyle \sum _{k=0}^1 \exp (\beta _1 k + \beta _2' X_2) {P}( {{\mathcal {Z}}}(s) \in {{\mathcal {C}}}^z_k |\,Z(0), X_2) \}\,. \end{aligned}$$

(A.1)

If $k=1$ we partition $r^{(1)}(s) = E \{ {\bar{Y}}_i(s) X_i^{\circ }(s) d N_i(s) \}$ conformably with $X^\circ (t) = (X^\circ _1(t), X_2')'$ and write $r^{(1)}(s) = (r_1^{(1)}(s), [r^{(1)}_2(s)]')'$, the leading term has the form

$$\begin{aligned} {E}_{X_2} \left\{ E_{X^{\circ }(s), {\bar{Y}}(s)} \left[ E\{ {\bar{Y}}(s) X_1^{\circ }(s) d \varLambda _2(s | X(s)) | X_1^{\circ }(s)= 1, X_2, {\bar{Y}}(s) = 1 \} | X_2 \right] \right\} \,. \end{aligned}$$

This can be written as

$$\begin{aligned} d \varLambda _2(s) \, {E}_{{Z}(0), X_2} \{ \textstyle \sum _{k=0}^1 \exp ( \beta _1 k + \beta '_2 X_2) {P}(\mathcal {Z}(s) \in {{\mathcal {C}}}^{zx^\circ }_{k1} | {Z}(0), X_2) \} \end{aligned}$$

(A.2)

since ${\bar{Y}}(s) X^{\circ }(s) = 1$ if and only if ${{\mathcal {Z}}}(s) \in {{\mathcal {C}}}^{zx^\circ }_{01} \cup {{\mathcal {C}}}^{zx^\circ }_{11}$. For $r_2^{(1)}(s)$ we have

$$\begin{aligned} d \varLambda _2(s) {E}_{{Z}(0),X_2} \{ X_2 \textstyle \sum _{k=0}^1 \exp ( \beta _1 k + \beta '_2 X_2) {P}({{\mathcal {Z}}}(s) \in {{\mathcal {C}}}^{z}_k |{Z}(0), X_2) \} \,. \end{aligned}$$

(A.3)

1.2 Computation of $r^{(k)}(s;\psi )$

Note that $r^{(0)}(s; \psi ) = E \{ {\bar{Y}}_i(s) \exp (\psi _1 X_1^{\circ }(s) + \psi _2' X_2) \}$ is

$$\begin{aligned} {E}_{{Z}(0), X_2} \{ E_{{\bar{Y}}(s), X_1^{\circ }(s)} \{ {\bar{Y}}(s) \exp (\psi _1 X_1^{\circ }(s) + \psi '_2 X_2) |{Z}(0),X_2 \} \} \,, \end{aligned}$$

which is computed as

$$\begin{aligned} E_{{Z}(0),X_2} \{ \textstyle \sum _{l=0}^1 \exp (\psi _1 l + \psi _2'X_2) {P}( {{\mathcal {Z}}}(s) \in {{\mathcal {C}}}^{x^\circ }_l | {Z}(0), X_2) \}\,. \end{aligned}$$

(A.4)

We again partition $r^{(1)}(s; \psi ) = E \{ {\bar{Y}}(s) X^{\circ }(s) \exp (\psi ' X^{\circ }(t) \} $ and note for the first element

$$\begin{aligned} r_1^{(1)}(s; \psi ) = {E}_{{Z}(0), X_2} \{ e^{\psi _1 + \psi '_2 X_2} {P}({{\mathcal {Z}}}(s) \in {{\mathcal {C}}}^{x^\circ }_1| {Z}(0), X_2) \} \,. \end{aligned}$$

(A.5)

The remaining $p\times 1$ vector $r_2^{(1)}(s;\psi )$ is given by

$$\begin{aligned} {E}_{{Z}(0), X_2} \{ X_2 \textstyle \sum _{l= 0}^1 \exp (\psi _1 l + \psi _2'X_2 ) {P}({{\mathcal {Z}}}(s) \in {{\mathcal {C}}}^{x^\circ }_l | {Z}(0), X_2) \} \,. \end{aligned}$$

(A.6)

Together equations (A.1)–(A.6) facilitate computation of (10) and hence determination of $\psi ^*$.

Joint model fitting via an EM algorithm

In what follows we give the complete data partial log-likelihoods and score functions for the intensities governing transitions in Fig. 3 and use ${{\mathcal {L}}}$ to distinguish these likelihoods from the observed data log-likelihood. The complete data here is conceived as containing the times of all marker transitions over (0, V), in which case the complete data likelihood can be factored into component functions that can be maximized separately (Cook and Lawless 2018, Chapter 2).

1.1 Complete data scores for the marker process

Markov intensities are adopted for the marker process giving the complete data partial log-likelihood contributions of the form

$$\begin{aligned} \log {{\mathcal {L}}}_j = \sum _{i=1}^n \int _0^{\infty } {\bar{Y}}_{ij}(s) \left\{ d N_{ij}(s) \log d \varLambda _j(s | X_{i2}) - d \varLambda _j(s | X_{i2}) \right\} \,,~ j = 0, 1 \,, \end{aligned}$$

with $d \varLambda _j(t | X_2) = d \varLambda _{jo}(t) \exp (\gamma '_j X_2)$ where $d \varLambda _{jo}(t)$ is the baseline transition rate. Flexibility is desired for the marker process but semiparametric methods are challenging since the $j \rightarrow 3-j$ transitions are not observed – we adopt piecewise-constant baseline rate functions and let $0 = b_{j0}< b_{j1}< \cdots < b_{j, K_j - 1} = \infty $ denote cutpoints for the $j \rightarrow j-1$ baseline rate, giving ${{\mathcal {B}}}_{jk} = [b_{j, k-1}, b_{jk})$, $k = 1, \ldots , K_j$, $j = 0, 1$. We let $d \varLambda _{jo}(s)/ds = \exp (\alpha _{jk})$ if $s \in {{\mathcal {B}}}_{jk}$ and $\alpha _j = (\alpha _{j1}, \ldots , \alpha _{j K_j})'$. The complete data score function for the $\alpha _{jk}$ are

$$\begin{aligned} \sum _{i=1}^n \int _{0}^\infty {\bar{Y}}_{ij}(s) I(s \in {{\mathcal {B}}}_{jk}) [ dN_{ij}(s) - d\varLambda _j(s|X_{i2})) ] \end{aligned}$$

(B.1)

which can be rewritten as

$$\begin{aligned} U_{jk} = \sum _{i=1}^n \left[ N_{ijk} - e^{\alpha _{jk} + \gamma _j' X_{i2}} S_{ijk} \right] \,, \end{aligned}$$

(B.2)

where $N_{ijk} = \int _0^{\infty } I(s \in {{\mathcal {B}}}_{jk}) d {\bar{N}}_{ij}(s)$ is the number of $j \rightarrow 1-j$ transitions over ${{\mathcal {B}}}_{jk}$ for indivdual i and $S_{ijk} = \int _0^{\infty } {\bar{Y}}_{ij}(s) I(s \in {{\mathcal {B}}}_{jk}) ds$ is their time at risk for a transition out of state j in sub-interval ${{\mathcal {B}}}_{jk}$, $j=0, 1$. The complete data partial score for $\gamma _j$ is

$$\begin{aligned} U_{j, K_j + 1}&= \sum _{i=1}^n \int _0^{\infty } {\bar{Y}}_{ij}(u) \left[ d N_{ij}(u) - d \varLambda _j(u | X_{i2}) \right] X_{i2} \nonumber \\&= \sum _{i=1}^n \sum _{k=1}^{K_j} \left[ N_{ijk} - e^{\alpha _{jk} + \gamma '_j X_{i2}} S_{ijk} \right] X_{i2} \end{aligned}$$

(B.3)

and we write $U_j = (U_{j1}, \ldots , U_{j K_j}, U'_{j, K_j + 1})'$ which is the estimating function for $\theta _j = (\alpha '_j, \gamma '_j)'$. The key elements missing from the intermittent visit process are $S_{ij} = (S_{ij1}, \ldots , S_{ijK_j})'$ and $N_{ij} = (N_{ij1}, \ldots , N_{ijK_j})'$, $j = 0, 1$.

1.2 Complete data scores for failure and censoring intensities

Here the complete data partial log-likelihood for the failure $(l=2)$ and censoring $(l=3)$ process intensities given by

$$\begin{aligned} \log {{\mathcal {L}}}_l = \sum _{i=1}^n \int _0^{\infty } {\bar{Y}}_i(s) \left\{ d N_{il}(s) \log d \varLambda _l(s | X_i(s)) - d \varLambda _l(s | X_i(s)) \right\} \,. \end{aligned}$$

We let $0 = b_{l0}< b_{l1}< \cdots< b_{l, K_l - 1} < b_{l K_l} = \infty $ be $K_l - 1$ cutpoints, giving intervals ${{\mathcal {B}}}_{lk}= (b_{l,k-1}, b_{lk})$ for the piecewise-constant intensities of the form $d \varLambda _{lo}(t)/dt = \exp (\alpha _{lk})$ if $t \in {{\mathcal {B}}}_{lk}$, $k = 1, \ldots , K_l$ where $d \varLambda _l(t | X(t)) = d \varLambda _{lo}(t) \exp (\beta ' X(t))$ if $l=2$ and $d \varLambda _l(t | X(t)) = d \varLambda _{lo}(t) \exp (\eta _c' X(t))$ if $l=3$. The complete data score function for $d\varLambda _{lo}(s)$ is

$$\begin{aligned} \sum _{i=1}^n \int _{0}^\infty {\bar{Y}}_{i}(s) [ dN_{il}(s) - d\varLambda _l(s|X_i(t)) ] \end{aligned}$$

(B.4)

which under the piecewise-constant hazard model gives $U_{lk} = \partial \log {{\mathcal {L}}}_l / \partial \alpha _{lk}$ of the form

$$\begin{aligned} \sum _{i=1}^n \left\{ \int _0^{\infty }{\bar{Y}}_i(u) I(u \in {{\mathcal {B}}}_{lk}) d N_{il}(u) - e^{\alpha _{lk}} \left[ {{\mathcal {S}}}_{i0k}^{(l)} + {{\mathcal {S}}}_{i1k}^{(l)} e^{\phi _1} \right] e^{\phi '_2 X_{i2}} \right\} \,, \end{aligned}$$

(B.5)

where $\phi = (\phi _1,\phi _2')'= \beta $ for $l=2$ and $\eta _c$ for $l=3$. ${{\mathcal {S}}}_{ijk}^{(l)} = \int _0^{\infty } {\bar{Y}}_{ij}(s) I(s \in {{\mathcal {B}}}_{lk}) ds$ where the superscript (l) reflects the fact that this is the time at risk of a transition out of state j in interval ${{\mathcal {B}}}_{lk}$ (as opposed to ${{\mathcal {B}}}_{jk}$, $j = 0, 1$). For $\phi $ we have $U_{l, K_l + 1} = \partial \log {{\mathcal {L}}}_l / \partial \phi $

$$\begin{aligned} \sum _{i=1}^n \sum _{k=1}^{K_{lj}} \int _{{{\mathcal {B}}}_{lk}} {\bar{Y}}_i(u) I(u \in {{\mathcal {B}}}_{lk}) \left[ d N_{il}(u) - e^{\alpha _{lk} + \phi ' X_i(u)} du\right] X_i(u)\,. \end{aligned}$$

(B.6)

1.3 Complete data scores for modulated Poisson visit process intensities

Here we let $d \varLambda _4(t | X_i(t))$ be the rate function for visits giving a complete data partial log-likelihood for the visits intensity

$$\begin{aligned} \log {{\mathcal {L}}}_4 \propto \sum _{i=1}^n \int _0^{\infty } {\bar{Y}}_i(s) \left\{ d A_i(s) \log d \varLambda _4(s | X_i(s)) - d \varLambda _4(s | X_i(s)) \right\} \,. \end{aligned}$$

For the visit process, we have $0 = b_{40}< b_{41}< \cdots< b_{4, K_4 - 1} < b_{4 K_4} = \infty $ be $K_4 - 1$ cutpoints, giving intervals ${{\mathcal {B}}}_{4k}= (b_{4,k-1},b_{4k})$, and let $d \varLambda _{4o}(t)/dt = \exp (\alpha _{4k})$ if $t \in {{\mathcal {B}}}_{4k}$, $k = 1, \ldots , K_4$ where $d \varLambda _4(t | X(t)) = d \varLambda _{4o}(t) \exp (\eta _a' X(t))$. The complete data score functions are $U_{4k} = \partial \log {{\mathcal {L}}}_4 / \partial \alpha _{4k}$ and $U_{4, K_4 + 1} = \partial \log {{\mathcal {L}}}_4 / \partial \eta _a$ given by

$$\begin{aligned} U_{4k} = \sum _{i=1}^n \left\{ \int _{{{\mathcal {B}}}_{4k}} {\bar{Y}}_i(u) I(u \in {{\mathcal {B}}}_{4k}) dA_i(u) - e^{\alpha _{4k}} \left[ {{\mathcal {S}}}_{i0k} + {{\mathcal {S}}}_{i1k} e^{\eta _a}\right] \right\} \end{aligned}$$

(B.7)

and

$$\begin{aligned} U_{4, K_4 + 1} = \sum _{i=1}^n \sum _{k=1}^{K_4} \int _0^{\infty } {\bar{Y}}_i(u) I(u \in {{\mathcal {B}}}_{4k}) \left[ d A_{i}(u) - e^{\alpha _{4k} + \eta '_a X_i(u) } du \right] X_i(u) \,.\nonumber \\ \end{aligned}$$

(B.8)

1.4 Computation of the conditional expectations

The elements that are missing include $S_{ijk}$ and $N_{ijk}$ in the complete data estimating functions for marker processes, and the covariate path $\{X_i(u), 0 < u \}$ for the failure, censoring and visit process intensities. We let

$$\begin{aligned} D_i = \{ {\bar{Y}}_i(s), d {\bar{N}}_{i2}(s), d {\bar{N}}_{i3}(s), d A_i(s), X_{i1}^{\circ }(s), 0< s <V_i, X_{i2} \} \end{aligned}$$

denote the observed data for individual i, $i = 1, \ldots , n$. Then

$$\begin{aligned} E\{S_{ijk} | D_i \} = \int _0^{\infty } I(s \in {{\mathcal {B}}}_{jk}) E\{ {\bar{Y}}_{ij}(s) | D_i \} ds = \int _{{{\mathcal {B}}}_{jk}} {\bar{Y}}_i(s) P({{\mathcal {Z}}}_i(s) \in {{\mathcal {C}}}_j^z | D_i) ds \end{aligned}$$

and

$$\begin{aligned} E\{ N_{ijk} | D_i \} = \int _{{{\mathcal {B}}}_{jk}} {\bar{Y}}_i(s) P({{\mathcal {Z}}}_i(s) \in {{\mathcal {C}}}_j^z | D_i) d \varLambda _j(s | X_{i2}) \,. \end{aligned}$$

For the other estimating functions we require only

$$\begin{aligned} E\{S_{ijk}^{(l)} | D_i \} = \int _{{{\mathcal {B}}}_{lk}} {\bar{Y}}_i(s) P( {{\mathcal {Z}}}_i(s) \in {{\mathcal {C}}}_j^z | D_i) ds \,. \end{aligned}$$

All of these expectations are based on the conditional probability $P({{\mathcal {Z}}}_i(s) \in {{\mathcal {C}}}_j^z | D_i)$. We discuss how to compute this here by considering two special cases.

1.5 Case 1: $a_{i r_i(u)}< u < a_{i, r_i(u) + 1}$ where $A_i(u) = r_i(u)$

The numerator of $P({{\mathcal {Z}}}_i(u) \in {{\mathcal {R}}}_1^z | D_i)$ is given by

$$\begin{aligned}&\prod _{j=1}^{r_i(u)} P({{\mathcal {Z}}}_i(a_{ij}^-) = (j-1, X_{i1}(a_{ij}^-), X_{i1}^{\circ }(a_{ij}^-)) | \bar{{\mathcal {H}}}_i^{\circ }(a_{ij}^-)) \lambda _4(a_{ij} | X_{i1}(a_{ij}^-), \bar{{\mathcal {H}}}_i^{\circ }(a_{ij}^-)) \\&\quad \times \biggl \{ P({{\mathcal {Z}}}_i(u) = (r_i(u), x_1 = 1, X_{i1}^{\circ }(a_{i r_i(u)})) | {{\mathcal {H}}}_i(a_{i r_i(u)}^{+})) \\&\quad \times P({{\mathcal {Z}}}_i(a_{i, r_i(u) + 1}^{-}) = (r_i(u), X_{i1}(a_{i, r_i(u) + 1}^{-}),\\&\quad \times X_{i1}^{\circ }(a_{i, r_i(u) + 1}^{-})) | {{\mathcal {Z}}}_i(u) = z_i(u), \bar{{\mathcal {H}}}_i^{\circ }(a_{i, r_i(u) + 1}^{-})) \\&\quad \times \lambda _4(a_{i, r_i(u) + 1} | X_{i1}(a_{i, r_i(u) + 1}), \bar{{\mathcal {H}}}_i^{\circ }(a_{i, r_i(u) + 1}^{-})) \biggr \} \\&\quad \times \prod _{j = r_i(u) + 2}^{r_i} P( {{\mathcal {Z}}}_i(a_{ij}^-) = (j-1, X_i(a_{ij}^-),\\&\quad \times X_{i1}^{\circ }(a_{ij}^-)) | \bar{{\mathcal {H}}}_i^{\circ }(a_{ij}^-)) \lambda _4(a_{ij} | X_{i1}(a_{ij}^-), \bar{{\mathcal {H}}}_i^{\circ }(a_{ij}^-)) \\&\quad \times \sum _{x_1=0}^1 \biggl \{ P({{\mathcal {Z}}}_i(V_i^-) = (r_1, X_{i1}(V_i^-) = x_1, X_{i1}^{\circ }(V_i^-)) | \bar{{\mathcal {H}}}_i^{\circ }(V_i^-)) \\&\quad \times [\lambda _2(V_i | x_1, X_{i2})]^{\delta _i} [\lambda _3(V_i | x_1, X_{i2})]^{1 - \delta _i} \biggr \} \end{aligned}$$

where $z_i(u) = (r_i(u), 1, X_i^{\circ }(u))$ and the denominator is given in (12).

1.6 Case 2: $a_{i r_i}< u < V_i$

Here the numerator of $P({{\mathcal {Z}}}_i(u) \in {{\mathcal {R}}}_1^z | D_i)$ is given below:

$$\begin{aligned}&\prod _{j=1}^{r_i} P({{\mathcal {Z}}}_i(a_{ij}^-) = (j-1, X_{i1}(a_{ij}^-), X_{i1}^{\circ }(a_{ij}^-) | \bar{{\mathcal {H}}}_i^{\circ }(a_{ij}^-)) \lambda _4(a_{ij} | X_{i1}(a_{ij}^-), \bar{{\mathcal {H}}}_i^{\circ }(a_{ij}^-)) \\&\quad \times \sum _{x_1^{\dag }=0}^{1} \biggl \{ P( {{\mathcal {Z}}}_i(u) = (r_1, x_1 = 1, X_{i1}^{\circ }(u)) | \bar{{\mathcal {H}}}_i^{\circ }(a_{ir_i}^+)) \\&\quad \times P({{\mathcal {Z}}}_i(V_i) = (r_i, x_1^{\dag }, X_{i1}^{\circ }(V_i)) | {{\mathcal {Z}}}_i(u) = z_i(u), \bar{{\mathcal {H}}}_i^{\circ }(V_i)) \\&\quad \times [\lambda _2(V_i | x_1^{\dag }, X_{i2})]^{\delta _i} [\lambda _3(V_i | x_1^{\dag }, X_{i2})]^{1 - \delta _i} \biggr \} \end{aligned}$$

where $z_i(u) = (r_i(u), 1, X_i^{\circ }(u))$ and the denominator is given by the observed data likelihood (12).

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cook, R.J., Lawless, J.F. & Xie, B. Marker-dependent observation and carry-forward of internal covariates in Cox regression. Lifetime Data Anal 28, 560–584 (2022). https://doi.org/10.1007/s10985-022-09561-9

Download citation

Received: 01 November 2021
Accepted: 07 June 2022
Published: 20 June 2022
Issue Date: October 2022
DOI: https://doi.org/10.1007/s10985-022-09561-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Marker-dependent observation and carry-forward of internal covariates in Cox regression

Abstract

Access this article

Similar content being viewed by others

The global prevalence of rheumatoid arthritis: a meta-analysis based on a systematic review

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

The Clone-Censor-Weight Method in Pharmacoepidemiologic Research: Foundations and Methodological Implementation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix

1.1 Computation of \(r^{(k)}(s)= E( {\bar{Y}}_i(s) d N_i(s))\)

1.2 Computation of \(r^{(k)}(s;\psi )\)

Joint model fitting via an EM algorithm

1.1 Complete data scores for the marker process

1.2 Complete data scores for failure and censoring intensities

1.3 Complete data scores for modulated Poisson visit process intensities

1.4 Computation of the conditional expectations

1.5 Case 1: \(a_{i r_i(u)}< u < a_{i, r_i(u) + 1}\) where \(A_i(u) = r_i(u)\)

1.6 Case 2: \(a_{i r_i}< u < V_i\)

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Marker-dependent observation and carry-forward of internal covariates in Cox regression

Abstract

Access this article

Similar content being viewed by others

The global prevalence of rheumatoid arthritis: a meta-analysis based on a systematic review

Estimating the sample mean and standard deviation from the sample size, median, range and/or interquartile range

The Clone-Censor-Weight Method in Pharmacoepidemiologic Research: Foundations and Methodological Implementation

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Appendices

Appendix

1.1 Computation of \(r^{(k)}(s)= E( {\bar{Y}}_i(s) d N_i(s))\)

1.2 Computation of \(r^{(k)}(s;\psi )\)

Joint model fitting via an EM algorithm

1.1 Complete data scores for the marker process

1.2 Complete data scores for failure and censoring intensities

1.3 Complete data scores for modulated Poisson visit process intensities

1.4 Computation of the conditional expectations

1.5 Case 1: \(a_{i r_i(u)}< u < a_{i, r_i(u) + 1}\) where \(A_i(u) = r_i(u)\)

1.6 Case 2: \(a_{i r_i}< u < V_i\)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation