Bayesian tree-based heterogeneous mediation analysis with a time-to-event outcome

Sun, Rongqian; Song, Xinyuan

doi:10.1007/s11222-023-10340-1

Bayesian tree-based heterogeneous mediation analysis with a time-to-event outcome

Original Paper
Published: 01 November 2023

Volume 34, article number 24, (2024)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Rongqian Sun¹ &
Xinyuan Song¹

313 Accesses
Explore all metrics

Abstract

Mediation analysis aims at quantifying and explaining the underlying causal mechanism between an exposure and an outcome of interest. In the context of survival analysis, mediation models have been widely used to achieve causal interpretation for the direct and indirect effects on the survival of interest. Although heterogeneity in treatment effect is drawing increasing attention in biomedical studies, none of the existing methods have accommodated the presence of heterogeneous causal pathways pointing to a time-to-event outcome. In this study, we consider a heterogeneous mediation analysis for survival data based on a Bayesian tree-based Cox proportional hazards model with shared topologies. Under the potential outcomes framework, individual-specific conditional direct and indirect effects are derived on the scale of the logarithm of hazards, survival probability, and restricted mean survival time. A Bayesian approach with efficient sampling strategies is developed to estimate the conditional causal effects through the Monte Carlo implementation of the mediation formula. Simulation studies show the satisfactory performance of the proposed method. The proposed model is then applied to an HIV dataset extracted from the ACTG175 study to demonstrate its usage in detecting heterogeneous causal pathways.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Does Cox analysis of a randomized survival study yield a causal treatment effect?

Article 24 June 2015

The built-in selection bias of hazard ratios formalized using structural causal models

Article Open access 15 February 2024

Defining causal mediation with a longitudinal mediator and a survival outcome

Article 14 September 2018

References

Albert, J.M., Nelson, S.: Generalized causal mediation analysis. Biometrics 67(3), 1028–1038 (2011)
MathSciNet Google Scholar
Athey, S., Wager, S.: Estimating treatment effects with causal forests: an application. Obs. Stud. 5(2), 37–51 (2019)
Google Scholar
Avin, C., Shpitser, I., Pearl, J.: Identifiability of path-specific effects. In: Proceedings of the 19th International Joint Conference on Artificial Intelligence, pp. 357–363. Edinburgh, Scotland (2005)
Baron, R.M., Kenny, D.A.: The moderator-mediator variable distinction in social psychological research: conceptual, strategic, and statistical considerations. J. Pers. Soc. Psychol. 51(6), 1173 (1986)
Google Scholar
Bonetti, M., Gelber, R.D.: Patterns of treatment effects in subsets of patients in clinical trials. Biostatistics 5(3), 465–481 (2004)
Google Scholar
Chipman, H.A., George, E.I., McCulloch, R.E.: BART: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)
MathSciNet Google Scholar
Cho, S.-H., Huang, Y.-T.: Mediation analysis with causally ordered mediators using Cox proportional hazards model. Stat. Med. 38(9), 1566–1581 (2019)
MathSciNet Google Scholar
Cui, Y., Kosorok, M.R., Sverdrup, E., Wager, S., Zhu, R.: Estimating heterogeneous treatment effects with right-censored data via causal survival forests. J. R. Stat. Soc. Ser. B Stat Methodol. 85(2), 179–211 (2023)
Google Scholar
Curth, A., Lee, C., van der Schaar, M.: SurvITE: learning heterogeneous treatment effects from time-to-event data. Adv. Neural. Inf. Process. Syst. 34, 26740–26753 (2021)
Google Scholar
Daniel, R., De Stavola, B., Cousens, S., Vansteelandt, S.: Causal mediation analysis with multiple mediators. Biometrics 71(1), 1–14 (2015)
MathSciNet Google Scholar
Didelez, V.: Defining causal mediation with a longitudinal mediator and a survival outcome. Lifetime Data Anal. 25, 593–610 (2019)
MathSciNet Google Scholar
Dorie, V., Hill, J., Shalit, U., Scott, M., Cervone, D.: Automated versus do-it-yourself methods for causal inference: lessons learned from a data analysis competition. Stat. Sci. 34(1), 43–68 (2019)
MathSciNet Google Scholar
Fulcher, I.R., Tchetgen, E.T., Williams, P.L.: Mediation analysis for censored survival data under an accelerated failure time model. Epidemiology (Cambridge, Mass.) 28(5), 660 (2017)
Hahn, P.R., Murray, J.S., Carvalho, C.M.: Bayesian regression tree models for causal inference: regularization, confounding, and heterogeneous effects (with discussion). Bayesian Anal. 15(3), 965–1056 (2020)
MathSciNet Google Scholar
Hammer, S.M., Katzenstein, D.A., Hughes, M.D., Gundacker, H., Schooley, R.T., Haubrich, R.H., Henry, W.K., Lederman, M.M., Phair, J.P., Niu, M., et al.: A trial comparing nucleoside monotherapy with combination therapy in HIV-infected adults with CD4 cell counts from 200 to 500 per cubic millimeter. N. Engl. J. Med. 335(15), 1081–1090 (1996)
Google Scholar
Hastings, W.K.: Monte Carlo sampling methods using Markov chains and their applications. Biometrika 57(1), 97–109 (1970)
MathSciNet Google Scholar
Hayes, A.F.: An index and test of linear moderated mediation. Multivar. Behav. Res. 50(1), 1–22 (2015)
MathSciNet Google Scholar
Henderson, N.C., Louis, T.A., Rosner, G.L., Varadhan, R.: Individualized treatment effects with censored data via fully nonparametric Bayesian accelerated failure time models. Biostatistics 21(1), 50–68 (2020)
MathSciNet Google Scholar
Hill, J., Linero, A., Murray, J.: Bayesian additive regression trees: a review and look forward. Annu. Rev. Stat. Appl. 7(1), 251–278 (2020)
MathSciNet Google Scholar
Hu, L., Ji, J., Li, F.: Estimating heterogeneous survival treatment effect in observational data using machine learning. Stat. Med. 40(21), 4691–4713 (2021)
MathSciNet Google Scholar
Huang, Y.-T., Yang, H.-I.: Causal mediation analysis of survival outcome with multiple mediators. Epidemiology (Cambridge, Mass.), 28(3), 370 (2017)
Ibrahim, J.G., Chen, M.-H., Sinha, D.: Bayesian Survival Analysis. Springer, New York (2001)
Google Scholar
Imai, K., Keele, L., Yamamoto, T.: Identification, inference and sensitivity analysis for causal mediation effects. Stat. Sci. 25(1), 51–71 (2010)
MathSciNet Google Scholar
Jenks, J.D., Hoenigl, M.: CD4: CD8 ratio and CD8+ cell count for prognosticating mortality in HIV-infected patients on antiretroviral therapy. J. Lab. Precis. Med., 3 (2018)
Kang, K., Pan, D., Song, X.: A joint model for multivariate longitudinal and survival data to discover the conversion to Alzheimer’s disease. Stat. Med. 41(2), 356–373 (2022)
MathSciNet Google Scholar
Lin, S.-H., Young, J.G., Logan, R., VanderWeele, T.J.: Mediation analysis for a survival outcome with time-varying exposures, mediators, and confounders. Stat. Med. 36(26), 4153–4166 (2017)
MathSciNet Google Scholar
Linero, A.R.: Bayesian regression trees for high-dimensional prediction and variable selection. J. Am. Stat. Assoc. 113(522), 626–636 (2018)
MathSciNet Google Scholar
Linero, A.R., Basak, P., Li, Y., Sinha, D.: Bayesian survival tree ensembles with submodel shrinkage. Bayesian Anal. 17(3), 997–1020 (2022)
MathSciNet Google Scholar
Linero, A.R., Sinha, D., Lipsitz, S.R.: Semiparametric mixed-scale models using shared Bayesian forests. Biometrics 76(1), 131–144 (2020)
MathSciNet Google Scholar
Linero, A.R., Yang, Y.: Bayesian regression tree ensembles that adapt to smoothness and sparsity. J. R. Stat. Soc. Ser. B Stat Methodol. 80(5), 1087–1110 (2018)
MathSciNet Google Scholar
Linero, A. R., Zhang, Q.: Mediation analysis using Bayesian tree ensembles. Psychol. Methods (2022)
Lu, W., Zhang, H.H., Zeng, D.: Variable selection for optimal treatment decision. Stat. Methods Med. Res. 22(5), 493–504 (2013)
MathSciNet Google Scholar
MacKinnon, D.P., Fairchild, A.J., Fritz, M.S.: Mediation analysis. Annu. Rev. Psychol. 58, 593–614 (2007)
Google Scholar
MacKinnon, D.P., Lockwood, C.M., Hoffman, J.M., West, S.G., Sheets, V.: A comparison of methods to test mediation and other intervening variable effects. Psychol. Methods 7(1), 83 (2002)
Google Scholar
Malinsky, D., Shpitser, I., Richardson, T.: A Potential Outcomes Calculus for Identifying Conditional Path-Specific Effects. In: The 22nd International Conference on Artificial Intelligence and Statistics, pp. 3080–3088. PMLR (2019)
Metropolis, N., Rosenbluth, A.W., Rosenbluth, M.N., Teller, A.H., Teller, E.: Equation of state calculations by fast computing machines. J. Chem. Phys. 21(6), 1087–1092 (1953)
Monsalvo, M., Vallejo, A., Fontecha, M., Vivancos, M.J., Vizcarra, P., Casado, J.L.: CD4/CD8 ratio improvement in HIV-1-infected patients receiving dual antiretroviral treatment. Int. J. STD AIDS 30(7), 656–662 (2019)
Murray, J.S.: Log-linear Bayesian additive regression trees for multinomial logistic and count regression models. J. Am. Stat. Assoc. 116(534), 756–769 (2021)
MathSciNet Google Scholar
Pearl, J.: Direct and indirect effects. In Proceedings of the Seventeenth Conference on Uncertainty and Artificial Intelligence, pp. 411–420. Morgan Kaufman, San Francisco, CA (2001)
Preacher, K.J., Rucker, D.D., Hayes, A.F.: Addressing moderated mediation hypotheses: theory, methods, and prescriptions. Multivar. Behav. Res. 42(1), 185–227 (2007)
Google Scholar
Qin, J., Shen, Y.: Statistical methods for analyzing right-censored length-biased data under cox model. Biometrics 66(2), 382–392 (2010)
MathSciNet Google Scholar
Royston, P., Parmar, M.K.: The use of restricted mean survival time to estimate the treatment effect in randomized clinical trials when the proportional hazards assumption is in doubt. Stat. Med. 30(19), 2409–2421 (2011)
MathSciNet Google Scholar
Rubin, D.B.: Estimating causal effects of treatments in randomized and nonrandomized studies. J. Educ. Psychol. 66(5), 688–701 (1974)
Rubin, D.B.: Causal inference using potential outcomes: design, modeling, decisions. J. Am. Stat. Assoc. 100(469), 322–331 (2005)
MathSciNet Google Scholar
Sinha, D., Ibrahim, J.G., Chen, M.-H.: A Bayesian justification of Cox’s partial likelihood. Biometrika 90(3), 629–641 (2003)
MathSciNet Google Scholar
Sun, R., Zhou, X., Song, X.: Bayesian causal mediation analysis with latent mediators and survival outcome. Struct. Equ. Model. 28(5), 778–790 (2021)
MathSciNet Google Scholar
Tan, Y.V., Roy, J.: Bayesian additive regression trees and the general BART model. Stat. Med. 38(25), 5048–5069 (2019)
MathSciNet Google Scholar
Ten Have, T.R., Joffe, M.M.: A review of causal estimation of effects in mediation analyses. Stat. Methods Med. Res. 21(1), 77–107 (2012)
MathSciNet Google Scholar
Wang, W., Albert, J.M.: Causal mediation analysis for the Cox proportional hazards model with a smooth baseline hazard estimator. J. R. Stat. Soc. Ser. C Appl. Stat. 66(4), 741 (2017)
MathSciNet Google Scholar
Xue, F., Tang, X., Kim, G., Koenen, K.C., Martin, C.L., Galea, S., Wildman, D., Uddin, M., Qu, A.: Heterogeneous mediation analysis on epigenomic PTSD and traumatic stress in a predominantly African American Cohort. J. Am. Stat. Assoc. 117(540), 1669–1683 (2022)
MathSciNet Google Scholar
Yin, G., Ibrahim, J.G.: A general class of Bayesian survival models with zero and nonzero cure fractions. Biometrics 61(2), 403–412 (2005)
MathSciNet Google Scholar
Zhou, X., Song, X.: Mediation analysis for mixture Cox proportional hazards cure models. Stat. Methods Med. Res. 30(6), 1554–1572 (2021)
MathSciNet Google Scholar

Download references

Acknowledgements

This research was fully supported by GRF Grants (14303622, 14302220) from Research Grant Council of the Hong Kong Special Administration Region.

Author information

Authors and Affiliations

Department of Statistics, Chinese University of Hong Kong, Hong Kong, China
Rongqian Sun & Xinyuan Song

Authors

Rongqian Sun
View author publications
You can also search for this author in PubMed Google Scholar
Xinyuan Song
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

R.S. and X.S. wrote the manuscript. R.S. and X.S. reviewed the manuscript.

Corresponding author

Correspondence to Xinyuan Song.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendices

Appendix A: Derivation of Eqs. (3) and (4)

Under the identification assumptions stated in Sect. 2.3, the targeted function of potential survival time for subject i, $\omega \{T_i(a, M_i(a'))\}$, is identified as

$$\begin{aligned}{} & {} \omega \{T_i(a,M_i(a'))\} {=} \int \omega \{T_i(a, m) | M_i(a') {=} m, \textbf{x}_i = \varvec{x}\}dF_{M_i(a')|\textbf{x}_i=\varvec{x}} \nonumber \\{} & {} \quad {\left[ \text {By Assumption } 4\right] }\nonumber \\{} & {} \quad = \int \omega \{T_i(a, m) |\textbf{x}_i = \varvec{x}\}dF_{M_i(a')|\textbf{x}_i=\varvec{x}}\nonumber \\{} & {} \quad {\left[ \text {By Assumption } 1\right] }\nonumber \\{} & {} \quad = \int \omega \{T_i(a, m) |A_i = a, \textbf{x}_i = \varvec{x}\}dF_{M_i(a')|\textbf{x}_i=\varvec{x}} \nonumber \\{} & {} \quad {\left[ \text {By Assumption } 2\right] }\nonumber \\{} & {} \quad = \int \omega \{T_i(a, m) |A_i = a, M_i = m, \textbf{x}_i = \varvec{x}\}dF_{M_i(a')|\textbf{x}_i=\varvec{x}} \nonumber \\{} & {} \quad {\left[ \text {By Assumption } 3\right] }\nonumber \\{} & {} \quad = \int \omega \{T_i(a, m) |A_i = a, M_i = m, \textbf{x}_i = \varvec{x}\}dF_{M_i(a')|A_i = a', \textbf{x}_i=\varvec{x}} \nonumber \\{} & {} \quad {\left[ \text {By Consistency}\right] }\nonumber \\{} & {} \quad = \int \omega \{T_i |A_i = a, M_i = m, \textbf{x}_i = \varvec{x}\}dF_{M_i | A_i = a', \textbf{x}_i=\varvec{x}}. \nonumber \\ \end{aligned}$$

(A.1)

The last equation follows from the basic consistency assumption of the counterfactual framework, which states that an individual’s potential outcome under their observed exposure history is exactly their observed outcome (Imai et al. 2010; Ten Have and Joffe 2012), i.e., $M_i(A_i) = M_i$ and $T_i(A_i, M_i(A_i)) = T_i$. We used the Monte Carlo method to integrate over $F_{M_i|A_i, \textbf{x}_i}$, and also adopted the comonotonic sampling strategy (Linero and Zhang 2022) to enhance the efficiency of the numerical implementation of the mediation formula. Major steps are outlined in Algorithm A1.

By using inverse transform sampling with shared $u_i$, $M_i^{(itr)}(1)$ and $M_i^{(itr)}(0)$ are sampled as comonotonic pairs. As explained by Linero and Zhang (2022), this approach is effective in reducing Monte Carlo errors by making the $M_i^{(itr)}(a)$ pairs as correlated as possible, thereby allowing for accurate estimation of the conditional PSEs with only one sample per iteration.

Table 10 Average bias, relative bias, RMSE, coverage rate and length of the 95% credible interval for the average PSEs on the scale of log hazard, survival probability, and RMST at the maximal observed time under setup (i) with $n = 1000$ and a true baseline hazard of $\lambda _0(t)=1/2t^{1/2}$

Full size table

Table 11 Average MSE of mediation effects across individuals and average (in-sample) PRMSE of the mediator and the time-to-event outcome on three scales under setup(i) with $\lambda _0(t)=1/2t^{1/2}$

Full size table

The group-average or sample-average PSEs are identified through an additional integration over the covariate distribution $F_{\textbf{x}}$. We followed the common practice in the mediation literature to approximate the integration using empirical distribution of the covariates based on the collected sample (Albert and Nelson 2011; Bonetti and Gelber 2004), under which circumstance Eq. 4 simplifies to

$$\begin{aligned}{} & {} {} {\bar{\omega }}\{T(a,M(a'))\} \approx \frac{1}{n}\sum _{i=1}^n \omega \{T_i(a,M_i(a'))\} \nonumber \\{}{} & {} {} = \frac{1}{n}\sum _{i=1}^n \int \omega \{T_i |A_i = a, M_i = m, {\textbf {x}}_i = \varvec{x}\} dF_{M_i | A_i = a', {\textbf {x}}_i=\varvec{x}}.\nonumber \\ \end{aligned}$$

(A.2)

Appendix B: Full conditional distributions of the tree ensembles

Denote ${\textbf{Y}} = (Y_1, Y_2, \dots , Y_n)^T$, ${\textbf{M}} = (M_1, M_2, \dots , M_n)^T$, ${\textbf{A}} = (A_1, A_2, \dots , A_n)^T$, ${\textbf{X}} = (\textbf{x}_1^T, \textbf{x}_2^T, \dots , \textbf{x}_n^T)^T$, and $\varvec{\delta } = (\delta _1, \dots , \delta _n)^T$. Let ${\mathcal {D}} = \{{\textbf{Y}},{\textbf{X}},{\textbf{A}},\varvec{\delta }, {\textbf{M}}\}$ denote the observed data. The complete data likelihood of the proposed model is expressed as

$$\begin{aligned} p({\mathcal {D}}|\cdot )= & {} \prod _{i=1}^{n}\exp \Bigg [-\Lambda _0(Y_i)\exp \{\upsilon (\textbf{x}_i, M_i)+ \tau (\textbf{x}_i)A_i\}\big ]\nonumber \\{} & {} \quad \big [\lambda _0(Y_i)\exp \{\upsilon (\textbf{x}_i,M_i)+ \tau (\textbf{x}_i)A_i\}\Bigg ]^{\delta _i}\nonumber \\{} & {} \quad \times \frac{1}{\sqrt{2\pi }\sigma }\exp \{-\frac{(M_i - \upsilon _M(\textbf{x}_i) - \tau _M(\textbf{x}_i)A_i )^2}{2\sigma ^2}\},\nonumber \\ \end{aligned}$$

(A.3)

where $\Lambda _0(t) = \int _0^t \lambda _0(u)du$ is the cumulative baseline hazard function. After assuming prior independence among the unknown parameters $\{\mathcal {T}_j, \mathcal {M}_j\}_{j=1}^J$, $\{\widetilde{\mathcal {T}}_h, \widetilde{\mathcal {M}}_h^{(M)}, \widetilde{\mathcal {M}}_h\}_{h=1}^H$, $\{\breve{\mathcal {T}}_k, \breve{\mathcal {M}}_j\}_{k=1}^K$, $\varvec{\lambda }$, and $\sigma $, and prior independence among the individual binary trees inside each tree ensembles as well, the full conditional distribution of the parameters are derived as

follows.

(I) Full conditional distribution of $\{\mathcal {T}_j, \mathcal {M}_j\}_{j=1}^J$

Let $R_{ij} = M_i - \sum _{l \ne j}g({\textbf {x}}_i;\mathcal {T}_l, \mathcal {M}_l) - \tau _M({\textbf {x}}_i)A_i$ and $\varvec{R}_j = (R_{1j}, \dots , R_{nj})^T$ denote the partial residuals in the mediator regression equation w.r.t the jth binary tree $\{\mathcal {T}_j,\mathcal {M}_j\}$. By prior independence and the i.i.d. assumption on the sample data,

$$\begin{aligned}{} & {} {} p\left( \{\mathcal {T}_j,\mathcal {M}_j\}_{j = 1}^{J}|{\mathcal {D}}, \sigma , \{\widetilde{\mathcal {T}}_h,\widetilde{\mathcal {M}}_h^{(M)}\}_{h=1}^H \right) \\{}{} & {} {} =\prod _{j=1}^{J}p(\mathcal {M}_j|\varvec{R}_j, \mathcal {T}_j, \sigma )p(\mathcal {T}_j|\varvec{R}_j,\sigma ), \end{aligned}$$

where

$$\begin{aligned}{} & {} {} p( \mathcal {M}_j|{\mathcal {D}},\mathcal {T}_j,\sigma ) \propto p(\varvec{R}_j|\mathcal {M}_j,\mathcal {T}_j, \sigma )p(\mathcal {M}_j|\mathcal {T}_j)\nonumber \\{}{} & {} {} \propto \prod _{l=1}^{b_j}\prod _{i:\mathbb {1}_{il}=1}\exp \Big \{-\frac{1}{2\sigma ^2}(R_{ij} -\mu _{jl})^2\Big \}\exp \Big \{-\frac{1}{2\sigma _\mu ^2}\mu _{jl}^2\Big \}\nonumber \\{}{} & {} {} \propto \prod _{l=1}^{b_j}\exp \Big \{-\frac{1}{2\sigma _\mu ^2}\mu _{jl}^2-\frac{1}{2\sigma ^2}\sum _{i:\mathbb {1}_{il} = 1}(R_{ij} - \mu _{jl})^2\Big \}\nonumber \\{}{} & {} {} \overset{d}{=}\ \prod _{l=1}^{b_j}N(a_l,\sigma ^2_l), \end{aligned}$$

(A.4)

with $\sigma _l^2 = \frac{\sigma ^2\sigma _\mu ^2}{\sigma ^2 + \sigma _\mu ^2\sum _i\mathbb {1}_{il}}$, $a_l = \frac{\sigma _\mu ^2\sum _{i:\mathbb {1}_{il} = 1}R_{ij}}{\sigma ^2 + \sigma _\mu ^2\sum _i\mathbb {1}_{il}}$, $\sigma ^2_\mu = \frac{c^2_1}{4Jk_1^2}$, $\mathbb {1}_{il}$ denoting that subject i is allocated to the lth terminal node of $\mathcal {T}_j$ with mean parameter $\mu _{jl}$, and ‘$\overset{d}{=}$’ representing equivalence in distribution. The conjugacy makes it feasible to draw $\mathcal {T}_j$ from its marginal posterior distribution where $\mathcal {M}_j$ is integrated out to avoid the need for reversible jumps. That is,

$$\begin{aligned}{} & {} {} p(\mathcal {T}_j| \varvec{R}_j,\sigma ) \propto p(\mathcal {T}_j)\left( \int p(\varvec{R}_j|\mathcal {M}_j, \mathcal {T}_j, \sigma )p(\mathcal {M}_j|\mathcal {T}_j)d\mathcal {M}_j\right) \nonumber \\{}{} & {} {} \propto p(\mathcal {T}_j)\prod _{l=1}^{b_j} \int \prod _{i:\mathbb {1}_{il}=1}\frac{1}{\sqrt{2\pi \sigma ^2}}\exp \left\{ -\frac{1}{2\sigma ^2}(R_{ij} - \mu _{jl})^2\right\} \nonumber \\{}{} & {} {} \quad \times \frac{1}{\sqrt{2\pi \sigma ^2_\mu }}\exp \left\{ -\frac{\mu _{jl}^2}{2\sigma ^2_\mu }\right\} d\mu _{jl}\nonumber \\{}{} & {} {} \propto p(\mathcal {T}_j)\prod _{l=1}^{b_j} \left( \frac{1}{\sqrt{2\pi \sigma ^2}}\right) ^{n}\exp \left\{ -\frac{\sum _{i=1}^n(R_{ij})^2}{2\sigma ^2}\right\} \nonumber \\{}{} & {} {} \quad \times \sqrt{\frac{\sigma _l^2}{\sigma ^2_\mu }}\exp \left\{ \frac{\left( \sum _{i:\mathbb {1}_{il} = 1}R_{ij}\right) ^2\sigma _l^2}{2\sigma ^4}\right\} . \end{aligned}$$

(A.5)

To acquire posterior samples of $\mathcal {T}_j$, we employed the Metropolis-Hastings algorithm (Metropolis et al. 1953; Hastings 1970), with a classical proposal distribution suggested by (Chipman et al. 2010) that includes four possible moves with probabilities as follows: splitting at a leaf node (0.25), pruning a pair of leaf nodes (0.25), changing the splitting rule of an internal node (0.4), and swapping the splitting rule between an internal node and its child (0.1). Given the current iteration $\mathcal {T}_j^{(itr)}$, a candidate tree $\mathcal {T}_j^*$ is proposed according to the proposal distribution $q(\cdot |\cdot )$, and is accepted as $\mathcal {T}_j^{(itr+1)}$ with a probability of $\min \left\{ 1, \frac{q\left( \mathcal {T}_j^{(itr)}\big |\mathcal {T}_j^*\right) p\left( \mathcal {T}_j^*\big |\varvec{R}^{(itr)}_j, \sigma ^{(itr)}\right) }{q\left( \mathcal {T}_j^*\big |\mathcal {T}_j^{(itr)}\right) p\left( \mathcal {T}_j^{(itr)}\big |\varvec{R}^{(itr)}_j, \sigma ^{(itr)}\right) }\right\} $.

(II) Full conditional distribution of $\{\widetilde{\mathcal {T}}_h, \widetilde{\mathcal {M}}_h, \widetilde{\mathcal {M}}_h^{(M)}\}_{h=1}^H$

Let $\widetilde{R}_{ih}^{(M)} = M_i - \upsilon _M({\textbf {x}}_i) - \sum _{l \ne h}g({\textbf {x}}_i;\widetilde{\mathcal {T}}_l, \widetilde{\mathcal {M}}_l^{(M)})A_i$ and $\varvec{\widetilde{R}}_h^{(M)} = (\widetilde{R}_{1\,h}^{(M)}, \dots , \widetilde{R}_{nh}^{(M)})^T$ denote the partial residuals in the mediator regression equation w.r.t the hth binary tree $\{\widetilde{\mathcal {T}}_h,\widetilde{\mathcal {M}}_h^{(M)}\}$. Let $\widetilde{R}_{ih} = \upsilon ({\textbf {x}}_i, M_i) + \sum _{l \ne h}g({\textbf {x}}_i;\widetilde{\mathcal {T}}_l, \widetilde{\mathcal {M}}_l)A_i$ and $\varvec{\widetilde{R}}_h = (\widetilde{R}_{1h}, \dots , \widetilde{R}_{nh})^T$ denote the partial residuals in the PH model w.r.t $\{\widetilde{\mathcal {T}}_h,\widetilde{\mathcal {M}}_h\}$. The full conditionals of $\{\widetilde{\mathcal {T}}_h, \widetilde{\mathcal {M}}_h, \widetilde{\mathcal {M}}_h^{(M)}\}_{h=1}^H$ are derived similarly, with the only difference being that the leaf node parameters are bivariate due to the shared tree topologies. Assuming independence among the leaf node parameters, we have

$$\begin{aligned}{} & {} {} p\left( \{\widetilde{\mathcal {T}}_h,\widetilde{\mathcal {M}}_h^{(M)}, \widetilde{\mathcal {M}}_h\}_{h = 1}^{H}|{\mathcal {D}}, \sigma , \{(\mathcal {T}_j,\mathcal {M}_j)\}_{j=1}^J, \{\breve{\mathcal {T}}_k,\breve{\mathcal {M}}_k\}_{k=1}^K, \varvec{\lambda } \right) \\{}{} & {} {} =\,\prod _{h=1}^{H}p\left( \widetilde{\mathcal {M}}_h^{(M)},\widetilde{\mathcal {M}}_h|\varvec{\widetilde{R}}_h^{(M)}, \varvec{\widetilde{R}}_h,\widetilde{\mathcal {T}}_h, \varvec{\lambda },\sigma \right) \\{}{} & {} {} p\left( \widetilde{\mathcal {T}}_h|\varvec{\widetilde{R}}_h^{(M)}, \varvec{\widetilde{R}}_h,\varvec{\lambda },\sigma \right) =\,\prod _{h=1}^{H}p\left( \widetilde{\mathcal {M}}_h| \varvec{\widetilde{R}}_h,\widetilde{\mathcal {T}}_h, \varvec{\lambda }\right) p\left( \widetilde{\mathcal {M}}_h^{(M)}|\varvec{\widetilde{R}}_h^{(M)}, \widetilde{\mathcal {T}}_h, \sigma \right) \\{}{} & {} {} \quad \times p\left( \widetilde{\mathcal {T}}_h|\varvec{\widetilde{R}}_h^{(M)}, \varvec{\widetilde{R}}_h,\varvec{\lambda },\sigma \right) \end{aligned}$$

where

$$\begin{aligned}{} & {} {} p\left( \widetilde{\mathcal {M}}_h| \varvec{\widetilde{R}}_h,\widetilde{\mathcal {T}}_h, \varvec{\lambda }\right) \propto p\left( \varvec{\widetilde{R}}_h|\widetilde{\mathcal {M}}_h,\widetilde{\mathcal {T}}_h,\varvec{\lambda }\right) p\left( \widetilde{\mathcal {M}}_h|\widetilde{\mathcal {T}}_h\right) \nonumber \\{}{} & {} {} \propto \prod _{l=1}^{b_h}\prod _{i:\mathbb {1}_{il}=1}\exp \left[ \delta _i\left\{ \widetilde{R}_{ih} + \widetilde{\mu }_{hl}A_i\right\} - \exp \left\{ \widetilde{R}_{ih} + \widetilde{\mu }_{hl}A_i\right\} \right. \nonumber \\{}{} & {} {} \quad \left. \times \sum _{s:i\in {\mathcal {R}}(s)}\lambda _s\delta _s\right] \exp \left[ \widetilde{\mu }_{hl}{\widetilde{\zeta }} - \exp (\widetilde{\mu }_{hl}){\widetilde{\eta }}\right] \nonumber \\{}{} & {} {} \propto \prod _{l=1}^{b_h} \exp \left[ \widetilde{\mu }_{hl}\left( \sum _{i:\mathbb {1}_{il}= 1}\delta _iA_i + {\widetilde{\zeta }}\right) \right. \nonumber \\{}{} & {} {} \quad \left. - \exp (\widetilde{\mu }_{hl})\left\{ {\widetilde{\eta }} +\sum _{i:\mathbb {1}_{il}= 1,A_i=1}\exp (\widetilde{R}_{ih})\sum _{s:i\in {\mathcal {R}}(s)}\lambda _s\delta _s\right\} \right] \nonumber \\{}{} & {} {} \overset{d}{=}\prod _{l=1}^{b_h}logGam({\widetilde{\zeta }}^*, {\widetilde{\eta }}^*), \end{aligned}$$

(A.6)

with ${\widetilde{\zeta }}^* = \sum _{i:\mathbb {1}_{il}= 1}\delta _iA_i + {\widetilde{\zeta }}$ and ${\widetilde{\eta }}^* = {\widetilde{\eta }} +\sum _{i:\mathbb {1}_{il}= 1,A_i=1}$$\exp (\widetilde{R}_{ih})\sum _{s:i\in {\mathcal {R}}(s)}\lambda _s\delta _s$; and

$$\begin{aligned}{} & {} {} p\left( \widetilde{\mathcal {M}}_h^{(M)}|\varvec{\widetilde{R}}_h^{(M)}, \widetilde{\mathcal {T}}_h, \sigma \right) \propto p\left( \varvec{\widetilde{R}}_h^{(M)}|\widetilde{\mathcal {M}}_h^{(M)},\widetilde{\mathcal {T}}_h, \sigma \right) p\left( \widetilde{\mathcal {M}}_h^{(M)}|\widetilde{\mathcal {T}}_h\right) \nonumber \\{}{} & {} {} \propto \prod _{l=1}^{b_h}\prod _{i:\mathbb {1}_{il}=1}\exp \left\{ -\frac{1}{2\sigma ^2}\left( \widetilde{R}_{ih}^{(M)} -\widetilde{\mu }_{hl}^{(M)}\right) ^2\right\} \exp \left\{ -\frac{1}{2\sigma _{\widetilde{\mu }}^2}\left( \widetilde{\mu }_{hl}^{(M)}\right) ^2\right\} \nonumber \\{}{} & {} {} \overset{d}{=}\ \prod _{l=1}^{b_h}N({\widetilde{a}}_l,{\widetilde{\sigma }}^2_l), \end{aligned}$$

(A.7)

with ${\widetilde{\sigma }}_l^2 = \frac{\sigma ^2\sigma _{\widetilde{\mu }}^2}{\sigma ^2 + \sigma _{\widetilde{\mu }}^2\sum _i\mathbb {1}_{il}}$, ${\widetilde{a}}_l = \frac{\sigma _{\widetilde{\mu }}^2\sum _{i:\mathbb {1}_{il} = 1}\widetilde{R}_{ih}^{(M)}}{\sigma ^2 + \sigma _{\widetilde{\mu }}^2\sum _i\mathbb {1}_{il}}$, and $\sigma ^2_{\widetilde{\mu }} = \frac{c^2_2}{4Hk_2^2}$.

Using the conjugate normal and log-gamma priors once again leads to closed-form marginal posteriors for $\widetilde{\mathcal {T}}_h$, which is integrated over $\{\widetilde{\mathcal {M}}_h,\widetilde{\mathcal {M}}_h^{(M)}\}$ and expressed as

$$\begin{aligned}{} & {} p\left( \widetilde{\mathcal {T}}_h|\varvec{\widetilde{R}}_h, \varvec{\widetilde{R}}_h^{(M)},\varvec{\lambda },\sigma \right) \propto \nonumber \\{} & {} \quad p(\widetilde{\mathcal {T}}_h)\int p\left( \varvec{\widetilde{R}}_h|\widetilde{\mathcal {M}}_h, \widetilde{\mathcal {T}}_h, \varvec{\lambda }\right) p\left( \widetilde{\mathcal {M}}_h|\widetilde{\mathcal {T}}_h\right) d\widetilde{\mathcal {M}}_h\nonumber \\{} & {} \quad \int p\left( \varvec{\widetilde{R}}_h^{(M)}|\widetilde{\mathcal {M}}_h^{(M)}, \widetilde{\mathcal {T}}_h, \sigma \right) p\left( \widetilde{\mathcal {M}}_h^{(M)}|\widetilde{\mathcal {T}}_h\right) d\widetilde{\mathcal {M}}_h^{(M)}\nonumber \\{} & {} \quad \propto p(\widetilde{\mathcal {T}}_h)\prod _{l=1}^{b_h} \int \exp \Bigg [\sum _{i:\mathbb {1}_{il}= 1}\delta _i\widetilde{R}_{ih} + \widetilde{\mu }_{hl}\sum _{i:\mathbb {1}_{il}= 1}\delta _iA_i\nonumber \\{} & {} \quad - \exp (\widetilde{\mu }_{hl})\sum _{i:\mathbb {1}_{il}=A_i= 1}\exp (\widetilde{R}_{ih})\sum _{s:i\in {\mathcal {R}}(s)}\lambda _s\delta _s\nonumber \\{} & {} \quad - \sum _{i:\mathbb {1}_{il}= 1,A_i=0}\exp (\widetilde{R}_{ih})\sum _{s:i\in {\mathcal {R}}(s)}\lambda _s\delta _s\Bigg ]\nonumber \\{} & {} \quad \times \frac{{\widetilde{\eta }}^{{\widetilde{\zeta }}}}{\Gamma ({\widetilde{\zeta }})}\exp \left( \widetilde{\mu }_{hl} {\widetilde{\zeta }} - \exp (\widetilde{\mu }_{hl}){\widetilde{\eta }}\right) d\widetilde{\mu }_{hl}\nonumber \\{} & {} \quad \times \int \prod _{i:\mathbb {1}_{il}=1}\frac{1}{\sqrt{2\pi \sigma ^2}}\exp \left\{ -\frac{1}{2\sigma ^2}\left( \widetilde{R}_{ih}^{(M)} - \widetilde{\mu }_{hl}^{(M)}\right) ^2\right\} \nonumber \\{} & {} \quad \frac{1}{\sqrt{2\pi \sigma ^2_{\widetilde{\mu }}}}\exp \left\{ -\frac{\left( \widetilde{\mu }_{hl}^{(M)}\right) ^2}{2\sigma ^2_{\widetilde{\mu }}}\right\} d\widetilde{\mu }_{hl}^{(M)}\nonumber \\{} & {} \quad \propto p(\widetilde{\mathcal {T}}_h)\prod _{l=1}^{b_h}\exp \left( \sum _{i:\mathbb {1}_{il}= 1}\delta _i\widetilde{R}_{ih}\right. \nonumber \\{} & {} \left. \quad - \sum _{i:\mathbb {1}_{il}= 1,A_i=0}\exp (\widetilde{R}_{ih})\sum _{s:i\in {\mathcal {R}}(s)}\lambda _s\delta _s\right) \frac{{\widetilde{\eta }}^{{\widetilde{\zeta }}}\Gamma ({\widetilde{\zeta }}^*)}{\Gamma ({\widetilde{\zeta }}) ({\widetilde{\eta }}^*)^{{\widetilde{\zeta }}^*}}\nonumber \\{} & {} \quad \times \left( \frac{1}{\sqrt{2\pi \sigma ^2}}\right) ^{n}\exp \left\{ -\frac{\sum _{i=1}^n(\widetilde{R}_{ih}^{(M)})^2}{2\sigma ^2}\right\} \nonumber \\{} & {} \quad \sqrt{\frac{{\widetilde{\sigma }}_l^2}{\sigma ^2_{\widetilde{\mu }}}}\exp \left\{ \frac{\left( \sum _{i:\mathbb {1}_{il} = 1}\widetilde{R}_{ih}^{(M)}\right) ^2{\widetilde{\sigma }}_l^2}{2\sigma ^4}\right\} . \end{aligned}$$

(A.8)

Posterior draws of $\widetilde{\mathcal {T}}_h$ can thus be obtained using MH algorithm with the same proposal distribution described above.

(III) Full conditional distribution of $\{\breve{\mathcal {T}}_k, \breve{\mathcal {M}}_k\}_{k=1}^K$

Let $\breve{R}_{ik} = \sum _{l \ne k}g({\textbf {x}}_i, M_i;\breve{\mathcal {T}}_l, \breve{\mathcal {M}}_l) + \tau ({\textbf {x}}_i)A_i$ and $\varvec{\breve{R}}_k = (\breve{R}_{1k}, \dots , \breve{R}_{nk})^T$ denote the partial residuals in the PH model w.r.t $\{\breve{\mathcal {T}}_k,\breve{\mathcal {M}}_k\}$. The full conditionals of $\{\breve{\mathcal {T}}_k, \breve{\mathcal {M}}_k\}_{k=1}^K$ are derived as

$$\begin{aligned}{} & {} {} p\left( \{\breve{\mathcal {T}}_k,\breve{\mathcal {M}}_k\}_{k = 1}^{K}|{\mathcal {D}}, \varvec{\lambda }, \{\widetilde{\mathcal {T}}_h,\widetilde{\mathcal {M}}_h\}_{h=1}^H \right) \\{}{} & {} {} = \prod _{k=1}^{K}p\left( \breve{\mathcal {M}}_k|\varvec{\breve{R}}_k, \breve{\mathcal {T}}_k, \varvec{\lambda }\right) p\left( \breve{\mathcal {T}}_k|\varvec{\breve{R}}_k,\varvec{\lambda }\right) , \end{aligned}$$

where

$$\begin{aligned}{} & {} {} p\left( \breve{\mathcal {M}}_k|\varvec{\breve{R}}_k, \breve{\mathcal {T}}_k, \varvec{\lambda }\right) \propto p\left( \varvec{\breve{R}}_k|\breve{\mathcal {M}}_k, \breve{\mathcal {T}}_k, \varvec{\lambda }\right) p\left( \breve{\mathcal {M}}_k|\breve{\mathcal {T}}_k\right) \nonumber \\{}{} & {} {} \propto \prod _{l=1}^{b_k}\prod _{i:\mathbb {1}_{il}=1}\exp \left[ \delta _i\left\{ \breve{R}_{ik} + \breve{\mu }_{kl}\right\} \right. \nonumber \\{}{} & {} {} \quad \left. - \exp \left\{ \breve{R}_{ik} + \breve{\mu }_{kl}\right\} \sum _{s:i\in {\mathcal {R}}(s)}\lambda _s\delta _s\right] \exp \left[ \breve{\mu }_{kl}\breve{\zeta } - \exp (\breve{\mu }_{kl})\breve{\eta }\right] \nonumber \\{}{} & {} {} \propto \prod _{l=1}^{b_k} \exp \left[ \breve{\mu }_{kl}\left( \sum _{i:\mathbb {1}_{il}= 1}\delta _i + \breve{\zeta }\right) \right. \nonumber \\{}{} & {} {} \quad \left. - \exp (\breve{\mu }_{kl})\left\{ \breve{\eta } +\sum _{i:\mathbb {1}_{il}= 1}\exp (\breve{R}_{ik})\sum _{s:i\in {\mathcal {R}}(s)}\lambda _s\delta _s\right\} \right] \nonumber \\{}{} & {} {} \overset{d}{=}\prod _{l=1}^{b_k}\text{ log-gamma }(\breve{\zeta }^*, \breve{\eta }^*), \end{aligned}$$

(A.9)

with $\breve{\zeta }^* = \sum _{i:\mathbb {1}_{il}= 1}\delta _i + \breve{\zeta }$ and $\breve{\eta }^* = \breve{\eta } +\sum _{i:\mathbb {1}_{il}= 1}\exp (\breve{R}_{ik}) \sum _{s:i\in {\mathcal {R}}(s)}\lambda _s\delta _s$. The marginal posterior distribution of $\breve{\mathcal {T}}_k$ is then derived as

$$\begin{aligned}{} & {} {} p(\breve{\mathcal {T}}_k|\varvec{\breve{R}}_k, \varvec{\lambda }) \propto p(\breve{\mathcal {T}}_k)\int p\left( \varvec{\breve{R}}_k|\breve{\mathcal {M}}_k, \breve{\mathcal {T}}_k, \varvec{\lambda }\right) p\left( \breve{\mathcal {M}}_k|\breve{\mathcal {T}}_k\right) d\breve{\mathcal {M}}_k,\nonumber \\{}{} & {} {} \propto p(\breve{\mathcal {T}}_k)\prod _{l=1}^{b_k}\int \exp \left[ \sum _{i:\mathbb {1}_{il}= 1}\delta _i\breve{R}_{ik} + \breve{\mu }_{kl}\sum _{i:\mathbb {1}_{il}= 1}\delta _i \right. \nonumber \\{}{} & {} {} \quad \left. - \exp (\breve{\mu }_{kl})\sum _{i:\mathbb {1}_{il}= 1}\exp (\breve{R}_{ik})\sum _{s:i\in {\mathcal {R}}(s)}\lambda _s\delta _s\right] \nonumber \\{}{} & {} {} \quad \times \frac{\breve{\eta }^{\breve{\zeta }}}{\Gamma (\breve{\zeta })}\exp \left\{ \breve{\mu }_{kl}\breve{\zeta } - \exp (\breve{\mu }_{kl})\breve{\eta }\right\} d\breve{\mu }_{kl}\nonumber \\{}{} & {} {} \propto p(\breve{\mathcal {T}}_k)\prod _{l=1}^{b_k}\exp \left( \sum _{i:\mathbb {1}_{il}= 1}\delta _i\breve{R}_{ik}\right) \nonumber \\{}{} & {} {} \quad \int \frac{\breve{\eta }^{\breve{\zeta }}}{\Gamma (\breve{\zeta })}\exp \Bigg [ \breve{\mu }_{kl}\bigg (\sum _{i:\mathbb {1}_{il}= 1}\delta _i + \breve{\zeta }\bigg ) \nonumber \\{}{} & {} {} \quad - \exp (\breve{\mu }_{kl})\bigg \{\breve{\eta }+\sum _{i:\mathbb {1}_{il}= 1}\exp (\breve{R}_{ik})\sum _{s: i\in {\mathcal {R}}(s)}\lambda _s\delta _s\bigg \}\Bigg ]d\breve{\mu }_{kl}\nonumber \\{}{} & {} {} \propto p(\breve{\mathcal {T}}_k)\prod _{l=1}^{b_k}\exp \left( \sum _{i:\mathbb {1}_{il}= 1}\delta _i\breve{R}_{ik}\right) \frac{\breve{\eta }^{\breve{\zeta }}\Gamma (\breve{\zeta }^*)}{\Gamma (\breve{\zeta })(\breve{\eta }^*)^{\breve{\zeta }^*}}. \end{aligned}$$

(A.10)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Sun, R., Song, X. Bayesian tree-based heterogeneous mediation analysis with a time-to-event outcome. Stat Comput 34, 24 (2024). https://doi.org/10.1007/s11222-023-10340-1

Download citation

Received: 30 May 2023
Accepted: 12 October 2023
Published: 01 November 2023
DOI: https://doi.org/10.1007/s11222-023-10340-1

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Bayesian tree-based heterogeneous mediation analysis with a time-to-event outcome

Abstract

Access this article

Similar content being viewed by others

Does Cox analysis of a randomized survival study yield a causal treatment effect?

The built-in selection bias of hazard ratios formalized using structural causal models

Defining causal mediation with a longitudinal mediator and a survival outcome

References

Acknowledgements