Skip to main content
Log in

A quadratic upper bound algorithm for regression analysis of credit risk under the proportional hazards model with case-cohort data

  • Original Paper
  • Published:
Statistics and Computing Aims and scope Submit manuscript

Abstract

A case-cohort design is a cost-effective biased-sampling scheme in studies on survival data. We study the regression analysis of credit risk by fitting the proportional hazards model to data collected via the case-cohort design. Using the minorization-maximization principle, we develop a new quadratic upper-bound algorithm for the calculation of estimators and obtain the convergence of the algorithm. The proposed algorithm involves the inversion of the derived upper-bound matrix only one time in the whole process and the upper-bound matrix is independent of parameters. These features make the proposed algorithm have simple update and low per-iterative cost, especially to large-dimensional problems. Rcpp is an R package which enables users to write R extensions with C++. In this paper, we write the program of the proposed algorithm via Rcpp and improve the efficiency of R program execution and realize the fast computing. We conduct simulation studies to illustrate the performance of the proposed algorithm. We analyze a real data example from a mortgage dataset for evaluating credit risk.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

References

  • Baesens, B., Roesch, D., Scheule, H.: Credit risk analytics Measurement techniques applications and examples in SAS. Wiley, New York (2016)

    Google Scholar 

  • Banasik, J., Crook, J.N., Thomas, L.C.: Not if but when will borrowers default. J. Op. Res. Soc. 50(12), 1185–1190 (1999)

    MATH  Google Scholar 

  • Becker, M.P., Yang, I., Lange, K.: Em algorithms without missing data. Stat. Methods Med. Res. 6(1), 38–54 (1997)

    Google Scholar 

  • Bellotti, T., Crook, J.: Credit scoring with macroeconomic variables using survival analysis. J. Op. Res. Soc. 60(12), 1699–1707 (2009)

    MATH  Google Scholar 

  • Bellotti, T., Crook, J.: Retail credit stress testing using a discrete hazard model with macroeconomic factors. J. Op. Res. Soc. 65(3), 340–350 (2014)

    Google Scholar 

  • Breslow, N.E., Wellner, J.A.: Weighted likelihood for semiparametric models and two-phase stratified samples with application to cox regression. Scand. J. Stat. 34(1), 86–102 (2007)

    MathSciNet  MATH  Google Scholar 

  • Böhning, D., Lindsay, B.: Monotonicity of quadratic-approximation algorithms. Ann. Inst. Stat. Math. 40(02), 641–663 (1988)

    MathSciNet  MATH  Google Scholar 

  • Cai, J., Zeng, D.: Sample size/power calculation for case-cohort studies. Biometrics 60(4), 1015–1024 (2004)

    MathSciNet  MATH  Google Scholar 

  • Cai, J., Zeng, D.: Power calculation for case-cohort studies with nonrare events. Biometrics 63(4), 1288–1295 (2007)

    MathSciNet  MATH  Google Scholar 

  • Chen, K., Lo, S.-H.: Case-cohort and case-control analysis with cox’s model. Biometrika 86(4), 755–764 (1999)

    MathSciNet  MATH  Google Scholar 

  • Cox, D.R.: Regression models and life-tables. J. Roy. Stat. Soc. Ser. B (Methodol.) 34(2), 187–202 (1972)

    MathSciNet  MATH  Google Scholar 

  • De Pierro, A.R.: A modified expectation maximization algorithm for penalized likelihood estimation in emission tomography. IEEE Trans. Med. Imaging 14(1), 132–137 (1995)

    Google Scholar 

  • Ding, J., Zhou, H., Liu, Y., Cai, J., Longnecker, M.P.: Estimating effect of environmental contaminants on women’s subfecundity for the moba study data with an outcome-dependent sampling scheme. Biostatistics, 15(4), 636–650 (2014)

  • Ding, J., Tian, G.-L., Yuen, K.C.: A new mm algorithm for constrained estimation in the proportional hazards model. Comput. Stat. Data Anal. 84, 135–151 (2015)

    MathSciNet  MATH  Google Scholar 

  • Dirick, L., Claeskens, G., Baesens, B.: An akaike information criterion for multiple event mixture cure models. Eur. J. Oper. Res. 241(2), 449–457 (2015)

    MathSciNet  MATH  Google Scholar 

  • Dirick, L., Claeskens, G., Baesens, B.: Time to default in credit scoring using survival analysis: a benchmark study. J. Op. Res. Soc. 68(6), 652–665 (2017)

    Google Scholar 

  • Eddelbuettel, D., Francois, R.: Rcpp: seamless R and C++ integration. J. Stat. Softw. 40(8), 1–18 (2011)

    Google Scholar 

  • Eddelbuettel, D., Sanderson, C.: Rcpparmadillo: accelerating R with high-performance C++ linear algebra. Comput. Stat. Data Anal. 71, 1054–1063 (2014)

    MathSciNet  MATH  Google Scholar 

  • Efron, B., Tibshirani, R.J.: An introduction to the bootstrap. Chapman and Hall/CRC, Edward Chapman (1994)

    MATH  Google Scholar 

  • Fan, J., Li, R.: Variable selection via nonconcave penalized likelihood and its oracle properties. J. Am. Stat. Assoc. 96(456), 1348–1360 (2001)

    MathSciNet  MATH  Google Scholar 

  • Fan, J., Li, R.: Variable selection for Cox’s proportional hazards model and frailty model. Ann. Stat. 30(1), 74–99 (2002)

    MathSciNet  MATH  Google Scholar 

  • Fan, J., Lv, J.: Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B (Statistical Methodology) 70(5), 849–911 (2008)

    MathSciNet  MATH  Google Scholar 

  • Huang, J., Horowitz, J.L., Wei, F.: Variable selection in nonparametric additive models. Ann. Stat. 38(4), 2282–2313 (2010)

    MathSciNet  MATH  Google Scholar 

  • Hunter, D.R., Lange, K.: Computing estimates in the proportional odds model. Ann. Inst. Stat. Math. 54(1), 155–168 (2002)

    MathSciNet  MATH  Google Scholar 

  • Hunter, D.R., Lange, K.: A tutorial on mm algorithms. Am. Stat. 58(1), 30–37 (2004)

    MathSciNet  Google Scholar 

  • Im, J.-K., Apley, D.W., Qi, C., Shan, X.: A time-dependent proportional hazards survival model for credit risk analysis. J. Op. Res. Soc. 63(3), 306–321 (2012)

    Google Scholar 

  • Kang, S., Cai, J., Chambless, L.: Marginal additive hazards model for case-cohort studies with multiple disease outcomes: an application to the atherosclerosis risk in communities (aric) study. Biostatistics 14(1), 28–41 (2013)

    Google Scholar 

  • Kang, S., Wenbin, L., Liu, M.: Efficient estimation for accelerated failure time model under case-cohort and nested case-control sampling. Biometrics 73(1), 114–123 (2017)

    MathSciNet  MATH  Google Scholar 

  • Kong, L., Cai, J., Sen, P.K.: Weighted estimating equations for semiparametric transformation models with censored data from a case-cohort design. Biometrika 91(2), 305–319 (2004)

    MathSciNet  MATH  Google Scholar 

  • Kulich, M., Lin, D.Y.: Additive hazards regression for case-cohort studies. Biometrika 87(1), 73–87 (2000)

    MathSciNet  MATH  Google Scholar 

  • Kulich, M., Lin, D.Y.: Improving the efficiency of relative-risk estimation in case-cohort studies. J. Am. Stat. Assoc. 99(467), 832–844 (2004)

    MathSciNet  MATH  Google Scholar 

  • Lange, K.: Optimization, vol. 1. Springer, New York, NY (2004)

    MATH  Google Scholar 

  • Lange, K.: Numerical analysis for statisticians. Springer, Berlin (2010)

  • Lange, K., Hunter, D.R., Yang, I.: Optimization transfer using surrogate objective functions. J. Comput. Graph. Stat. 9(1), 1–20 (2000)

    MathSciNet  Google Scholar 

  • Liu, D., Cai, T., Lok, A., Zheng, Y.: Nonparametric maximum likelihood estimators of time-dependent accuracy measures for survival outcome under two-stage sampling designs. J. Am. Stat. Assoc. 113(522), 882–892 (2018)

    MathSciNet  MATH  Google Scholar 

  • Lu, W., Tsiatis, A.A.: Semiparametric transformation models for the case-cohort study. Biometrika 93(1), 207–214 (2006)

    MathSciNet  MATH  Google Scholar 

  • Narain, B.: Survival analysis and the credit granting decision. lc thomas, jn crook, db edelman, eds. credit scoring and credit control, (1992)

  • Prentice, R.L.: A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika 73(1), 1–11 (1986). (04)

    MathSciNet  MATH  Google Scholar 

  • Scheike, T.H., Martinussen, T.: Maximum likelihood estimation for cox’s regression model under case–cohort sampling. Scand. J. Stat 31(2), 283–293 (2004)

  • Self, S.G., Prentice, R.L.: Asymptotic Distribution Theory and Efficiency Results for Case-Cohort Studies. Ann. Stat. 16(1), 64–81 (1988)

    MathSciNet  MATH  Google Scholar 

  • Steingrimsson, J.A., Strawderman, R.L.: Estimation in the semiparametric accelerated failure time model with missing covariates: improving efficiency through augmentation. J. Am. Stat. Assoc. 112(519), 1221–1235 (2017)

    MathSciNet  Google Scholar 

  • Stepanova, M., Thomas, L.: Survival analysis methods for personal loan data. Oper. Res. 50(2), 277–289 (2002)

    MATH  Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B (Methodological) 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  • Tibshirani, R.: The lasso method for variable selection in the cox model. Stat. Med. 16(4), 385–395 (1997)

  • Tong, E.N.C., Christophe, M., Thomas, L.C.: Mixture cure models in credit scoring: if and when borrowers default. Eur. J. Oper. Res. 218(1), 132–139 (2012)

  • Yu, J., Liu, Y., Dale P Sandler, J.C., Zhou, H.: Outcome-dependent sampling design and inference for cox’s proportional hazards model. J. Stat. Plan. Inference 178, 24–36 (2016)

  • Zeng, D., Lin, D.Y.: Efficient estimation of semiparametric transformation models for two-phase cohort studies. J. Am. Stat. Assoc. 109(505), 371–383 (2014)

    MathSciNet  MATH  Google Scholar 

  • Zhang, C.-H., Huang, J.: The sparsity and bias of the lasso selection in high-dimensional linear regression. Ann. Stat. 36(4), 1567–1594 (2008)

    MathSciNet  MATH  Google Scholar 

  • Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101(476), 1418–1429 (2006)

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

This research is supported in part by the National Natural Science Foundation of China ( 11671310 to J.D.).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yanqin Feng.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Appendix A: proofs the theorems:

Appendix A: proofs the theorems:

Asymptotic properties of \(\widehat{{\varvec{{\varvec{\beta }}}}}_P\) is established by (Self and Prentice 1988) for statistical inference. Before presenting the result of asymptotic properties, we introduce some notations first. Let \({\varvec{{\varvec{\beta }}}}_0\) denote the true value of \({\varvec{{\varvec{\beta }}}}\) and denote \(\tau \) to be the time when the study ends or discontinues. Let \(C = \{i: \xi _i = 1 \text { or } \Delta _i = 1\}\) and \(\widetilde{C} = \{i:\xi _i = 1\}\). For \(d = 0, 1, 2,\)

$$\begin{aligned}{} & {} {} S^{(d)}({{\varvec{\beta }}}, t) = \frac{1}{n} \sum _{i \in {C}} Y_{i}(t) \exp \left( Z_{i}^{\prime } {{\varvec{\beta }}}\right) Z_{i}^{\otimes d},\nonumber \\{}{} & {} {} \widetilde{S}^{(d)}({{\varvec{\beta }}}, t) = \frac{1}{\widetilde{n}} \sum _{i \in \widetilde{{C}}} Y_{i}(t) \exp \left( Z_{i}^{\prime } {{\varvec{\beta }}}\right) Z_{i}^{\otimes d},\nonumber \\ {}{} & {} {} Q^{(d)}({{\varvec{\beta }}}, t, w) = \frac{1}{n} \sum _{i \in {C}} Y_{i}(t) Y_{i}(w) \exp \left( 2 Z_{i}^{\prime } {{\varvec{\beta }}}\right) Z_{i}^{\otimes d},\nonumber \\{}{} & {} {} \widetilde{Q}^{(d)}({{\varvec{\beta }}}, t, w)= \frac{1}{\widetilde{n}} \sum _{i \in \widetilde{{C}}} Y_{i}(t) Y_{i}(w) \exp \left( 2 Z_{i}^{\prime } {{\varvec{\beta }}}\right) Z_{i}^{\otimes d}, \end{aligned}$$
(A.1)

where \(a^{\otimes 0} = 1\), \(a^{\otimes 1} = a\), \(a^{\otimes 2} = aa^{\top }\) and a is a vector. Define

$$\begin{aligned} E({\varvec{\beta }}, t)=\frac{S^{(1)}({\varvec{\beta }}, t)}{S^{(0)}({\varvec{\beta }}, t)}, \quad \widetilde{E}({\varvec{\beta }}, t)=\frac{\widetilde{S}^{(1)}({\varvec{\beta }}, t)}{\widetilde{S}^{(0)}({\varvec{\beta }}, t)}, \end{aligned}$$
(A.2)

Condition (C1)-(C7) ensure the asymptotic convergence of \(\widehat{{\varvec{{\varvec{\beta }}}}}_P\):

  1. (C1)

    \(\widetilde{n} / N \rightarrow \alpha \) for some \(\alpha \in (0, 1)\).

  2. (C2)

    \(\int _o^{\tau } \lambda _0(t)dt < \infty \).

  3. (C3)

    There exits \(\delta >0\) such that \(n^{-\frac{1}{2}} \sup _{i, t \in [0, \tau ]} Y_{i}(t)\left| Z_{i}\right| I\left\{ Z_{i}^{\prime } {\varvec{\beta }}_{0}>-\delta \left| Z_{i}\right| \right\} {\mathop {\rightarrow }\limits ^{P}} 0\).

  4. (C4)

    There exits \(s^{(d)}({\varvec{\beta }}, t)\) defined in \(\mathbb {B} \times [0, \tau ]\) satisfying:

  5. (1)

    \(\sup _{{\varvec{\beta }} \in \mathbb {B}, t \in [0, \tau ]}\left\| S^{(d)}({\varvec{\beta }}, t)-s^{(d)}({\varvec{\beta }}, t)\right\| {\mathop {\longrightarrow }\limits ^{P}} 0;\)

  6. (2)

    \(s^{(d)}({\varvec{\beta }}, t)\) are continuous functions of \({\varvec{\beta }}\) uniformly in \(t \in [0, \tau ]\); \(s^{(d)}({\varvec{\beta }}, t)\) are bounded in \(\mathbb {B} \times [0, \tau ]\), \(s^{(0)}({\varvec{\beta }}, t)\) is bounded above 0; \(s^{(1)}({\varvec{\beta }}, t)=\nabla _{{\varvec{\beta }}} s^{(0)}({\varvec{\beta }}, t), s^{(2)}({\varvec{\beta }}, t)=\nabla _{{\varvec{\beta }}}^{2} s^{(0)}({\varvec{\beta }}, t)\) for \({\varvec{\beta }} \in \mathbb {B}, t \in [0, \tau ]\),

  7. (3)

    The matrix

    $$\begin{aligned} \Sigma =\int _{0}^{\tau } v\left( {\varvec{\beta }}_{0}, t\right) s^{(0)}\left( {\varvec{\beta }}_{0}, t\right) \lambda _{0}(t) \textrm{d} t \end{aligned}$$
    (A.3)

    is positive definite, where \(v({\varvec{\beta }}, t)=s^{(2)}({\varvec{\beta }}, t) / s^{(0)}({\varvec{\beta }}, t) -\left( s^{(1)}({\varvec{\beta }}, t) / s^{(0)}({\varvec{\beta }}, t)\right) ^{\otimes 2}\).

  8. (C5)

    The sequence of distributions of \(n^{\frac{1}{2}}(\widetilde{E}({\varvec{\beta }}_{0}, t)-E ({\varvec{\beta }}_{0}, t))\) is tight on the space of càdlàg functions equipped with the product Skorohod topology.

  9. (C6)

    There exist \(q^{(d)}({\varvec{\beta }}, t, w)\) defined on \(\mathbb {B} \times [0, \tau ]^{2}\), satisfying:

  10. (1)

    \(\sup _{{\varvec{\beta }} \in \mathcal {B},(t, w) \in [0,\tau ]^{2}}\Vert Q^{(d)}({\varvec{\beta }}, t, w)-q^{(d)}({\varvec{\beta }}, t, w)\Vert {\mathop {\longrightarrow }\limits ^{P}} 0;\)

  11. (2)

    \(q^{(d)}({\varvec{\beta }}, t, w), d = 0,1,2\) are continuous function of \({\varvec{\beta }}\) uniformly in \((t, w) \in [0, \tau ]^{2}\); and \(q^{(d)}({\varvec{\beta }}, t, w)\) are bounded on \(\mathcal {B} \times [0, \tau ]^{2}\);

  12. (3)

    \(\sup _{n \ge 1} E\left( Q^{(d)}({\varvec{\beta }}, t, w)\right) \) is bounded sequence.

  13. (C7)

    for \(d=0,1,2\),

    $$\begin{aligned} \sup _{{\varvec{\beta }} \in \mathbb {B}, t \in [0, \tau ]}\left\| \tilde{S}^{(d)}({\varvec{\beta }}, t)-s^{(d)}({\varvec{\beta }}, t)\right\| {\mathop {\longrightarrow }\limits ^{P}} 0, \end{aligned}$$
    (A.4)

    and

    $$\begin{aligned} \sup _{{\varvec{\beta }} \in \mathbb {B},(t, w) \in [0, \tau ]^{2}}\left\| \tilde{Q}^{(d)}({\varvec{\beta }}, t, w)-q^{(d)}({\varvec{\beta }}, t, w)\right\| {\mathop {\longrightarrow }\limits ^{P}} 0.\nonumber \\ \end{aligned}$$
    (A.5)

Lemma A.1

Under the regularity conditions (C1)-(C7), as \(n\rightarrow \infty \), \(\widehat{{\varvec{\beta }}}_P\) converges to \({\varvec{\beta }}_0\) in probability and

$$\begin{aligned} \sqrt{n}\left( \widehat{{\varvec{\beta }}}_{P}-{\varvec{\beta }}_{0}\right) {\mathop {\longrightarrow }\limits ^{d}} N\left( 0, \Sigma ^{-1}+\Sigma ^{-1} \Omega \Sigma ^{-1}\right) , \end{aligned}$$

where

$$\begin{aligned} \begin{aligned} \Omega =&\int _{0}^{\tau } \int _{0}^{\tau } G\left( {\varvec{\beta }}_{0}, t, w\right) s^{(0)}\left( {\varvec{\beta }}_{0}, t\right) s^{(0)}\left( {\varvec{\beta }}_{0}, w\right) \\&\lambda _{0}(t) \lambda _{0}(w) \textrm{d} t \mathrm {~d} w,\\ G({\varvec{\beta }}, t, w)=&(1-\alpha ) \alpha ^{-1}\\&\left( \left( s^{(0)}({\varvec{\beta }}, t) s^{(0)}({\varvec{\beta }}, w)\right) ^{-1} h^{(1)}({\varvec{\beta }}, t, w)\right. \\&+\left( s^{(0)}({\varvec{\beta }}, t) s^{(0)}({\varvec{\beta }}, w)\right) ^{-2} s^{(1)}({\varvec{\beta }}, t)\\&s^{(1)}({\varvec{\beta }}, w)^{\top } h^{(0)}({\varvec{\beta }}, t, w) \\&-s^{(0)}({\varvec{\beta }}, t)^{-1} s^{(0)}({\varvec{\beta }}, w)^{-2}\\&s^{(1)}({\varvec{\beta }}, w) h^{(2)}({\varvec{\beta }}, w, t) \\&-s^{(0)}({\varvec{\beta }}, w)^{-1} s^{(0)}({\varvec{\beta }}, t)^{-2}\\&\left. s^{(1)}({\varvec{\beta }}, t) h^{(2)}({\varvec{\beta }}, t, w)\right) \end{aligned} \end{aligned}$$

and

$$\begin{aligned} \begin{aligned}&h^{(i)}({\varvec{\beta }}, t, w)=q^{(i)}({\varvec{\beta }}, t, w)-s^{(i)}({\varvec{\beta }}, t) s^{(i)}({\varvec{\beta }}, w), i = 0,1,2. \end{aligned} \end{aligned}$$

Proof of Theorem 3.1:

The observed information matrix is

$$\begin{aligned}&H_P({\varvec{\beta }})=\sum _{i=1}^{N} \Delta _{i} \nonumber \\&\quad \left[ \frac{\sum _{l \in \widetilde{R}\left( T_{i}\right) } {\varvec{Z}}_{l}^{\otimes 2} e^{{\varvec{Z}}_{l}^{\prime } {\varvec{\beta }}}}{\sum _{l \in \widetilde{R}\left( T_{i}\right) } e^{{\varvec{Z}}_{l}^{\prime } {\varvec{\beta }}}}- \left\{ \frac{\sum _{l \in \widetilde{R}\left( T_{i}\right) } {\varvec{Z}}_{l} e^{{\varvec{Z}}_{l}^{\prime } {\varvec{\beta }}}}{\sum _{l \in \widetilde{R}\left( T_{i}\right) } e^{{\varvec{Z}}_{l}^{\prime } {\varvec{\beta }}}}\right\} ^{\otimes 2} \right] . \end{aligned}$$
(A.6)

Define

$$\begin{aligned} p_k^i = \frac{e^{{\varvec{Z}}_k'{\varvec{\beta }}}}{\sum _{l \in \widetilde{R}\left( T_{i}\right) }e^{{\varvec{Z}}_l'{\varvec{\beta }}}},\; k \in \widetilde{R}(T_i). \end{aligned}$$
(A.7)

\(H_p({\varvec{\beta }})\) can be written as

$$\begin{aligned}{} & {} H_p({\varvec{\beta }}) = \sum _{i=1}^N\Delta _i\left[ \sum _{j\in \widetilde{R}(T_i)}{\varvec{Z}}_j{\varvec{Z}}_j^\top p_j^i-\left\{ \sum _{j\in \widetilde{R}(T_i)}{\varvec{Z}}_jp_{j}^i\right\} ^{\otimes 2}\right] \nonumber \\{} & {} \quad \triangleq \sum _{i=1}^N\Delta _i M_i \;. \end{aligned}$$
(A.8)

For any \(x \ne 0\), we have

$$\begin{aligned} x^\top M_i x= & {} \sum _{j \in \widetilde{R}(T_i)}(x^\top {\varvec{Z}}_j)^2p_j^i \nonumber \\{} & {} - \left[ \sum _{j\in \widetilde{R}(T_i)}x^\top {\varvec{Z}}_jp_{j}^i\right] ^2 \nonumber \\= & {} \textbf{E}_i[W^2] - \textbf{E}_i[W]^2, \end{aligned}$$
(A.9)

where W is a discrete random variable with support \(\{W_j|W_j = x^\top {\varvec{Z}}_j, j \in \widetilde{R}(T_i)\}\) and the \(\textbf{E}_i[\cdot ]\) is with respect to probability \(\{p_k^i,\; k \in \widetilde{R}_i(T_i)\}\).

Let \(w_{\textrm{min}}\) and \(w_{\textrm{max}}\) represent minimum and maximum value of \(W_{j}.\) Easily note that this variance of W can be maximized when the \(w_{\textrm{min}}\) and \(w_{\textrm{max}}\) are both taken on with the probability 1/2. So equation (A.9) can be dominated by:

$$\begin{aligned} \begin{aligned} x^{\top } M_{i} x&\le \frac{w_{\textrm{min}}^{2}+w_{\textrm{max}}^{2}}{2}-\left[ \frac{w_{\textrm{min}}+w_{\textrm{max}}}{2}\right] ^{2} \\&\le \frac{w_{\textrm{min}}^{2}+w_{\textrm{max}}^{2}}{2} \le \frac{\sum _{j \in \widetilde{R}(T_i)} W_{j}^{2}}{2}\\&=\frac{\sum _{j \in \widetilde{R}(T_i)} x^{\top } {\varvec{Z}}_{j} {\varvec{Z}}_{j}^{\top } x}{2}. \end{aligned} \end{aligned}$$

Hence, \(M_i \le \sum _{j \in \widetilde{R}_{i}} {\varvec{Z}}_jZ_j^\top \) and we can deduce that:

$$\begin{aligned} H_P({\varvec{\beta }})=\sum _{i =1}^N \Delta _{i} M_{i} \le \sum _{i = 1}^N \Delta _{i} \sum _{j \in \widetilde{R}(t_{i})} {\varvec{Z}}_{j} {\varvec{Z}}_{j}^{\top } / 2. \end{aligned}$$
(A.10)

Let p be a n-dimensional vector and define D(p) be a diagonal matrix, with the k-th diagonal element being the k-element of p and other elements being 0. Due to (A.8), this quadratic form can be rewritten as:

$$\begin{aligned} \begin{aligned}&x^{\top } M_{i} x =\sum _{j \in \widetilde{R}(T_i)}\left\{ x^{\top } {\varvec{Z}}_{j}\right\} ^{2} p_j^i-\left[ \sum _{j \in \widetilde{R}(T_i)} x^{\top } {\varvec{Z}}_{j} p_j^i\right] ^{2} \\&\quad =\left\{ {\varvec{Z}}^{\top } x\right\} ^{\top } D\left( p^i\right) \left\{ {\varvec{Z}}^{\top } x\right\} -\left\{ {\varvec{Z}}^{\top } x\right\} ^{\top } \\&\quad p^i\left[ \left\{ {\varvec{Z}}^{\top } x\right\} ^{\top } p^i\right] ^{\top } \\&\quad =\left\{ {\varvec{Z}}^{\top } x\right\} ^{\top }\left[ D\left( p^i\right) -p^i\left( p^i\right) ^{\top }\right] \left\{ {\varvec{Z}}^{\top } x\right\} , \end{aligned} \end{aligned}$$
(A.11)

where \({\varvec{Z}} = [{\varvec{Z}}_1, \ldots , {\varvec{Z}}_{n_i}]^{\top }\) and \(p^i = [p_1^i, \ldots , p_{n_i}^i]^{\top }\). \(n_i\) is the number of the elements of \(\widetilde{R}(T_i)\). Let \(w = [W_1,\ldots , W_{n_i}]^{\top } = {\varvec{Z}}^{\top } x\) and \(G(p) = D(p) - pp^{\top }\). Then (A.11) can be rewritten as

$$\begin{aligned} x^{\top } M_i x = w^{\top } G(p^i) w. \end{aligned}$$
(A.12)

We hope to find an upper bound of \(M_i\), which is independent of \(\varvec{\beta }\). Note that in (A.12), \({\varvec{\beta }}\) is only in \(G(p^i)\), so we want to seek an upper bound of \(G(p^i)\). Consider a proportional expression as follow:

$$\begin{aligned} \left( w^{\top } G(p^i) w\right) / \left( w^{\top } G(p^{*}) w\right) , \end{aligned}$$
(A.13)

where \(p^{*} = [1/n_i, \ldots , 1/n_i]^{\top }\). If the above ratio is controlled by a proper constant independent of \({\varvec{\beta }}\), then we get an upper bound. Since in the proof of Theorem 2 we have found that

$$\begin{aligned} w^{\top } G(p^i) w\le & {} \frac{w_{\textrm{min}}^{2}+w_{\textrm{max}}^{2}}{2}-\left[ \frac{w_{\textrm{min}}+w_{\textrm{max}}}{2}\right] ^{2} \\= & {} \frac{(w_{\max } - w_{\min })^2}{4}, \end{aligned}$$

we need only to find a lower bound of \(w^{\top } G(p^{*}) w \).

Rewriting \(w^{\top } G(p^{*}) w \) as:

$$\begin{aligned}{} & {} w^{\top } G\left( p^{*}\right) w =w^{\top } D\left( p^{*}\right) w-\left( w^{\top } p^{*}\right) ^{2} \\{} & {} \quad =\sum _{j \in \widetilde{R}(T_i)} w_{j}^{2} / n_{i}-\left( \sum _{j \in \widetilde{R}(T_i)} w_{j} / n_{i}\right) ^{2}. \end{aligned}$$

We can think of it as the variance of a random variable with probability \(1/ n_i\) at \(w_i, i= 1, 2,\ldots , n_i\), so our goal is equivalently to find a lower bound of the variance. Keep the maximum and the minimum of W fixed while consider all other components of the vector w as unknown variables of the variance function. In this case, the variance is minimized when the all remaining position variables take the value \((w_{\min } + w_{\max }) / 2\).

$$\begin{aligned} \begin{aligned} w^{\top } G\left( p^{*}\right) w&\ge \left( w_{\max }-\frac{w_{\max }+w_{\min }}{2}\right) ^2 \frac{1}{n_{i}}\\&+\left( w_{\min }-\frac{w_{\max }+w_{\min }}{2}\right) ^2 \frac{1}{n_{i}} \\&=\frac{\left( w_{\max }-w_{\min }\right) ^{2}}{2 n_{i}}, \end{aligned} \end{aligned}$$

Hence we bound the ratio:

$$\begin{aligned} \frac{w^{\top } G\left( p^{(i)}\right) w}{w^{\top } G\left( p^{*}\right) w} \le \frac{\left( w_{\max }-w_{\min }\right) ^{2}}{4} / \frac{\left( w_{\max }-w_{\min }\right) ^{2}}{2 n_{i}}=\frac{n_{i}}{2}. \end{aligned}$$

According to this, we can obtain that

$$\begin{aligned} \begin{aligned} x^{\top } M_{i} x&=w^{\top } G\left( p^{(i)}\right) w \le w^{\top } \frac{n_{i}}{2} G\left( p^{*}\right) w \\&=\frac{n_{i}}{2}\left\{ {\varvec{Z}}^{\top } x\right\} ^{\top }\left[ D\left( p^{*}\right) -p^{*}\left( p^{*}\right) ^{\top }\right] \left\{ {\varvec{Z}}^{\top } x\right\} \\&=\frac{n_{i}}{2} \sum _{j \in \widetilde{R}(T_i)}\left\{ x^{\top } {\varvec{Z}}_{j}\right\} ^{2} \frac{1}{n_{i}}-\frac{n_{i}}{2}\left[ \sum _{j \in \widetilde{R}(T_i)} x^{\top } {\varvec{Z}}_{j} \frac{1}{n_{i}}\right] ^{2} \\&=x^{\top }\left[ \sum _{j \in \widetilde{R}(T_i)} {\varvec{Z}}_{j} {\varvec{Z}}_{j}^{\top } / 2-\left\{ \sum _{j \in \widetilde{R}(T_i)} {\varvec{Z}}_{j}\right\} ^{\otimes 2} /\left( 2 n_{i}\right) \right] x. \end{aligned} \end{aligned}$$

Hence

$$\begin{aligned} M_{i} \le \left[ \sum _{j \in \widetilde{R}(T_i)} {\varvec{Z}}_{j} {\varvec{Z}}_{j}^{\top }-\left\{ \sum _{j \in \widetilde{R}(T_i)} {\varvec{Z}}_{j}\right\} ^{\otimes 2} / n_{i}\right] / 2, \end{aligned}$$

and we can deduce that

$$\begin{aligned}{} & {} H_P({\varvec{\beta }})=\sum _{i=1}^N\Delta _{i} M_{i} \le \sum _{i = 1}^N \Delta _{i}\\{} & {} \quad \left[ \sum _{j \in \widetilde{R}(T_i)} {\varvec{Z}}_{j} {\varvec{Z}}_{j}^{\top }-\left\{ \sum _{j \in \widetilde{R}(T_i)} {\varvec{Z}}_{j}\right\} ^{\otimes 2} / n_{i}\right] / 2. \end{aligned}$$

\(\square \)

Proof of Theorem 3.2:

Let \({\varvec{\beta }}^{(m+1)} = {\varvec{\beta }}^{(m)} + B^{-1}U_P({\varvec{\beta }})\), we first prove that \(l_P({\varvec{\beta }}^{(m)})\) is an ascending sequence. consider the quadratic approximation of \(l_P({\varvec{\beta }}^{(m+1)}) - l_P({\varvec{\beta }}^{(m)})\):

$$\begin{aligned} \begin{aligned}&Q({\varvec{\beta }}^{(m+1)}|{\varvec{\beta }}^{(m)})- l_P({\varvec{\beta }}^{(m)})= U_P({\varvec{\beta }}^{(m)})^{\top }({\varvec{\beta }}^{(m+1)} - {\varvec{\beta }}^{(m)}) \\&\quad - \frac{1}{2}({\varvec{\beta }}^{(m+1)} - {\varvec{\beta }}^{(m)}) ^{\top } \textbf{B} ({\varvec{\beta }}^{(m+1)} - {\varvec{\beta }}^{(m)})\\&\quad = U_P({\varvec{\beta }}^{(m)})^{\top }\textbf{B}^{-1}U_P({\varvec{\beta }}^{(m)}) - \frac{1}{2}U_P({\varvec{\beta }}^{(m)})^{\top }\textbf{B}^{-1}\textbf{B}\textbf{B}^{-1}U_P({\varvec{\beta }}^{(m)})\\&\quad = \frac{1}{2}U_P({\varvec{\beta }}^{(m)})^{\top }\textbf{B}^{-1}U_P({\varvec{\beta }}^{(m)}) \ge 0, \end{aligned} \end{aligned}$$

where the inequality is strict if \(U_P({\varvec{\beta }}^{(m)})\ne 0\). Since \(l_P({\varvec{\beta }}^{(m+1)})\ge Q({\varvec{\beta }}^{(m+1)}|{\varvec{\beta }}^{(m)})\) for the reason \(H_P({\varvec{\beta }})\le \textbf{B}\), we have \(l_P({\varvec{\beta }}^{(m + 1)}) \ge l_P({\varvec{\beta }}^{(m)})\).

Suppose for purpose of contradiction that \(\Vert U_P({\varvec{\beta }}^{(m)})\Vert \) is bounded below 0, then

$$\begin{aligned} l_P({\varvec{\beta }}^{(m+1)}) - l_P({\varvec{\beta }}^{(m)}) \ge Q({\varvec{\beta }}^{(m+1)}|{\varvec{\beta }}^{(m)})- l_P({\varvec{\beta }}^{(m)}) > 0, \end{aligned}$$

which means the increments of \(l_P({\varvec{\beta }}^{(m)})\) are positive, contradicting to boundedness of \(l_P({\varvec{\beta }})\). Therefore, \(U_P({\varvec{\beta }}^{(m)})\) converges to 0, deducing that \({\varvec{\beta }}^{(m)}\) converges to \(\widehat{{\varvec{\beta }}}_P\). \(\square \)

Proof of Theorem 3.3

From the proof of Theorem 1 we can learn that

$$\begin{aligned}{} & {} Q_P\left( \varvec{\beta }^{(m+1)} \mid \varvec{\beta }^{(m)}\right) -l_P\left( \varvec{\beta }^{(m)}\right) \\{} & {} \quad =\frac{1}{2} U_P\left( \varvec{\beta }^{(m)}\right) ^{\top } \textbf{B}^{-1} U_P\left( \varvec{\beta }^{(m)}\right) . \end{aligned}$$

Note that the rate of convergence, measured by degree of improvement at each iterative step is monotone decreasing with the least bound matrix \(\textbf{B}\) from (A.19). And

$$\begin{aligned} \begin{aligned} \textbf{B}_{1}-\textbf{B}_{2}&=\sum _{i=1}^{N} \Delta _{i} \sum _{j \in \widetilde{R}(T_i)} {\varvec{Z}}_{j} {\varvec{Z}}_{j}^{\top } / 2-\sum _{i=1}^{N} \Delta _{i}\\&\left[ \sum _{j \in \widetilde{R}(T_i)} {\varvec{Z}}_{j} {\varvec{Z}}_{j}^{\top }-\left\{ \sum _{j \in \widetilde{R}(T_i)} {\varvec{Z}}_{j}\right\} ^{\otimes 2} / n_{i}\right] / 2 \\&=\sum _{i=1}^{N} \Delta _{i}\left[ \left\{ \sum _{j \in \widetilde{R}(T_i)} {\varvec{Z}}_{j}\right\} ^{\otimes 2} / n_{i}\right] / 2 \ge 0. \end{aligned} \end{aligned}$$

As a consequence that \(\textbf{B}_{1} \ge \textbf{B}_{2}\), we can deduce that QUB2 converges faster than QUB1 algorithm. \(\square \)

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Huang, C., Ding, J. & Feng, Y. A quadratic upper bound algorithm for regression analysis of credit risk under the proportional hazards model with case-cohort data. Stat Comput 33, 78 (2023). https://doi.org/10.1007/s11222-023-10248-w

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11222-023-10248-w

Keywords

Navigation