A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems

Chung, Matthias; Long, Qi; Johnson, Brent A.

doi:10.1007/s11222-012-9333-9

A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems

Published: 26 May 2012

Volume 23, pages 601–614, (2013)
Cite this article

Statistics and Computing Aims and scope Submit manuscript

Matthias Chung¹,
Qi Long² &
Brent A. Johnson²

740 Accesses
14 Citations
Explore all metrics

Abstract

The analysis of survival endpoints subject to right-censoring is an important research area in statistics, particularly among econometricians and biostatisticians. The two most popular semiparametric models are the proportional hazards model and the accelerated failure time (AFT) model. Rank-based estimation in the AFT model is computationally challenging due to optimization of a non-smooth loss function. Previous work has shown that rank-based estimators may be written as solutions to linear programming (LP) problems. However, the size of the LP problem is O(n ²+p) subject to n ² linear constraints, where n denotes sample size and p denotes the dimension of parameters. As n and/or p increases, the feasibility of such solution in practice becomes questionable. Among data mining and statistical learning enthusiasts, there is interest in extending ordinary regression coefficient estimators for low-dimensions into high-dimensional data mining tools through regularization. Applying this recipe to rank-based coefficient estimators leads to formidable optimization problems which may be avoided through smooth approximations to non-smooth functions. We review smooth approximations and quasi-Newton methods for rank-based estimation in AFT models. The computational cost of our method is substantially smaller than the corresponding LP problem and can be applied to small- or large-scale problems similarly. The algorithm described here allows one to couple rank-based estimation for censored data with virtually any regularization and is exemplified through four case studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Variable selection for survival data with a class of adaptive elastic net techniques

Article 17 March 2015

Censored broken adaptive ridge regression in high-dimension

Article 17 January 2024

Variable selection for semiparametric accelerated failure time models with nonignorable missing data

Article 19 November 2023

References

Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Book MATH Google Scholar
Brown, B.M., Wang, Y.G.: Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92, 149–158 (2005)
Article MathSciNet MATH Google Scholar
Brown, B.M., Wang, Y.G.: Induced smoothing for rank regression with censored survival times. Stat. Med. 26, 828–836 (2007)
Article MathSciNet Google Scholar
Cai, T., Huang, J., Tian, L.: Regularized estimation for the accelerated failure time model. Biometrics 65, 394–404 (2009)
Article MathSciNet MATH Google Scholar
CAMDA: Critical Assessment of Microarray Data Analysis (2003). http://www.camda.duke.edu/camda03.html
Candes, E., Tao, T.: The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35(6), 2313–2351 (2007)
Article MathSciNet MATH Google Scholar
Chung, J., Chung, M., O’Leary, D.: Designing optimal filters for ill-posed inverse problems. SIAM J. Sci. Comput. 33(6), 3132–3152 (2011)
Article MathSciNet MATH Google Scholar
Conrad, M., Johnson, B.A.: A quasi-Newton algorithm for efficient computation of Gehan estimates. Technical Report TR 2010-02. Department of Biostatistics and Bioinformatics, Emory University (2010)
Cox, D.R.: Regression models and life-tables. J. R. Stat. Soc., Ser. B, Stat. Methodol. 34, 187–220 (1972)
MATH Google Scholar
Cox, D.R., Oakes, D.: Analysis of Survival Data. Chapman & Hall, London (1984)
Google Scholar
Dickson, E.R., Grambsch, P.M., Fleming, T.R., Fisher, D., Langworthy, A.: Prognosis in primary biliary cirrhosis: model for decision making. Hepatology 10(1), 1–7 (1989)
Article Google Scholar
Fleming, T.R., Harrington, D.P.: Counting Processes and Survival Analysis, vol. 8. Wiley, New York (1991)
MATH Google Scholar
Fygenson, M., Ritov, Y.: Monotone estimating equations for censored data. Ann. Stat. 22, 732–746 (1994)
Article MathSciNet MATH Google Scholar
Gehan, E.A.: A generalized Wilcoxon test for comparing arbitrarily single-censored samples. Biometrika 52, 203–223 (1965)
MathSciNet MATH Google Scholar
Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, New York (1981)
MATH Google Scholar
Hadamard, J.: Sur les Problèmes aux Dérivées Partielles et Leur Signification Physique (1902)
Hastie, T., Tibshirani, R.J.F.: The Elements of Statistical Learning, 2nd edn. Springer, New York (2009)
Book MATH Google Scholar
Heller, G.: Smoothed rank regression with censored data. J. Am. Stat. Assoc. 102(478), 552–559 (2007)
Article MathSciNet MATH Google Scholar
Hoerl, A.E., Kennard, R.W.: Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 55–67 (1970)
Huang, J., Ma, S., Xie, H.: Regularized estimation in the accelerated failure time model with high-dimensional covariates. Biometrics, 813–820 (2006)
Huber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)
Article MATH Google Scholar
Hunter, R., Lange, K.: A tutorial on mm algorithms. Am. Stat. 30–37 (2004)
Jin, Z., Lin, D.Y., Wei, L.J., Ying, Z.: Rank-based inference for the accelerated failure time model. Biometrika 90(2), 341–353 (2003)
Article MathSciNet MATH Google Scholar
Johnson, B.A.: Variable selection in semiparametric linear regression with censored data. J. R. Stat. Soc. B 70, 351–370 (2008)
Article MATH Google Scholar
Johnson, B.A.: Rank-based estimation in the ℓ ₁-regularized partly linear model with application to integrated analyses of clinical predictors and gene expression data. Biostatistics 10, 659–666 (2009a)
Article Google Scholar
Johnson, B.A.: On lasso for censored data. Electron. J. Stat. 3, 485–506 (2009b)
Article MathSciNet MATH Google Scholar
Johnson, B.A., Lin, D., Zeng, D.: Penalized estimating functions and variable selection in semiparametric regression models. J. Am. Stat. Assoc. 103, 672–680 (2008)
Article MathSciNet MATH Google Scholar
Johnson, A., Long, Q., Chung, M.: On path restoration for censored outcomes. Biometrics 67, 1379–1388 (2011)
Article MathSciNet MATH Google Scholar
Johnson, L.M., Strawderman, R.L.: Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 96(3), 577–590 (2009)
Article MathSciNet MATH Google Scholar
Kaipio, J.P., Somersalo, E.: Statistical and Computational Inverse Problems. Springer, Berlin (2005)
MATH Google Scholar
Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data, vol. 5, 2nd edn. Wiley, New York (1980)
MATH Google Scholar
Koenker, R., Bassett, G. Jr.: Regression quantiles. Econometrica 33–50 (1978)
Koenker, R., Ng, P.: A Frisch-Newton algorithm for sparse quantile regression. Acta Math. Appl. Sin. 21(2), 225–236 (2005)
Article MathSciNet MATH Google Scholar
Koenker, R.W., D’Orey, V: Algorithm as 229: computing regression quantiles. J. R. Stat. Soc., Ser. C, Appl. Stat. 36(3), 383–393 (1987)
Google Scholar
Lin, D.Y., Geyer, C.J.: Computational methods for semiparametric linear regression with censored data. J. Comput. Graph. Stat. 1(1), 77–90 (1992)
Google Scholar
Meier, L., Van de Geer, S., Bühlmann, P.: The group lasso for logistic regression. Group 70(1), 53–71 (2008)
MATH Google Scholar
Morris, C., Norton, E., Zhou, X.: Parametric duration analysis of nursing home usage. In: Case Studies in Biometry, pp. 231–248 (1994)
Google Scholar
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, Berlin (2006)
MATH Google Scholar
Owen, A.B.: A robust hybrid of lasso and ridge regression. Technical Report, Department of Statistics, Stanford University, Palo Alto, CA (2006)
Prentice, R.L.: Linear rank tests with right censored data. Biometrika 65(1), 167–179 (1978)
Article MathSciNet MATH Google Scholar
Reid, N.: A conversation with Sir David Cox. Stat. Sci. 9, 439–455 (1994)
Article MATH Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. J. R. Stat. Soc., Ser. B, Stat. Methodol. 67(1), 91–108 (2005)
Article MathSciNet MATH Google Scholar
Tsiatis, A.A.: Estimating regression parameters using linear rank tests for censored data. Ann. Stat. 18(1), 354–372 (1990)
Article MathSciNet MATH Google Scholar
Vogel, C.R.: Computational Methods for Inverse Problems, vol. 23. SIAM, Philadelphia (2002)
Book MATH Google Scholar
Wei, L.J., Ying, Z., Lin, D.Y.: Linear regression analysis of censored survival data based on rank tests. Biometrika 77(4), 845–851 (1990)
Article MathSciNet Google Scholar
Wu, S., Shen, X., Geyer, C.J.: Adaptive regularization using the entire solution surface. Biometrika 96(3), 513–527 (2009)
Article MathSciNet MATH Google Scholar
Xu, J., Leng, C., Ying, Z.: Rank-based variable selection with censored data. Stat. Comput. 20, 165–176 (2010)
Article MathSciNet Google Scholar
Ying, Z.: A large sample study of rank estimation for censored regression data. Ann. Stat. 21, 76–99 (1993)
Article MATH Google Scholar
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc., Ser. B, Stat. Methodol. 68(1), 49–67 (2006)
Article MathSciNet MATH Google Scholar
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Article MATH Google Scholar
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc., Ser. B, Stat. Methodol. 67(2), 301–320 (2005)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Mathematics, Virginia Tech, Blacksburg, VA, 24061-0123, USA
Matthias Chung
Department of Biostatistics and Bioinformatics, Rollins School of Public Health, Emory University, Atlanta, GA, 30322, USA
Qi Long & Brent A. Johnson

Authors

Matthias Chung
View author publications
You can also search for this author in PubMed Google Scholar
Qi Long
View author publications
You can also search for this author in PubMed Google Scholar
Brent A. Johnson
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Brent A. Johnson.

Additional information

This paper is adapted from an earlier Emory technical report by Conrad and Johnson (2010). This work was supported in part by US NIH PHS Grant UL1 RR025008 from the Clinical and Translational Science Award program.

Appendix

Operating characteristics of polynomial-smoothed Gehan estimator

In this section, we outline the large sample properties of the estimator $\widehat{{\boldsymbol {\beta }}}_{G,\varepsilon }$. Let the parameter β belong to a parameter space $\mathbb{B}$, a compact subset of ℜ^p and let f ₀(β) be a convex function for ${\boldsymbol {\beta }}\in\mathbb{B}$. The proof of Theorem 1 relies on the following two facts regarding the loss functions f _G(β) and f _G,ε(β).

Lemma 1

Under Conditions A1–A3 in Johnson and Strawderman (2009, p. 586),

$$\sup_{{\boldsymbol {\beta }}\in\mathbb{B}} \bigl \vert f_{G}({\boldsymbol {\beta }}) - f_0( {\boldsymbol {\beta }}) \bigr \vert \rightarrow0 \quad \mbox{\textit{almost surely}}. $$

Lemma 2

Under Conditions A1–A3 in Johnson and Strawderman (2009, p. 586),

$$\sup_{{\boldsymbol {\beta }}\in\mathbb{B}} \bigl \vert f_{G,\varepsilon }({\boldsymbol {\beta }}) - f_0( {\boldsymbol {\beta }}) \bigr \vert \rightarrow0 \quad \mbox{\textit{almost surely}}. $$

Lemma 1 is also Lemma 1 in Johnson and Strawderman (2009) under exactly the same conditions and stated without proof.

Outline proof of Lemma 2

By the triangle inequality, we have

(19)

By Lemma 1, the second term in (19) can be made arbitrarily small, uniformly for all ${\boldsymbol {\beta }}\in\mathbb{B}$, except on a set of probability measure zero. The first term in (19) is

Hence, the absolute difference between the Gehan loss and its smooth approximation can be made arbitrarily small, for every ${\boldsymbol {\beta }}\in\mathbb{B}$. The conclusion then follows. □

Proof of Theorem 1

Under Conditions A1–A3 of Johnson and Strawderman, f _G(β) and f _G,ε(β) converge uniformly to the convex function f ₀(β) by Lemmas 1 and 2, respectively. By Condition A4, f ₀(β) is strictly convex at its unique minimizer, β ₀. Thus, the minimizers of the random convex functions f _G,ε(β) and f _G(β) converge almost surely to β ₀. □

Asymptotic distribution

The polynomial-smoothed Gehan estimator bears a close similarity to Heller’s (2007) estimator and one expects the asymptotic distribution theory follows similarly. A straightforward calculation confirms that K _ε(z) in Ψ _G,ε(β) in (14) is a survivor function and k _ε(z)=(d/dz)K _ε(z) is symmetric about zero with finite second moment (that is, Heller’s 2007, Condition C3, p. 553). Define the asymptotic slope matrix A _ε(β) and asymptotic covariance B _ε(β),

Then, assuming the covariate matrix has finite second moment and the non-singularity of A _ε(β) in a neighborhood of the true value β ₀, one can show $\sqrt{n}(\widehat{{\boldsymbol {\beta }}}_{G,\varepsilon } - {\boldsymbol {\beta }}_{0})$ converges in distribution to a mean-zero normal random vector with asymptotic covariance

$$\bigl\{\mathbf {A}_{\varepsilon }({\boldsymbol {\beta }}_0)\bigr\}^{-1} \mathbf {B}_{\varepsilon }({\boldsymbol {\beta }}_0)\bigl\{\mathbf {A}_{\varepsilon }( {\boldsymbol {\beta }}_0)\bigr\}^{-1}, $$

(see Heller 2007, Appendix). As with Heller’s estimator, both A _ε(β) and B _ε(β) are directly estimable from the data, the latter derived from a theory of U-statistics.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chung, M., Long, Q. & Johnson, B.A. A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems. Stat Comput 23, 601–614 (2013). https://doi.org/10.1007/s11222-012-9333-9

Download citation

Received: 29 December 2011
Accepted: 25 April 2012
Published: 26 May 2012
Issue Date: September 2013
DOI: https://doi.org/10.1007/s11222-012-9333-9

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems

Abstract

Access this article

Similar content being viewed by others

Variable selection for survival data with a class of adaptive elastic net techniques

Censored broken adaptive ridge regression in high-dimension

Variable selection for semiparametric accelerated failure time models with nonignorable missing data

References