Abstract
The analysis of survival endpoints subject to right-censoring is an important research area in statistics, particularly among econometricians and biostatisticians. The two most popular semiparametric models are the proportional hazards model and the accelerated failure time (AFT) model. Rank-based estimation in the AFT model is computationally challenging due to optimization of a non-smooth loss function. Previous work has shown that rank-based estimators may be written as solutions to linear programming (LP) problems. However, the size of the LP problem is O(n 2+p) subject to n 2 linear constraints, where n denotes sample size and p denotes the dimension of parameters. As n and/or p increases, the feasibility of such solution in practice becomes questionable. Among data mining and statistical learning enthusiasts, there is interest in extending ordinary regression coefficient estimators for low-dimensions into high-dimensional data mining tools through regularization. Applying this recipe to rank-based coefficient estimators leads to formidable optimization problems which may be avoided through smooth approximations to non-smooth functions. We review smooth approximations and quasi-Newton methods for rank-based estimation in AFT models. The computational cost of our method is substantially smaller than the corresponding LP problem and can be applied to small- or large-scale problems similarly. The algorithm described here allows one to couple rank-based estimation for censored data with virtually any regularization and is exemplified through four case studies.
Similar content being viewed by others
References
Boyd, S.P., Vandenberghe, L.: Convex Optimization. Cambridge University Press, Cambridge (2004)
Brown, B.M., Wang, Y.G.: Standard errors and covariance matrices for smoothed rank estimators. Biometrika 92, 149–158 (2005)
Brown, B.M., Wang, Y.G.: Induced smoothing for rank regression with censored survival times. Stat. Med. 26, 828–836 (2007)
Cai, T., Huang, J., Tian, L.: Regularized estimation for the accelerated failure time model. Biometrics 65, 394–404 (2009)
CAMDA: Critical Assessment of Microarray Data Analysis (2003). http://www.camda.duke.edu/camda03.html
Candes, E., Tao, T.: The Dantzig selector: statistical estimation when p is much larger than n. Ann. Stat. 35(6), 2313–2351 (2007)
Chung, J., Chung, M., O’Leary, D.: Designing optimal filters for ill-posed inverse problems. SIAM J. Sci. Comput. 33(6), 3132–3152 (2011)
Conrad, M., Johnson, B.A.: A quasi-Newton algorithm for efficient computation of Gehan estimates. Technical Report TR 2010-02. Department of Biostatistics and Bioinformatics, Emory University (2010)
Cox, D.R.: Regression models and life-tables. J. R. Stat. Soc., Ser. B, Stat. Methodol. 34, 187–220 (1972)
Cox, D.R., Oakes, D.: Analysis of Survival Data. Chapman & Hall, London (1984)
Dickson, E.R., Grambsch, P.M., Fleming, T.R., Fisher, D., Langworthy, A.: Prognosis in primary biliary cirrhosis: model for decision making. Hepatology 10(1), 1–7 (1989)
Fleming, T.R., Harrington, D.P.: Counting Processes and Survival Analysis, vol. 8. Wiley, New York (1991)
Fygenson, M., Ritov, Y.: Monotone estimating equations for censored data. Ann. Stat. 22, 732–746 (1994)
Gehan, E.A.: A generalized Wilcoxon test for comparing arbitrarily single-censored samples. Biometrika 52, 203–223 (1965)
Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, New York (1981)
Hadamard, J.: Sur les Problèmes aux Dérivées Partielles et Leur Signification Physique (1902)
Hastie, T., Tibshirani, R.J.F.: The Elements of Statistical Learning, 2nd edn. Springer, New York (2009)
Heller, G.: Smoothed rank regression with censored data. J. Am. Stat. Assoc. 102(478), 552–559 (2007)
Hoerl, A.E., Kennard, R.W.: Ridge Regression: Biased Estimation for Nonorthogonal Problems. Technometrics 55–67 (1970)
Huang, J., Ma, S., Xie, H.: Regularized estimation in the accelerated failure time model with high-dimensional covariates. Biometrics, 813–820 (2006)
Huber, P.J.: Robust estimation of a location parameter. Ann. Math. Stat. 35(1), 73–101 (1964)
Hunter, R., Lange, K.: A tutorial on mm algorithms. Am. Stat. 30–37 (2004)
Jin, Z., Lin, D.Y., Wei, L.J., Ying, Z.: Rank-based inference for the accelerated failure time model. Biometrika 90(2), 341–353 (2003)
Johnson, B.A.: Variable selection in semiparametric linear regression with censored data. J. R. Stat. Soc. B 70, 351–370 (2008)
Johnson, B.A.: Rank-based estimation in the ℓ 1-regularized partly linear model with application to integrated analyses of clinical predictors and gene expression data. Biostatistics 10, 659–666 (2009a)
Johnson, B.A.: On lasso for censored data. Electron. J. Stat. 3, 485–506 (2009b)
Johnson, B.A., Lin, D., Zeng, D.: Penalized estimating functions and variable selection in semiparametric regression models. J. Am. Stat. Assoc. 103, 672–680 (2008)
Johnson, A., Long, Q., Chung, M.: On path restoration for censored outcomes. Biometrics 67, 1379–1388 (2011)
Johnson, L.M., Strawderman, R.L.: Induced smoothing for the semiparametric accelerated failure time model: asymptotics and extensions to clustered data. Biometrika 96(3), 577–590 (2009)
Kaipio, J.P., Somersalo, E.: Statistical and Computational Inverse Problems. Springer, Berlin (2005)
Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data, vol. 5, 2nd edn. Wiley, New York (1980)
Koenker, R., Bassett, G. Jr.: Regression quantiles. Econometrica 33–50 (1978)
Koenker, R., Ng, P.: A Frisch-Newton algorithm for sparse quantile regression. Acta Math. Appl. Sin. 21(2), 225–236 (2005)
Koenker, R.W., D’Orey, V: Algorithm as 229: computing regression quantiles. J. R. Stat. Soc., Ser. C, Appl. Stat. 36(3), 383–393 (1987)
Lin, D.Y., Geyer, C.J.: Computational methods for semiparametric linear regression with censored data. J. Comput. Graph. Stat. 1(1), 77–90 (1992)
Meier, L., Van de Geer, S., Bühlmann, P.: The group lasso for logistic regression. Group 70(1), 53–71 (2008)
Morris, C., Norton, E., Zhou, X.: Parametric duration analysis of nursing home usage. In: Case Studies in Biometry, pp. 231–248 (1994)
Nocedal, J., Wright, S.J.: Numerical Optimization, 2nd edn. Springer, Berlin (2006)
Owen, A.B.: A robust hybrid of lasso and ridge regression. Technical Report, Department of Statistics, Stanford University, Palo Alto, CA (2006)
Prentice, R.L.: Linear rank tests with right censored data. Biometrika 65(1), 167–179 (1978)
Reid, N.: A conversation with Sir David Cox. Stat. Sci. 9, 439–455 (1994)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. B 58(1), 267–288 (1996)
Tibshirani, R., Saunders, M., Rosset, S., Zhu, J., Knight, K.: Sparsity and smoothness via the fused lasso. J. R. Stat. Soc., Ser. B, Stat. Methodol. 67(1), 91–108 (2005)
Tsiatis, A.A.: Estimating regression parameters using linear rank tests for censored data. Ann. Stat. 18(1), 354–372 (1990)
Vogel, C.R.: Computational Methods for Inverse Problems, vol. 23. SIAM, Philadelphia (2002)
Wei, L.J., Ying, Z., Lin, D.Y.: Linear regression analysis of censored survival data based on rank tests. Biometrika 77(4), 845–851 (1990)
Wu, S., Shen, X., Geyer, C.J.: Adaptive regularization using the entire solution surface. Biometrika 96(3), 513–527 (2009)
Xu, J., Leng, C., Ying, Z.: Rank-based variable selection with censored data. Stat. Comput. 20, 165–176 (2010)
Ying, Z.: A large sample study of rank estimation for censored regression data. Ann. Stat. 21, 76–99 (1993)
Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc., Ser. B, Stat. Methodol. 68(1), 49–67 (2006)
Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418–1429 (2006)
Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Stat. Soc., Ser. B, Stat. Methodol. 67(2), 301–320 (2005)
Author information
Authors and Affiliations
Corresponding author
Additional information
This paper is adapted from an earlier Emory technical report by Conrad and Johnson (2010). This work was supported in part by US NIH PHS Grant UL1 RR025008 from the Clinical and Translational Science Award program.
Appendix
Appendix
Operating characteristics of polynomial-smoothed Gehan estimator
In this section, we outline the large sample properties of the estimator \(\widehat{{\boldsymbol {\beta }}}_{G,\varepsilon }\). Let the parameter β belong to a parameter space \(\mathbb{B}\), a compact subset of ℜp and let f 0(β) be a convex function for \({\boldsymbol {\beta }}\in\mathbb{B}\). The proof of Theorem 1 relies on the following two facts regarding the loss functions f G (β) and f G,ε (β).
Lemma 1
Under Conditions A1–A3 in Johnson and Strawderman (2009, p. 586),
Lemma 2
Under Conditions A1–A3 in Johnson and Strawderman (2009, p. 586),
Lemma 1 is also Lemma 1 in Johnson and Strawderman (2009) under exactly the same conditions and stated without proof.
Outline proof of Lemma 2
By the triangle inequality, we have
By Lemma 1, the second term in (19) can be made arbitrarily small, uniformly for all \({\boldsymbol {\beta }}\in\mathbb{B}\), except on a set of probability measure zero. The first term in (19) is
Hence, the absolute difference between the Gehan loss and its smooth approximation can be made arbitrarily small, for every \({\boldsymbol {\beta }}\in\mathbb{B}\). The conclusion then follows. □
Proof of Theorem 1
Under Conditions A1–A3 of Johnson and Strawderman, f G (β) and f G,ε (β) converge uniformly to the convex function f 0(β) by Lemmas 1 and 2, respectively. By Condition A4, f 0(β) is strictly convex at its unique minimizer, β 0. Thus, the minimizers of the random convex functions f G,ε (β) and f G (β) converge almost surely to β 0. □
Asymptotic distribution
The polynomial-smoothed Gehan estimator bears a close similarity to Heller’s (2007) estimator and one expects the asymptotic distribution theory follows similarly. A straightforward calculation confirms that K ε (z) in Ψ G,ε (β) in (14) is a survivor function and k ε (z)=(d/dz)K ε (z) is symmetric about zero with finite second moment (that is, Heller’s 2007, Condition C3, p. 553). Define the asymptotic slope matrix A ε (β) and asymptotic covariance B ε (β),
Then, assuming the covariate matrix has finite second moment and the non-singularity of A ε (β) in a neighborhood of the true value β 0, one can show \(\sqrt{n}(\widehat{{\boldsymbol {\beta }}}_{G,\varepsilon } - {\boldsymbol {\beta }}_{0})\) converges in distribution to a mean-zero normal random vector with asymptotic covariance
(see Heller 2007, Appendix). As with Heller’s estimator, both A ε (β) and B ε (β) are directly estimable from the data, the latter derived from a theory of U-statistics.
Rights and permissions
About this article
Cite this article
Chung, M., Long, Q. & Johnson, B.A. A tutorial on rank-based coefficient estimation for censored data in small- and large-scale problems. Stat Comput 23, 601–614 (2013). https://doi.org/10.1007/s11222-012-9333-9
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11222-012-9333-9