Abstract
Survival analysis is a commonly used technique to identify important predictors of adverse events and develop guidelines for patient’s treatment in medical research. When applied to large amounts of patient data, efficient optimization routines become a necessity. We propose efficient training algorithms for three kinds of linear survival support vector machines: 1) ranking-based, 2) regression-based, and 3) combined ranking and regression. We perform optimization in the primal using truncated Newton optimization and use order statistic trees to lower computational costs of training. We employ the same optimization technique and extend it for non-linear models too. Our results demonstrate the superiority of our proposed optimization scheme over existing training algorithms, which fail due to their inherently high time and space complexities when applied to large datasets. We validate the proposed survival models on 6 real-world datasets, and show that pure ranking-based approaches outperform regression and hybrid models.
Chapter PDF
References
Adelson-Velsky, G., Landis, E.: An algorithm for the organization of information. In: Doklady Akademii Nauk SSSR, vol. 146, pp. 263–266 (1962)
Airola, A., Pahikkala, T., Salakoski, T.: Training linear ranking SVMs in linearithmic time using red–black trees. Pattern Recogn. Lett. 32(9), 1328–1336 (2011)
Bayer, R.: Symmetric binary B-trees: Data structure and maintenance algorithms. Acta Inform. 1(4), 290–306 (1972)
Bender, R., Augustin, T., Blettner, M.: Generating survival times to simulate Cox proportional hazards models. Stat. Med. 24(11), 1713–1723 (2005)
Chapelle, O., Keerthi, S.S.: Efficient algorithms for ranking with SVMs. Information Retrieval 13(3), 201–215 (2009)
Cox, D.R.: Regression models and life tables (with discussion). J. Roy. Stat. Soc. B 34, 187–220 (1972)
Dembo, R.S., Steihaug, T.: Truncated Newton algorithms for large-scale optimization. Math. Programming 26(2), 190–212 (1983)
Desmedt, C., Piette, F., Loi, S., Wang, Y., Lallemand, F., Haibe-Kains, B., Viale, G., Delorenzi, M., Zhang, Y., d’Assignies, M.S., Bergh, J., Lidereau, R., Ellis, P., Harris, A.L., Klijn, J.G., Foekens, J.A., Cardoso, F., Piccart, M.J., Buyse, M., Sotiriou, C.: Strong Time Dependence of the 76-Gene Prognostic Signature for Node-Negative Breast Cancer Patients in the TRANSBIG Multicenter Independent Validation Series. Clin. Cancer Res. 13(11), 3207–3214 (2007)
Eleuteri, A., Taktak, A.F.G.: Support vector machines for survival regression. In: Biganzoli, E., Vellido, A., Ambrogi, F., Tagliaferri, R. (eds.) CIBB 2011. LNCS, vol. 7548, pp. 176–189. Springer, Heidelberg (2012)
Evers, L., Messow, C.M.: Sparse kernel methods for high-dimensional survival data. Bioinformatics 24(14), 1632–1638 (2008)
Harrell, F.E., Califf, R.M., Pryor, D.B., Lee, K.L., Rosati, R.A.: Evaluating the Yield of Medical Tests. J. Am. Med. Assoc. 247(18), 2543–2546 (1982)
Hosmer, D., Lemeshow, S., May, S.: Applied Survival Analysis: Regression Modeling of Time to Event Data. John Wiley & Sons, Inc. (2008)
Jensen, P.B., Jensen, L.J., Brunak, S.: Mining electronic health records: towards better research applications and clinical care. Nat. Rev. Genet. 13(6), 395–405 (2012)
Kalbfleisch, J.D., Prentice, R.L.: The Statistical Analysis of Failure Time Data. John Wiley & Sons, Inc. (2002)
Kannel, W.B., Feinleib, M., McNamara, P.M., Garrision, R.J., Castelli, W.P.: An Investigation of Coronary Heart Disease in Families: The Framingham Offspring Study. Am. J. Epidemiol. 110(3), 281–290 (1979)
Keerthi, S.S., DeCoste, D.: A Modified Finite Newton Method for Fast Solution of Large Scale Linear SVMs. J. Mach. Learn. Res. 6, 341–361 (2005)
Khan, F.M., Zubek, V.B.: Support vector regression for censored data (SVRc): a novel tool for survival analysis. In: 8th IEEE Int. Conf. on Data Mining, pp. 863–868 (2008)
Kimeldorf, G.S., Wahba, G.: A correspondence between bayesian estimation on stochastic processes and smoothing by splines. Ann. Math. Stat. 41, 495–502 (1970)
Lee, C.P., Lin, C.J.: Large-Scale Linear RankSVM. Neural Comput. 26(4), 781–817 (2014)
Mangasarian, O.: A finite newton method for classification. Optimization Methods and Software 17(5), 913–929 (2002)
Ndrepepa, G., Braun, S., Mehilli, J., Birkmeier, K.A., Byrne, R.A., Ott, I., Hösl, K., Schulz, S., Fusaro, M., Pache, J., Hausleiter, J., Laugwitz, K.L., Massberg, S., Seyfarth, M., Schömig, A., Kastrati, A.: Prognostic value of sensitive troponin T in patients with stable and unstable angina and undetectable conventional troponin. Am. Heart J. 161(1), 68–75 (2011)
Shivaswamy, P.K., Chu, W., Jansche, M.: A support vector approach to censored targets. In: 7th IEEE Int. Conf. on Data Mining, pp. 655–660 (2007)
Steck, H., Krishnapuram, B., Dehing-oberije, C., Lambin, P., Raykar, V.C.: On ranking in survival analysis: bounds on the concordance index. In: Adv. Neural Inf. Process. Syst., vol. 20, pp. 1209–1216 (2008)
Van Belle, V., Pelckmans, K., Suykens, J.A., Van Huffel, S.: Support vector machines for survival analysis. In: Proc. 3rd Int. Conf. Comput. Intell. Med. Healthc, pp. 1–8 (2007)
Van Belle, V., Pelckmans, K., Suykens, J.A., Van Huffel, S.: Survival SVM: a practical scalable algorithm. In: Proc. of 16th European Symposium on Artificial Neural Networks, pp. 89–94 (2008)
Van Belle, V., Pelckmans, K., Van Huffel, S., Suykens, J.A.K.: Support vector methods for survival analysis: a comparison between ranking and regression approaches. Artif. Intell. Med. 53(2), 107–118 (2011)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Pölsterl, S., Navab, N., Katouzian, A. (2015). Fast Training of Support Vector Machines for Survival Analysis. In: Appice, A., Rodrigues, P., Santos Costa, V., Gama, J., Jorge, A., Soares, C. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2015. Lecture Notes in Computer Science(), vol 9285. Springer, Cham. https://doi.org/10.1007/978-3-319-23525-7_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-23525-7_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-23524-0
Online ISBN: 978-3-319-23525-7
eBook Packages: Computer ScienceComputer Science (R0)