Abstract
A major challenge in precision medicine is the development of biomarkers which can effectively guide patient treatment in a manner which benefits both the individual and the population. Much of the difficulty is the poor reproducibility of existing approaches as well as the complexity of the problem. Machine learning tools with rigorous statistical inference properties have great potential to move this area forward. In this chapter, we review existing pipelines for biomarker discovery and validation from a statistical perspective and identify a number of key areas where improvements are needed. We then proceed to outline a framework for developing a master pipeline firmly grounded in statistical principles which can yield better reproducibility, leading to improved biomarker development and increasing success in precision medicine.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Oldenhuis, C., Oosting, S., Gietema, J., De Vries, E.: Prognostic versus predictive value of biomarkers in oncology. Eur. J. Cancer 44(7), 946–953 (2008)
National Institutes of Health: Precision Medicine Initiative Cohort Program (2016). Accessed 25 Feb 2016
Poste, G.: Bring on the biomarkers. Nature 469(7329), 156–157 (2011)
Preedy, V.R., Patel, V.B.: General Methods in Biomarker Research and Their Applications. Springer, Netherlands (2015)
Novelli, G., Ciccacci, C., Borgiani, P., Papaluca Amati, M., Abadie, E.: Genetic tests and genomic biomarkers: regulation, qualification and validation. Clin. Cases Min. Bone Metab. 5(2), 149–154 (2008)
Sun, Q., Van Dam, R.M., Spiegelman, D., Heymsfield, S.B., Willett, W.C., Hu, F.B.: Comparison of dual-energy x-ray absorptiometric and anthropometric measures of adiposity in relation to adiposity-related biologic factors. Am. J. Epidemiol. kwq306 (2010)
Flegal, K.M., Graubard, B.I.: Estimates of excess deaths associated with body mass index and other anthropometric variables. Am. J. Clin. Nutr. 89(4), 1213–1219 (2009)
Task Force of the European Society of Cardiology and the North American Society of Pacing Electrophysiology: Heart rate variability: standards of measurement, physiological interpretation and clinical use. Circulation 93(5), 1043–1065 (1996)
Huikuri, H.V., Stein, P.K.: Heart rate variability in risk stratification of cardiac patients. Prog. Cardiovasc. Dis. 56(2), 153–159 (2013)
Association, A.D., et al.: Standards of medical care in diabetes - 2015 abridged for primary care providers. Clin. Diab. 33(2), 97–111 (2015)
Larsen, M.L., Hørder, M., Mogensen, E.F.: Effect of long-term monitoring of glycosylated hemoglobin levels in insulin-dependent diabetes mellitus. N. Engl. J. Med. 323(15), 1021–1025 (1990)
Karapetis, C.S., Khambata-Ford, S., Jonker, D.J., O’Callaghan, C.J., Tu, D., Tebbutt, N.C., Simes, R.J., Chalchal, H., Shapiro, J.D., Robitaille, S., et al.: K-ras mutations and benefit from cetuximab in advanced colorectal cancer. N. Engl. J. Med. 359(17), 1757–1765 (2008)
Miki, Y., Swensen, J., Shattuck-Eidens, D., Futreal, P.A., Harshman, K., Tavtigian, S., Liu, Q., Cochran, C., Bennett, L.M., Ding, W., et al.: A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science 266(5182), 66–71 (1994)
Lee, M., Shen, H., Huang, J.Z., Marron, J.: Biclustering via sparse singular value decomposition. Biometrics 66(4), 1087–1095 (2010)
Pepe, M.S., Etzioni, R., Feng, Z., Potter, J.D., Thompson, M.L., Thornquist, M., Winget, M., Yasui, Y.: Phases of biomarker development for early detection of cancer. J. Natl Cancer Inst. 93(14), 1054–1061 (2001)
Sargent, D.J., Conley, B.A., Allegra, C., Collette, L.: Clinical trial designs for predictive marker validation in cancer treatment trials. J. Clin. Oncol. 23(9), 2020–2027 (2005)
Freidlin, B., McShane, L.M., Korn, E.L.: Randomized clinical trials with biomarkers: design issues. J. Natl Cancer Inst. 102(3), 152–160 (2010)
Simon, R.: Clinical trial designs for evaluating the medical utility of prognostic and predictive biomarkers in oncology. Personalized Med. 7(1), 33–47 (2010)
Mandrekar, S.J., Sargent, D.J.: Clinical trial designs for predictive biomarker validation: one size does not fit all. J. Biopharm. Stat. 19(3), 530–542 (2009)
Jiang, W., Freidlin, B., Simon, R.: Biomarker-adaptive threshold design: a procedure for evaluating treatment with possible biomarker-defined subset effect. J. Natl Cancer Inst. 99(13), 1036–1043 (2007)
Freidlin, B., Simon, R.: Adaptive signature design: an adaptive clinical trial design for generating and prospectively testing a gene expression signature for sensitive patients. Clin. Cancer Res. 11(21), 7872–7878 (2005)
Murphy, S.A.: An experimental design for the development of adaptive treatment strategies. Stat. Med. 24(10), 1455–1481 (2005)
Denny, J.C.: Mining electronic health records in the genomics era. PLoS Comput. Biol. 8(12), e1002823 (2012)
Society for Clinical Data Management, I: Good Clinical Data Management Practices (2005). Accessed 25 Feb 2016
Bruza, P.D., Van der Weide, T.P.: The semantics of data flow diagrams. University of Nijmegen, Department of Informatics, Faculty of Mathematics and Informatics (1989)
U.S. Department of Health & Human Services: HIPAA Administrative Simplification (2013). Accessed 25 Feb 2016
Wei, S., Kosorok, M.R.: Latent supervised learning. J. Am. Stat. Assoc. 108(503), 957–970 (2013)
Chapelle, O., Schölkopf, B., Zien, A., et al.: Semi-supervised learning (2006)
Kosorok, M.R.: What’s so special about semiparametric methods? Sankhya. Ser. B [Methodol.] 71(2), 331–353 (2009)
Wei, L.: The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat. Med. 11(14–15), 1871–1879 (1992)
Altstein, L., Li, G.: Latent subgroup analysis of a randomized clinical trial through a semiparametric accelerated failure time mixture model. Biometrics 69(1), 52–61 (2013)
Hastie, T., Tibshirani, F.: The Elements of Statistical Learning (2001)
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267–288 (1996)
Meier, L., Van De Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. Roy. Stat. Soc.: Ser. B (Methodol.) 70(1), 53–71 (2008)
Tibshirani, R., et al.: The lasso method for variable selection in the Cox model. Stat. Med. 16(4), 385–395 (1997)
Bien, J., Taylor, J., Tibshirani, R.: A lasso for hierarchical interactions. Ann. Stat. 41(3), 1111 (2013)
Bondell, H.D., Krishna, A., Ghosh, S.K.: Joint variable selection for fixed and random effects in linear mixed-effects models. Biometrics 66(4), 1069–1077 (2010)
Ibrahim, J.G., Zhu, H., Garcia, R.I., Guo, R.: Fixed and random effects selection in mixed effects models. Biometrics 67(2), 495–503 (2011)
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Chipman, H.A., George, E.I., McCulloch, R.E.: BART: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)
Zhu, R., Zeng, D., Kosorok, M.R.: Reinforcement learning trees. J. Am. Stat. Assoc. 110(512), 1770–1784 (2015)
Gray, K.R., Aljabar, P., Heckemann, R.A., Hammers, A., Rueckert, D., Initiative, A.D.N., et al.: Random forest-based similarity measures for multi-modal classification of Alzheimer‘s disease. NeuroImage 65, 167–175 (2013)
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig. Process. Mag. 29(6), 82–97 (2012)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint (2012). arXiv:1207.0580
Xiong, H.Y., Alipanahi, B., Lee, L.J., Bretschneider, H., Merico, D., Yuen, R.K., Hua, Y., Gueroussov, S., Najafabadi, H.S., Hughes, T.R., et al.: The human splicing code reveals new insights into the genetic determinants of disease. Science 347(6218), 1254806 (2015)
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)
Smola, A., Vapnik, V.: Support vector regression machines. Advances in Neural Information Processing Systems, vol. 9, pp. 155–161 (1997)
Zhao, Y., Kosorok, M.R., Zeng, D.: Reinforcement learning design for cancer clinical trials. Stat. Med. 28(26), 3294–3315 (2009)
Zhao, Y., Zeng, D., Socinski, M.A., Kosorok, M.R.: Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer. Biometrics 67(4), 1422–1433 (2011)
Vansteelandt, S., Joffe, M., et al.: Structural nested models and G-estimation: The partially realized promise. Stat. Sci. 29(4), 707–731 (2014)
Robins, J.: A new approach to causal inference in mortality studies with a sustained exposure period-application to control of the healthy worker survivor effect. Math. Model. 7(9), 1393–1512 (1986)
Robins, J.M.: The analysis of randomized and non-randomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. Health Service Res. Methodol.: A Focus on AIDS 113, 159 (1989)
Witteman, J.C., D’Agostino, R.B., Stijnen, T., Kannel, W.B., Cobb, J.C., de Ridder, M.A., Hofman, A., Robins, J.M.: G-estimation of causal effects: isolated systolic hypertension and cardiovascular death in the Framingham Heart Study. Am. J. Epidemiol. 148(4), 390–401 (1998)
Robins, J.M., Blevins, D., Ritter, G., Wulfsohn, M.: G-estimation of the effect of prophylaxis therapy for pneumocystis carinii pneumonia on the survival of AIDS patients. Epidemiology 3, 319–336 (1992)
Zhao, Y., Zeng, D., Rush, A.J., Kosorok, M.R.: Estimating individualized treatment rules using outcome weighted learning. J. Am. Stat. Assoc. 107(449), 1106–1118 (2012)
Qian, M., Murphy, S.A.: Performance guarantees for individualized treatment rules. Ann. Stat. 39(2), 1180–1210 (2011)
Zhou, X., Mayer-Hamblett, N., Khan, U., Kosorok, M.R.: Residual weighted learning for estimating individualized treatment rules. J. Am. Stat. Assoc., October 2015
Xu, Y., Yu, M., Zhao, Y.Q., Li, Q., Wang, S., Shao, J.: Regularized outcome weighted subgroup identification for differential treatment effects. Biometrics 71(3), 645–653 (2015)
Zhao, Y.Q., Zeng, D., Laber, E.B., Kosorok, M.R.: New statistical learning methods for estimating optimal dynamic treatment regimes. J. Am. Stat. Assoc. 110(510), 583–598 (2015)
Su, X., Meneses, K., McNees, P., Johnson, W.O.: Interaction trees: exploring the differential effects of an intervention programme for breast cancer survivors. J. Roy. Stat. Soc. C (Appl. Stat.) 60(3), 457–474 (2011)
Zhang, B., Tsiatis, A.A., Laber, E.B., Davidian, M.: A robust method for estimating optimal treatment regimes. Biometrics 68(4), 1010–1018 (2012)
Tian, L., Alizadeh, A.A., Gentles, A.J., Tibshirani, R.: A simple method for estimating interactions between a treatment and a large number of covariate. J. Am. Stat. Assoc. 109(508), 1517–1532 (2014)
Cox, D.: Regression models and life tables (with discussion). J. Roy.Stat. Soc, B 34, 187–220 (1972)
Kalantar-Zadeh, K., Kopple, J.D., Regidor, D.L., Jing, J., Shinaberger, C.S., Aronovitz, J., McAllister, C.J., Whellan, D., Sharma, K.: A1C and survival in maintenance hemodialysis patients. Diab. Care 30(5), 1049–1055 (2007)
Kyan, M., Muneesawang, P., Jarrah, K., Guan, L.: Unsupervised Learning: A Dynamic Approach. IEEE Press Series on Computational Intelligence, pp. 275–276
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(2579–2605), 85 (2008)
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Coifman, R.R., Lafon, S., Lee, A.B., Maggioni, M., Nadler, B., Warner, F., Zucker, S.W.: Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc. Natl. Acad. Sci. U.S.A. 102(21), 7426–7431 (2005)
Shabalin, A.A., Weigman, V.J., Perou, C.M., Nobel, A.B.: Finding large average submatrices in high dimensional data. Ann. Appl. Stat. 3(3), 985–1012 (2009)
Tan, K.M., Witten, D.M.: Sparse biclustering of transposable data. J. Comput. Graph. Stat. 23(4), 985–1008 (2014)
Chen, G., Sullivan, P.F., Kosorok, M.R.: Biclustering with heterogeneous variance. Proc. Natl. Acad. Sci. 110(30), 12253–12258 (2013)
Cruz, J.A., Wishart, D.S.: Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2, 59–78 (2006)
Swan, A.L., Mobasheri, A., Allaway, D., Liddell, S., Bacardit, J.: Application of machine learning to proteomics data: classification and biomarker identification in postgenomics biology. OMICS 17(12), 595–610 (2013)
Libbrecht, M.W., Noble, W.S.: Machine learning applications in genetics and genomics. Nat. Rev. Genet. 16(6), 321–332 (2015)
Bender, R., Lange, S.: Adjusting for multiple testing? when and how? J. Clin. Epidemiol. 54(4), 343–349 (2001)
Glickman, M.E., Rao, S.R., Schultz, M.R.: False discovery rate control is a recommended alternative to Bonferroni-type adjustments in health studies. J. Clin. Epidemiol. 67(8), 850–857 (2014)
Westfall, P.H., Young, S.S.: Resampling-based multiple testing: examples and methods for p-value adjustment, vol. 279. John Wiley & Sons, New York (1993)
Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.) 57(1), 289–300 (1995)
Efron, B.: Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction, vol. 1. Cambridge University Press, Cambridge (2012)
Van der Laan, M.J.: Multiple Testing Procedures with Applications to Genomics. Springer Series in Statistics. Springer, Heidelberg (2008)
Pepe, M.S.: The statistical evaluation of medical tests for classification and prediction. Oxford University Press, USA (2003)
Pepe, M.S.: A regression modelling framework for receiver operating characteristic curves in medical diagnostic testing. Biometrika 84(3), 595–608 (1997)
Cai, T., Pepe, M.S.: Semiparametric receiver operating characteristic analysis to evaluate biomarkers for disease. J. Am. Stat. Assoc. 97(460), 1099–1107 (2002)
Chrzanowski, M.: Weighted empirical likelihood inference for the area under the ROC curve. J. Stat. Plan. Infer. 147, 159–172 (2014)
Cai, T., Dodd, L.E.: Regression analysis for the partial area under the ROC curve. Statistica Sin. 18, 817–836 (2008)
Cai, T., Moskowitz, C.S.: Semi-parametric estimation of the binormal ROC curve for a continuous diagnostic test. Biostatistics 5(4), 573–586 (2004)
Pepe, M.S.: An interpretation for the ROC curve and inference using GLM procedures. Biometrics 56(2), 352–359 (2000)
Ware, J.H.: The limitations of risk factors as prognostic tools. N. Engl. J. Med. 355(25), 2615–2617 (2006)
Pencina, M.J., D’Agostino, R.B., Vasan, R.S.: Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat. Med. 27(2), 157–172 (2008)
Pencina, M.J., D’Agostino, R.B., Steyerberg, E.W.: Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat. Med. 30(1), 11–21 (2011)
Gail, M., Simon, R.M.: Testing for qualitative interactions between treatmenteects and patient subsets. Biometrics 41(2), 361–372 (1985)
Russek-Cohen, E., Simon, R.M.: Evaluating treatments when a gender by treatment interaction may exist. Stat. Med. 16(4), 455–464 (1997)
Huang, Y., Gilbert, P.B., Janes, H.: Assessing treatment-selection markers using a potential outcomes framework. Biometrics 68(3), 687–696 (2012)
Zhang, Z., Nie, L., Soon, G., Liu, A.: The use of covariates and random effects in evaluating predictive biomarkers under a potential outcome framework. Ann. Appl. Stat. 8(4), 2336 (2014)
Polley, M.Y.C., Freidlin, B., Korn, E.L., Conley, B.A., Abrams, J.S., McShane, L.M.: Statistical and practical considerations for clinical evaluation of predictive biomarkers. J. Natl. Cancer Inst. 105(22), 1677–1683 (2013)
Lawrence, I., Lin, K.: A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, 255–268 (1989)
Drummond, C.: Replicability is Not Reproducibility: Nor is it Good Science (2009)
Casadevall, A., Fang, F.C.: Reproducible science. Infect. Immun. 78(12), 4972–4975 (2010)
Laine, C., Goodman, S.N., Griswold, M.E., Sox, H.C.: Reproducible research: moving toward research the public can really trust. Ann. Intern. Med. 146(6), 450–453 (2007)
Fleming, T.R., DeMets, D.L.: Surrogate end points in clinical trials: are we being misled? Ann. Intern. Med. 125(7), 605–613 (1996)
Connolly, S.J.: Use and misuse of surrogate outcomes in arrhythmia trials. Circulation 113(6), 764–766 (2006)
Weir, M., Investigators, C.A.S.T., et al.: The cardiac arrhythmia suppression trial investigators: Preliminary report: Effect of encainide and flecainide on mortality in a randomized trial of arrhythmia suppression after myocardial infarction. Cardiopul. Phys. Ther. J. 1(2), 12 (1990)
Prentice, R.L.: Surrogate endpoints in clinical trials: definition and operational criteria. Stat. Med. 8(4), 431–440 (1989)
Berger, V.W.: Does the prentice criterion validate surrogate endpoints? Stat. Med. 23(10), 1571–1578 (2004)
Strimbu, K., Tavel, J.A.: What are biomarkers? Curr. Opin. HIV AIDS 5(6), 463 (2010)
Sbarouni, E., Georgiadou, P., Voudris, V.: Gender-specific differences in biomarkers responses to acute coronary syndromes and revascularization procedures. Biomarkers 16(6), 457–465 (2011)
Healy, B.: The yentl syndrome. N. Engl. J. Med. 325(4), 274–276 (1991)
Hoffman, R.M.: Screening for prostate cancer. N. Engl. J. Med. 365(21), 2013–2019 (2011)
Holzinger, A.: Interactive machine learning for health informatics: When do we need the human-in-the-loop? Brain Inform. 3(2), 119–131 (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this chapter
Cite this chapter
Hidalgo, S.J.T. et al. (2016). A Master Pipeline for Discovery and Validation of Biomarkers. In: Holzinger, A. (eds) Machine Learning for Health Informatics. Lecture Notes in Computer Science(), vol 9605. Springer, Cham. https://doi.org/10.1007/978-3-319-50478-0_13
Download citation
DOI: https://doi.org/10.1007/978-3-319-50478-0_13
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50477-3
Online ISBN: 978-3-319-50478-0
eBook Packages: Computer ScienceComputer Science (R0)