Skip to main content

A Master Pipeline for Discovery and Validation of Biomarkers

  • Chapter
  • First Online:
Machine Learning for Health Informatics

Abstract

A major challenge in precision medicine is the development of biomarkers which can effectively guide patient treatment in a manner which benefits both the individual and the population. Much of the difficulty is the poor reproducibility of existing approaches as well as the complexity of the problem. Machine learning tools with rigorous statistical inference properties have great potential to move this area forward. In this chapter, we review existing pipelines for biomarker discovery and validation from a statistical perspective and identify a number of key areas where improvements are needed. We then proceed to outline a framework for developing a master pipeline firmly grounded in statistical principles which can yield better reproducibility, leading to improved biomarker development and increasing success in precision medicine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 59.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Oldenhuis, C., Oosting, S., Gietema, J., De Vries, E.: Prognostic versus predictive value of biomarkers in oncology. Eur. J. Cancer 44(7), 946–953 (2008)

    Article  Google Scholar 

  2. National Institutes of Health: Precision Medicine Initiative Cohort Program (2016). Accessed 25 Feb 2016

    Google Scholar 

  3. Poste, G.: Bring on the biomarkers. Nature 469(7329), 156–157 (2011)

    Article  Google Scholar 

  4. Preedy, V.R., Patel, V.B.: General Methods in Biomarker Research and Their Applications. Springer, Netherlands (2015)

    Book  Google Scholar 

  5. Novelli, G., Ciccacci, C., Borgiani, P., Papaluca Amati, M., Abadie, E.: Genetic tests and genomic biomarkers: regulation, qualification and validation. Clin. Cases Min. Bone Metab. 5(2), 149–154 (2008)

    Google Scholar 

  6. Sun, Q., Van Dam, R.M., Spiegelman, D., Heymsfield, S.B., Willett, W.C., Hu, F.B.: Comparison of dual-energy x-ray absorptiometric and anthropometric measures of adiposity in relation to adiposity-related biologic factors. Am. J. Epidemiol. kwq306 (2010)

    Google Scholar 

  7. Flegal, K.M., Graubard, B.I.: Estimates of excess deaths associated with body mass index and other anthropometric variables. Am. J. Clin. Nutr. 89(4), 1213–1219 (2009)

    Article  Google Scholar 

  8. Task Force of the European Society of Cardiology and the North American Society of Pacing Electrophysiology: Heart rate variability: standards of measurement, physiological interpretation and clinical use. Circulation 93(5), 1043–1065 (1996)

    Google Scholar 

  9. Huikuri, H.V., Stein, P.K.: Heart rate variability in risk stratification of cardiac patients. Prog. Cardiovasc. Dis. 56(2), 153–159 (2013)

    Article  Google Scholar 

  10. Association, A.D., et al.: Standards of medical care in diabetes - 2015 abridged for primary care providers. Clin. Diab. 33(2), 97–111 (2015)

    Article  Google Scholar 

  11. Larsen, M.L., Hørder, M., Mogensen, E.F.: Effect of long-term monitoring of glycosylated hemoglobin levels in insulin-dependent diabetes mellitus. N. Engl. J. Med. 323(15), 1021–1025 (1990)

    Article  Google Scholar 

  12. Karapetis, C.S., Khambata-Ford, S., Jonker, D.J., O’Callaghan, C.J., Tu, D., Tebbutt, N.C., Simes, R.J., Chalchal, H., Shapiro, J.D., Robitaille, S., et al.: K-ras mutations and benefit from cetuximab in advanced colorectal cancer. N. Engl. J. Med. 359(17), 1757–1765 (2008)

    Article  Google Scholar 

  13. Miki, Y., Swensen, J., Shattuck-Eidens, D., Futreal, P.A., Harshman, K., Tavtigian, S., Liu, Q., Cochran, C., Bennett, L.M., Ding, W., et al.: A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science 266(5182), 66–71 (1994)

    Article  Google Scholar 

  14. Lee, M., Shen, H., Huang, J.Z., Marron, J.: Biclustering via sparse singular value decomposition. Biometrics 66(4), 1087–1095 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  15. Pepe, M.S., Etzioni, R., Feng, Z., Potter, J.D., Thompson, M.L., Thornquist, M., Winget, M., Yasui, Y.: Phases of biomarker development for early detection of cancer. J. Natl Cancer Inst. 93(14), 1054–1061 (2001)

    Article  Google Scholar 

  16. Sargent, D.J., Conley, B.A., Allegra, C., Collette, L.: Clinical trial designs for predictive marker validation in cancer treatment trials. J. Clin. Oncol. 23(9), 2020–2027 (2005)

    Article  Google Scholar 

  17. Freidlin, B., McShane, L.M., Korn, E.L.: Randomized clinical trials with biomarkers: design issues. J. Natl Cancer Inst. 102(3), 152–160 (2010)

    Article  Google Scholar 

  18. Simon, R.: Clinical trial designs for evaluating the medical utility of prognostic and predictive biomarkers in oncology. Personalized Med. 7(1), 33–47 (2010)

    Article  Google Scholar 

  19. Mandrekar, S.J., Sargent, D.J.: Clinical trial designs for predictive biomarker validation: one size does not fit all. J. Biopharm. Stat. 19(3), 530–542 (2009)

    Article  MathSciNet  Google Scholar 

  20. Jiang, W., Freidlin, B., Simon, R.: Biomarker-adaptive threshold design: a procedure for evaluating treatment with possible biomarker-defined subset effect. J. Natl Cancer Inst. 99(13), 1036–1043 (2007)

    Article  Google Scholar 

  21. Freidlin, B., Simon, R.: Adaptive signature design: an adaptive clinical trial design for generating and prospectively testing a gene expression signature for sensitive patients. Clin. Cancer Res. 11(21), 7872–7878 (2005)

    Article  Google Scholar 

  22. Murphy, S.A.: An experimental design for the development of adaptive treatment strategies. Stat. Med. 24(10), 1455–1481 (2005)

    Article  MathSciNet  Google Scholar 

  23. Denny, J.C.: Mining electronic health records in the genomics era. PLoS Comput. Biol. 8(12), e1002823 (2012)

    Article  Google Scholar 

  24. Society for Clinical Data Management, I: Good Clinical Data Management Practices (2005). Accessed 25 Feb 2016

    Google Scholar 

  25. Bruza, P.D., Van der Weide, T.P.: The semantics of data flow diagrams. University of Nijmegen, Department of Informatics, Faculty of Mathematics and Informatics (1989)

    Google Scholar 

  26. U.S. Department of Health & Human Services: HIPAA Administrative Simplification (2013). Accessed 25 Feb 2016

    Google Scholar 

  27. Wei, S., Kosorok, M.R.: Latent supervised learning. J. Am. Stat. Assoc. 108(503), 957–970 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  28. Chapelle, O., Schölkopf, B., Zien, A., et al.: Semi-supervised learning (2006)

    Google Scholar 

  29. Kosorok, M.R.: What’s so special about semiparametric methods? Sankhya. Ser. B [Methodol.] 71(2), 331–353 (2009)

    Google Scholar 

  30. Wei, L.: The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat. Med. 11(14–15), 1871–1879 (1992)

    Article  Google Scholar 

  31. Altstein, L., Li, G.: Latent subgroup analysis of a randomized clinical trial through a semiparametric accelerated failure time mixture model. Biometrics 69(1), 52–61 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  32. Hastie, T., Tibshirani, F.: The Elements of Statistical Learning (2001)

    Google Scholar 

  33. Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267–288 (1996)

    MathSciNet  MATH  Google Scholar 

  34. Meier, L., Van De Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. Roy. Stat. Soc.: Ser. B (Methodol.) 70(1), 53–71 (2008)

    Article  MathSciNet  MATH  Google Scholar 

  35. Tibshirani, R., et al.: The lasso method for variable selection in the Cox model. Stat. Med. 16(4), 385–395 (1997)

    Article  Google Scholar 

  36. Bien, J., Taylor, J., Tibshirani, R.: A lasso for hierarchical interactions. Ann. Stat. 41(3), 1111 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  37. Bondell, H.D., Krishna, A., Ghosh, S.K.: Joint variable selection for fixed and random effects in linear mixed-effects models. Biometrics 66(4), 1069–1077 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  38. Ibrahim, J.G., Zhu, H., Garcia, R.I., Guo, R.: Fixed and random effects selection in mixed effects models. Biometrics 67(2), 495–503 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  39. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)

    Article  MathSciNet  MATH  Google Scholar 

  40. Chipman, H.A., George, E.I., McCulloch, R.E.: BART: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  41. Zhu, R., Zeng, D., Kosorok, M.R.: Reinforcement learning trees. J. Am. Stat. Assoc. 110(512), 1770–1784 (2015)

    Article  MathSciNet  Google Scholar 

  42. Gray, K.R., Aljabar, P., Heckemann, R.A., Hammers, A., Rueckert, D., Initiative, A.D.N., et al.: Random forest-based similarity measures for multi-modal classification of Alzheimer‘s disease. NeuroImage 65, 167–175 (2013)

    Article  Google Scholar 

  43. Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig. Process. Mag. 29(6), 82–97 (2012)

    Article  Google Scholar 

  44. Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)

    Google Scholar 

  45. Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint (2012). arXiv:1207.0580

  46. Xiong, H.Y., Alipanahi, B., Lee, L.J., Bretschneider, H., Merico, D., Yuen, R.K., Hua, Y., Gueroussov, S., Najafabadi, H.S., Hughes, T.R., et al.: The human splicing code reveals new insights into the genetic determinants of disease. Science 347(6218), 1254806 (2015)

    Article  Google Scholar 

  47. Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)

    Book  MATH  Google Scholar 

  48. Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)

    Google Scholar 

  49. Smola, A., Vapnik, V.: Support vector regression machines. Advances in Neural Information Processing Systems, vol. 9, pp. 155–161 (1997)

    Google Scholar 

  50. Zhao, Y., Kosorok, M.R., Zeng, D.: Reinforcement learning design for cancer clinical trials. Stat. Med. 28(26), 3294–3315 (2009)

    Article  MathSciNet  Google Scholar 

  51. Zhao, Y., Zeng, D., Socinski, M.A., Kosorok, M.R.: Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer. Biometrics 67(4), 1422–1433 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  52. Vansteelandt, S., Joffe, M., et al.: Structural nested models and G-estimation: The partially realized promise. Stat. Sci. 29(4), 707–731 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  53. Robins, J.: A new approach to causal inference in mortality studies with a sustained exposure period-application to control of the healthy worker survivor effect. Math. Model. 7(9), 1393–1512 (1986)

    Article  MathSciNet  MATH  Google Scholar 

  54. Robins, J.M.: The analysis of randomized and non-randomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. Health Service Res. Methodol.: A Focus on AIDS 113, 159 (1989)

    Google Scholar 

  55. Witteman, J.C., D’Agostino, R.B., Stijnen, T., Kannel, W.B., Cobb, J.C., de Ridder, M.A., Hofman, A., Robins, J.M.: G-estimation of causal effects: isolated systolic hypertension and cardiovascular death in the Framingham Heart Study. Am. J. Epidemiol. 148(4), 390–401 (1998)

    Article  Google Scholar 

  56. Robins, J.M., Blevins, D., Ritter, G., Wulfsohn, M.: G-estimation of the effect of prophylaxis therapy for pneumocystis carinii pneumonia on the survival of AIDS patients. Epidemiology 3, 319–336 (1992)

    Article  Google Scholar 

  57. Zhao, Y., Zeng, D., Rush, A.J., Kosorok, M.R.: Estimating individualized treatment rules using outcome weighted learning. J. Am. Stat. Assoc. 107(449), 1106–1118 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  58. Qian, M., Murphy, S.A.: Performance guarantees for individualized treatment rules. Ann. Stat. 39(2), 1180–1210 (2011)

    Article  MathSciNet  MATH  Google Scholar 

  59. Zhou, X., Mayer-Hamblett, N., Khan, U., Kosorok, M.R.: Residual weighted learning for estimating individualized treatment rules. J. Am. Stat. Assoc., October 2015

    Google Scholar 

  60. Xu, Y., Yu, M., Zhao, Y.Q., Li, Q., Wang, S., Shao, J.: Regularized outcome weighted subgroup identification for differential treatment effects. Biometrics 71(3), 645–653 (2015)

    Article  MathSciNet  MATH  Google Scholar 

  61. Zhao, Y.Q., Zeng, D., Laber, E.B., Kosorok, M.R.: New statistical learning methods for estimating optimal dynamic treatment regimes. J. Am. Stat. Assoc. 110(510), 583–598 (2015)

    Article  MathSciNet  Google Scholar 

  62. Su, X., Meneses, K., McNees, P., Johnson, W.O.: Interaction trees: exploring the differential effects of an intervention programme for breast cancer survivors. J. Roy. Stat. Soc. C (Appl. Stat.) 60(3), 457–474 (2011)

    Article  MathSciNet  Google Scholar 

  63. Zhang, B., Tsiatis, A.A., Laber, E.B., Davidian, M.: A robust method for estimating optimal treatment regimes. Biometrics 68(4), 1010–1018 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  64. Tian, L., Alizadeh, A.A., Gentles, A.J., Tibshirani, R.: A simple method for estimating interactions between a treatment and a large number of covariate. J. Am. Stat. Assoc. 109(508), 1517–1532 (2014)

    Article  MathSciNet  Google Scholar 

  65. Cox, D.: Regression models and life tables (with discussion). J. Roy.Stat. Soc, B 34, 187–220 (1972)

    Google Scholar 

  66. Kalantar-Zadeh, K., Kopple, J.D., Regidor, D.L., Jing, J., Shinaberger, C.S., Aronovitz, J., McAllister, C.J., Whellan, D., Sharma, K.: A1C and survival in maintenance hemodialysis patients. Diab. Care 30(5), 1049–1055 (2007)

    Article  Google Scholar 

  67. Kyan, M., Muneesawang, P., Jarrah, K., Guan, L.: Unsupervised Learning: A Dynamic Approach. IEEE Press Series on Computational Intelligence, pp. 275–276

    Google Scholar 

  68. Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)

    Article  Google Scholar 

  69. Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(2579–2605), 85 (2008)

    MATH  Google Scholar 

  70. Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)

    Article  Google Scholar 

  71. Coifman, R.R., Lafon, S., Lee, A.B., Maggioni, M., Nadler, B., Warner, F., Zucker, S.W.: Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc. Natl. Acad. Sci. U.S.A. 102(21), 7426–7431 (2005)

    Article  Google Scholar 

  72. Shabalin, A.A., Weigman, V.J., Perou, C.M., Nobel, A.B.: Finding large average submatrices in high dimensional data. Ann. Appl. Stat. 3(3), 985–1012 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  73. Tan, K.M., Witten, D.M.: Sparse biclustering of transposable data. J. Comput. Graph. Stat. 23(4), 985–1008 (2014)

    Article  MathSciNet  Google Scholar 

  74. Chen, G., Sullivan, P.F., Kosorok, M.R.: Biclustering with heterogeneous variance. Proc. Natl. Acad. Sci. 110(30), 12253–12258 (2013)

    Article  MathSciNet  MATH  Google Scholar 

  75. Cruz, J.A., Wishart, D.S.: Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2, 59–78 (2006)

    Google Scholar 

  76. Swan, A.L., Mobasheri, A., Allaway, D., Liddell, S., Bacardit, J.: Application of machine learning to proteomics data: classification and biomarker identification in postgenomics biology. OMICS 17(12), 595–610 (2013)

    Article  Google Scholar 

  77. Libbrecht, M.W., Noble, W.S.: Machine learning applications in genetics and genomics. Nat. Rev. Genet. 16(6), 321–332 (2015)

    Article  Google Scholar 

  78. Bender, R., Lange, S.: Adjusting for multiple testing? when and how? J. Clin. Epidemiol. 54(4), 343–349 (2001)

    Article  Google Scholar 

  79. Glickman, M.E., Rao, S.R., Schultz, M.R.: False discovery rate control is a recommended alternative to Bonferroni-type adjustments in health studies. J. Clin. Epidemiol. 67(8), 850–857 (2014)

    Article  Google Scholar 

  80. Westfall, P.H., Young, S.S.: Resampling-based multiple testing: examples and methods for p-value adjustment, vol. 279. John Wiley & Sons, New York (1993)

    Google Scholar 

  81. Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979)

    MathSciNet  MATH  Google Scholar 

  82. Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.) 57(1), 289–300 (1995)

    MathSciNet  MATH  Google Scholar 

  83. Efron, B.: Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction, vol. 1. Cambridge University Press, Cambridge (2012)

    MATH  Google Scholar 

  84. Van der Laan, M.J.: Multiple Testing Procedures with Applications to Genomics. Springer Series in Statistics. Springer, Heidelberg (2008)

    Google Scholar 

  85. Pepe, M.S.: The statistical evaluation of medical tests for classification and prediction. Oxford University Press, USA (2003)

    MATH  Google Scholar 

  86. Pepe, M.S.: A regression modelling framework for receiver operating characteristic curves in medical diagnostic testing. Biometrika 84(3), 595–608 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  87. Cai, T., Pepe, M.S.: Semiparametric receiver operating characteristic analysis to evaluate biomarkers for disease. J. Am. Stat. Assoc. 97(460), 1099–1107 (2002)

    Article  MathSciNet  MATH  Google Scholar 

  88. Chrzanowski, M.: Weighted empirical likelihood inference for the area under the ROC curve. J. Stat. Plan. Infer. 147, 159–172 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  89. Cai, T., Dodd, L.E.: Regression analysis for the partial area under the ROC curve. Statistica Sin. 18, 817–836 (2008)

    MathSciNet  MATH  Google Scholar 

  90. Cai, T., Moskowitz, C.S.: Semi-parametric estimation of the binormal ROC curve for a continuous diagnostic test. Biostatistics 5(4), 573–586 (2004)

    Article  MATH  Google Scholar 

  91. Pepe, M.S.: An interpretation for the ROC curve and inference using GLM procedures. Biometrics 56(2), 352–359 (2000)

    Article  MathSciNet  MATH  Google Scholar 

  92. Ware, J.H.: The limitations of risk factors as prognostic tools. N. Engl. J. Med. 355(25), 2615–2617 (2006)

    Article  Google Scholar 

  93. Pencina, M.J., D’Agostino, R.B., Vasan, R.S.: Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat. Med. 27(2), 157–172 (2008)

    Article  MathSciNet  Google Scholar 

  94. Pencina, M.J., D’Agostino, R.B., Steyerberg, E.W.: Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat. Med. 30(1), 11–21 (2011)

    Article  MathSciNet  Google Scholar 

  95. Gail, M., Simon, R.M.: Testing for qualitative interactions between treatmenteects and patient subsets. Biometrics 41(2), 361–372 (1985)

    Article  MATH  Google Scholar 

  96. Russek-Cohen, E., Simon, R.M.: Evaluating treatments when a gender by treatment interaction may exist. Stat. Med. 16(4), 455–464 (1997)

    Article  Google Scholar 

  97. Huang, Y., Gilbert, P.B., Janes, H.: Assessing treatment-selection markers using a potential outcomes framework. Biometrics 68(3), 687–696 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  98. Zhang, Z., Nie, L., Soon, G., Liu, A.: The use of covariates and random effects in evaluating predictive biomarkers under a potential outcome framework. Ann. Appl. Stat. 8(4), 2336 (2014)

    Article  MathSciNet  MATH  Google Scholar 

  99. Polley, M.Y.C., Freidlin, B., Korn, E.L., Conley, B.A., Abrams, J.S., McShane, L.M.: Statistical and practical considerations for clinical evaluation of predictive biomarkers. J. Natl. Cancer Inst. 105(22), 1677–1683 (2013)

    Article  Google Scholar 

  100. Lawrence, I., Lin, K.: A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, 255–268 (1989)

    Article  MATH  Google Scholar 

  101. Drummond, C.: Replicability is Not Reproducibility: Nor is it Good Science (2009)

    Google Scholar 

  102. Casadevall, A., Fang, F.C.: Reproducible science. Infect. Immun. 78(12), 4972–4975 (2010)

    Article  Google Scholar 

  103. Laine, C., Goodman, S.N., Griswold, M.E., Sox, H.C.: Reproducible research: moving toward research the public can really trust. Ann. Intern. Med. 146(6), 450–453 (2007)

    Article  Google Scholar 

  104. Fleming, T.R., DeMets, D.L.: Surrogate end points in clinical trials: are we being misled? Ann. Intern. Med. 125(7), 605–613 (1996)

    Article  Google Scholar 

  105. Connolly, S.J.: Use and misuse of surrogate outcomes in arrhythmia trials. Circulation 113(6), 764–766 (2006)

    Article  Google Scholar 

  106. Weir, M., Investigators, C.A.S.T., et al.: The cardiac arrhythmia suppression trial investigators: Preliminary report: Effect of encainide and flecainide on mortality in a randomized trial of arrhythmia suppression after myocardial infarction. Cardiopul. Phys. Ther. J. 1(2), 12 (1990)

    Google Scholar 

  107. Prentice, R.L.: Surrogate endpoints in clinical trials: definition and operational criteria. Stat. Med. 8(4), 431–440 (1989)

    Article  MathSciNet  Google Scholar 

  108. Berger, V.W.: Does the prentice criterion validate surrogate endpoints? Stat. Med. 23(10), 1571–1578 (2004)

    Article  Google Scholar 

  109. Strimbu, K., Tavel, J.A.: What are biomarkers? Curr. Opin. HIV AIDS 5(6), 463 (2010)

    Article  Google Scholar 

  110. Sbarouni, E., Georgiadou, P., Voudris, V.: Gender-specific differences in biomarkers responses to acute coronary syndromes and revascularization procedures. Biomarkers 16(6), 457–465 (2011)

    Article  Google Scholar 

  111. Healy, B.: The yentl syndrome. N. Engl. J. Med. 325(4), 274–276 (1991)

    Article  MathSciNet  Google Scholar 

  112. Hoffman, R.M.: Screening for prostate cancer. N. Engl. J. Med. 365(21), 2013–2019 (2011)

    Article  Google Scholar 

  113. Holzinger, A.: Interactive machine learning for health informatics: When do we need the human-in-the-loop? Brain Inform. 3(2), 119–131 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michael R. Kosorok .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this chapter

Cite this chapter

Hidalgo, S.J.T. et al. (2016). A Master Pipeline for Discovery and Validation of Biomarkers. In: Holzinger, A. (eds) Machine Learning for Health Informatics. Lecture Notes in Computer Science(), vol 9605. Springer, Cham. https://doi.org/10.1007/978-3-319-50478-0_13

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50478-0_13

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50477-3

  • Online ISBN: 978-3-319-50478-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics