A Master Pipeline for Discovery and Validation of Biomarkers

Hidalgo, Sebastian J. Teran; Lawson, Michael T.; Luckett, Daniel J.; Chaudhari, Monica; Chen, Jingxiang; Choudhury, Arkopal; Di Florio, Arianna; Jiang, Xiaotong; Nguyen, Crystal T.; Kosorok, Michael R.

doi:10.1007/978-3-319-50478-0_13

Sebastian J. Teran Hidalgo¹⁴,
Michael T. Lawson¹⁵,
Daniel J. Luckett¹⁵,
Monica Chaudhari¹⁵,
Jingxiang Chen¹⁵,
Arkopal Choudhury¹⁵,
Arianna Di Florio¹⁶,
Xiaotong Jiang¹⁵,
Crystal T. Nguyen¹⁵ &
…
Michael R. Kosorok¹⁵

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9605))

5161 Accesses
4 Altmetric

Abstract

A major challenge in precision medicine is the development of biomarkers which can effectively guide patient treatment in a manner which benefits both the individual and the population. Much of the difficulty is the poor reproducibility of existing approaches as well as the complexity of the problem. Machine learning tools with rigorous statistical inference properties have great potential to move this area forward. In this chapter, we review existing pipelines for biomarker discovery and validation from a statistical perspective and identify a number of key areas where improvements are needed. We then proceed to outline a framework for developing a master pipeline firmly grounded in statistical principles which can yield better reproducibility, leading to improved biomarker development and increasing success in precision medicine.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Oldenhuis, C., Oosting, S., Gietema, J., De Vries, E.: Prognostic versus predictive value of biomarkers in oncology. Eur. J. Cancer 44(7), 946–953 (2008)
Article Google Scholar
National Institutes of Health: Precision Medicine Initiative Cohort Program (2016). Accessed 25 Feb 2016
Google Scholar
Poste, G.: Bring on the biomarkers. Nature 469(7329), 156–157 (2011)
Article Google Scholar
Preedy, V.R., Patel, V.B.: General Methods in Biomarker Research and Their Applications. Springer, Netherlands (2015)
Book Google Scholar
Novelli, G., Ciccacci, C., Borgiani, P., Papaluca Amati, M., Abadie, E.: Genetic tests and genomic biomarkers: regulation, qualification and validation. Clin. Cases Min. Bone Metab. 5(2), 149–154 (2008)
Google Scholar
Sun, Q., Van Dam, R.M., Spiegelman, D., Heymsfield, S.B., Willett, W.C., Hu, F.B.: Comparison of dual-energy x-ray absorptiometric and anthropometric measures of adiposity in relation to adiposity-related biologic factors. Am. J. Epidemiol. kwq306 (2010)
Google Scholar
Flegal, K.M., Graubard, B.I.: Estimates of excess deaths associated with body mass index and other anthropometric variables. Am. J. Clin. Nutr. 89(4), 1213–1219 (2009)
Article Google Scholar
Task Force of the European Society of Cardiology and the North American Society of Pacing Electrophysiology: Heart rate variability: standards of measurement, physiological interpretation and clinical use. Circulation 93(5), 1043–1065 (1996)
Google Scholar
Huikuri, H.V., Stein, P.K.: Heart rate variability in risk stratification of cardiac patients. Prog. Cardiovasc. Dis. 56(2), 153–159 (2013)
Article Google Scholar
Association, A.D., et al.: Standards of medical care in diabetes - 2015 abridged for primary care providers. Clin. Diab. 33(2), 97–111 (2015)
Article Google Scholar
Larsen, M.L., Hørder, M., Mogensen, E.F.: Effect of long-term monitoring of glycosylated hemoglobin levels in insulin-dependent diabetes mellitus. N. Engl. J. Med. 323(15), 1021–1025 (1990)
Article Google Scholar
Karapetis, C.S., Khambata-Ford, S., Jonker, D.J., O’Callaghan, C.J., Tu, D., Tebbutt, N.C., Simes, R.J., Chalchal, H., Shapiro, J.D., Robitaille, S., et al.: K-ras mutations and benefit from cetuximab in advanced colorectal cancer. N. Engl. J. Med. 359(17), 1757–1765 (2008)
Article Google Scholar
Miki, Y., Swensen, J., Shattuck-Eidens, D., Futreal, P.A., Harshman, K., Tavtigian, S., Liu, Q., Cochran, C., Bennett, L.M., Ding, W., et al.: A strong candidate for the breast and ovarian cancer susceptibility gene BRCA1. Science 266(5182), 66–71 (1994)
Article Google Scholar
Lee, M., Shen, H., Huang, J.Z., Marron, J.: Biclustering via sparse singular value decomposition. Biometrics 66(4), 1087–1095 (2010)
Article MathSciNet MATH Google Scholar
Pepe, M.S., Etzioni, R., Feng, Z., Potter, J.D., Thompson, M.L., Thornquist, M., Winget, M., Yasui, Y.: Phases of biomarker development for early detection of cancer. J. Natl Cancer Inst. 93(14), 1054–1061 (2001)
Article Google Scholar
Sargent, D.J., Conley, B.A., Allegra, C., Collette, L.: Clinical trial designs for predictive marker validation in cancer treatment trials. J. Clin. Oncol. 23(9), 2020–2027 (2005)
Article Google Scholar
Freidlin, B., McShane, L.M., Korn, E.L.: Randomized clinical trials with biomarkers: design issues. J. Natl Cancer Inst. 102(3), 152–160 (2010)
Article Google Scholar
Simon, R.: Clinical trial designs for evaluating the medical utility of prognostic and predictive biomarkers in oncology. Personalized Med. 7(1), 33–47 (2010)
Article Google Scholar
Mandrekar, S.J., Sargent, D.J.: Clinical trial designs for predictive biomarker validation: one size does not fit all. J. Biopharm. Stat. 19(3), 530–542 (2009)
Article MathSciNet Google Scholar
Jiang, W., Freidlin, B., Simon, R.: Biomarker-adaptive threshold design: a procedure for evaluating treatment with possible biomarker-defined subset effect. J. Natl Cancer Inst. 99(13), 1036–1043 (2007)
Article Google Scholar
Freidlin, B., Simon, R.: Adaptive signature design: an adaptive clinical trial design for generating and prospectively testing a gene expression signature for sensitive patients. Clin. Cancer Res. 11(21), 7872–7878 (2005)
Article Google Scholar
Murphy, S.A.: An experimental design for the development of adaptive treatment strategies. Stat. Med. 24(10), 1455–1481 (2005)
Article MathSciNet Google Scholar
Denny, J.C.: Mining electronic health records in the genomics era. PLoS Comput. Biol. 8(12), e1002823 (2012)
Article Google Scholar
Society for Clinical Data Management, I: Good Clinical Data Management Practices (2005). Accessed 25 Feb 2016
Google Scholar
Bruza, P.D., Van der Weide, T.P.: The semantics of data flow diagrams. University of Nijmegen, Department of Informatics, Faculty of Mathematics and Informatics (1989)
Google Scholar
U.S. Department of Health & Human Services: HIPAA Administrative Simplification (2013). Accessed 25 Feb 2016
Google Scholar
Wei, S., Kosorok, M.R.: Latent supervised learning. J. Am. Stat. Assoc. 108(503), 957–970 (2013)
Article MathSciNet MATH Google Scholar
Chapelle, O., Schölkopf, B., Zien, A., et al.: Semi-supervised learning (2006)
Google Scholar
Kosorok, M.R.: What’s so special about semiparametric methods? Sankhya. Ser. B [Methodol.] 71(2), 331–353 (2009)
Google Scholar
Wei, L.: The accelerated failure time model: a useful alternative to the Cox regression model in survival analysis. Stat. Med. 11(14–15), 1871–1879 (1992)
Article Google Scholar
Altstein, L., Li, G.: Latent subgroup analysis of a randomized clinical trial through a semiparametric accelerated failure time mixture model. Biometrics 69(1), 52–61 (2013)
Article MathSciNet MATH Google Scholar
Hastie, T., Tibshirani, F.: The Elements of Statistical Learning (2001)
Google Scholar
Tibshirani, R.: Regression shrinkage and selection via the lasso. J. Roy. Stat. Soc.: Ser. B (Methodol.) 58(1), 267–288 (1996)
MathSciNet MATH Google Scholar
Meier, L., Van De Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. Roy. Stat. Soc.: Ser. B (Methodol.) 70(1), 53–71 (2008)
Article MathSciNet MATH Google Scholar
Tibshirani, R., et al.: The lasso method for variable selection in the Cox model. Stat. Med. 16(4), 385–395 (1997)
Article Google Scholar
Bien, J., Taylor, J., Tibshirani, R.: A lasso for hierarchical interactions. Ann. Stat. 41(3), 1111 (2013)
Article MathSciNet MATH Google Scholar
Bondell, H.D., Krishna, A., Ghosh, S.K.: Joint variable selection for fixed and random effects in linear mixed-effects models. Biometrics 66(4), 1069–1077 (2010)
Article MathSciNet MATH Google Scholar
Ibrahim, J.G., Zhu, H., Garcia, R.I., Guo, R.: Fixed and random effects selection in mixed effects models. Biometrics 67(2), 495–503 (2011)
Article MathSciNet MATH Google Scholar
Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001)
Article MathSciNet MATH Google Scholar
Chipman, H.A., George, E.I., McCulloch, R.E.: BART: Bayesian additive regression trees. Ann. Appl. Stat. 4(1), 266–298 (2010)
Article MathSciNet MATH Google Scholar
Zhu, R., Zeng, D., Kosorok, M.R.: Reinforcement learning trees. J. Am. Stat. Assoc. 110(512), 1770–1784 (2015)
Article MathSciNet Google Scholar
Gray, K.R., Aljabar, P., Heckemann, R.A., Hammers, A., Rueckert, D., Initiative, A.D.N., et al.: Random forest-based similarity measures for multi-modal classification of Alzheimer‘s disease. NeuroImage 65, 167–175 (2013)
Article Google Scholar
Hinton, G., Deng, L., Yu, D., Dahl, G.E., Mohamed, A.R., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T.N., et al.: Deep neural networks for acoustic modeling in speech recognition: the shared views of four research groups. IEEE Sig. Process. Mag. 29(6), 82–97 (2012)
Article Google Scholar
Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Google Scholar
Hinton, G.E., Srivastava, N., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.R.: Improving neural networks by preventing co-adaptation of feature detectors. arXiv preprint (2012). arXiv:1207.0580
Xiong, H.Y., Alipanahi, B., Lee, L.J., Bretschneider, H., Merico, D., Yuen, R.K., Hua, Y., Gueroussov, S., Najafabadi, H.S., Hughes, T.R., et al.: The human splicing code reveals new insights into the genetic determinants of disease. Science 347(6218), 1254806 (2015)
Article Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and Other Kernel-Based Learning Methods. Cambridge University Press, Cambridge (2000)
Book MATH Google Scholar
Boser, B.E., Guyon, I.M., Vapnik, V.N.: A training algorithm for optimal margin classifiers. In: Proceedings of the Fifth Annual Workshop on Computational Learning Theory, pp. 144–152. ACM (1992)
Google Scholar
Smola, A., Vapnik, V.: Support vector regression machines. Advances in Neural Information Processing Systems, vol. 9, pp. 155–161 (1997)
Google Scholar
Zhao, Y., Kosorok, M.R., Zeng, D.: Reinforcement learning design for cancer clinical trials. Stat. Med. 28(26), 3294–3315 (2009)
Article MathSciNet Google Scholar
Zhao, Y., Zeng, D., Socinski, M.A., Kosorok, M.R.: Reinforcement learning strategies for clinical trials in nonsmall cell lung cancer. Biometrics 67(4), 1422–1433 (2011)
Article MathSciNet MATH Google Scholar
Vansteelandt, S., Joffe, M., et al.: Structural nested models and G-estimation: The partially realized promise. Stat. Sci. 29(4), 707–731 (2014)
Article MathSciNet MATH Google Scholar
Robins, J.: A new approach to causal inference in mortality studies with a sustained exposure period-application to control of the healthy worker survivor effect. Math. Model. 7(9), 1393–1512 (1986)
Article MathSciNet MATH Google Scholar
Robins, J.M.: The analysis of randomized and non-randomized AIDS treatment trials using a new approach to causal inference in longitudinal studies. Health Service Res. Methodol.: A Focus on AIDS 113, 159 (1989)
Google Scholar
Witteman, J.C., D’Agostino, R.B., Stijnen, T., Kannel, W.B., Cobb, J.C., de Ridder, M.A., Hofman, A., Robins, J.M.: G-estimation of causal effects: isolated systolic hypertension and cardiovascular death in the Framingham Heart Study. Am. J. Epidemiol. 148(4), 390–401 (1998)
Article Google Scholar
Robins, J.M., Blevins, D., Ritter, G., Wulfsohn, M.: G-estimation of the effect of prophylaxis therapy for pneumocystis carinii pneumonia on the survival of AIDS patients. Epidemiology 3, 319–336 (1992)
Article Google Scholar
Zhao, Y., Zeng, D., Rush, A.J., Kosorok, M.R.: Estimating individualized treatment rules using outcome weighted learning. J. Am. Stat. Assoc. 107(449), 1106–1118 (2012)
Article MathSciNet MATH Google Scholar
Qian, M., Murphy, S.A.: Performance guarantees for individualized treatment rules. Ann. Stat. 39(2), 1180–1210 (2011)
Article MathSciNet MATH Google Scholar
Zhou, X., Mayer-Hamblett, N., Khan, U., Kosorok, M.R.: Residual weighted learning for estimating individualized treatment rules. J. Am. Stat. Assoc., October 2015
Google Scholar
Xu, Y., Yu, M., Zhao, Y.Q., Li, Q., Wang, S., Shao, J.: Regularized outcome weighted subgroup identification for differential treatment effects. Biometrics 71(3), 645–653 (2015)
Article MathSciNet MATH Google Scholar
Zhao, Y.Q., Zeng, D., Laber, E.B., Kosorok, M.R.: New statistical learning methods for estimating optimal dynamic treatment regimes. J. Am. Stat. Assoc. 110(510), 583–598 (2015)
Article MathSciNet Google Scholar
Su, X., Meneses, K., McNees, P., Johnson, W.O.: Interaction trees: exploring the differential effects of an intervention programme for breast cancer survivors. J. Roy. Stat. Soc. C (Appl. Stat.) 60(3), 457–474 (2011)
Article MathSciNet Google Scholar
Zhang, B., Tsiatis, A.A., Laber, E.B., Davidian, M.: A robust method for estimating optimal treatment regimes. Biometrics 68(4), 1010–1018 (2012)
Article MathSciNet MATH Google Scholar
Tian, L., Alizadeh, A.A., Gentles, A.J., Tibshirani, R.: A simple method for estimating interactions between a treatment and a large number of covariate. J. Am. Stat. Assoc. 109(508), 1517–1532 (2014)
Article MathSciNet Google Scholar
Cox, D.: Regression models and life tables (with discussion). J. Roy.Stat. Soc, B 34, 187–220 (1972)
Google Scholar
Kalantar-Zadeh, K., Kopple, J.D., Regidor, D.L., Jing, J., Shinaberger, C.S., Aronovitz, J., McAllister, C.J., Whellan, D., Sharma, K.: A1C and survival in maintenance hemodialysis patients. Diab. Care 30(5), 1049–1055 (2007)
Article Google Scholar
Kyan, M., Muneesawang, P., Jarrah, K., Guan, L.: Unsupervised Learning: A Dynamic Approach. IEEE Press Series on Computational Intelligence, pp. 275–276
Google Scholar
Tenenbaum, J.B., De Silva, V., Langford, J.C.: A global geometric framework for nonlinear dimensionality reduction. Science 290(5500), 2319–2323 (2000)
Article Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(2579–2605), 85 (2008)
MATH Google Scholar
Roweis, S.T., Saul, L.K.: Nonlinear dimensionality reduction by locally linear embedding. Science 290(5500), 2323–2326 (2000)
Article Google Scholar
Coifman, R.R., Lafon, S., Lee, A.B., Maggioni, M., Nadler, B., Warner, F., Zucker, S.W.: Geometric diffusions as a tool for harmonic analysis and structure definition of data: diffusion maps. Proc. Natl. Acad. Sci. U.S.A. 102(21), 7426–7431 (2005)
Article Google Scholar
Shabalin, A.A., Weigman, V.J., Perou, C.M., Nobel, A.B.: Finding large average submatrices in high dimensional data. Ann. Appl. Stat. 3(3), 985–1012 (2009)
Article MathSciNet MATH Google Scholar
Tan, K.M., Witten, D.M.: Sparse biclustering of transposable data. J. Comput. Graph. Stat. 23(4), 985–1008 (2014)
Article MathSciNet Google Scholar
Chen, G., Sullivan, P.F., Kosorok, M.R.: Biclustering with heterogeneous variance. Proc. Natl. Acad. Sci. 110(30), 12253–12258 (2013)
Article MathSciNet MATH Google Scholar
Cruz, J.A., Wishart, D.S.: Applications of machine learning in cancer prediction and prognosis. Cancer Inform. 2, 59–78 (2006)
Google Scholar
Swan, A.L., Mobasheri, A., Allaway, D., Liddell, S., Bacardit, J.: Application of machine learning to proteomics data: classification and biomarker identification in postgenomics biology. OMICS 17(12), 595–610 (2013)
Article Google Scholar
Libbrecht, M.W., Noble, W.S.: Machine learning applications in genetics and genomics. Nat. Rev. Genet. 16(6), 321–332 (2015)
Article Google Scholar
Bender, R., Lange, S.: Adjusting for multiple testing? when and how? J. Clin. Epidemiol. 54(4), 343–349 (2001)
Article Google Scholar
Glickman, M.E., Rao, S.R., Schultz, M.R.: False discovery rate control is a recommended alternative to Bonferroni-type adjustments in health studies. J. Clin. Epidemiol. 67(8), 850–857 (2014)
Article Google Scholar
Westfall, P.H., Young, S.S.: Resampling-based multiple testing: examples and methods for p-value adjustment, vol. 279. John Wiley & Sons, New York (1993)
Google Scholar
Holm, S.: A simple sequentially rejective multiple test procedure. Scand. J. Stat. 6, 65–70 (1979)
MathSciNet MATH Google Scholar
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.) 57(1), 289–300 (1995)
MathSciNet MATH Google Scholar
Efron, B.: Large-Scale Inference: Empirical Bayes Methods for Estimation, Testing, and Prediction, vol. 1. Cambridge University Press, Cambridge (2012)
MATH Google Scholar
Van der Laan, M.J.: Multiple Testing Procedures with Applications to Genomics. Springer Series in Statistics. Springer, Heidelberg (2008)
Google Scholar
Pepe, M.S.: The statistical evaluation of medical tests for classification and prediction. Oxford University Press, USA (2003)
MATH Google Scholar
Pepe, M.S.: A regression modelling framework for receiver operating characteristic curves in medical diagnostic testing. Biometrika 84(3), 595–608 (1997)
Article MathSciNet MATH Google Scholar
Cai, T., Pepe, M.S.: Semiparametric receiver operating characteristic analysis to evaluate biomarkers for disease. J. Am. Stat. Assoc. 97(460), 1099–1107 (2002)
Article MathSciNet MATH Google Scholar
Chrzanowski, M.: Weighted empirical likelihood inference for the area under the ROC curve. J. Stat. Plan. Infer. 147, 159–172 (2014)
Article MathSciNet MATH Google Scholar
Cai, T., Dodd, L.E.: Regression analysis for the partial area under the ROC curve. Statistica Sin. 18, 817–836 (2008)
MathSciNet MATH Google Scholar
Cai, T., Moskowitz, C.S.: Semi-parametric estimation of the binormal ROC curve for a continuous diagnostic test. Biostatistics 5(4), 573–586 (2004)
Article MATH Google Scholar
Pepe, M.S.: An interpretation for the ROC curve and inference using GLM procedures. Biometrics 56(2), 352–359 (2000)
Article MathSciNet MATH Google Scholar
Ware, J.H.: The limitations of risk factors as prognostic tools. N. Engl. J. Med. 355(25), 2615–2617 (2006)
Article Google Scholar
Pencina, M.J., D’Agostino, R.B., Vasan, R.S.: Evaluating the added predictive ability of a new marker: from area under the ROC curve to reclassification and beyond. Stat. Med. 27(2), 157–172 (2008)
Article MathSciNet Google Scholar
Pencina, M.J., D’Agostino, R.B., Steyerberg, E.W.: Extensions of net reclassification improvement calculations to measure usefulness of new biomarkers. Stat. Med. 30(1), 11–21 (2011)
Article MathSciNet Google Scholar
Gail, M., Simon, R.M.: Testing for qualitative interactions between treatmenteects and patient subsets. Biometrics 41(2), 361–372 (1985)
Article MATH Google Scholar
Russek-Cohen, E., Simon, R.M.: Evaluating treatments when a gender by treatment interaction may exist. Stat. Med. 16(4), 455–464 (1997)
Article Google Scholar
Huang, Y., Gilbert, P.B., Janes, H.: Assessing treatment-selection markers using a potential outcomes framework. Biometrics 68(3), 687–696 (2012)
Article MathSciNet MATH Google Scholar
Zhang, Z., Nie, L., Soon, G., Liu, A.: The use of covariates and random effects in evaluating predictive biomarkers under a potential outcome framework. Ann. Appl. Stat. 8(4), 2336 (2014)
Article MathSciNet MATH Google Scholar
Polley, M.Y.C., Freidlin, B., Korn, E.L., Conley, B.A., Abrams, J.S., McShane, L.M.: Statistical and practical considerations for clinical evaluation of predictive biomarkers. J. Natl. Cancer Inst. 105(22), 1677–1683 (2013)
Article Google Scholar
Lawrence, I., Lin, K.: A concordance correlation coefficient to evaluate reproducibility. Biometrics 45, 255–268 (1989)
Article MATH Google Scholar
Drummond, C.: Replicability is Not Reproducibility: Nor is it Good Science (2009)
Google Scholar
Casadevall, A., Fang, F.C.: Reproducible science. Infect. Immun. 78(12), 4972–4975 (2010)
Article Google Scholar
Laine, C., Goodman, S.N., Griswold, M.E., Sox, H.C.: Reproducible research: moving toward research the public can really trust. Ann. Intern. Med. 146(6), 450–453 (2007)
Article Google Scholar
Fleming, T.R., DeMets, D.L.: Surrogate end points in clinical trials: are we being misled? Ann. Intern. Med. 125(7), 605–613 (1996)
Article Google Scholar
Connolly, S.J.: Use and misuse of surrogate outcomes in arrhythmia trials. Circulation 113(6), 764–766 (2006)
Article Google Scholar
Weir, M., Investigators, C.A.S.T., et al.: The cardiac arrhythmia suppression trial investigators: Preliminary report: Effect of encainide and flecainide on mortality in a randomized trial of arrhythmia suppression after myocardial infarction. Cardiopul. Phys. Ther. J. 1(2), 12 (1990)
Google Scholar
Prentice, R.L.: Surrogate endpoints in clinical trials: definition and operational criteria. Stat. Med. 8(4), 431–440 (1989)
Article MathSciNet Google Scholar
Berger, V.W.: Does the prentice criterion validate surrogate endpoints? Stat. Med. 23(10), 1571–1578 (2004)
Article Google Scholar
Strimbu, K., Tavel, J.A.: What are biomarkers? Curr. Opin. HIV AIDS 5(6), 463 (2010)
Article Google Scholar
Sbarouni, E., Georgiadou, P., Voudris, V.: Gender-specific differences in biomarkers responses to acute coronary syndromes and revascularization procedures. Biomarkers 16(6), 457–465 (2011)
Article Google Scholar
Healy, B.: The yentl syndrome. N. Engl. J. Med. 325(4), 274–276 (1991)
Article MathSciNet Google Scholar
Hoffman, R.M.: Screening for prostate cancer. N. Engl. J. Med. 365(21), 2013–2019 (2011)
Article Google Scholar
Holzinger, A.: Interactive machine learning for health informatics: When do we need the human-in-the-loop? Brain Inform. 3(2), 119–131 (2016)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Biostatistics, Yale University, Hew Haven, CT, USA
Sebastian J. Teran Hidalgo
Department of Biostatistics, University of North Carolina, Chapel Hill, NC, USA
Michael T. Lawson, Daniel J. Luckett, Monica Chaudhari, Jingxiang Chen, Arkopal Choudhury, Xiaotong Jiang, Crystal T. Nguyen & Michael R. Kosorok
Division of Psychological Medicine and Clinical Neurosciences, Cardiff University, Cardiff, Wales, UK
Arianna Di Florio

Authors

Sebastian J. Teran Hidalgo
View author publications
You can also search for this author in PubMed Google Scholar
Michael T. Lawson
View author publications
You can also search for this author in PubMed Google Scholar
Daniel J. Luckett
View author publications
You can also search for this author in PubMed Google Scholar
Monica Chaudhari
View author publications
You can also search for this author in PubMed Google Scholar
Jingxiang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Arkopal Choudhury
View author publications
You can also search for this author in PubMed Google Scholar
Arianna Di Florio
View author publications
You can also search for this author in PubMed Google Scholar
Xiaotong Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Crystal T. Nguyen
View author publications
You can also search for this author in PubMed Google Scholar
Michael R. Kosorok
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Michael R. Kosorok .

Editor information

Editors and Affiliations

Institute for Medical Informatics, Statistics and Documentation, Medical University Graz, Graz, Austria
Andreas Holzinger

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Hidalgo, S.J.T. et al. (2016). A Master Pipeline for Discovery and Validation of Biomarkers. In: Holzinger, A. (eds) Machine Learning for Health Informatics. Lecture Notes in Computer Science(), vol 9605. Springer, Cham. https://doi.org/10.1007/978-3-319-50478-0_13

Download citation

DOI: https://doi.org/10.1007/978-3-319-50478-0_13
Published: 10 December 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-50477-3
Online ISBN: 978-3-319-50478-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics