Skip to main content

Markov Decision Processes: Application to Treatment Planning

  • Living reference work entry
  • First Online:
Encyclopedia of Optimization
  • 30 Accesses

Introduction

The availability of large, heterogenous patient datasets has recently enabled the development of data-driven methods for improving healthcare. Many such efforts have combined medical data with machine learning for predictive purposes, using large patient datasets to diagnose diseases [5] or forecast patients’ disease trajectories [21].

Patient data can also be used for prescriptive purposes, by optimizing the type or timing of treatment that patients receive. Because patients’ medical conditions unfold over time, such prescriptive analytics must take into account patients’ transitions among disease states, and how treatment actions in the present can affect future health outcomes.

Markov decision processes(MDPs) are a powerful tool for such modeling. An MDP characterizes a system (such as a patient) that transitions among states over time. State transition probabilities depend both on the current state of the system and the action (such as a treatment) taken in that...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Alagoz O, Maillart L, Schaefer A, Roberts M (2004) The optimal timing of living-donor liver transplantation. Manag Sci 50(10):1420–1430

    Article  MATH  Google Scholar 

  2. Ayer T, Alagoz O, Stout N (2012) Or forum—a pomdp approach to personalize mammography screening decisions. Oper Res 60(5):1019–1034

    Article  MathSciNet  MATH  Google Scholar 

  3. Baucum M, Khojandi A, Vasudevan R (2020) Improving deep reinforcement learning with transitional variational autoencoders: A healthcare application. IEEE J Biomed Health Inform 25(6):2273–2280

    Article  Google Scholar 

  4. Baucum M, Khojandi A, Vasudevan R, Ramdhani R (2020) Optimizing patient-specific medication regimen policies using wearable sensors in parkinson’s disease. https://bmjopen.bmj.com/content/bmjopen/7/11/e018374.full.pdf, preprint

  5. Fatima M, Pasha M (2017) Survey of machine learning algorithms for disease diagnostic. J Intell Learn Syst Appl 9(01):1

    Google Scholar 

  6. Feinberg E (2011) Total expected discounted reward MDPs: existence of optimal policies. In: Wiley encyclopedia of operations research and management science. Wiley, Hoboken, NJ

    Google Scholar 

  7. Feinberg E, Shwartz A (2002) Handbook of Markov decision processes: methods and applications. Kluwer Academic Publishers, Boston, MA

    Book  MATH  Google Scholar 

  8. Grand-Clément J, Chan C, Goyal V, Chuang E (2021) Interpretable machine learning for resource allocation with application to ventilator triage. ArXiv preprint 2110.10994

    Google Scholar 

  9. Ibrahim R, Kucukyazici B, Verter V, Gendreau M, Blostein M (2016) Designing personalized treatment: an application to anticoagulation therapy. Prod Oper Manag 25(5):902–918

    Article  Google Scholar 

  10. Komorowski M, Celi L, Badawi O, Gordon A, Faisal A (2018) The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care. Nat Med 24(11):1716–1720

    Article  Google Scholar 

  11. Lakkaraju H, Rudin C (2017) Learning cost-effective and interpretable treatment regimes. In: Artificial Intelligence and Statistics. PMLR, pp 166–175

    Google Scholar 

  12. Liu Y, Li S, Li F, Song L, Rehg J (2015) Efficient learning of continuous-time hidden markov models for disease progression. Adv Neural Inf Process Syst 28:3600–3608

    Google Scholar 

  13. Lu M, Shahn Z, Sow D, Doshi-Velez F, Lehman L (2020) Is deep reinforcement learning ready for practical applications in healthcare? A sensitivity analysis of duel-ddqn for hemodynamic management in sepsis patients. In: AMIA Annual Symposium Proceedings, vol 2020. American Medical Informatics Association, p 773

    Google Scholar 

  14. Nemati S, Ghassemi M, Clifford G (2016) Optimal medication dosing from suboptimal clinical examples: a deep reinforcement learning approach. In: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, pp 2978–2981

    Google Scholar 

  15. Parbhoo S, Bogojeska J, Zazzi M, Roth V, Doshi-Velez F (2017) Combining kernel and model based learning for hiv therapy selection. AMIA Summits Transl Sci Proc 2017:239

    Google Scholar 

  16. Peine A, Hallawa A, Bickenbach J, Dartmann G, Fazlic L, Schmeink A, Ascheid G, Thiemermann C, Schuppert A, Kindle R (2021) Development and validation of a reinforcement learning algorithm to dynamically optimize mechanical ventilation in critical care. NPJ Digit Med 4(1):1–12

    Article  Google Scholar 

  17. Prasad N, Cheng L, Chivers C, Draugelis M, Engelhardt B (2017) A reinforcement learning approach to weaning of mechanical ventilation in intensive care units. ArXiv preprint 1704.06300

    Google Scholar 

  18. Puterman M (1990) Markov decision processes. Handb Oper Rese Manag Sci 2:331–434

    MathSciNet  MATH  Google Scholar 

  19. Puterman M (2014) Markov decision processes: discrete stochastic dynamic programming. Wiley, Hoboken, NJ

    MATH  Google Scholar 

  20. Raghu A, Komorowski M, Ahmed I, Celi L, Szolovits P, Ghassemi M (2017) Deep reinforcement learning for sepsis treatment. ArXiv preprint 1711.09602

    Google Scholar 

  21. Saranya G, Pravin A (2020) A comprehensive study on disease risk predictions in machine learning. Int J Electr Comput Eng 10(4):4217

    Google Scholar 

  22. Schaefer A, Bailey M, Shechter S, Roberts M (2005) Modeling medical treatment using markov decision processes. In: Operations research and health care. Springer, Boston, MA, pp 593–612

    Chapter  Google Scholar 

  23. Schell G, Marrero W, Lavieri M, Sussman J, Hayward R (2016) Data-driven markov decision process approximations for personalized hypertension treatment planning. MDM Policy Pract 1(1), https://doi.org/10.1177/2381468316674214

  24. Shechter S, Bailey M, Schaefer A, Roberts M (2008) The optimal time to initiate hiv therapy under ordered health states. Oper Res 56(1):20–33

    Article  MATH  Google Scholar 

  25. Steimle L, Denton B (2017) Markov decision processes for screening and treatment of chronic diseases. In: Markov decision processes in practice. Springer, Cham, Switzerland, pp 189–222

    Chapter  MATH  Google Scholar 

  26. Tang S, Modi A, Sjoding M, Wiens J (2020) Clinician-in-the-loop decision making: Reinforcement learning with near-optimal set-valued policies. In: International Conference on Machine Learning. PMLR, pp 9387–9396

    Google Scholar 

  27. White III C, White D (1989) Markov decision processes. Eur J Oper Res 39(1):1–16

    Article  MathSciNet  MATH  Google Scholar 

  28. Wiesemann W, Kuhn D, Rustem B (2013) Robust markov decision processes. Math Oper Res 38(1):153–183

    Article  MathSciNet  MATH  Google Scholar 

  29. Xu H, Mannor S (2010) Distributionally robust markov decision processes. Adv Neural Inf Process Syst 23:2505–2513

    MATH  Google Scholar 

  30. Yu C, Liu J, Nemati S, Yin G (2021) Reinforcement learning in healthcare: a survey. ACM Comput Surv (CSUR) 55(1):1–36

    Article  Google Scholar 

  31. Zhou Z, Wang Y, Mamani H, Coffey D (2019) How do tumor cytogenetics inform cancer treatments? Dynamic risk stratification and precision medicine using multi-armed bandits. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3405082, preprint

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anahita Khojandi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2023 Springer Nature Switzerland AG

About this entry

Check for updates. Verify currency and authenticity via CrossMark

Cite this entry

Baucum, M., Khojandi, A. (2023). Markov Decision Processes: Application to Treatment Planning. In: Pardalos, P.M., Prokopyev, O.A. (eds) Encyclopedia of Optimization. Springer, Cham. https://doi.org/10.1007/978-3-030-54621-2_844-1

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-54621-2_844-1

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-54621-2

  • Online ISBN: 978-3-030-54621-2

  • eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics