Markov Decision Processes: Application to Treatment Planning

Baucum, Matt; Khojandi, Anahita

doi:10.1007/978-3-030-54621-2_844-1

Matt Baucum³ &
Anahita Khojandi⁴

30 Accesses

Introduction

The availability of large, heterogenous patient datasets has recently enabled the development of data-driven methods for improving healthcare. Many such efforts have combined medical data with machine learning for predictive purposes, using large patient datasets to diagnose diseases [5] or forecast patients’ disease trajectories [21].

Patient data can also be used for prescriptive purposes, by optimizing the type or timing of treatment that patients receive. Because patients’ medical conditions unfold over time, such prescriptive analytics must take into account patients’ transitions among disease states, and how treatment actions in the present can affect future health outcomes.

Markov decision processes(MDPs) are a powerful tool for such modeling. An MDP characterizes a system (such as a patient) that transitions among states over time. State transition probabilities depend both on the current state of the system and the action (such as a treatment) taken in that...

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Alagoz O, Maillart L, Schaefer A, Roberts M (2004) The optimal timing of living-donor liver transplantation. Manag Sci 50(10):1420–1430
Article MATH Google Scholar
Ayer T, Alagoz O, Stout N (2012) Or forum—a pomdp approach to personalize mammography screening decisions. Oper Res 60(5):1019–1034
Article MathSciNet MATH Google Scholar
Baucum M, Khojandi A, Vasudevan R (2020) Improving deep reinforcement learning with transitional variational autoencoders: A healthcare application. IEEE J Biomed Health Inform 25(6):2273–2280
Article Google Scholar
Baucum M, Khojandi A, Vasudevan R, Ramdhani R (2020) Optimizing patient-specific medication regimen policies using wearable sensors in parkinson’s disease. https://bmjopen.bmj.com/content/bmjopen/7/11/e018374.full.pdf, preprint
Fatima M, Pasha M (2017) Survey of machine learning algorithms for disease diagnostic. J Intell Learn Syst Appl 9(01):1
Google Scholar
Feinberg E (2011) Total expected discounted reward MDPs: existence of optimal policies. In: Wiley encyclopedia of operations research and management science. Wiley, Hoboken, NJ
Google Scholar
Feinberg E, Shwartz A (2002) Handbook of Markov decision processes: methods and applications. Kluwer Academic Publishers, Boston, MA
Book MATH Google Scholar
Grand-Clément J, Chan C, Goyal V, Chuang E (2021) Interpretable machine learning for resource allocation with application to ventilator triage. ArXiv preprint 2110.10994
Google Scholar
Ibrahim R, Kucukyazici B, Verter V, Gendreau M, Blostein M (2016) Designing personalized treatment: an application to anticoagulation therapy. Prod Oper Manag 25(5):902–918
Article Google Scholar
Komorowski M, Celi L, Badawi O, Gordon A, Faisal A (2018) The artificial intelligence clinician learns optimal treatment strategies for sepsis in intensive care. Nat Med 24(11):1716–1720
Article Google Scholar
Lakkaraju H, Rudin C (2017) Learning cost-effective and interpretable treatment regimes. In: Artificial Intelligence and Statistics. PMLR, pp 166–175
Google Scholar
Liu Y, Li S, Li F, Song L, Rehg J (2015) Efficient learning of continuous-time hidden markov models for disease progression. Adv Neural Inf Process Syst 28:3600–3608
Google Scholar
Lu M, Shahn Z, Sow D, Doshi-Velez F, Lehman L (2020) Is deep reinforcement learning ready for practical applications in healthcare? A sensitivity analysis of duel-ddqn for hemodynamic management in sepsis patients. In: AMIA Annual Symposium Proceedings, vol 2020. American Medical Informatics Association, p 773
Google Scholar
Nemati S, Ghassemi M, Clifford G (2016) Optimal medication dosing from suboptimal clinical examples: a deep reinforcement learning approach. In: 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC). IEEE, pp 2978–2981
Google Scholar
Parbhoo S, Bogojeska J, Zazzi M, Roth V, Doshi-Velez F (2017) Combining kernel and model based learning for hiv therapy selection. AMIA Summits Transl Sci Proc 2017:239
Google Scholar
Peine A, Hallawa A, Bickenbach J, Dartmann G, Fazlic L, Schmeink A, Ascheid G, Thiemermann C, Schuppert A, Kindle R (2021) Development and validation of a reinforcement learning algorithm to dynamically optimize mechanical ventilation in critical care. NPJ Digit Med 4(1):1–12
Article Google Scholar
Prasad N, Cheng L, Chivers C, Draugelis M, Engelhardt B (2017) A reinforcement learning approach to weaning of mechanical ventilation in intensive care units. ArXiv preprint 1704.06300
Google Scholar
Puterman M (1990) Markov decision processes. Handb Oper Rese Manag Sci 2:331–434
MathSciNet MATH Google Scholar
Puterman M (2014) Markov decision processes: discrete stochastic dynamic programming. Wiley, Hoboken, NJ
MATH Google Scholar
Raghu A, Komorowski M, Ahmed I, Celi L, Szolovits P, Ghassemi M (2017) Deep reinforcement learning for sepsis treatment. ArXiv preprint 1711.09602
Google Scholar
Saranya G, Pravin A (2020) A comprehensive study on disease risk predictions in machine learning. Int J Electr Comput Eng 10(4):4217
Google Scholar
Schaefer A, Bailey M, Shechter S, Roberts M (2005) Modeling medical treatment using markov decision processes. In: Operations research and health care. Springer, Boston, MA, pp 593–612
Chapter Google Scholar
Schell G, Marrero W, Lavieri M, Sussman J, Hayward R (2016) Data-driven markov decision process approximations for personalized hypertension treatment planning. MDM Policy Pract 1(1), https://doi.org/10.1177/2381468316674214
Shechter S, Bailey M, Schaefer A, Roberts M (2008) The optimal time to initiate hiv therapy under ordered health states. Oper Res 56(1):20–33
Article MATH Google Scholar
Steimle L, Denton B (2017) Markov decision processes for screening and treatment of chronic diseases. In: Markov decision processes in practice. Springer, Cham, Switzerland, pp 189–222
Chapter MATH Google Scholar
Tang S, Modi A, Sjoding M, Wiens J (2020) Clinician-in-the-loop decision making: Reinforcement learning with near-optimal set-valued policies. In: International Conference on Machine Learning. PMLR, pp 9387–9396
Google Scholar
White III C, White D (1989) Markov decision processes. Eur J Oper Res 39(1):1–16
Article MathSciNet MATH Google Scholar
Wiesemann W, Kuhn D, Rustem B (2013) Robust markov decision processes. Math Oper Res 38(1):153–183
Article MathSciNet MATH Google Scholar
Xu H, Mannor S (2010) Distributionally robust markov decision processes. Adv Neural Inf Process Syst 23:2505–2513
MATH Google Scholar
Yu C, Liu J, Nemati S, Yin G (2021) Reinforcement learning in healthcare: a survey. ACM Comput Surv (CSUR) 55(1):1–36
Article Google Scholar
Zhou Z, Wang Y, Mamani H, Coffey D (2019) How do tumor cytogenetics inform cancer treatments? Dynamic risk stratification and precision medicine using multi-armed bandits. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3405082, preprint

Download references

Author information

Authors and Affiliations

Department of Business Analytics, Information Systems and Supply Chain, Florida State University, Tallahassee, FL, USA
Matt Baucum
Department of Industrial and Systems Engineering, University of Tennessee-Knoxville, Knoxville, TN, USA
Anahita Khojandi

Authors

Matt Baucum
View author publications
You can also search for this author in PubMed Google Scholar
Anahita Khojandi
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Anahita Khojandi .

Editor information

Editors and Affiliations

Department of Industrial & Systems Engin, University of Florida, Gainesville, FL, USA
Panos M. Pardalos
Departmentl of Industrial Engineering, University of Pittsburgh, Pittsburgh, PA, USA
Oleg A. Prokopyev

Rights and permissions

Reprints and permissions

Copyright information

About this entry

Cite this entry

Baucum, M., Khojandi, A. (2023). Markov Decision Processes: Application to Treatment Planning. In: Pardalos, P.M., Prokopyev, O.A. (eds) Encyclopedia of Optimization. Springer, Cham. https://doi.org/10.1007/978-3-030-54621-2_844-1

Download citation

DOI: https://doi.org/10.1007/978-3-030-54621-2_844-1
Published: 23 September 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-54621-2
Online ISBN: 978-3-030-54621-2
eBook Packages: Springer Reference MathematicsReference Module Computer Science and Engineering

Publish with us

Policies and ethics