Optimizing the Expected Mean Payoff in Energy Markov Decision Processes

Brázdil, Tomáš; Kučera, Antonín; Novotný, Petr

doi:10.1007/978-3-319-46520-3_3

Tomáš Brázdil¹⁶,
Antonín Kučera¹⁶ &
Petr Novotný¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 9938))

Included in the following conference series:

International Symposium on Automated Technology for Verification and Analysis

830 Accesses
7 Citations

Abstract

Energy Markov Decision Processes (EMDPs) are finite-state Markov decision processes where each transition is assigned an integer counter update and a rational payoff. An EMDP configuration is a pair s(n), where s is a control state and n is the current counter value. The configurations are changed by performing transitions in the standard way. We consider the problem of computing a safe strategy (i.e., a strategy that keeps the counter non-negative) which maximizes the expected mean payoff.

The research was funded by the Czech Science Foundation Grant No. P202/12/G061 and by the People Programme (Marie Curie Actions) of the European Union’s Seventh Framework Programme (FP7/2007-2013) under REA grant agreement no [291734].

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
The payoff may correspond to some independent performance measure, or it can reflect the use of the critical resource represented by the counter.
2.
Formally, the decision algorithm answers “yes” iff, say, first possibility holds.
3.
Under a finite description we can imagine a program with unbounded integer variables encoding the strategy’s execution.

References

Abdulla, P.A., Mayr, R., Sangnier, A., Sproston, J.: Solving parity games on integer vectors. In: DArgenio, P.R., Melgratti, H. (eds.) CONCUR 2013. LNCS, vol. 8052, pp. 106–120. Springer, Heidelberg (2013). doi:10.1007/978-3-642-40184-8_9
Chapter Google Scholar
Abdulla, P.A., Ciobanu, R., Mayr, R., Sangnier, A., Sproston, J.: Qualitative analysis of VASS-induced MDPs. In: Jacobs, B., et al. (eds.) FOSSACS 2016. LNCS, vol. 9634, pp. 319–334. Springer, Heidelberg (2016). doi:10.1007/978-3-662-49630-5_19
Chapter Google Scholar
de Alfaro, L.: Formal verification of probabilistic systems. Ph.D. thesis, Stanford University, Stanford, CA, USA (1998)
Google Scholar
Bouyer, P., Fahrenberg, U., Larsen, K.G., Markey, N., Srba, J.: Infinite runs in weighted timed automata with energy constraints. In: Cassez, F., Jard, C. (eds.) FORMATS 2008. LNCS, vol. 5215, pp. 33–47. Springer, Heidelberg (2008). doi:10.1007/978-3-540-85778-5_4
Chapter Google Scholar
Bouyer, P., Markey, N., Randour, M., Larsen, K.G., Laursen, S.: Average-energy games. In: Proceedings of GandALF 2015, pp. 1–15 (2015)
Google Scholar
Brázdil, T., Brožek, V., Chatterjee, K., Forejt, V., Kučera, A.: Two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS 2011, pp. 33–42 (2011)
Google Scholar
Brázdil, T., Brozek, V., Etessami, K., Kučera, A., Wojtczak, D.: One-counter Markov decision processes. In: Proceedings of SODA 2010, pp. 863–874. SIAM (2010)
Google Scholar
Brázdil, T., Kiefer, S., Kučera, A.: Efficient analysis of probabilistic programs with an unbounded counter. J. ACM 61(6), 41:1–41:35 (2014)
Article MathSciNet MATH Google Scholar
Brázdil, T., Kučera, A., Novotný, P.: Optimizing the Expected Mean Payoff in Energy Markov Decision Processes. CoRR abs/1607.00678 (2016)
Google Scholar
Brenguier, R., Cassez, F., Raskin, J.F.: Energy and mean-payoff timed games. In: Proceedings of the 17th International Conference on Hybrid Systems: Computation and Control, HSCC 2014, pp. 283–292. ACM, New York (2014)
Google Scholar
Brim, L., Chaloupka, J., Doyen, L., Gentilini, R., Raskin, J.: Faster algorithms for mean-payoff games. Formal Methods Syst. Des. 38(2), 97–118 (2011)
Article MATH Google Scholar
Bruyère, V., Filiot, E., Randour, M., Raskin, J.F.: Meet your expectations with guarantees: beyond worst-case synthesis in quantitative games. In: Mayr, E.W., Portier, N. (eds.) STACS 2014. Leibniz International Proceedings in Informatics (LIPIcs), vol. 25, pp. 199–213. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2014)
Google Scholar
Cachera, D., Fahrenberg, U., Legay, A.: An omega-algebra for real-time energy problems. In: Proceedings of FSTTCS 2015. LIPIcs, vol. 45, pp. 394–407. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany (2015)
Google Scholar
Chakrabarti, A., Alfaro, L., Henzinger, T.A., Stoelinga, M.: Resource interfaces. In: Alur, R., Lee, I. (eds.) EMSOFT 2003. LNCS, vol. 2855, pp. 117–133. Springer, Heidelberg (2003). doi:10.1007/978-3-540-45212-6_9
Chapter Google Scholar
Chatterjee, K., Doyen, L.: Energy parity games. In: Abramsky, S., Gavoille, C., Kirchner, C., Meyer auf der Heide, F., Spirakis, P.G. (eds.) ICALP 2010. LNCS, vol. 6199, pp. 599–610. Springer, Heidelberg (2010). doi:10.1007/978-3-642-14162-1_50
Chapter Google Scholar
Chatterjee, K., Doyen, L., Henzinger, T., Raskin, J.: Generalized mean-payoff and energy games. In: Proceedings of FST & TCS 2010. LIPIcs, vol. 8, pp. 505–516. Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik (2010)
Google Scholar
Chatterjee, K., Komárková, Z., Křetínský, J.: Unifying two views on multiple mean-payoff objectives in Markov decision processes. In: Proceedings of LICS 2015, pp. 244–256 (2015)
Google Scholar
Chatterjee, K., Henzinger, M.: Efficient and dynamic algorithms for alternating Büchi games and maximal end-component decomposition. J. ACM 61(3), 15:1–15:40 (2014)
Article MathSciNet MATH Google Scholar
Chatterjee, K., Henzinger, M., Krinninger, S., Nanongkai, D.: Polynomial-time algorithms for energy games with special weight structures. Algorithmica 70(3), 457–492 (2014)
Article MathSciNet MATH Google Scholar
Chatterjee, K., Randour, M., Raskin, J.F.: Strategy synthesis for multi-dimensional quantitative objectives. Acta Informatica 51(3–4), 129–163 (2014)
Article MathSciNet MATH Google Scholar
Clemente, L., Raskin, J.F.: Multidimensional beyond worst-case and almost-sure problems for mean-payoff objectives. In: Proceedings of LICS 2015, pp. 257–268. IEEE Computer Society, Washington (2015)
Google Scholar
Filar, J., Vrieze, K.: Competitive Markov Decision Processes. Springer-Verlag New York Inc., New York (1996)
Book MATH Google Scholar
Forejt, V., Kwiatkowska, M., Norman, G., Parker, D.: Automated verification techniques for probabilistic systems. In: Bernardo, M., Issarny, V. (eds.) SFM 2011. LNCS, vol. 6659, pp. 53–113. springer, Heidelberg (2011). doi:10.1007/978-3-642-21455-4_3
Chapter Google Scholar
Gurvich, V., Karzanov, A., Khachiyan, L.: Cyclic games and an algorithm to find minimax cycle means in directed graphs. USSR Comput. Math. Math. Phys. 28(5), 85–91 (1990)
Article MathSciNet MATH Google Scholar
Haase, C., Kiefer, S.: The odds of staying on budget. In: Halldórsson, M.M., Iwama, K., Kobayashi, N., Speckmann, B. (eds.) ICALP 2015. LNCS, vol. 9135, pp. 234–246. Springer, Heidelberg (2015). doi:10.1007/978-3-662-47666-6_19
Google Scholar
Howard, R.: Dynamic Programming and Markov Processes. MIT Press, New York (1960)
MATH Google Scholar
Juhl, L., Guldstrand Larsen, K., Raskin, J.-F.: Optimal bounds for multiweighted and parametrised energy games. In: Liu, Z., Woodcock, J., Zhu, H. (eds.) Theories of Programming and Formal Methods. LNCS, vol. 8051, pp. 244–255. Springer, Heidelberg (2013). doi:10.1007/978-3-642-39698-4_15
Chapter Google Scholar
Kitaev, M., Rykov, V.: Controlled Queueing Systems. CRC Press, Boca Raton (1995)
MATH Google Scholar
Kučera, A.: Playing games with counter automata. In: Finkel, A., Leroux, J., Potapov, I. (eds.) RP 2012. LNCS, vol. 7550, pp. 29–41. Springer, Heidelberg (2012). doi:10.1007/978-3-642-33512-9_4
Chapter Google Scholar
Puterman, M.L.: Markov Decision Processes. Wiley-Interscience, Hoboken (2005)
MATH Google Scholar
Velner, Y., Chatterjee, K., Doyen, L., Henzinger, T., Rabinovich, A., Raskin, J.: The complexity of multi-mean-payoff and multi-energy games. Inf. Comput. 241, 177–196 (2015)
Article MathSciNet MATH Google Scholar
Zwick, U., Paterson, M.: The complexity of mean payoff games on graphs. Theor. Comput. Sci. 158(1&2), 343–359 (1996)
Article MathSciNet MATH Google Scholar

Download references

Author information

Authors and Affiliations

Faculty of Informatics MU, Botanická 68a, 602 00, Brno, Czech Republic
Tomáš Brázdil & Antonín Kučera
IST Austria, Klosterneuburg, Austria
Petr Novotný

Authors

Tomáš Brázdil
View author publications
You can also search for this author in PubMed Google Scholar
Antonín Kučera
View author publications
You can also search for this author in PubMed Google Scholar
Petr Novotný
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Petr Novotný .

Editor information

Editors and Affiliations

AIST , Osaka, Japan
Cyrille Artho
Inria Rennes , Rennes, France
Axel Legay
Bar Ilan University , Ramat Gan, Israel
Doron Peled

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Brázdil, T., Kučera, A., Novotný, P. (2016). Optimizing the Expected Mean Payoff in Energy Markov Decision Processes. In: Artho, C., Legay, A., Peled, D. (eds) Automated Technology for Verification and Analysis. ATVA 2016. Lecture Notes in Computer Science(), vol 9938. Springer, Cham. https://doi.org/10.1007/978-3-319-46520-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-46520-3_3
Published: 22 September 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-46519-7
Online ISBN: 978-3-319-46520-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics