Skip to main content

Safe-Exploration of Control Policies from Safe-Experience via Gaussian Processes

  • Conference paper
  • First Online:
Learning and Intelligent Optimization (LION 2022)

Abstract

Control of many real-life systems strongly relies on the knowledge of a domain expert, who usually adopts a safe control policy to deal with uncertainty. The term safe means that the policy is aimed at avoiding system’s disruptions or relevant deviations from the desired behaviour, usually at the cost of sub-optimal performances. This paper proposes a statistically-sound approach which exploits the collected experience to safe-explore new policies by assuming a reasonable risk in terms of safety while improving performances. Gaussian Process regression is the core of the approach, providing a probabilistic approximation of both system’s dynamics and performances, depending on historical data related to the application of the safe policy. Being a probabilistic model, Gaussian Process provides both an estimate of the level of safety and, more important, the associated predictive uncertainty, which is crucial for implementing the safe-exploration of new efficient policies. The approach allows to avoid the typically expensive implementation of a digital twin of the system, required in the case of simulation-optimization approaches, as well as the formulation as a stochastic programming problem. Results on two case studies, inspired by real-life systems, are presented, showing an improvement in terms of performances with respect the initial safe policy, with reasonable safety of the systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    https://www.mathworks.com/help/simulink/ug/model-a-house-heating-system.html.

  2. 2.

    https://github.com/facebook/prophet/blob/main/examples/example_yosemite_temps.csv.

References

  1. Lu, Q., et al.: Stochastic programming for floodwater utilization of a complex multi-reservoir system considering risk constraints. J. Hydrol. 599, 126388 (2021)

    Article  Google Scholar 

  2. Han, D., Lee, J.H.: Two-stage stochastic programming formulation for optimal design and operation of multi-microgrid system using data-based modeling of renewable energy sources. Appl. Energy 291, 116830 (2021)

    Article  Google Scholar 

  3. Lima, R.M., Conejo, A.J., Giraldi, L., LeMaitre, O., Hoteit, I., Knio, O.M.: Risk-averse stochastic programming vs. adaptive robust optimization: a virtual power plant application. INFORMS J. Comput. 34, 1795–1818 (2022)

    Article  MathSciNet  MATH  Google Scholar 

  4. Rachih, H., Mhada, F., Chiheb, R.: Simulation optimization of an inventory control model for a reverse logistics system. Dec. Sci. Lett. 11(1), 43–54 (2022)

    Article  Google Scholar 

  5. Chakraei, I., Safavi, H.R., Dandy, G.C., Golmohammadi, M.H.: Integrated simulation-optimization framework for water allocation based on sustainability of surface water and groundwater resources. J. Water Resour. Plan. Manag. 147(3), 05021001 (2021)

    Article  Google Scholar 

  6. Tordecilla, R.D., Juan, A.A., Montoya-Torres, J.R., Quintero-Araujo, C.L., Panadero, J.: Simulation-optimization methods for designing and assessing resilient supply chain networks under uncertainty scenarios: A review. Simul. Model. Pract. Theory 106, 102166 (2021)

    Article  Google Scholar 

  7. Candelieri, A., Galuzzi, B., Giordani, I., Archetti, F.: Learning optimal control of water distribution networks through sequential model-based optimization. In: International Conference on Learning and Intelligent Optimization, pp. 303–315 (2020)

    Google Scholar 

  8. Candelieri, A., Ponti, A., Archetti, F.: Data efficient learning of implicit control strategies in water distribution networks. In: 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE), pp. 1812–1816 (2021)

    Google Scholar 

  9. Schreiter, J., Nguyen-Tuong, D., Eberts, M., Bischoff, B., Markert, H., Toussaint, M.: Safe exploration for active learning with gaussian processes. In: Bifet, A., et al. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9286, pp. 133–149. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23461-8_9

    Chapter  Google Scholar 

  10. Schillinger, M., Hartmann, B., Skalecki, P., Meister, M., Nguyen-Tuong, D., Nelles, O.: Safe active learning and safe Bayesian optimization for tuning a PI-controller. IFAC-PapersOnLine 50(1), 5967–5972 (2017)

    Article  Google Scholar 

  11. Sui, Y., Zhuang, V., Burdick, J., Yue, Y.: Stagewise safe Bayesian optimization with gaussian processes. In: International Conference on Machine Learning, pp. 4781–4789. PMLR (2018)

    Google Scholar 

  12. Kirschner, J., Mutny, M., Hiller, N., Ischebeck, R., Krause, A.: Adaptive and safe Bayesian optimization in high dimensions via one-dimensional subspaces. In: International Conference on Machine Learning, pp. 3429–3438. PMLR (2019)

    Google Scholar 

  13. Fiducioso, M., Curi, S., Schumacher, B., Gwerder, M., Krause, A.: Safe contextual Bayesian optimization for sustainable room temperature PID control tuning. arXiv preprint arXiv:1906.12086 (2019)

  14. Berkenkamp, F., Krause, A., Schoellig, A. P.: Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics. Mach. Learn. 1–35 (2021)

    Google Scholar 

  15. König, C., Turchetta, M., Lygeros, J., Rupenyan, A., Krause, A.: Safe and efficient model-free adaptive control via Bayesian optimization. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 9782–9788. IEEE (2021)

    Google Scholar 

  16. Frazier, P.I.: Bayesian optimization. In: Recent Advances in Optimization and Modeling of Contemporary Problems, pp. 255–278. Informs (2018)

    Google Scholar 

  17. Archetti, F., Candelieri, A.: Bayesian Optimization and Data Science. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24494-1

    Book  MATH  Google Scholar 

  18. Williams, C.K., Rasmussen, C.E.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)

    MATH  Google Scholar 

  19. Gramacy, R.B.: Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences. Chapman and Hall/CRC, Boca Raton (2020)

    Book  Google Scholar 

  20. Deisenroth, M.P., Fox, D., Rasmussen, C.E.: Gaussian processes for data-efficient learning in robotics and control. IEEE Trans. Pattern Anal. Mach. Intell. 37(2), 408–423 (2013)

    Article  Google Scholar 

  21. Bischoff, B., et al.: Policy search for learning robot control using sparse data. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3882–3887. IEEE (2014)

    Google Scholar 

  22. Kamthe, S., Deisenroth, M.: Data-efficient reinforcement learning with probabilistic model predictive control. In: International Conference on Artificial Intelligence and Statistics, pp. 1701–1710. PMLR (2018)

    Google Scholar 

  23. Sergeyev, Y.D., Candelieri, A., Kvasov, D.E., Perego, R.: Safe global optimization of expensive noisy black-box functions in the \(\delta \)-Lipschitz framework. Soft. Comput. 24(23), 17715–17735 (2020). https://doi.org/10.1007/s00500-020-05030-3

    Article  MATH  Google Scholar 

  24. Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Trans. Inf. Theory 58(5), 3250–3265 (2012)

    Article  MathSciNet  MATH  Google Scholar 

  25. De Ath, G., Everson, R. M., Fieldsend, J. E., Rahat, A. A.: \(\varepsilon \)-shotgun: \(\varepsilon \)-greedy batch Bayesian optimisation. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pp. 787–795 (2020)

    Google Scholar 

  26. De Ath, G., Everson, R.M., Rahat, A.A., Fieldsend, J.E.: Greed is good: exploration and exploitation trade-offs in Bayesian optimisation. ACM Trans. Evol. Learn. Optim. 1(1), 1–22 (2021)

    Article  Google Scholar 

  27. Berk, J., Gupta, S., Rana, S., Venkatesh, S.: Randomised Gaussian process upper confidence bound for Bayesian optimisation. In: Proceedings of the 29th International Conference on Artificial Intelligence, pp. 2284–2290 (2021)

    Google Scholar 

  28. Candelieri, A., Archetti, F.: Sparsifying to optimize over multiple information sources: an augmented Gaussian process based algorithm. Struct. Multidiscip. Optim. 64(1), 239–255 (2021). https://doi.org/10.1007/s00158-021-02882-7

    Article  MathSciNet  Google Scholar 

  29. Candelieri, A., Perego, R., Archetti, F.: Green machine learning via augmented Gaussian processes and multi-information source optimization. Soft. Comput. 25(19), 12591–12603 (2021). https://doi.org/10.1007/s00500-021-05684-7

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Antonio Candelieri .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Candelieri, A., Ponti, A., Archetti, F. (2022). Safe-Exploration of Control Policies from Safe-Experience via Gaussian Processes. In: Simos, D.E., Rasskazova, V.A., Archetti, F., Kotsireas, I.S., Pardalos, P.M. (eds) Learning and Intelligent Optimization. LION 2022. Lecture Notes in Computer Science, vol 13621. Springer, Cham. https://doi.org/10.1007/978-3-031-24866-5_18

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-24866-5_18

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-24865-8

  • Online ISBN: 978-3-031-24866-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics