Safe-Exploration of Control Policies from Safe-Experience via Gaussian Processes

Candelieri, Antonio; Ponti, Andrea; Archetti, Francesco

doi:10.1007/978-3-031-24866-5_18

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13621))

Included in the following conference series:

International Conference on Learning and Intelligent Optimization

558 Accesses

Abstract

Control of many real-life systems strongly relies on the knowledge of a domain expert, who usually adopts a safe control policy to deal with uncertainty. The term safe means that the policy is aimed at avoiding system’s disruptions or relevant deviations from the desired behaviour, usually at the cost of sub-optimal performances. This paper proposes a statistically-sound approach which exploits the collected experience to safe-explore new policies by assuming a reasonable risk in terms of safety while improving performances. Gaussian Process regression is the core of the approach, providing a probabilistic approximation of both system’s dynamics and performances, depending on historical data related to the application of the safe policy. Being a probabilistic model, Gaussian Process provides both an estimate of the level of safety and, more important, the associated predictive uncertainty, which is crucial for implementing the safe-exploration of new efficient policies. The approach allows to avoid the typically expensive implementation of a digital twin of the system, required in the case of simulation-optimization approaches, as well as the formulation as a stochastic programming problem. Results on two case studies, inspired by real-life systems, are presented, showing an improvement in terms of performances with respect the initial safe policy, with reasonable safety of the systems.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

References

Lu, Q., et al.: Stochastic programming for floodwater utilization of a complex multi-reservoir system considering risk constraints. J. Hydrol. 599, 126388 (2021)
Article Google Scholar
Han, D., Lee, J.H.: Two-stage stochastic programming formulation for optimal design and operation of multi-microgrid system using data-based modeling of renewable energy sources. Appl. Energy 291, 116830 (2021)
Article Google Scholar
Lima, R.M., Conejo, A.J., Giraldi, L., LeMaitre, O., Hoteit, I., Knio, O.M.: Risk-averse stochastic programming vs. adaptive robust optimization: a virtual power plant application. INFORMS J. Comput. 34, 1795–1818 (2022)
Article MathSciNet MATH Google Scholar
Rachih, H., Mhada, F., Chiheb, R.: Simulation optimization of an inventory control model for a reverse logistics system. Dec. Sci. Lett. 11(1), 43–54 (2022)
Article Google Scholar
Chakraei, I., Safavi, H.R., Dandy, G.C., Golmohammadi, M.H.: Integrated simulation-optimization framework for water allocation based on sustainability of surface water and groundwater resources. J. Water Resour. Plan. Manag. 147(3), 05021001 (2021)
Article Google Scholar
Tordecilla, R.D., Juan, A.A., Montoya-Torres, J.R., Quintero-Araujo, C.L., Panadero, J.: Simulation-optimization methods for designing and assessing resilient supply chain networks under uncertainty scenarios: A review. Simul. Model. Pract. Theory 106, 102166 (2021)
Article Google Scholar
Candelieri, A., Galuzzi, B., Giordani, I., Archetti, F.: Learning optimal control of water distribution networks through sequential model-based optimization. In: International Conference on Learning and Intelligent Optimization, pp. 303–315 (2020)
Google Scholar
Candelieri, A., Ponti, A., Archetti, F.: Data efficient learning of implicit control strategies in water distribution networks. In: 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE), pp. 1812–1816 (2021)
Google Scholar
Schreiter, J., Nguyen-Tuong, D., Eberts, M., Bischoff, B., Markert, H., Toussaint, M.: Safe exploration for active learning with gaussian processes. In: Bifet, A., et al. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9286, pp. 133–149. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23461-8_9
Chapter Google Scholar
Schillinger, M., Hartmann, B., Skalecki, P., Meister, M., Nguyen-Tuong, D., Nelles, O.: Safe active learning and safe Bayesian optimization for tuning a PI-controller. IFAC-PapersOnLine 50(1), 5967–5972 (2017)
Article Google Scholar
Sui, Y., Zhuang, V., Burdick, J., Yue, Y.: Stagewise safe Bayesian optimization with gaussian processes. In: International Conference on Machine Learning, pp. 4781–4789. PMLR (2018)
Google Scholar
Kirschner, J., Mutny, M., Hiller, N., Ischebeck, R., Krause, A.: Adaptive and safe Bayesian optimization in high dimensions via one-dimensional subspaces. In: International Conference on Machine Learning, pp. 3429–3438. PMLR (2019)
Google Scholar
Fiducioso, M., Curi, S., Schumacher, B., Gwerder, M., Krause, A.: Safe contextual Bayesian optimization for sustainable room temperature PID control tuning. arXiv preprint arXiv:1906.12086 (2019)
Berkenkamp, F., Krause, A., Schoellig, A. P.: Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics. Mach. Learn. 1–35 (2021)
Google Scholar
König, C., Turchetta, M., Lygeros, J., Rupenyan, A., Krause, A.: Safe and efficient model-free adaptive control via Bayesian optimization. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 9782–9788. IEEE (2021)
Google Scholar
Frazier, P.I.: Bayesian optimization. In: Recent Advances in Optimization and Modeling of Contemporary Problems, pp. 255–278. Informs (2018)
Google Scholar
Archetti, F., Candelieri, A.: Bayesian Optimization and Data Science. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24494-1
Book MATH Google Scholar
Williams, C.K., Rasmussen, C.E.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)
MATH Google Scholar
Gramacy, R.B.: Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences. Chapman and Hall/CRC, Boca Raton (2020)
Book Google Scholar
Deisenroth, M.P., Fox, D., Rasmussen, C.E.: Gaussian processes for data-efficient learning in robotics and control. IEEE Trans. Pattern Anal. Mach. Intell. 37(2), 408–423 (2013)
Article Google Scholar
Bischoff, B., et al.: Policy search for learning robot control using sparse data. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3882–3887. IEEE (2014)
Google Scholar
Kamthe, S., Deisenroth, M.: Data-efficient reinforcement learning with probabilistic model predictive control. In: International Conference on Artificial Intelligence and Statistics, pp. 1701–1710. PMLR (2018)
Google Scholar
Sergeyev, Y.D., Candelieri, A., Kvasov, D.E., Perego, R.: Safe global optimization of expensive noisy black-box functions in the \(\delta \)-Lipschitz framework. Soft. Comput. 24(23), 17715–17735 (2020). https://doi.org/10.1007/s00500-020-05030-3
Article MATH Google Scholar
Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Trans. Inf. Theory 58(5), 3250–3265 (2012)
Article MathSciNet MATH Google Scholar
De Ath, G., Everson, R. M., Fieldsend, J. E., Rahat, A. A.: \(\varepsilon \)-shotgun: \(\varepsilon \)-greedy batch Bayesian optimisation. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pp. 787–795 (2020)
Google Scholar
De Ath, G., Everson, R.M., Rahat, A.A., Fieldsend, J.E.: Greed is good: exploration and exploitation trade-offs in Bayesian optimisation. ACM Trans. Evol. Learn. Optim. 1(1), 1–22 (2021)
Article Google Scholar
Berk, J., Gupta, S., Rana, S., Venkatesh, S.: Randomised Gaussian process upper confidence bound for Bayesian optimisation. In: Proceedings of the 29th International Conference on Artificial Intelligence, pp. 2284–2290 (2021)
Google Scholar
Candelieri, A., Archetti, F.: Sparsifying to optimize over multiple information sources: an augmented Gaussian process based algorithm. Struct. Multidiscip. Optim. 64(1), 239–255 (2021). https://doi.org/10.1007/s00158-021-02882-7
Article MathSciNet Google Scholar
Candelieri, A., Perego, R., Archetti, F.: Green machine learning via augmented Gaussian processes and multi-information source optimization. Soft. Comput. 25(19), 12591–12603 (2021). https://doi.org/10.1007/s00500-021-05684-7
Article Google Scholar

Download references

Author information

Authors and Affiliations

University of Milano-Bicocca, Milan, Italy
Antonio Candelieri, Andrea Ponti & Francesco Archetti

Authors

Antonio Candelieri
View author publications
You can also search for this author in PubMed Google Scholar
Andrea Ponti
View author publications
You can also search for this author in PubMed Google Scholar
Francesco Archetti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Antonio Candelieri .

Editor information

Editors and Affiliations

SBA Research, Vienna, Austria
Dimitris E. Simos
Moscow Aviation Institute (National Research University), Moscow, Russia
Varvara A. Rasskazova
Università degli Studi di Milano-Bicocca, Milan, Italy
Francesco Archetti
Wilfrid Laurier University, Waterloo, ON, Canada
Ilias S. Kotsireas
University of Florida, Gainesville, FL, USA
Panos M. Pardalos

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Candelieri, A., Ponti, A., Archetti, F. (2022). Safe-Exploration of Control Policies from Safe-Experience via Gaussian Processes. In: Simos, D.E., Rasskazova, V.A., Archetti, F., Kotsireas, I.S., Pardalos, P.M. (eds) Learning and Intelligent Optimization. LION 2022. Lecture Notes in Computer Science, vol 13621. Springer, Cham. https://doi.org/10.1007/978-3-031-24866-5_18

Download citation

DOI: https://doi.org/10.1007/978-3-031-24866-5_18
Published: 05 February 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24865-8
Online ISBN: 978-3-031-24866-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Safe-Exploration of Control Policies from Safe-Experience via Gaussian Processes