Abstract
Control of many real-life systems strongly relies on the knowledge of a domain expert, who usually adopts a safe control policy to deal with uncertainty. The term safe means that the policy is aimed at avoiding system’s disruptions or relevant deviations from the desired behaviour, usually at the cost of sub-optimal performances. This paper proposes a statistically-sound approach which exploits the collected experience to safe-explore new policies by assuming a reasonable risk in terms of safety while improving performances. Gaussian Process regression is the core of the approach, providing a probabilistic approximation of both system’s dynamics and performances, depending on historical data related to the application of the safe policy. Being a probabilistic model, Gaussian Process provides both an estimate of the level of safety and, more important, the associated predictive uncertainty, which is crucial for implementing the safe-exploration of new efficient policies. The approach allows to avoid the typically expensive implementation of a digital twin of the system, required in the case of simulation-optimization approaches, as well as the formulation as a stochastic programming problem. Results on two case studies, inspired by real-life systems, are presented, showing an improvement in terms of performances with respect the initial safe policy, with reasonable safety of the systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Lu, Q., et al.: Stochastic programming for floodwater utilization of a complex multi-reservoir system considering risk constraints. J. Hydrol. 599, 126388 (2021)
Han, D., Lee, J.H.: Two-stage stochastic programming formulation for optimal design and operation of multi-microgrid system using data-based modeling of renewable energy sources. Appl. Energy 291, 116830 (2021)
Lima, R.M., Conejo, A.J., Giraldi, L., LeMaitre, O., Hoteit, I., Knio, O.M.: Risk-averse stochastic programming vs. adaptive robust optimization: a virtual power plant application. INFORMS J. Comput. 34, 1795–1818 (2022)
Rachih, H., Mhada, F., Chiheb, R.: Simulation optimization of an inventory control model for a reverse logistics system. Dec. Sci. Lett. 11(1), 43–54 (2022)
Chakraei, I., Safavi, H.R., Dandy, G.C., Golmohammadi, M.H.: Integrated simulation-optimization framework for water allocation based on sustainability of surface water and groundwater resources. J. Water Resour. Plan. Manag. 147(3), 05021001 (2021)
Tordecilla, R.D., Juan, A.A., Montoya-Torres, J.R., Quintero-Araujo, C.L., Panadero, J.: Simulation-optimization methods for designing and assessing resilient supply chain networks under uncertainty scenarios: A review. Simul. Model. Pract. Theory 106, 102166 (2021)
Candelieri, A., Galuzzi, B., Giordani, I., Archetti, F.: Learning optimal control of water distribution networks through sequential model-based optimization. In: International Conference on Learning and Intelligent Optimization, pp. 303–315 (2020)
Candelieri, A., Ponti, A., Archetti, F.: Data efficient learning of implicit control strategies in water distribution networks. In: 2021 IEEE 17th International Conference on Automation Science and Engineering (CASE), pp. 1812–1816 (2021)
Schreiter, J., Nguyen-Tuong, D., Eberts, M., Bischoff, B., Markert, H., Toussaint, M.: Safe exploration for active learning with gaussian processes. In: Bifet, A., et al. (eds.) ECML PKDD 2015. LNCS (LNAI), vol. 9286, pp. 133–149. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-23461-8_9
Schillinger, M., Hartmann, B., Skalecki, P., Meister, M., Nguyen-Tuong, D., Nelles, O.: Safe active learning and safe Bayesian optimization for tuning a PI-controller. IFAC-PapersOnLine 50(1), 5967–5972 (2017)
Sui, Y., Zhuang, V., Burdick, J., Yue, Y.: Stagewise safe Bayesian optimization with gaussian processes. In: International Conference on Machine Learning, pp. 4781–4789. PMLR (2018)
Kirschner, J., Mutny, M., Hiller, N., Ischebeck, R., Krause, A.: Adaptive and safe Bayesian optimization in high dimensions via one-dimensional subspaces. In: International Conference on Machine Learning, pp. 3429–3438. PMLR (2019)
Fiducioso, M., Curi, S., Schumacher, B., Gwerder, M., Krause, A.: Safe contextual Bayesian optimization for sustainable room temperature PID control tuning. arXiv preprint arXiv:1906.12086 (2019)
Berkenkamp, F., Krause, A., Schoellig, A. P.: Bayesian optimization with safety constraints: safe and automatic parameter tuning in robotics. Mach. Learn. 1–35 (2021)
König, C., Turchetta, M., Lygeros, J., Rupenyan, A., Krause, A.: Safe and efficient model-free adaptive control via Bayesian optimization. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 9782–9788. IEEE (2021)
Frazier, P.I.: Bayesian optimization. In: Recent Advances in Optimization and Modeling of Contemporary Problems, pp. 255–278. Informs (2018)
Archetti, F., Candelieri, A.: Bayesian Optimization and Data Science. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-24494-1
Williams, C.K., Rasmussen, C.E.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)
Gramacy, R.B.: Surrogates: Gaussian Process Modeling, Design, and Optimization for the Applied Sciences. Chapman and Hall/CRC, Boca Raton (2020)
Deisenroth, M.P., Fox, D., Rasmussen, C.E.: Gaussian processes for data-efficient learning in robotics and control. IEEE Trans. Pattern Anal. Mach. Intell. 37(2), 408–423 (2013)
Bischoff, B., et al.: Policy search for learning robot control using sparse data. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 3882–3887. IEEE (2014)
Kamthe, S., Deisenroth, M.: Data-efficient reinforcement learning with probabilistic model predictive control. In: International Conference on Artificial Intelligence and Statistics, pp. 1701–1710. PMLR (2018)
Sergeyev, Y.D., Candelieri, A., Kvasov, D.E., Perego, R.: Safe global optimization of expensive noisy black-box functions in the \(\delta \)-Lipschitz framework. Soft. Comput. 24(23), 17715–17735 (2020). https://doi.org/10.1007/s00500-020-05030-3
Srinivas, N., Krause, A., Kakade, S.M., Seeger, M.W.: Information-theoretic regret bounds for gaussian process optimization in the bandit setting. IEEE Trans. Inf. Theory 58(5), 3250–3265 (2012)
De Ath, G., Everson, R. M., Fieldsend, J. E., Rahat, A. A.: \(\varepsilon \)-shotgun: \(\varepsilon \)-greedy batch Bayesian optimisation. In: Proceedings of the 2020 Genetic and Evolutionary Computation Conference, pp. 787–795 (2020)
De Ath, G., Everson, R.M., Rahat, A.A., Fieldsend, J.E.: Greed is good: exploration and exploitation trade-offs in Bayesian optimisation. ACM Trans. Evol. Learn. Optim. 1(1), 1–22 (2021)
Berk, J., Gupta, S., Rana, S., Venkatesh, S.: Randomised Gaussian process upper confidence bound for Bayesian optimisation. In: Proceedings of the 29th International Conference on Artificial Intelligence, pp. 2284–2290 (2021)
Candelieri, A., Archetti, F.: Sparsifying to optimize over multiple information sources: an augmented Gaussian process based algorithm. Struct. Multidiscip. Optim. 64(1), 239–255 (2021). https://doi.org/10.1007/s00158-021-02882-7
Candelieri, A., Perego, R., Archetti, F.: Green machine learning via augmented Gaussian processes and multi-information source optimization. Soft. Comput. 25(19), 12591–12603 (2021). https://doi.org/10.1007/s00500-021-05684-7
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Candelieri, A., Ponti, A., Archetti, F. (2022). Safe-Exploration of Control Policies from Safe-Experience via Gaussian Processes. In: Simos, D.E., Rasskazova, V.A., Archetti, F., Kotsireas, I.S., Pardalos, P.M. (eds) Learning and Intelligent Optimization. LION 2022. Lecture Notes in Computer Science, vol 13621. Springer, Cham. https://doi.org/10.1007/978-3-031-24866-5_18
Download citation
DOI: https://doi.org/10.1007/978-3-031-24866-5_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24865-8
Online ISBN: 978-3-031-24866-5
eBook Packages: Computer ScienceComputer Science (R0)