Optimal control of a black-box system based on surrogate models by spatial adaptive partitioning method

doi:10.1016/j.isatra.2019.11.012

ISA Transactions

Volume 100, May 2020, Pages 63-73

https://doi.org/10.1016/j.isatra.2019.11.012 Get rights and content

Highlights

•
Optimal control problems of a black-box system can be solved using surrogate models.
•
A hierarchical model based on spatial adaptive partitioning strategy is accurate.
•
Combining sensitivity index with interval length can well guide spatial partitioning.

Abstract

When the dynamic model of a classical optimal control problem is explicit, we can transform this problem into a nonlinear programming problem and solve it by employing a traditional method. However, in some cases, no mathematical model of state equations is provided explicitly except for input–output data obtained from a simulation model. The hybrid model composed of functional mockup unit blocks generated in multiple platforms is a typical example. In this work, we regard these blocks as black-box models and use hierarchical neural network model to surrogate right-hand-side derivative functions of state equations. Specifically, to obtain highly accurate hierarchical neural network model, we explore a spatial adaptive partitioning criterion combining global sensitivity indices and interval length of local spaces based on the input–output data. Compared with models trained by several other partition criteria, numerical results verify that surrogate models obtained by the spatial adaptive partitioning method have higher accuracy. A mathematical example and a trajectory optimization problem of the black-box industrial robot Manutec r3 indicate the effectiveness of our proposed strategy.

Introduction

The optimal control problem (OCP) based on first principles mathematical models can be solved with the existing indirect methods, dynamic programming or direct methods. A boundary value problem (BVP) can be obtained by indirect methods from the OCP according to Pontryagins Maximum Principle [1]. Dynamic Programming [2] attempts to solve Hamilton–Jacobi–Bellman equation approximately. Due to Curse of Dimensionality [3], [4], however, it is difficult to be used in a high-dimensional space. Direct methods [5] are another popular method besides indirect methods, do not attempt to solve the optimal control necessary conditions, but directly discretize dynamic variables to convert the OCP to a nonlinear programming problem (NLP) [6], [7].

In an indirect method or direct method, we should provide explicit dynamic equations of the OCP. For a complex physical system, it may be not an easy task. The equations will not be extracted explicitly when the system components are imported from multiple engineering domains or the entire system spans different modeling and simulation platforms [8]. A Modelica-based platform can directly extract C code of the state equations [9] by compiling the established system model built only in this platform, so as to tackle the related OCPs. However, in some practical situations, simulation codes of many computer programs are embedded [10] or some involved components in the whole system need to be imported from other platforms as functional mockup units (FMUs). In these cases, the state equations of an entire system cannot be established by extracting C code directly.

In terms of this dilemma, we propose an approach to treat the entire system model as a black-box model and develop a surrogate model for the right-hand-side (RHS) functions of state equations of OCP. Typically, the construction of a surrogate model can be made with some different function structures, including splines [11], orthogonal polynomials [12], as well as Kriging [13], Radial Basis Function (RBF) [14] and other response surface models. Furthermore, models such as Neural Network (NN) [15] and Support Vector Machine (SVM) [16] are also frequently used for function approximation.

Although the approximation methods described above are available, it is still difficult to develop a high-precision surrogate model over the whole state space, especially when encountering high-dimensional data [17]. In order to deal with this problem, some research work, specifically on training local models [18] and dimensionality reduction of training data [17], has been done. The former partitions the entire state space into different subspaces in accordance with an approximate partition criterion and constructs a separate local model for each subspace. Compared with establishing only one single global model, it is more reasonable to train multiple local models for approximating a complicate function. The latter projects the data into a low-dimensional space using dimensionality reduction techniques such as local linear embedding [19] and self-organizing maps [20]. After a surrogate model of the low-dimensional space is trained, the global one in the original space can be evaluated through the mapping between them. However, a priori knowledge of the dimension of the low-dimensional space is required. Furthermore, not all data can be matched to a suitable low-dimensional data manifold.

After comprehensively considering the mentioned function structures and approximation methods, the NN model is adopted as the basic model due to the following reasons. It is convenient to extract the explicit expression of a NN model. In addition, the NN model is relatively concise and the expression is independent of data points. Hence, the NN model will not be extremely complicated as training data points increases. However, it is difficult to achieve a high-precision approximate model using only one response surface model throughout the whole design space. So it is necessary to partition the entire space into some local spaces in accordance with an appropriate partition criterion and develop a NN model in each local space.

A suitable spatial partitioning strategy is very crucial for the construction of multiple NNs, which is called hierarchical neural network model. In the existing tree-based learning methods, there have been some related researches on training data set partitioning. Although they are different from our surrogate model construction method that partitions the space and resamples points in the partitioned subspace, these algorithms still brings us some inspiration. In most previous partition strategies, more new subsets are obtained by splitting the existing subsets along one dimension [21], [22]. Minimizing impurity [23] or mean squared error [21] is the basic strategy to select the partition dimension in these literatures.

In addition, the most well-known Classification and Regression Tree (CART) [24] identifies best split from all potential splits on each input variable by exhaustive search. The partition strategy used in CART is to select each split point with the minimum deviations after two child nodes of samples are respectively predicted using their mean output values. Calculating the statistic value of residuals and choosing the partition dimension with the smallest relevant $p$ -value is the partition criterion in the algorithm Smoothed and Unsmoothed Piecewise-Polynomial Regression Trees (SUPPORT) [23] and the Generalized, Unbiased, Interaction Detection and Estimation (GUIDE) [25]. Moreover, a model-based recursive partitioning method scores parameter instability in each node to select the split dimension [26]. The evtree algorithm constructs globally optimal CART by an evolutionary algorithm [27], which is different from conventional locally optimal methods. Besides, a mathematical programming optimization model is adopted to optimize the positions of split points and regression coefficients for child nodes in the segmented regression approach [28] and Mathematical Programming Tree (MPTree) [29]. The semi-supervised decision tree [30] detects the valley between data-concentrated sub-regions in the input space, and then partitions the input space with this tendency.

Most of local models used in these existing algorithms are constant fitting or linear models. Because of the weak local approximation ability, more spatial partitions are needed if we use these algorithms in our work. The excessive non-smooth boundaries of the developed surrogate model result in a great discontinuity of the model derivatives, which hinders the solution of OCPs. Selecting NN models as the local models will enhance the local approximation ability and reduce the number of local subspaces required. Beyond that, the essential approach in these mentioned algorithms is to traverse all potential split dimensions and split points, the one with the minimum deviations is chosen. However, the computational cost increases exponentially with the number of samples and input variables.

In this research, a new partitioning strategy, in which the NN model is chosen as the local model, is proposed to develop high-precision surrogate model for solving OCPs of black-box models. The core of our strategy is the presented spatial partitioning criterion combining sensitivity indices with interval length of local spaces. The interval length of a local space represents the length of a line segment projected by a multidimensional local space on one certain dimension. In accordance with this partition criterion, the partition index corresponding to each input variable is calculated, and then the dimension with the largest value is selected. Different from traditional methods that choose the appropriate split dimension and point by calculating the sum of deviations after trying potential partition and local model fitting, the new strategy can reduce the amount of time consumed by these examining behaviors. Numerical examples verify that the accuracy of surrogate model developed by our method is higher when the identical training time is guaranteed, and the effectiveness of our introduced method in solving OCPs of black-box models.

The rest of this paper is structured as follows. Section 2 reviews the main numerical methods for solving OCP based on explicit mathematical model of state equations, and Sobol’ sensitivity analysis. Section 3 describes the proposed adaptive spatial partitioning strategy and construction of local NN models. Section 4 demonstrates the feasibility of our method through two examples. The first example is a nonlinear dynamic system with actual solutions and illustrates the validity of surrogate models in solving OCPs with black-box models. The second example states the model and OCP of an industrial robot Manutec r3, in which sub-blocks are imported from several platforms in the form of FMUs. The state equations in this case should be considered as black-box functions during the optimization process. The results and some related discussion are given here. Section 5 gives the conclusions section.

Section snippets

Numerical methods for OCPs

The general OCPs are as follows: $min_{u (t), t_{f}} J = ϕ (x (t_{0}), t_{0}, x (t_{f}), t_{f}) + \int_{t_{0}}^{t_{f}} L (x (t), u (t), t) d t$ $s.t. \dot{x} = f (x (t), u (t), t)$ $ψ (x (t_{0}), t_{0}, x (t_{f}), t_{f}) = 0$ $g (x (t), u (t), t) \leq 0$ $u_{L} \leq u (t) \leq u_{U}$

The goal of OCPs is to identify the optimal control inputs $u^{*} (t)$ that minimize the performance index $J$ . There are two terms in the performance index, where $ϕ$ is known as the Mayer term, $L$ is the integrand of the Lagrange term. $t \in [t_{0}, t_{f}]$ denotes time, $x (t_{0}) \in R^{m}$ and $x (t_{f}) \in R^{m}$ are the initial and final states. $x (t) \in R^{m}$ and $u (t) \in R^{n}$ are the state

Spatial adaptive partitioning strategy and construction of hierarchical neural network model

Our research team developed a multi-domain physical system modeling and simulation platform named Mworks [41] based on Modelica language [42]. It can construct dynamic models and define control optimization problems like Jmodelica.org [9]. In order to extend the applicability of our platform, modules of other platforms such as AMESim, ADAMS and SIMULINK are integrated according to the functional mockup interface (FMI) [8]. Since the FMU blocks of these hybrid models are binary, the explicit

Numerical results and discussion

Surrogate modeling of black-box dynamic systems and spatial adaptive partitioning strategy combining sensitivity indices with interval length is demonstrated here by using two examples. The first example is a simple actual problem with known solutions. We assume it as a black-box model here and develop its surrogate model to solve OCP. The results obtained are compared with the known solutions to verify the feasibility of our method. The second example is a point-to-point trajectory

Conclusions

In this article, we investigate OCPs of a dynamic model established by multi-platform hybrid modeling. Sub-models or components created by different platforms serve as FMU blocks of the entire model. Since in a hybrid model, the entire or part of simulation codes are embedded, we regard it as a black-box model. Without explicit dynamic equations, the traditional methods are not applicable to be used in such circumstances. To cope with the above problem, the spatial adaptive partitioning

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is supported by National Natural Science Foundation of China (Grant No. 51575205) and National Key Research and Development Program of China(Grant No. 2018YFB1701702). These supports are gratefully acknowledged.

References (48)

BieglerL.T.
An overview of simultaneous strategies for dynamic optimization
Chem Eng Process : Process Intensif
(2007)
BieglerL.T. et al.
Large-scale nonlinear programming using IPOPT: An integrating framework for enterprise-wide dynamic optimization
Comput Chem Eng
(2009)
ÅkessonJ. et al.
Modeling and optimization with Optimica and JModelica. orgLanguages and tools for solving large-scale dynamic optimization problems
Comput Chem Eng
(2010)
Negrellos-OrtizI. et al.
Dynamic optimization of a cryogenic air separation unit using a derivative-free optimization approach
Comput Chem Eng
(2018)
TalebitootiR. et al.
Shape design optimization of cylindrical tank using b-spline curves
Comput Fluids
(2015)
YinH.
On multidimensional scaling and the embedding of self-organising maps
Neural Netw
(2008)
YangL. et al.
Mathematical programming for piecewise linear regression analysis
Expert Syst Appl
(2016)
YangL. et al.
A regression tree approach using mathematical programming
Expert Syst Appl
(2017)
KimK.
A hybrid classification algorithm by subspace partitioning through semi-supervised decision tree
Pattern Recognit
(2016)
WangX. et al.
A symplectic pseudospectral method for nonlinear optimal control problems with inequality constraints
ISA Trans
(2017)

GargD. et al.

A unified framework for the numerical solution of optimal control problems using pseudospectral methods

Automatica

(2010)

LiuP. et al.

Fast engineering optimization: A novel highly effective control parameterization approach for industrial dynamic processes

ISA Trans

(2015)

LiuP. et al.

A novel non-uniform control vector parameterization approach with time grid refinement for flight level tracking optimal control problems

ISA Trans

(2018)

LiM. et al.

Solutions of nonlinear constrained optimal control problems using quasilinearization and variational pseudospectral methods

ISA Trans

(2016)

RossI.M. et al.

A review of pseudospectral optimal control: From theory to flight

Annu Rev Control

(2012)

NossentJ. et al.

Sobolsensitivity analysis of a complex environmental model

Environ Model Softw

(2011)

SaltelliA. et al.

How to avoid a perfunctory sensitivity analysis

Environ Model Softw

(2010)

SobolI.M.

Uniformly distributed sequences with an additional uniform property

USSR Comput Math Math Phys

(1976)

VazA.I.F. et al.

Robot trajectory planning with semi-infinite programming

European J Oper Res

(2004)

PontryaginL.S.

Mathematical theory of optimal processes

(2018)

ElbertP. et al.

Implementation of dynamic programming for $n$ -dimensional optimal control problems with final state constraints

IEEE Trans Control Syst Technol

(2013)

BellmanR.

Dynamic programming

(2013)

BertsekasD.P.

Dynamic programming and optimal control, vol. 1

(1996)

BettsJ.T.

Practical methods for optimal control and estimation using nonlinear programming, vol. 19

(2010)

Cited by (7)

A new sequential sampling method of surrogate models for design and optimization of dynamic systems
2021, Mechanism and Machine Theory
Citation Excerpt :
Besides, state responses are obtained through integrating the surrogate model of derivative function so that the characteristics of dynamic systems can be preserved. For instance, a high-fidelity hierarchical surrogate model of derivative function is constructed to solve the trajectory planning problem of a black-box industrial robot in the work of Qiao et al. through spatial adaptive partitioning method [28]. RSMs have been widely used in static optimization problems, which can effectively reduce the number of expensive function evaluations [29,30].
When solving combined plant and control optimization (co-design) problems of actual dynamic systems, the models with computationally expensive states derivative functions will often be encountered. Jacobian information cannot be effectively extracted in the optimization process since the expressions of these models are extremely verbose or models are given in the form of simulation programs, which makes gradient-based optimization algorithms difficult. Then the alternative approaches based on finite difference technique are selected. These methods require numerous original expensive evaluations, hence co-design optimizations of these complicated systems are basically not impractical. Here, we propose a new sequential sampling strategy based on error filtering and distance clustering to construct and update surrogate models. The computational cost is significantly reduced through deploying these cheap surrogate models and their easy-to-extract Jacobian information. Dynamic optimization examples of the 2 DOF and 3 DOF robot illustrate the comparison of several sampling strategies and show the feasibility and efficiency of the proposed method. The co-design example of a wind turbine is followed to demonstrate its application prospect in engineering designs.
Reliability Index Allocation for Industrial Robot Based Combinatorial Weighting Game Theory
2024, SSRN
A Right-Hand Side Function Surrogate Model-Based Method for the Black-Box Dynamic Optimization Problem
2023, Journal of Mechanical Design
An Asymmetric Collision-Free Optimal Trajectory Planning Method for Three DOF Industrial Robotic Arms
2023, Symmetry
A Crossrate-Based Approach for Reliability-Based Multidisciplinary Dynamic System Design Optimization
2023, Applied Sciences (Switzerland)
A Transformation-Based Improved Kriging Method for the Black Box Problem in Reliability-Based Design Optimization
2023, Mathematics

View all citing articles on Scopus

View full text

Research articleOptimal control of a black-box system based on surrogate models by spatial adaptive partitioning method

Highlights

Abstract

Introduction

Section snippets

Numerical methods for OCPs

Spatial adaptive partitioning strategy and construction of hierarchical neural network model

Numerical results and discussion

Conclusions

Declaration of Competing Interest

Acknowledgments

Chem Eng Process : Process Intensif

Comput Chem Eng

Comput Chem Eng

Comput Chem Eng

Comput Fluids

Neural Netw

Expert Syst Appl

Expert Syst Appl

Pattern Recognit

ISA Trans

Automatica

ISA Trans

ISA Trans

ISA Trans

Annu Rev Control

Environ Model Softw

Environ Model Softw

USSR Comput Math Math Phys

European J Oper Res

Mathematical theory of optimal processes

Implementation of dynamic programming for n-dimensional optimal control problems with final state constraints

IEEE Trans Control Syst Technol

Dynamic programming

Dynamic programming and optimal control, vol. 1

Practical methods for optimal control and estimation using nonlinear programming, vol. 19

Research article
Optimal control of a black-box system based on surrogate models by spatial adaptive partitioning method

Implementation of dynamic programming for $n$ -dimensional optimal control problems with final state constraints