Elsevier

ISA Transactions

Volume 100, May 2020, Pages 63-73
ISA Transactions

Research article
Optimal control of a black-box system based on surrogate models by spatial adaptive partitioning method

https://doi.org/10.1016/j.isatra.2019.11.012Get rights and content

Highlights

  • Optimal control problems of a black-box system can be solved using surrogate models.

  • A hierarchical model based on spatial adaptive partitioning strategy is accurate.

  • Combining sensitivity index with interval length can well guide spatial partitioning.

Abstract

When the dynamic model of a classical optimal control problem is explicit, we can transform this problem into a nonlinear programming problem and solve it by employing a traditional method. However, in some cases, no mathematical model of state equations is provided explicitly except for input–output data obtained from a simulation model. The hybrid model composed of functional mockup unit blocks generated in multiple platforms is a typical example. In this work, we regard these blocks as black-box models and use hierarchical neural network model to surrogate right-hand-side derivative functions of state equations. Specifically, to obtain highly accurate hierarchical neural network model, we explore a spatial adaptive partitioning criterion combining global sensitivity indices and interval length of local spaces based on the input–output data. Compared with models trained by several other partition criteria, numerical results verify that surrogate models obtained by the spatial adaptive partitioning method have higher accuracy. A mathematical example and a trajectory optimization problem of the black-box industrial robot Manutec r3 indicate the effectiveness of our proposed strategy.

Introduction

The optimal control problem (OCP) based on first principles mathematical models can be solved with the existing indirect methods, dynamic programming or direct methods. A boundary value problem (BVP) can be obtained by indirect methods from the OCP according to Pontryagins Maximum Principle [1]. Dynamic Programming [2] attempts to solve Hamilton–Jacobi–Bellman equation approximately. Due to Curse of Dimensionality [3], [4], however, it is difficult to be used in a high-dimensional space. Direct methods [5] are another popular method besides indirect methods, do not attempt to solve the optimal control necessary conditions, but directly discretize dynamic variables to convert the OCP to a nonlinear programming problem (NLP) [6], [7].

In an indirect method or direct method, we should provide explicit dynamic equations of the OCP. For a complex physical system, it may be not an easy task. The equations will not be extracted explicitly when the system components are imported from multiple engineering domains or the entire system spans different modeling and simulation platforms [8]. A Modelica-based platform can directly extract C code of the state equations [9] by compiling the established system model built only in this platform, so as to tackle the related OCPs. However, in some practical situations, simulation codes of many computer programs are embedded [10] or some involved components in the whole system need to be imported from other platforms as functional mockup units (FMUs). In these cases, the state equations of an entire system cannot be established by extracting C code directly.

In terms of this dilemma, we propose an approach to treat the entire system model as a black-box model and develop a surrogate model for the right-hand-side (RHS) functions of state equations of OCP. Typically, the construction of a surrogate model can be made with some different function structures, including splines [11], orthogonal polynomials [12], as well as Kriging [13], Radial Basis Function (RBF) [14] and other response surface models. Furthermore, models such as Neural Network (NN) [15] and Support Vector Machine (SVM) [16] are also frequently used for function approximation.

Although the approximation methods described above are available, it is still difficult to develop a high-precision surrogate model over the whole state space, especially when encountering high-dimensional data [17]. In order to deal with this problem, some research work, specifically on training local models [18] and dimensionality reduction of training data [17], has been done. The former partitions the entire state space into different subspaces in accordance with an approximate partition criterion and constructs a separate local model for each subspace. Compared with establishing only one single global model, it is more reasonable to train multiple local models for approximating a complicate function. The latter projects the data into a low-dimensional space using dimensionality reduction techniques such as local linear embedding [19] and self-organizing maps [20]. After a surrogate model of the low-dimensional space is trained, the global one in the original space can be evaluated through the mapping between them. However, a priori knowledge of the dimension of the low-dimensional space is required. Furthermore, not all data can be matched to a suitable low-dimensional data manifold.

After comprehensively considering the mentioned function structures and approximation methods, the NN model is adopted as the basic model due to the following reasons. It is convenient to extract the explicit expression of a NN model. In addition, the NN model is relatively concise and the expression is independent of data points. Hence, the NN model will not be extremely complicated as training data points increases. However, it is difficult to achieve a high-precision approximate model using only one response surface model throughout the whole design space. So it is necessary to partition the entire space into some local spaces in accordance with an appropriate partition criterion and develop a NN model in each local space.

A suitable spatial partitioning strategy is very crucial for the construction of multiple NNs, which is called hierarchical neural network model. In the existing tree-based learning methods, there have been some related researches on training data set partitioning. Although they are different from our surrogate model construction method that partitions the space and resamples points in the partitioned subspace, these algorithms still brings us some inspiration. In most previous partition strategies, more new subsets are obtained by splitting the existing subsets along one dimension [21], [22]. Minimizing impurity [23] or mean squared error [21] is the basic strategy to select the partition dimension in these literatures.

In addition, the most well-known Classification and Regression Tree (CART) [24] identifies best split from all potential splits on each input variable by exhaustive search. The partition strategy used in CART is to select each split point with the minimum deviations after two child nodes of samples are respectively predicted using their mean output values. Calculating the statistic value of residuals and choosing the partition dimension with the smallest relevant p-value is the partition criterion in the algorithm Smoothed and Unsmoothed Piecewise-Polynomial Regression Trees (SUPPORT) [23] and the Generalized, Unbiased, Interaction Detection and Estimation (GUIDE) [25]. Moreover, a model-based recursive partitioning method scores parameter instability in each node to select the split dimension [26]. The evtree algorithm constructs globally optimal CART by an evolutionary algorithm [27], which is different from conventional locally optimal methods. Besides, a mathematical programming optimization model is adopted to optimize the positions of split points and regression coefficients for child nodes in the segmented regression approach [28] and Mathematical Programming Tree (MPTree) [29]. The semi-supervised decision tree [30] detects the valley between data-concentrated sub-regions in the input space, and then partitions the input space with this tendency.

Most of local models used in these existing algorithms are constant fitting or linear models. Because of the weak local approximation ability, more spatial partitions are needed if we use these algorithms in our work. The excessive non-smooth boundaries of the developed surrogate model result in a great discontinuity of the model derivatives, which hinders the solution of OCPs. Selecting NN models as the local models will enhance the local approximation ability and reduce the number of local subspaces required. Beyond that, the essential approach in these mentioned algorithms is to traverse all potential split dimensions and split points, the one with the minimum deviations is chosen. However, the computational cost increases exponentially with the number of samples and input variables.

In this research, a new partitioning strategy, in which the NN model is chosen as the local model, is proposed to develop high-precision surrogate model for solving OCPs of black-box models. The core of our strategy is the presented spatial partitioning criterion combining sensitivity indices with interval length of local spaces. The interval length of a local space represents the length of a line segment projected by a multidimensional local space on one certain dimension. In accordance with this partition criterion, the partition index corresponding to each input variable is calculated, and then the dimension with the largest value is selected. Different from traditional methods that choose the appropriate split dimension and point by calculating the sum of deviations after trying potential partition and local model fitting, the new strategy can reduce the amount of time consumed by these examining behaviors. Numerical examples verify that the accuracy of surrogate model developed by our method is higher when the identical training time is guaranteed, and the effectiveness of our introduced method in solving OCPs of black-box models.

The rest of this paper is structured as follows. Section 2 reviews the main numerical methods for solving OCP based on explicit mathematical model of state equations, and Sobol’ sensitivity analysis. Section 3 describes the proposed adaptive spatial partitioning strategy and construction of local NN models. Section 4 demonstrates the feasibility of our method through two examples. The first example is a nonlinear dynamic system with actual solutions and illustrates the validity of surrogate models in solving OCPs with black-box models. The second example states the model and OCP of an industrial robot Manutec r3, in which sub-blocks are imported from several platforms in the form of FMUs. The state equations in this case should be considered as black-box functions during the optimization process. The results and some related discussion are given here. Section 5 gives the conclusions section.

Section snippets

Numerical methods for OCPs

The general OCPs are as follows: minu(t),tfJ=ϕ(x(t0),t0,x(tf),tf)+t0tfL(x(t),u(t),t)dts.t.ẋ=f(x(t),u(t),t)ψ(x(t0),t0,x(tf),tf)=0g(x(t),u(t),t)0uLu(t)uU

The goal of OCPs is to identify the optimal control inputs u(t) that minimize the performance index J. There are two terms in the performance index, where ϕ is known as the Mayer term, L is the integrand of the Lagrange term. t[t0,tf] denotes time, x(t0)Rm and x(tf)Rm are the initial and final states. x(t)Rm and u(t)Rn are the state

Spatial adaptive partitioning strategy and construction of hierarchical neural network model

Our research team developed a multi-domain physical system modeling and simulation platform named Mworks [41] based on Modelica language [42]. It can construct dynamic models and define control optimization problems like Jmodelica.org [9]. In order to extend the applicability of our platform, modules of other platforms such as AMESim, ADAMS and SIMULINK are integrated according to the functional mockup interface (FMI) [8]. Since the FMU blocks of these hybrid models are binary, the explicit

Numerical results and discussion

Surrogate modeling of black-box dynamic systems and spatial adaptive partitioning strategy combining sensitivity indices with interval length is demonstrated here by using two examples. The first example is a simple actual problem with known solutions. We assume it as a black-box model here and develop its surrogate model to solve OCP. The results obtained are compared with the known solutions to verify the feasibility of our method. The second example is a point-to-point trajectory

Conclusions

In this article, we investigate OCPs of a dynamic model established by multi-platform hybrid modeling. Sub-models or components created by different platforms serve as FMU blocks of the entire model. Since in a hybrid model, the entire or part of simulation codes are embedded, we regard it as a black-box model. Without explicit dynamic equations, the traditional methods are not applicable to be used in such circumstances. To cope with the above problem, the spatial adaptive partitioning

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

Acknowledgments

This work is supported by National Natural Science Foundation of China (Grant No. 51575205) and National Key Research and Development Program of China(Grant No. 2018YFB1701702). These supports are gratefully acknowledged.

References (48)

  • GargD. et al.

    A unified framework for the numerical solution of optimal control problems using pseudospectral methods

    Automatica

    (2010)
  • LiuP. et al.

    Fast engineering optimization: A novel highly effective control parameterization approach for industrial dynamic processes

    ISA Trans

    (2015)
  • LiuP. et al.

    A novel non-uniform control vector parameterization approach with time grid refinement for flight level tracking optimal control problems

    ISA Trans

    (2018)
  • LiM. et al.

    Solutions of nonlinear constrained optimal control problems using quasilinearization and variational pseudospectral methods

    ISA Trans

    (2016)
  • RossI.M. et al.

    A review of pseudospectral optimal control: From theory to flight

    Annu Rev Control

    (2012)
  • NossentJ. et al.

    Sobolsensitivity analysis of a complex environmental model

    Environ Model Softw

    (2011)
  • SaltelliA. et al.

    How to avoid a perfunctory sensitivity analysis

    Environ Model Softw

    (2010)
  • SobolI.M.

    Uniformly distributed sequences with an additional uniform property

    USSR Comput Math Math Phys

    (1976)
  • VazA.I.F. et al.

    Robot trajectory planning with semi-infinite programming

    European J Oper Res

    (2004)
  • PontryaginL.S.

    Mathematical theory of optimal processes

    (2018)
  • ElbertP. et al.

    Implementation of dynamic programming for n-dimensional optimal control problems with final state constraints

    IEEE Trans Control Syst Technol

    (2013)
  • BellmanR.

    Dynamic programming

    (2013)
  • BertsekasD.P.

    Dynamic programming and optimal control, vol. 1

    (1996)
  • BettsJ.T.

    Practical methods for optimal control and estimation using nonlinear programming, vol. 19

    (2010)
  • Cited by (7)

    View all citing articles on Scopus
    View full text