Data-driven adjoint-based calibration of port-Hamiltonian systems in time domain

We present a gradient-based identification algorithm to identify the system matrices of a linear port-Hamiltonian system from given input-output time data. Aiming for a direct structure-preserving approach, we employ techniques from optimal control with ordinary differential equations and define a constrained optimization problem. The input-to-state stability is discussed which is the key step towards the existence of optimal controls. Further, we derive the first-order optimality system taking into account the port-Hamiltonian structure. Indeed, the proposed method preserves the skew-symmetry and positive (semi)-definiteness of the system matrices throughout the optimization iterations. Numerical results with perturbed and unperturbed synthetic data, as well as an example from the PHS benchmark collection demonstrate the feasibility of the approach.


Introduction
In structure-preserving modelling of coupled dynamical systems the port-Hamiltonian framework allows for constructing overall port-Hamiltonian systems (PHS) provided that (a) all subsystems are PHS and (b) a power-conserving interconnection between the input and outputs of the subsystems is provided [MM19,vdS06,EMv07,DMSB09]. In realistic applications this approach reaches its limits if for a specific subsystem, either no knowledge that would allow the definition of a physics-based PHS is available, or one is forced to use user-specified simulation packages with no information of the intrinsic dynamics, and thus only the input-output characteristics are available.In both cases a remedy is to generate input-output data either by physical measurements or evaluation of the simulation package, and derive a PHS surrogate that fits these input-output data best.This PHS surrogate can then be used to model the subsystem, and overall leads to a coupled PHS with structure-preserving properties.
There is an extensive literature on system identification methods, see for example [Lju99].Here we focus on the identification of port-Hamiltonian systems.Port-Hamiltonian realizations are a special class of realizations of passive systems.More precisely, in [CGH23] it is shown that passive systems have a port-Hamiltonian realization.For a detailed survey on passive systems we refer to [Wil72].Data-driven port-Hamiltonian realizations of dynamic systems from input-output data have been investigated using various approaches.The Loewner framework has been used to construct a port-Hamiltonian realization from frequency domain data in [ALI17,BGVD20].Another frequency domain approach is proposed in [Sch21], where a parametrization of the class of PHS is used to permit the usage of unconstrained optimization solvers during identification.In [CGB22] the frequency response data are inferred using the time-domain input-output data and then frequency domain methods are used to construct a port-Hamiltonian realization.Further, in [CMH19] a best-fit linear state-space model is derived from time-domain data and then, in a post-processing step, a nearest port-Hamiltonian realization is inferred.Based on input, state and output time-domain data and a given quadratic Hamiltonian the authors of [MNU23] construct a port-Hamiltonian realization using dynamic mode decompositions.
We propose a direct time-domain approach for constructing a best-fit PHS model in one step using input-output data.Techniques from optimal control with ordinary differential equations and a constrained optimization problem are employed.We derive the first-order optimality system taking into account the port-Hamiltonian structure.The proposed method preserves the skew-symmetry and positive (semi)-definiteness of the system matrices throughout the optimization iterations.Our approach does not generate frequency data as an intermediate step and data of the state variable are not needed.We remark, that our method is overdetermined because we identify all system parameters including the quadratic Hamiltonian.We account for this by showing different test cases in the numerical section.If the Hamiltonian is known it is possible to use our approach to derive a best-fit PHS with a given Hamiltonian.
The paper is organized as follows: in the next section, we define the identification problem of computing the best-fit of a PHS to given input-output data, prove the continuous dependence of the state on the input which is the key step to obtain the existence of solutions to the identification problem.The adjoint equation and optimality conditions for all system matrices, i.e., the positive definite scaling matrix Q, the fixed-rank semi-definite dissipation matrix R, the skewsymmetric system matrix J, the input matrix B and the initial value, are derived in section three.The gradient-descent algorithm and numerical schemes for the identification process are stated in chapter four.Various numerical tests are provided in chapter five: in particular, we provide several examples for identification problems in the deterministic case and we test the robustness in case of noisy data.The numerical examples are rounded off with an investigation of the model as model order reduction approach, here we use a single mass-spring-damper chain example from the PHS benchmark collection [Sch22].We conclude with an outlook.

Identification problem
For a given input u : [0, T ] → R m , a data set y data : [0, T ] → R m which can consist of continuous data on the interval [0, T ] for T > 0 or continuous interpolation of measurements at discrete time points and reference values for the identification w ref , we consider the identification problem with cost functional subject to the state constraint where w = (J, R, Q, B, x 0 ) contains the matrices and initial data to be identified having the properties If no information on reference data is available, we recommend to set λ = 0. Note that only output data and for λ > 0 reference values are considered in the cost functional.Hence, the dimension n of the internal state is unknown.We assume in the following that n is chosen a priori and remark that a deliberate small choice can also be interpreted as a model order reduction for the internal state.
For notational convenience, we define In applications often m ≤ n hence we obtain only partial information about the internal state x.In particular, we have no information about the initial condition x 0 .This is the reason why we include the initial data in the identification process for w.Although we use techniques from optimal control theory, we may refer to w as parameters in the following and introduce the parameter space with set of admissible parameters as Moreover, we define the output space Y = H 1 ((0, T ), R m ).
For later reference we define the identification task at hand: Problem 1.We seek to find system matrices and initial data, w = (J, R, Q, B, x 0 ), which solve the problem min In [GJT23], we proposed a sensitivity-based approach to identify the system matrices of a port-Hamiltonian system and argued that this is a valid ansatz for small-scale systems, as it requires to solve auxilliary problems for each basis element of the matrix spaces.Here, we derive an adjointbased approach.Before we begin the derivation of the algorithm, we analyze the well-posedness of the identification problem.First, we prove that the state solution depends continuously on the data.
Lemma 2. For every w ∈ U ad and u ∈ C([0, T ], R m ) the state equation (2) admits a unique solution x ∈ C 1 ([0, T ], R n ).Moreover, the solution depends continuously on the data, in more detail, for w, w ′ ∈ U ad and corresponding solutions x, x ′ there exists a constant C > 0 such that Proof.For given w ∈ U ad and u ∈ C([0, T ], R m ) the existence of a unique solution to (2) follows by standard ODE theory [Tes12].For the second statement we estimate ∥x − x ′ ∥ L 2 ((0,T ),R n ) and An application of Gronwall inequality yields Integration over [0, T ] we obtain Moreover, we obtain for Again, Gronwall and integration over [0, T ] yields Adding (4) and (5) and taking square root yield the desired result.
The well-posedness of the state equation allows us to introduce the solution operator Moreover, we define the reduced cost functional Both will be helpful in the proof of the well-posedness result of the identification problem and also play a major role in the derivation of the gradient-descent algorithm later on.The next ingredient for the well-posedness result is the weak convergence of the state operator Lemma 3. The state operator e defined in (6) is weakly continuous.
Proof.We consider {x n } n ⊂ H 1 ((0, T ), R n ) with x n ⇀ x and {w n } n ⊂ U ad with w n ⇀ w.Note that U ad is finite dimensional, thus weak and strong convergence coincide.Then for φ ∈ L 2 ((0, T ), R n ) we obtain By the weak convergence of x n and w n we conclude that all terms tends to zero as n → ∞.Note that weak convergence of {x n } n implies boundedness of the sequence {x n } n .Moreover, by the embedding 0) and further we obtain (x 0 ) n → x 0 by the convergence of w n .Altogether, this yields the desired result.
Proof.We follow the lines of [Trö10,HPUU08] and consider a minimizing sequence If λ > 0, Ĵ (w) is coercive w.r.t.w and hence the minimization sequence is bounded.In the other case the minimizing sequence is bounded by assumption.As U ad is a subset of a reflexive, finite-dimensional space, we can extract a convergent subsequence {w n k } k with w n k → w * .Note that w = 0 implies z ≡ 0, hence the continuous dependence on the data shown in Lemma 2 implies the boundedness of {S(w n k )} k ⊂ H 1 ((0, T ), R n ) and we can extract a weakly convergent subsequence S(w n k ℓ ) ⇀ x in H 1 ((0, T ), R n ).The weak continuity of the state operator shown in Lemma 3 yields e(S(w n k ℓ ), w n k ℓ ) ⇀ e(x, w) and the weak lower semicontinuity of the norm allows us to estimate which proves that S(w) = x.We note that the weak lower semicontinuity of J implies the weak lower semicontinuity of Ĵ , hence This proves the existence of a minimizer.
Remark.We want to emphasize that even though the state equation is linear in x, it is non-linear in w.Thus in general, the identification problem is nonconvex and we cannot expect to obtain a uniqueness result for (IdP).
Here, we aim for a general approach, also feasible for high-dimensional systems, and therefore discuss an adjoint-based approach in the following.A challenge in this context is the structure of the system matrices (3) which we aim to preserve in each step of the identification process.To this end, we employ techniques from optimization on manifolds [AMS08] to compute the respective retractions for the system matrices to preserve their structure in each iteration.To the authors' knowledge, there is no explicit formula for the retraction in the space of semi-definite matrices.Hence, we use the flat metric for R and as the space of skew-symmetric matrices as well as R n×m and R n are vector spaces, we are naturally in the flat case for J, B and x 0 , respectively.
One of our objectives is the derivation of a gradient-based algorithm for the identification.We derive the adjoint-based gradient descent scheme step-wise, i.e., we discuss the computation of the first-order optimality system and then show that the optimality condition of this system coincides with the gradient of the reduced cost functional.

First-order optimality system
The main goal of this section is the derivation of the gradient of the reduced cost functional Ĵ .We follow a standard approach from optimization with differential equations, see for example [HPUU08] and use the Lagrangian of the constrained dynamics, which is given by L(x, w, p) = J (x, w) − ⟨e(x, w), p⟩, where p denotes the Lagrange multiplier or adjoint variable.The first-order optimality system is derived by solving dL = 0. Hence we compute As d x L(x, w, p)[h x ] = 0 must hold for arbitrary h x we can identify the adjoint equation as The derivative d p L(x, w, p)[h p ] = 0 yields the state equation and the optimality condition is derived as For simplicity we split the derivation into the different matrices, to obtain Altogether, we obtain the following result.
Theorem 5. A stationary point w = ( Q, R, J, B, x0 ) of Problem 1 satisfies the optimality condition where p satisfies the adjoint equation and x the state equation with output ȳ given by where a ⊗ b denotes the dyadic product ab ⊤ for a, b ∈ R n .
To stay on the respective manifolds in each iteration of the identification procedure, we use the retractions from manopt [BMAS14] in the implementation.

Relationship to the gradient of the reduced cost functional
We are now well-prepared to identify the gradient of our identification problem.We discuss the case including the estimation of the initial state, if this is needless the gradient update with respect to z 0 can be neglected.We have already seen that the solution operator allows us to define the reduced cost function which which leads us to an reformulation of (IdP) as unconstrained optimization problem, i.e., the state constraint is treated implicitly.The gradient we derive in the following is the one of the reduced cost functional.In fact, the whole gradient-descent algorithm will only vary w and thereby vary the state implicitly.
Let w = (Q, R, J, B, z 0 ) be the parameters or, to be more precise, the matrices and initial data to be identified.We denote by ⟨•, •⟩ W * ,W : W * , W → R the dual pairing of W and its dual W * .Moreover, A * denotes the adjoint of the operator A. To identify the gradient of the reduced cost functional we first note that the state operator yields This allows us to find (10) In the previous subsection we computed the adjoint equation already and the gradient components corresponding (Q, R, J, B, x 0 ), respectively.
We emphasize that the computation (10) shows that the left-hand side of system (8) is in fact the gradient of the reduced cost functional at w.We will use this information in the gradient-descent algorithm proposed in the following section.

Algorithm and numerical schemes
An optimal solution has to satisfy the conditions in Theorem 5 all at once.However, due to the forward/backward structure of coupled state and adjoint equation it is in general difficult to solve the system directly.This is the reason why we propose an iterative approach based on the gradient (8).In fact, for an initial guess of systems matrices and initial condition w we solve the state equation (2) and obtain S(w), using this information we solve the corresponding adjoint equation ( 7).The information of the state and adjoint solutions allows us to evaluate the gradient at w with −∇ Ĵ (w) we update our initial guess for the second iteration using a step size obtained by Armijo rule with initial stepsize σ 0 [HPUU08].The following iterates are based on a conjugate-gradient (CG) update as proposed in [Sat22], in more detail, the search direction is given by with scaling parameter and transport map T (k) that transports the tangent vector at w k to a similar tangent at w k+1 .This transport is required in order to have well-defined CG-steps in the manifold setting, see [Sat22] for more details.The transport maps for the respective manifolds implemented in the manopt package [BMAS14] are used.In the implementation, we reset the CG-step in every 5. iteration.
The iteration is stopped if an appropriate stopping criterion is fulfilled.As stopping criterion we check the relative cost: First, we can stop if the gradient vanishes numerically, i.e., ∥∇ Ĵ (w)∥ < ϵ for 0 < ϵ ≪ 1.Another option is to stop the iteration, if the update from one to the next gradient step ∥∇ Ĵ (w k ) − ∇ Ĵ (w k+1 )∥ < ϵ is sufficiently small.Additionally, we can impose a maximal number of iterations to stop the iteration.We summarize the steps in Algorihm 1.
Algorithm 1: Conjugate-gradient algorithm for the identification process Data: initial guess w, algorithmic parameters, stopping criterion; Result: optimized matrices and initial data w 1) solve state equation → S(w); 2) solve adjoint equation → p; 3) evaluate gradient (8)→ ∇J (w); 4) compute the CG-direction (11) → η; 5) find admissible step size with Armijo rule → σ; 6) update control w → w + ση; 7) if stopping criterion is not fulfilled → go to 1); else return optimal control; For the numerical results presented in the next section, we use the following algorithmic parameters unless explicitly stated otherwise: maximal budget of 100 CG-iterations, initial step size for Armijo rule 1000.The time step size of the explicit Euler discretizations of the state and adjoint ODE, respectively, is 0.001 and the upper bound of the time interval considered for the ODEs is T = 1.The integrals in the gradient expressions are computed with an simple left-endpoint approximation.
In order to avoid numerical errors to interfere with the structure of the matrices, we check for skew-symmetry or symmetry, respectively, and if needed overwrite the new iterate for J by J → 1 2 (J − J ⊤ ) and analogous As we assume to have no knowledge about the systems, we use no reference values for the tests and set J 2 (w) ≡ 0.Moreover, we set m = 2 for all the numerical tests and use the inputs (see Figure 1) where chirp denotes the chirp-signal from scipy.signal toolbox with frequency 3 at time t = 0 and frequency 20 at time t = 1 and f 0 = 3.We initialize B with an diag(1, 3) in the first m×m entries and zero everywhere else.The matrix Q representing the Hamiltonian is initialized as the n×n identity matrix.For R we set the first 2×2 zero and to identity and zero everywhere else.Of course it is nontrivial to find a good initial guess for R in the dynamics as in particular the number of dissipative elements is unknown.Furthermore, we assume that the system is in equilibrium if no input applies, therefore we refrain to estimate the initial state x 0 and set x 0 ≡ 0 in most of the examples.To generate a feasible initial J we draw an n × n matrix with independent uniformly distributed entries in [−1, 1], take the upper part of this matrix (starting with the first off-diagonal) and fill the lower part of the matrix by the negative of the transpose of the upper part.
To check the robustness of the identified system, we perform a cross validation using the test signal where square 0.1 is the square signal from the scipy.signaltoolbox with duty parameter 0.1.

Proof of concept by numerical tests
To underline the feasibility of the proposed approach, we show three test cases.First, we generate synthetic data and show that Algorithm 1 is able to find system matrices that approximate the data set.Then we repeat the test with a randomly perturbed data set.Our third test case tests the performace of a model order reduction with the proposed algorithm, here we apply the single massspring-damper chain from the PHS benchmark collection [Sch22].The code is publicly available on github1 .

Gradient test
To validate our implementation, we perform a gradient check using the finite differences in some arbitrary direction h with elements in the respective tangent spaces.As we are in the manifold setting, the finite difference reads with retraction R w at w. Since w = (J, R, Q, B, x 0 ) the retractions and also the inner product (•, •) w are defined element-wise.For details we refer to the code gradient_check.pyon github or [BMAS14].Also the directions in h depend on the respective manifolds of J, R, Q, B and x 0 .We first test the finite differences where only one of the parameters J, R, Q, B, x 0 is varied, then we vary all parameters a once.For both test, the relative difference between the directional derivative computed with inner product and the (sum) of the finite differences is below 2%.In more detail, we obtain

Synthetic data (deterministic)
In the following we show results for a synthetic data set with n = 5 and m = 2. First, we fix Q and x 0 and fit J, R and B. Then we fix all matrices and just fit the initial value of the internal dynamics x 0 , which is known in practise.
Identification of J, R, B and Q In this section we fit the system matrices J, R, B and Q.In the first experiment, we neglegt Q and in the second test, we include Q and penalize deviations of Q from the identify matrix.
In the first test, we assume to have no information about the system, hence we set λ = 0 and we fit the matrices J, R, B. In particular, we fix a positive-definite matrix Q which is unchanged during the identification.This is justified as Q always multiplies x in the dynamics, hence learning J, R, B will cover all necessary information of the system.The reference output is generated using random matrices with appropriate structure.The algorithm terminated after the maximal number of iterations as can be seen in Figure 2, where we show the evolution of the cost functional on the left-hand side and some additional information of the difference between the fitted and reference output in the maximum norm.Note that we cannot expect higher accuracy, since the error in the cost is of the order of the accuracy of the Euler scheme that we use to solve the state and adjoint systems.The reduction of the cost functional is above 99%.
In Figure 3 we show the output obtained with the initial matrices before the identification in blue, the orange graph shows the output of the fitted system and the green-dashed line shows the output obtained with the reference data.As the error values above already indicated the fitted output matches the reference output very well.
We close the numerical analysis of the identification of J, R and B with a cross validation test.The input signal u test is now applied to the fitted and reference systems and we compare the corresponding output.Figure 4 shows the results of this cross validation graphically.In numbers, we obtain for the L2-difference ∥y − y test ∥ L 2 ((0,1);R 2 ) = 1.63e −1 and for maximum norm ∥y − y test ∥ ∞ = 2.96e −1 .In the following, we learn Q in addition to the other parameters learned above.To stabilize the identification process, we choose λ = 0.001, and the regularization term J 2 (Q; in the cost functional.On the one hand, this additional term stabilizes the identification process on the other hand, it tries to keep Q close to the identify, which is away from the true value diag(1,2,3,4,5).
The algorithm terminated after the maximal number of iterations as can be seen in Figure 5, where we show the evolution of the cost functional on the left-hand side and some additional information of the difference between the fitted and reference output in the maximum norm.The reduction of the cost functional is above 97%.
In Figure 6 we show the output obtained with the initial matrices before the identification in blue, the orange graph shows the output of the fitted system and the green-dashed line shows the output obtained with the reference data.As the error values above already indicated the fitted output matches the reference output very well.
We close the numerical analysis of the identification of J, R, B and Q with a cross validation test.The input signal u test is now applied to the fitted and reference systems and we compare the   corresponding output.Figure 7 shows the results of this cross validation graphically.In numbers, we obtain for the L2-difference ∥y − y test ∥ L 2 ((0,1);R 2 ) = 2.41e −1 and for maximum norm ∥y − y test ∥ ∞ = 4.16e −1 .As expected, the penalization of Q stabilizes the parameter fitting process, but as the true value of Q is in general unknown, it may cause an error by keeping Q close to the identity.

Identification of x 0
Now, we assume to know the system matrices J, Q, R and B and we are interested to fit the initial value of the internal state x 0 , which is in practise unknown.The parameters are chosen as before.
The algorithm terminated after the maximal number of iterations as can be seen in Figure 8, where we show the evolution of the cost functional on the left-hand side and some additional information of the difference between the fitted and reference output in the maximum norm.Note that we cannot expect higher accuracy, since the error in the cost is of the order of the accuracy of the Euler scheme that we use to solve the state and adjoint systems.The reduction of the cost  functional is above 99%.
In Figure 9 we show the output obtained with the initial matrices before the identification in blue, the orange graph shows the output of the fitted system and the green-dashed line shows the output obtained with the reference data.As the error values above already indicated the fitted output matches the reference output very well.
We close the section on the deterministic data with a cross validation test.The input signal u test is now applied to the fitted and reference systems and we compare the corresponding output.Figure 4 shows the results of this cross validation graphically.In numbers, we obtain for the L2difference ∥y − y test ∥ L 2 ((0,1);R 2 ) = 4.58e −2 and for maximum norm ∥y − y test ∥ ∞ = 8.75e −2 .

Synthetic data (randomly perturbed)
Let us consider the same setting as in the previous subsection but with randomly perturbed output data.In every time-step we add independent normally distributed vectors n i ∈ R m leading us to  the perturbed data given by y data,σ (t i ) = y data (t i ) + σn i .
Figure 11 shows the evolution of the cost functionals over the identification iterations for the different noise levels.For σ = 0.25 the cost is almost constant, hence the noise overlays the information.However, the error values in Table 1 show that the identification works stable.This is also true for the cross validation shown in Figure 12 and the corresponding errors in Table 2.It is interesting to see that the robustness in the cross validation test increases with the noise level in the training.

Model order reduction test with single mass-spring-damper chain
The last test case we consider is taken from the PHS benchmark collection [Sch22] and we aim to investigate the robustness of our approach in case of model order reduction, i.e. if the model size n deviates for the real data.We generated the standard example with 50 mass-spring-damper cells that are each connected   with their neighboring masses by springs.The last mass is connected to a wall via a spring while at the two first masses external forces are applied.These are two dimensional inputs leading to m = 2.Moreover, each mass is connected with the ground with a damper.All masses are set to 4, the damping coefficient to 1 and the stiffness to 4. The output y are the velocities of the masses which are controlled.For an illustration and more details we refer to the documentation of the MSD σ 0.01 0.05 0.25 ∥y − y test ∥ L 2 ((0,1);R 2 ) 2.09e −1 1.89e −1 1.01e −1 ∥y − y test ∥ ∞ 4.1e −1 3.58e −1 2.04e −1 Table 2: Errors obtained in the cross validation for different noise levels.
Chain example referenced at [Sch22].We extracted the system matrices J, Q, R ∈ R 100×100 and the input matrix B ∈ R 100×2 from the julia implementation and stored them in python npy-format to use them with our implementation of the algorithm.
In Figure 14 we show the evolution of the cost functional over the identification iterations and the relative error of the output compared in the maximum norm for different model sizes n with true reference size n = 100.As expected models with larger n are able to capture the dynamics better, however, even for the n = 20 the error after the identification is below 10% and the values for n = 60, 80 and 100 are below 5 %.While the cost corresponds to the L 2 -norm, we show the relative error of the output measured in the maximum norm on the left panel in Figure 14.9.48% 7.47% 4.51% 4.45% 3.93% 6.66%While the relative error before the identification is about 359% for all model sizes the error after the identification ranges between 10 − 5%.The exact values are shown in Table 3 and the fitted output are shown in Figure 15.

Conclusion and outlook
In this paper we have proposed a way of calibrating linear PHS models from given input-output time data in a direct approach: structure preservation defines a constrained optimization problem, which is solved by an adjoint-based conjugate gradient descent algorithm -there is no need of first deriving a best-fit linear state-space model and then, in a post-processing step, finding the nearest PHS realization.In addition, there is no need for parameterization to transfer the constrained optimization problem into an unconstrained one.Numerical results underpin the feasibility of our approach for both, synthetic data and a model reduction example employing a mass-spring-damper benchmark problem.

Declarations
• Funding: Not applicable • Competing interests: The authors declare that they have no competing interests.
• Ethics approval: Not applicable

Figure 1 :
Figure 1: Signal u for the identification (left) and signal utest for cross validation (right).

Figure 2 :
Figure 2: Left: Evolution of the cost over the iterations of the identification in log-plot.Right: cost values before / after the identification and difference of fitted output and reference output in the maximum norm.

Figure 3 :
Figure 3: Left: First component of the output for the before the identification (initial), after the identification (fitted) and for the reference matrices (reference) used for the generation of the synthetic data.Right: Second component of the output for the before the identification (initial), after the identification (fitted) and for the reference matrices (reference)used for the generation of the synthetic data.

Figure 4 :
Figure 4: Left: First component of the output obtained in the cross validation.Right: Second component of the output obtained in the cross validation.

Figure 5 :
Figure 5: Left: Evolution of the cost over the iterations of the identification in log-plot.Right: cost values before / after the identification and difference of fitted output and reference output in the maximum norm.

Figure 6 :
Figure 6: Left: First component of the output for the before the identification (initial), after the identification (fitted) and for the reference matrices (reference) used for the generation of the synthetic data.Right: Second component of the output for the before the identification (initial), after the identification (fitted) and for the reference matrices (reference)used for the generation of the synthetic data.

Figure 7 :
Figure 7: Left: First component of the output obtained in the cross validation.Right: Second component of the output obtained in the cross validation.

Figure 8 :
Figure 8: Left: Evolution of the cost over the iterations of the identification in log-plot.Right: cost values before / after the identification and difference of fitted output and reference output in the maximum norm.

Figure 9 :
Figure 9: Left: First component of the output for the before the identification (initial), after the identification (fitted) and for the reference matrices (reference) used for the generation of the synthetic data.Right: Second component of the output for the before the identification (initial), after the identification (fitted) and for the reference matrices (reference)used for the generation of the synthetic data.

Figure 10 :
Figure 10: Left: First component of the output obtained in the cross validation.Right: Second component of the output obtained in the cross validation.

Figure 13 :
Figure 13: Left: first component of the initial output for different model sizes n.Right: second component of the initial output for different model sizes n.

Figure 13
Figure13shows the output for the initial configuration before the identification for different model sizes n and the reference output as black dashed line.

Figure 14 :
Figure 14: Left: evolution of the cost functional over the identification iterations for different model sizes n.Right: Relative differences of the model output yn and the reference output y ref measured in the maximum norm for different model sizes n.

Figure 15 :
Figure 15: Left: first component of the fitted output for different model sizes n.Right: second component of the fitted output for different model sizes n.

Table 1 :
Additional information on the identification with different noise levels.

Table 3 :
Relative errors of the output measured before and after the identification