Reset-Free Data-Driven Gain Estimation: Power Iteration using Reversed-Circulant Matrices

A direct data-driven iterative algorithm is developed to accurately estimate the $H_\infty$ norm of a linear time-invariant system from continuous operation, i.e., without resetting the system. The main technical step involves a reversed-circulant matrix that can be evaluated in a model-free setting by performing experiments on the real system.


Introduction
Direct data-driven approaches to estimate system properties, such as system norms, enable the extraction of information while avoiding the need to model the entire system.In [6,Section 12.2], an iterative approach to estimate the H ∞ norm directly from data is developed, which is extended and analyzed in [11], [13], [10], [7].In sharp contrast, model-based approaches first estimate a model, either a parametric model with subsequent norm calculation [3], or a nonparametric model [5].These model-based approaches require substantial user intervention compared to data-driven approaches.Furthermore, these data-driven approaches can be viewed as iterative experiment design, see [11,Section 4.2].
A key assumption that underlies the developments in these direct data-driven norm estimation approaches is that the system needs to be reset after each iteration, which tacitly leads to substantial theoretical and practical drawbacks.The main reason to assume a full reset is that it allows to write the system as a linear system of equations, enabling a direct application of a power iteration.If such a reset is not performed, then past iterations still have effect during the current iteration, which is further complicated by a nonlinear normalization [11] at each iteration.A reset directly leads to a theoretical Email addresses: t.a.e.oomen@tue.nl(Tom Oomen), crro@kth.se(Cristian R. Rojas).
drawback, since the convergence rate of the estimated norm in these reset-based approaches is O(N −2 ) [14], where N denotes the experiment length of a single iteration.A frequency-domain sampling approach would lead to faster convergence results by eliminating transient effects, see [5], but still requires resets and does not benefit from the earlier mentioned optimal experiment design compared to iterative schemes.As major practical drawback, the application of the required resets requires additional experimental time and effort, which is application dependent but typically intensive.Although many approaches have been developed that enable data-driven estimation of system properties, these approaches typically require the system to be reset every iteration.The aim of this paper is to develop an approach that has improved accuracy and facilitates practical implementation by avoiding resets.
The main contribution of this paper is a reset-free iterative data-driven algorithm.The main technical result is a periodic response matrix that is transformed into a new reversed-circulant matrix.Specific focus is on a H ∞ -norm estimation procedure, similar to [6, Section 12.2], [11], [13], [10], [7] yet reset-free.In particular, by analyzing the eigenvalue decomposition of the reversedcirculant matrix, an iterative power iteration method is devised that estimates the H ∞ -norm using real-valued inputs.
The main direct benefit of the proposed method is a data-driven algorithm that provides more accurate estimates of the H ∞ -norm compared to reset-based approaches, which is confirmed in a simulation example.In addition, there is a substantial benefit in practical applications, including the one reported in [9], where the system does not need to be reset through a cumbersome re-initialisation and homing procedure, reducing experimental time and user effort.The broader significance of the paper lies in advocating the possibility that resetfree, i.e., continuous operation, may have theoretical and practical advantages for general data-driven estimation schemes, cf.[7] for a recent overview, which may unnecessarily restrict to reset-based approaches that are unnecessarily theoretically conservative and experimentally impractical.Note that these benefits are recognised in certain specific control approaches, e.g. in [12] and references therein, where also continuous operation is advocated.It is thus expected that continuous operation may have benefits for broader classes of data-driven algorithms, e.g., [6], [1].

Iterative experiments
Consider the discrete-time single-input single-output linear time-invariant system P (z) given by the statespace realization with matrices of appropriate dimensions and λ max (A) < 1.Data are collected in batches of length N ∈ Z >0 , where the j th batch or iteration is denoted by Consequently, by processing N data batch-wise,

Traditional data-driven power iterations via resetting
In this section, the main idea behind traditional datadriven norm estimation is outlined, including [11].The system (1) is reset after each experiment, i.e., hence (1) collapses to Next, the induced 2-norm is computed as (4) Note that where T N ∈ R N ×N is the involutory permutation matrix , which can be interpreted as a time-reversal operator.
Next, (5) reveals that T N J is symmetric, hence As a result, J i2 equals the largest eigenvalue λ max (T N J).In addition, λ max (T N J) → P ∞ for N → ∞, see [11,Theorem 3] for a proof, where for SISO systems The main idea of the data-driven approach is enabled by (6).In particular, (6) enables that all knowledge of the system J can directly be evaluated on the system instead of using model knowledge.In particular, the power method [15, Section 9.1], which is an iterative algorithm to determine the maximal eigenvalue of a matrix, can be directly applied to the real system J by applying a certain input and measuring the output.Then, the output is time-reversed through the operation T N and re-applied as input.By initializing with a random u 0 , the input lim j→∞ u j converges to an eigenvector corresponding to λ max = max |λ(JT N )|.As a result, the maximum eigenvalue directly follows from a Rayleigh quotient.

Problem formulation
Traditional iterative methods in Section 2.2 are based on the reset in (3) and have several shortcomings.First, lim N →∞ λ max (JT N ) = P ∞ is a limit result, with a relatively slow convergence rate of O(N −2 ), where N denotes the experiment length, see, e.g., [14] for a recent investigation of this and [5,8] for approaches that eliminate such transients effects in nonparametric modelbased and model-free settings, respectively.Second, the reset requires substantial experimental effort to stop the system completely.The main problem addressed in this paper is to develop a direct data-driven H ∞ -norm estimation procedure for the system (1) that alleviates the shortcomings associated with resetting (3).

Reset-free data-driven estimation approach
The main idea is to devise a data-driven mechanism that excites the system (1) with a periodic input, i.e., Assuming ρ(A) < 1, hence ρ(F ) < 1, this leads to and thus Hence, continuous reset-free operation essentially enables experimenting on the N ×N periodic response matrix H(I −F ) −1 G+J.The maximum gain of this matrix corresponds to the H ∞ norm for N → ∞, as is revealed next.
Theorem 1 Consider the matrix H(I − F ) −1 G + J defined by (1), and let Note that the DFT matrix F is unitary, i.e., F * F = F F * = I.Next, circulant matrices are defined, since these matrices play a key role in the proof of Theorem 1 and the main procedure of this paper.
is a circulant matrix.
, and PROOF.[Lemma 3] For the lower-triangular part, where J is possibly nonzero, the (p, q) th element (p > q) of H( where use is made of (I −A N ) −1 A = A(I −A N ) −1 , which can be verified by premultiplying and postmultiplying both sides by I − A N .For the upper-triangular part, the (p, q) th element (q > p) of Comparing these entries to those in (10) reveals that they indeed lead to a circulant matrix.Next, (11) directly follows by noting that a k is at the (1, i + 1) location of (10) for k > 0, while for a 0 an additional D term is included.Finally, any circulant matrix satisfies the eigenvalue decomposition (12), see [2,Fact 5.16.7].
Remark 4 In Lemma 3, circularity of H(I −F ) −1 G+J follows from the fact that (2) is LTI.Lifting in (1) is often also performed for linear periodically time varying systems, in which case the matrix F * (H(I−F ) −1 G+J)F in Theorem 1 is not necessarily diagonal [4].
PROOF.[Theorem 1] Substituting a k from (11), letting ω m = 2πm N and using , which yields the desired result.✷ Compared to [11,Theorem 3], the result in Theorem 1 only involves frequency discretization, and hence transient effects do not play a role.This improves norm estimation properties for finite N , as is confirmed in Section 6.
4 The need for time reversal: Reversed-circulant matrices Motivated by Theorem 1, the central idea in this paper is to apply the power iteration to the matrix H(I − F ) −1 G + J.In the periodically operated case, there is a direct connection between the eigenvalues and the frequency response, and hence properties such as the H ∞ norm.However, the input u ∈ R N must be real-valued, while the eigenvalues of H(I −F ) −1 G+J are complex by Theorem 1.These complex eigenvalues introduce a rotation and hence convergence cannot be expected if the power iteration is applied directly to H(I − F ) −1 G + J, see [15, Section 7.12].
The following theorem is key to the subsequent steps.
For 0 < m < N/2, first assume that λ m = 0, and let The vectors v m and v −m are linearly independent (and also from the remaining vectors w m , m / ∈ {m, N − m}, as they are built only from w m and w Consider all the vectors w m for which λ m = 0.These are eigenvectors of C corresponding to the eigenvalue 0, i.e., Cw m = 0.Then, Rw m = T N Cw m = 0, so w m is also an eigenvector of R corresponding to the eigenvalue λ m = 0. Together with the previous eigenvectors, they span R N , which thus exhausts the search for eigenvectors of R.This concludes the proof.✷ Hence, Theorem 5 reveals that time-reversal essentially enforces the matrix T N (H(I − F ) −1 G + J) to be symmetric, implying real-valued eigenvalues that have the same magnitude as those of H(I − F ) −1 G + J.

Algorithm
In this section, the power iteration is implemented.Additional steps are taken to actually implement the algorithm on the matrix T N (H(I − F ) −1 G + J) to enforce the periodic response (8) and bounded inputs.To this end, let where λ ∈ R\0 is discussed in Lemma 6, and is a normalization constant to normalize the input power to one, see [11].In addition, Here, α j enforces that the input is only updated every n update iterations to ensure that a steady-state response is obtained.Note that this additional transient time is at least of the same order of magnitude as approaches based on resetting, including [11], since it takes a similar transient time to bring a system to rest.Practically, n update can be set such that the system is sufficiently in steady-state, which is similar to, e.g., periodic excitation in frequency domain system identification or the iterative inversion-based control, see [12] and references therein for details.
Lemma 6 Let u 1 be a random vector.Then, the vectors u j generated by iteration (13) converge to the eigenvector corresponding to the largest eigenvalue λ m + λ, where λ m is defined in (12).
A proof of Lemma 6 follows from [15, Section 9.3], where the randomness of the vector u 1 is used to ensure it has a component in the direction of the eigenvector corresponding to λ m + λ.The shift λ is essential to let the algorithm converge to the largest positive eigenvalue in Theorem 5, since there is always a negative one with the same magnitude.Indeed, a negative eigenvalue would violate the input in (7), since the input would start switching signs.A suitable order of magnitude for the choice regarding λ is given by P ∞ , which renders all eigenvalues λ m positive, enhancing convergence, see [11,Remark 9].If there are multiple dominant eigenvectors λ m + λ, then the eigenvector iteration will converge to a subspace that exhibits similar norm estimation properties [15,Section 9.3].Finally, it is noted that standard results apply regarding conditioning of the matrix [15, Section 9.3], yet it is emphasized that the presence of noise may be dominant in case the iteration ( 13) is applied to a physical system, see [11] for a detailed analysis.
Given the iteration ( 13), an estimator for the H ∞ norm is directly obtained from (4), see also [11], as β j = 1 N u T j y j .

Example
Consider the system see Figure 1 for a Bode magnitude plot.
First, the result of Theorem 1 is compared to the traditional reset-based appproach (6).In particular, the spectral radius λ max (F * (H(I − F ) −1 G + J)F ) is computed for different values of N .In addition, the traditional result based on resetting is computed, i.e., λ max (T N J).
The values are computed for a range of values of N .For λ max (F * (H(I − F ) −1 G + J)F ), N is increased by doubling its value, ensuring that the frequency points are retained.Furthermore, these approximations for finite N are compared to their true value P ∞ .The results are presented in Figure 2. Clearly, the value of λ max (F * (H(I − F ) −1 G + J)F ) converges much faster to the true value P ∞ as expected.In addition, the result λ max (T N J) does produce a norm estimate of zero for N < 50 due to the strict delays in (16).The resetfree estimate λ max (F * (H(I −F ) −1 G+J)F ) clearly does not suffer from these delays.
Next, the approach of Section 5 is implemented.Here, N = 50 and n update = 10.The resulting input signals are depicted in Figure 3.The input already after one update contains a dominant frequency component, which further becomes sinusoidal after several iterations.This frequency excites the resonance peak in 2. Norm estimate comparison λ max (F * (H(I −F ) −1 G+J)F) (blue), λ max (TN J) (red), and P ∞ (black).The estimate λ max (F * (H(I − F ) −1 G + J)F) converges much faster to the true system norm P ∞ , hence the reset-free approach leads to a better approximation.
Figure 1, without the transient envelope effects that typically arise in the reset-based approach of [11], see [9,Figure 11].This is also evidenced in Figure 4, where the estimated norm β j rapidly converges to the underlying value λ max (F * (H(I − F ) −1 G + J)F ).Also, the update of the input each n update = 10 iterations can be distinguished, leading to a transient in the iteration domain after each update.
Summarizing, the approach of Section 5 effectively estimates the norm directly from data through an iterative procedure.To further compare this procedure with preexisting approaches, including the reset-based approach in [11], note that J in (1) is equal to J = 0 for N = 50 due to the strict delay in (16).Consequently, a reset-based approach does not lead to a useful estimate of the gain.
The superior estimation results of the reset-free algorithm are promising for robust control and related fields.In particular, robust control typically require an accurate model error bound in terms of the H ∞ norm.The reset-free approach can be directly applied to the model error, see [9], leading to very accurate results with a reduced experimental burden to reset the system.Similarly, the algorithm enables online fault detection and isolation through suitable experiments.Finally, the algorithm can be seen as an online optimal experiment design, providing more accurate models in system identification as well as related data-driven estimation algorithms.

Conclusion
In this paper, the standard assumption regarding resetting the system at each iteration in iterative data-driven estimation algorithms is alleviated.This has both theoretical advantages in terms of more accurate estimates   and practical advantages in terms of experimental effort.Its theoretical advantages are illustrated on a data-driven H ∞ norm estimation problem.The practical advantages compared to earlier data-driven norm estimation algorithms are reduced experimental time and reduced user intervention, since it does not require cumbersome resetting and re-initialisation procedures of the experimental setup.In this respect, the reset-free method has major practical advantages compared to earlier algorithms and no foreseen disadvantages.The extension to related data-driven estimation algorithms is immediate, and may enhance theoretical and practical advantages of a large class of these approaches.
For future research, it is of interest to explore whether it is possible to reduce n update in (15).Currently, convergence is only guaranteed if n update is selected such that the input reaches a steady-state.Numerous experiments have shown that the algorithm works practically if the input in iteration j has limited leakage to iteration j + 1, in which case n update = 1 often leads to convergence.A formal convergence proof is complicated by the normalization (14), see also [11].Furthermore, bounds on the error β j − P ∞ depending on N are of interest for selecting N based on prior knowledge.

Fig. 3 .
Fig. 3. Input signals.The initial input signal u1(k) (blue) is random noise.Already after the first update, the input u10(k) (red) contains a dominant frequency component.After convergence, the input, u10000(k) (black) clearly contains a dominant frequency component that corresponds to the peak frequency in Figure 1.