Data-driven parameter tuning for rational feedforward controller: Achieving optimal estimation via instrumental variable

Feedforward control has been widely used to improve the tracking performance of precision motion systems. This paper develops a new data-driven feedforward tuning approach associated with rational basis functions. The aim is to obtain the global optimum with optimal estimation accuracy. First, the instrumental variable is employed to ensure the unbiased estimation of the global optimum. Then, the optimal instrumental variable which leads to the highest estimation accuracy is derived, and a new reﬁned instrumental variable method is exploited to estimate the optimal instrumental variable. Moreover, the estimation accuracy of the optimal parameter is further improved through the proposed parameter updating law. Simulations are conducted to test the parameter estimation accuracy of the proposed approach, and it is demonstrated that the global optimum is unbiasedly estimated with optimal parameter estimation accuracy in terms of variance with the proposed approach. Experiments are performed and the results validate the excellent performance of the proposed approach for varying tasks.

high tracking performance. Specifically, the feedforward controller is parameterised in filter-based methods and the optimal parameter is determined through parameter optimisation algorithms [23,24]. A widely used strategy is to parameterise the feedforward controller linearly using polynomial basis functions (PBFs) [25][26][27][28]. This linear parameterisation guarantees the stability of the feedforward controller inherently and makes the parameter optimisation problem convex. These merits facilitate the application of PBFs and this approach has been extended towards input shaping [29] and multivariable systems [11,30]. The main drawback of PBFs is that the resulting feedforward controller can only approximate the inverse of the plant with unit denominator, which severely limits the achievable performance and extrapolation properties. A more promising strategy is to parameterise the feedforward controller using rational basis functions (RBFs) [31,32]. Compared to PBFs, the feedforward controller parameterised using RBFs is able to approximate the inverse plants of general rational motion systems. Hence, higher tracking performance and extrapolation properties are available. However, the parameter optimisation problem associated with RBFs is non-convex. Pre-existing approaches seek the optimal parameter through non-linear optimisation approach [31], which cannot ensure the global optimisation in general.
Besides, instrumental variable is an important component in many filter-based methods. The application of the instrumental variable generally leads to unbiased estimation of the optimal parameter [3,27], and the parameter estimation accuracy in terms of variance can be optimised [28,33]. However, in preexisting filter-based approaches associated with the instrumental variable, the noise in the estimated regressor is ignored while calculating the covariance matrix. Therefore, the achieved estimation accuracy after optimisation is non-optimal actually.
Based on the instrumental variable, this paper proposes a novel data-driven parameter tuning approach for feedforward controller parameterised using RBFs. The proposed approach aims at obtaining the global optimum with optimal estimation accuracy. Specifically, an analysis framework based on the instrumental variable is proposed, which leads to the unbiased estimation of the optimal parameter inherently. The estimation accuracy in terms of variance is analysed, in which the noise in the estimated regressor will be taken into consideration to ensure optimal parameter estimation accuracy after optimisation. The optimal instrumental variable is determined such that the covariance matrix is minimised. A refined instrumental variable (RIV) approach is proposed to estimate the optimal instrumental variable. Moreover, a novel parameter updating law is proposed to further improve the estimation accuracy of the global optimum based on the above optimisation. The performance of the proposed approach is tested through simulations and experiments. The results indicate that the proposed approach not only estimates the global optimum unbiasedly with optimal accuracy but also achieves high performance for varying tasks.
The preliminary results have been published in [34], in which the least square method is exploited to estimate the optimal parameter unbiasedly for feedforward controller parameterised using RBFs. This paper greatly improves the parameter estima-tion accuracy, compared to [34], through the introduction of an instrumental variable. As shown here in the simulations and experiments, the improved parameter estimation accuracy of the global optimum with the proposed approach contributes to higher reference-tracking performance and extrapolation properties.
This paper is organised as follows. In Section 2, the notations used here are provided. In Section 3, the problem to be solved is thoroughly stated. In Section 4, the optimal parameter is obtained with an optimal accuracy using the instrumental variable. In Section 5, the proposed parameter updating law which further improves the parameter estimation accuracy is introduced in detail. In Section 6, simulations are conducted to test the parameter estimation accuracy of the proposed approach. In Section 7, the reference-tracking performance and extrapolation properties of the proposed approach are compared with pre-existing approaches via experiments. In Section 8, the overall conclusions of this paper are given.

PRELIMINARIES
This paper considers only discrete-time, linear time-invariant and single-input single-output (SISO) systems. Finite-time task is performed and signals are assumed to be of length N ∈ ℤ + . A positive definite matrix A is denoted as A ≻ 0. For a vector u ∈ ℝ N , the weighted two-norm is given by ‖u‖ 2 W = u T Wu, with (⋅) T the transpose of a matrix or vector. Besides, the kth element of u is expressed as u[k](0 ≤ k ≤ N − 1). The discrete transfer function is denoted as G (q −1 ), with q −1 the backward shift operator q −1 u[k] = u[k − 1]. q −1 is omitted when this does not lead to any confusion.
Let g[k] be the unit impulse response of G (q −1 ). Given the input and output signals u, y ∈ ℝ N , the relation between u and y is given by y when zero initial condition is assumed. As finite-time task with length N is performed, it follows that u[k] = 0 when k < 0 and k > N − 1. Therefore, the relation between u and y can be rewritten as where G ∈ ℝ N ×N is the lower triangular Toeplitz matrix corresponding to G (q −1 ).

System description
This paper employs the two degrees-of-freedom (2-DOFs) control system, as shown in Figure 1. The plant P is generally FIGURE 1 Block diagram of the two degrees-of-freedom (2-DOFs) control configuration unknown. The feedback controller C is applied for robust stability and noise rejection. Meanwhile, the feedforward controller F is exploited to reduce the reference-tracking error.
In Figure 1, r is a finite-time reference with length N , y is the measured output, and e = r − y is the tracking error. Besides, the noise v is given by v = H (q −1 )w, where w is normally distributed white noise with zero mean and variance 2 , and H (q −1 ) is a monic and stable transfer function. From Figure 1, the tracking error e is given by where S = 1∕(1 + PC ) is the sensitivity function and S p = P∕(1 + PC ) is the process sensitivity function. It is observed from (2) that e can be divided into two parts: the referenceinduced error e r = Sr − F ⋅ S p r and the noise-induced error e v = −Sv. Similar to Assumption 2.1 in [28], it is assumed that C is designed such that S (q −1 )H (q −1 ) = 1, which makes e v = −w.

Feedforward parameterisation
In filter-based feedforward tuning approaches, F is parameterised and the feedforward tuning problem is converted into parameter optimisation problem. This paper parameterises F using RBFs as follows: and

Problem statements
Generally, F is designed to eliminate e r to enhance the reference-tracking performance. Ideally, e r = 0 holds for arbitrary references when F = P −1 . Hence, good tracking performance and extrapolation properties can be achieved simultaneously by making F as close as possible to P −1 . With the parameterisation in (3) and (4), F is able to approximate the inverse plants of general rational motion systems. Therefore, the key to achieve high performance is to find the global optimum accurately. Biasedness and variance are two critical indicators of the parameter estimation accuracy, which are the main focuses of this paper. Therefore, the aim of this paper is to propose a novel data-driven parameter tuning approach for feedforward controller parameterised using (3) and (4), and the detailed requirements for the proposed approach are as follows: (R 1 ) achieve unbiased estimation of the global optimum; (R 2 ) achieve optimal estimation accuracy in terms of variance.
Remark 1. Indeed, the selection of the basis function Ψ has great influence on the convergence performance of filter-based methods. However, similar to pre-existing literatures [23][24][25][26][27][28][29][30][31][32][33][34], the detailed selection method of Ψ is out of the scope of this paper and will be an interesting topic for the future research. Hence, the focus of this paper is to accurately obtain the global optimum with the given basis function Ψ, and the widely used differentiators [25][26][27][28] and delay units [3, 11 23, 24] are selected as the basis functions for the simulations and experiments, respectively.

ESTIMATION OF THE OPTIMAL PARAMETER BASED ON INSTRUMENTAL VARIABLE
This paper introduces the proposed approach in detail in Sections 4 and 5, which constitutes the main contribution of this paper. For convenience of reading, the framework of Sections 4 and 5 is presented in Figure 2.
As shown in Figure 2, the expression of the global optimum will be given in (10) and (11) in Section 4.1 based on the feedforward parameterisation in (3) and (4). In Sections 4.2-4.4, the variables required in (10) and (11), including Sr, S p r and the instrumental variable, are derived to make the global optimum calculable. Specifically, the estimation of Sr and S p r is realised in Section 4.2. The covariance matrix originated from the noise components in the estimated Sr and S p r is computed and minimised in Section 4.3. An RIV method is proposed in Section 4.4 to estimate the optimal instrumental variable obtained in Section 4.3. Substitute the Sr and S p r estimated in Section 4.2 and the optimal instrumental variable in Section 4.4 into (10) and (11), the global optimum can be obtained.

Determine the optimal parameter via instrumental variable
Ideally, the global optimum opt is the parameter such that e r ( opt ) = 0. Combining the feedforward parameterisation in (3) and (4) In order to estimate opt unbiasedly from (5), the objective criterion based on the instrumental variable is given by where Z ∈ ℝ N ×n z is the instrumental variable, L(q −1 ) is the pre-filter, W ≻ 0 is the weighting matrix and ( ) is given by Substituting (4) into (7), we have that with the regressor = [ A , B ] and It is observed from (8) that ( ) is linear with respect to , which implies that V ( ) is a quadratic function of . Therefore, opt can be directly solved as the analytical solution to V ( )∕ = 0, which yields where Remark 2. The motivations of selecting V ( ) in (6) as the objective criterion are twofold. First, the resulting optimisation problem is convex and the global optimum opt can be directly obtained from (10) and (11), which is different from the nonconvex optimisation problem suffered when the norm of the reference-induced error is optimised directly. Second, the introduction of the instrumental variable in (6) is crucial for the optimisation of parameter estimation accuracy in Sections 4 and 5.

Estimation of Sr and S p r
It is observed from (9) to (11) that the calculation of opt requires the knowledge of Z , L(q −1 ), W , Ψ, Sr and S p r. Of all these variables, Z , L(q −1 ), W will be determined by minimising the covariance matrix in the next subsection, the basis function Ψ is selected by the user in advance, but Sr and S p r cannot be given directly due to the unknown plant. This paper estimates Sr and S p r using the measured data from a reference-tracking trial marked as E 0 . Specifically, E 0 is carried out without feedforward control, that is, F = 0. Define e 0 and y 0 as the tracking error and measured output of E 0 , they are given by with v 0 = H (q −1 )w 0 the measurement noise in E 0 . DefineŜ r andŜ pr as the estimated Sr and S p r, respectively. Ignore the noise term Sv 0 in (12),Ŝ r andŜ pr can be directly solved as follows: Combining (12) and (13), the relation between the estimated values and real values is given by Considering that v 0 is with zero mean,Ŝ r andŜ pr obtained from (13) are the unbiased estimations of Sr and S p r, respectively. Combining the preselected Ψ and the instrumental variable determined in the next subsection, the optimal parameter can be calculated through (10).

Estimation accuracy of the optimal parameter
In the following description,̂,R z ,R ze and̂o pt denote the estimated variables corresponding to the noise-free variables , R z , R ze and opt . Specifically, noise-free variables are the real values calculated based on real variables Sr and S p r, while the estimated variables are calculated based on the estimated vari-ablesŜ r andŜ pr in (13). Therefore, noise components induced by v 0 in (14) are included in all the estimated variables, which is the main difference between the estimated variables and their corresponding noise-free variables.
The difference between̂o pt and opt is given bŷ Definẽ=̂− as the noise-induced component in the estimated regressor̂. Combining (9) and (14), we have that̃= [̃A,̃B] and It is observed from (5) that Sr = opt . Combining with (11) and (16), it follows that where The asymptotic distribution of̂o pt is given by where  (0, P ) denotes the normal distribution with zero mean, P is the covariance matrix, which is typically used as the performance index in terms of parameter estimation accuracy and can be minimised to improve the parameter estimation accuracy [28]. Based on (15) and (17), we have that where denotes the expectation with respect to the noise v, and L and K are the lower triangular Toeplitz matrixes of L(q −1 ) and K (q −1 ), respectively. As shown in (20), P is a function of Z , L(q −1 ) and W . In order to achieve optimal estimation accuracy (i.e. requirement R 2 ), Z , L(q −1 ), W should be determined such that P is minimised. Next, the lower bound of P is derived and the optimal instrumental variable is given in Theorem 1.

Theorem 1. The lower bound of P , which is denoted as P
The equivalence between P and P opt holds by selecting Z , L(q −1 ), W as follows: Proof. Define two matrixes and as follows: Therefore, P in (20) can be rewritten using and as follows: Using Lemma A.4 in [35], we have that Besides, it is obvious that P is equal to P opt while selecting the variables as (22). Hence, P opt is the optimal covariance matrix and the optimal variables are presented in (22), which completes the proof. □ Remark 3. According to the theory about the instrumental variable [35], unbiased estimation of opt (i.e. requirement R 1 ) can be achieved from (10) if Z is correlated with the reference r and uncorrelated with the noise v. Obviously, these two conditions are both satisfied by the Z opt in (22). Therefore,̂o pt is an unbiased estimation of opt inherently.
Remark 4. From (22), it is observed that the dimension of Z opt satisfies n z = n . Therefore, R z ∈ ℝ n ×n is square and (10) can be simplified as follows: Remark 5. In pre-existing approaches [28],̃is ignored while calculating and minimising P , which yields In Section 6, simulations are conducted and it is shown that the variables in (27) are non-optimal actually.

4.4
Estimation of Z opt and L opt (q −1 ) It is observed from (22) that the calculation of Z opt and L opt (q −1 ) requires the accurate knowledge of the unknown plant P and the real optimal parameter opt , which are generally unavailable in practical implementations. Hence, an RIV method is developed in this subsection to obtain Z opt and L opt (q −1 ) iteratively based onŜ r andŜ pr obtained in (13) without exploiting any information of P and opt . In this subsection, an auxiliary index j is introduced to mark the iteration number in the proposed RIV method. Specifically, the parameter < j > , the instrumental variable Z < j > and the filter L < j > (q −1 ) are updated after each iteration, and opt , Z opt and L opt (q −1 ) can be obtained after convergence.
In the proposed RIV method, the initial parameter <0> is selected and known by the user in advance. The steps of the proposed RIV method in the j th iteration are as follows.
(1) Obtain the estimation of the noise-free Sr and S p r as follows: (2) Based on (28), we have that , ] .
(29) (3) Calculate the estimation of K (q −1 ) as follows: (4) Combining (29) and (30), the approximation of Z opt and L opt (q −1 ) are derived as follows: (11), the estimations of R z and R ze are given by (6) The estimation of opt is determined as follows: (7) Check if the parameter < j > is convergent: if convergence is achieved, let opt = < j > , Z opt = Z < j > and L opt (q −1 ) = L < j > (q −1 ); if not, let j = j + 1 and return to the step (1).
Remark 6. Note that the above iterations are performed offline and no actual experiment is involved, which indicates that only one experiment, that is E 0 , is required to obtain opt .

FURTHER IMPROVEMENT OF PARAMETER ESTIMATION ACCURACY
As shown in Figure 2, the proposed parameter updating law which aims at improving the parameter estimation accuracy is given in Section 5.1. The unbiasedness and variance of the parameters obtained with this parameter updating law are analysed in Sections 5.2 and 5.3, respectively.

The proposed parameter updating law
In order to further improve the parameter estimation accuracy based on the above analysis, a novel parameter updating law is proposed as follows: where k is the iteration number, and wherê( k−1 ) is the estimated ( k−1 ) in noisy condition, which is given bŷ With the parameter updating law in (34), it is obvious that thêo pt obtained in Section 4 is 1 when the initial parameter 0 is zero. The estimation accuracy of 1 has been analysed in Section 4. In the following description, the unbiasedness and variance of k (k ≥ 2) will be investigated. (34) is not offline, and one reference-tracking trial is required in every iteration to updatê( k−1 ).

Unbiasedness
According to Remark 3, 1 is the unbiased estimation of opt , that is, Combining (34)-(36), the difference between k (k ≥ 2) and opt is given by Therefore, we have that Equation (39) indicates that k is still the unbiased estimation of opt when k ≥ 2.

Estimation accuracy in terms of variance
Define P k as the covariance matrix of k , that is, Obviously, we have that P 1 = P opt . Combining (38) and (40), where L opt and B n are the Toeplitz matrixes of L opt (q −1 ) and B( n ), respectively, and Γ is the noise-free variable corresponding toΓ in (35). As n (n ≥ 1) is the unbiased estimation of opt , it can be assumed that B( n ) ≈ B( opt ). Based on the above assumption, (41) can be rewritten as follows: Therefore, the covariance matrix can be arbitrarily small after enough iterations, which implies that the parameter estimation accuracy can be greatly improved.

Procedure of the proposed approach
The overall procedure of the proposed approach is presented in Table 1.

Set-ups
In the following simulations, the controlled plant and the feedback controller are given by The noise w is normally distributed white noise with zero mean and standard deviation = 5 nm. H (q −1 ) is derived such that H (q −1 )S (q −1 ) = 1 is satisfied. The references to be tracked in the simulations are the fourth-order trajectory r s , as shown in Figure 3. The parameters of r s are provided in Table 2.

FIGURE 3
The references to be applied in the simulations and experiments. The feedforward controller is parameterised using (3) and (4) and the basis functions are given by with T = 0.0002 s is the sampling time. Note that the plant is exactly known in (44) in the simulations. Hence, the optimal parameter opt can be obtained in advance, as shown in Table 3.
In this case study, the parameter estimation accuracy of the proposed approach is compared with pre-existing approaches. Moreover, as the plant is known, Z opt and L opt (q −1 ) are both available in this simulation. Hence, the parameter estimation accuracy achieved using the real Z opt and L opt (q −1 ) can be exploited as the benchmark case. Specifically, the comparison is made among the following four methods: Q 1 : the pre-existing approach associated with RBFs in [34]; Q 2 : the proposed approach with the pre-existing selection of the instrumental variable in (27); Q 3 : the proposed approach with the proposed selection of the instrumental variable in (22); Q 4 : the benchmark case in which the real optimal instrumental variable is applied directly.
In Q 1 , the coefficient k in experiments E 2 and E 4 in [34] will be high enough to ensure the optimal estimation accuracy of opt . In Q 2 and Q 3 , Z opt and L opt (q −1 ) are assumed to be unknown and the RIV method presented in Section 4.4 is applied to determine Z opt and L opt (q −1 ). Five computational iterations are performed in the RIV method for both Q 2 and Q 3 , which are enough for them to be convergent. By contrast, Z opt and L opt (q −1 ) are assumed to be known in advance in Q 4 and can be directly used without computational iterations.
In order to test the parameter estimation accuracy of Q 1 -Q 4 , a Monte Carlo simulation is conducted with 1000 realisations to track the reference r s for each of the four methods. To enable fair comparison, the initial parameters for Q 1 -Q 4 are all set as 0 = [0, 0, 0, 0, 0]. In each realisation, 10 iterations are performed for Q 2 -Q 4 , and one iteration is performed for Q 1 as Q 1 is non-iterative.

6.2
Simulation results

Unbiasedness and variance
The means and standard deviations of the parameters obtained from Q 1 to Q 4 after the first iteration are presented in Table 3. It is observed that the means of the parameters obtained from Q 1 to Q 4 are all unbiased. For Q 1 , this is owing to the conduction of repeated experiments E 3 and E 4 in [34]. For Q 2 -Q 4 , the unbiased estimation is guaranteed as the two conditions in Remark 3 are both met by the instrumental variables applied in all these three methods. In Q 1 , four experiments are required to achieve the unbiased estimation of opt . By contrast, only one experiment, that is E 0 in Section 4.2, is needed for Q 2 -Q 4 in the first iteration. Therefore, the proposed approach achieves unbiased estimation of opt more efficiently than Q 1 . From Table 3, it can also be observed that the standard deviations of parameters obtained by Q 3 is much smaller than Q 1 and Q 2 and very close to Q 4 . Two conclusions can be made based on the above results: (1) the parameter estimation accuracy of the proposed approach is greatly improved compared to Q 1 after the application of the instrumental variable and the minimisation of the covariance matrix in Section 4.3; (2) optimal estimation accuracy is achieved by the instrumental variable proposed in (22).
The evolutions of standard deviations of Q 2 -Q 4 are illustrated in Figure 4. It is shown that the curves of Q 3 are always much lower than Q 2 and almost overlap with Q 4 . This validates the optimal estimation accuracy of the proposed instrumental variable again. Moreover, it can be observed from Figure 4 that the standard deviations of Q 2 -Q 4 are all reduced iteration by iteration. Hence, the improvement of parameter estimation accuracy brought in by the proposed parameter updating law in Section 5 is verified. 3.9 × 10 −6 3.9 × 10 −6 ± 7.4 × 10 −8 3.9 × 10 −6 ± 3.9 × 10 −9 3.9 × 10 −6 ± 1.9 × 10 −10 3.9 × 10 −6 ± 1.9 × 10 −10

FIGURE 4
The evolutions of standard deviations Q 2 -Q 4 , where 'std' is the abbreviation of 'standard deviation'

Worst-case performance
In order to illustrate the relation between the parameter estimation accuracy and reference-tracking performance, the worstcase performance is investigated. Specifically, the worst-case tracking error means that the reference-tracking error signal whose 2-norm is the largest among the 1000 realisations. Define wc k as the parameter in the kth iteration which leads to the worst performance among the 1000 realisations, that is, where the auxiliary index i means the ith realisation. Hence, the worst-case tracking error is given by e k wc = e( wc k ). The evolutions of the 2-norm of e k wc of Q 1 -Q 4 are presented in Figure 5. Several worst-case tracking errors of Q 1 -Q 4 are given in Figure 6. It can be observed that the noise-induced error is dominant in the worst-case tracking errors of Q 3 and Q 4 , and the reference-induced error is almost eliminated with only one iteration. By contrast, significant reference-induced error still exists in the worst-case tracking errors of Q 1 and Q 2 after the first iteration. In the subsequent iterations, with the parameter updated using (34), the reference-induced error in the worst-case tracking error of Q 2 is reduced iteration by iteration, as shown in Figure 5. It can be observed from Figure 6 that   The ultra-precision reticle stage the reference-induced error nearly disappears in the worst-case tracking error of Q 2 after the 10th iteration.
Compare the above results with Figure 4 and Table 3, it can be concluded that the magnitude of worst-case tracking error is highly correlated with the parameter estimation accuracy. Specifically, the improvement of parameter estimation accuracy will lead to the reduction of worst-case tracking error. Therefore, the robustness against the noise is improved after the optimisation of the parameter estimation accuracy with the proposed approach.

System description
Experiments are carried out on the ultraprecision reticle stage developed in our lab. As shown in Figure 7, the reticle stage is driven by moving-magnet planar motors. Specifically, the coils and the permanent-magnet array are fixed on the stator and mover of the reticle stage, respectively. The mover is magnetically suspended above the stator while the reticle stage moves. Therefore, the motion of the reticle stage is frictionless. Moreover, with this moving-magnet structure, the cooling system is installed on the stator and no cable is connected to the mover. Hence, the disturbances resulted from the cable is avoided compared to moving-coil planar motors. The position of the mover is measured by the laser interferometers with subnanometer resolution. The reticle stage is controlled in all six DOFs, including three translations (x, y and z) and three rotations (t x , t y and t z ). Static decoupling matrix is applied to make the plant diagonally dominant, which enables the application of SISO feedback and feedforward control to each DOF of the reticle stage. The sampling period of the control system is T s = 0.0002 s. This paper considers the scanning direction (i.e. y direction) for the experiments. The references to be applied include r I , r II and r III , which are presented in Figure 4 and Table 2. In the experiments, the reference-tracking performance and extrapolation properties of the following three methods are compared. M 1 : standard ILC in [12]; M 2 : the pre-existing approach associated with RBFs in [34]; M 3 : the proposed approach mentioned here.
In M 1 , the optimal feedforward signal is pursued and no feedforward controller parameterisation is required. Based on the results of preliminary experiments, the feedforward controller is parameterised as follows for M 2 and M 3 : where the term (

Reference-tracking performance
The reference-tracking performance of M 1 -M 3 towards the reference r I is investigated. In this case study, no feedforward control is applied in the initial task for all the three methods. Ten iterations are performed for M 1 and M 3 , and M 2 is executed non-iteratively. The performance of these three methods are presented in Figures 8 and 9. It is shown that great performance enhancement is achieved with all three methods, and M 1 achieves the best performance. This result validates the superior performance of signal-based methods towards repeating tasks. On the other hand, it can be observed that M 3 achieves higher performance than M 2 with the same feedforward parameterisation. This can be explained by the conclusion of the simulations that higher parameter estimation accuracy contributes to smaller worst-case error.
Actually, the optimality of the parameter obtained by M 3 can be reflected by Figure 10. In the proposed RIV method in The achieved reference-tracking errors of M 1 (after 10 iterations), M 2 (after 1 iteration) and M 3 (after 10 iterations) while tracking r I .

FIGURE 10
The estimatedŜ r andŜ pr from (13) and the noise-free estimation S <10> r and S <10> pr from (28) Note:Ŝ r in (a) is also the initial reference-tracking error for M 1 -M 3 . Section 4.4, it can be observed from (28) that S < j > r = Sr and S < j > pr = S p r both holds if and only if F ( < j −1> ) = P −1 , which means < j −1> = opt . In the experiments, 10 computational iterations are executed in the step (d) (see Table 1) in M 3 . It is shown that the curves of S <10> r and S <10> pr overlap withŜ r and S pr , respectively. Hence, accurate estimation of opt is achieved since the first iteration of M 3 , which is consistent with the curve of M 3 in Figure 8.

Extrapolation properties
In order to test the extrapolation properties of M 1 -M 3 , the optimal feedforward signal (M 1 ) or feedforward controller (M 2 and M 3 ) obtained while tracking r I are directly applied to track the other two references, that is, r II and r III . The reference-tracking errors of M 1 -M 3 are depicted in Figure 11. It is obvious that great performance deterioration is encountered with M 1 after the reference changes, which reveals the poor extrapolation properties of M 1 . Comparatively, the performance of M 2 and M 3 is maintained even if the reference is changed. Hence, the great advantage of filter-based method over signal-based method on extrapolation properties is validated.
On the other hand, the performance of M 3 is better than M 2 for all the three references. Fundamentally, the extrapolation properties of filter-based method are determined by the degree of approximation of the feedforward controller F to the inverse plant P −1 . It has been shown in the simulations that the parameter estimation accuracy of M 3 is much better than M 2 . Therefore, the feedforward controller F obtained by M 3 is probably much closer to P −1 than that of M 2 . This explains the better performance of M 3 compared to M 2 while tracking r II and r III . The high performance resulted from the accurate feedforward parameter estimation of M 3 is validated.

CONCLUSION
This paper proposed a novel data-driven rational feedforward tuning approach. In the proposed approach, the instrumental variable was exploited, and the unbiased estimation of the optimal parameter was achieved inherently. In order to achieve the optimal estimation accuracy, the variance of the estimated value of the optimal parameter was analysed. The optimal instrumental variable which leads to the minimisation of the covariance matrix was derived, and an RIV approach was proposed to estimate the optimal instrumental variable. Based on the above optimal instrumental variable, a novel parameter updating law was proposed. It is proved that the proposed parameter updating law was able to improve the parameter estimation accuracy while maintaining the unbiasedness. Throughout the whole parameter tuning process, only data measured from reference-tracking trials was required. Hence, the proposed approach was totally data-driven. In the simulations and experiments, the advantages of the proposed approach were demonstrated: (1) the unbiasedness and optimal estimation accuracy in terms of variance; (2) high performance for repeating tasks and good extrapolation properties. The research is ongoing towards extending the proposed approach to multivariable and parameter-varying systems.