Model-Based ILC with a Modified Q-Filter for Complex Motion Systems : Practical Considerations and Experimental Verification on a Wafer Stage

Iterative learning control (ILC) is one of the most popular tracking control methods for systems that repeatedly execute the same task. A system model is usually used in the analysis and design of ILC. Model-based ILC results in general in fast convergence and good performance. However, the model uncertainties and nonrepetitive disturbances hamper its practical applications. One of the commonly used solutions is the introduction of a low-pass filter, namely, the Q-filter. However, it is indicated in this paper that the existing Q-filter configurations compromise the servo performance, although improving the robustness. Motivated by the combination of performance and robustness, a novel Q-filter configuration in ILC is presented in this paper. Some practical considerations, such as the configuration of ILC in a feedback control system, the time delay compensation, and the learning coefficient, are provided in the implementation of the proposed ILC algorithm. The effectiveness and superiority of the proposed ILC versus existing Q-filter ILC are demonstrated by both theoretical analysis and experimental verification on a wafer stage.


Introduction
High-performance motion is typically required in many manufacturing environments [1][2][3][4] where a tool must track a prescribed reference trajectory with high speed as well as high accuracy.One of the examples is the wafer stage which is responsible for the precision positioning of the wafer used in the IC (integrated circuit) manufacturing.The wafer stage performs a constant velocity scanning during exposure, after which acceleration takes place to bring the stage to the next exposure position [5].In the next-generation photolithography, the wafer stage is subject to tightening requirements on the servo performance due to larger throughput and smaller critical dimension [6].However, feedback controllers such as the PID controller alone cannot achieve these requirements due to the closed loop bandwidth limitation from mechanical resonances and electrical amplifiers [7,8].More and more efforts are thus devoted to feedforward control techniques.
Considering the repetitive nature of wafer scanning, it is natural to seek to incorporate the information from previous iterations somehow into the control command of the current iteration for the sake of eliminating the recurring servo error.As one such algorithm, iterative learning control (ILC) has found widespread applications in trajectory tracking and disturbance rejection [9][10][11][12] since it was initially proposed by Uchiyama [13] and Arimoto et al. [14].For instance, an iterative learning controller achieves about 93% improvement over the feedback controller in terms of the tracking accuracy in the wafer stage described in [15].
The early work on ILC focused on the design of a single learning filter (called L filter).It uses one gain times the error from the last iteration to update the control input as follows: where the superscript j denotes the iteration index, k denotes the discrete time index defined on the interval 0 N − 1 , q represents the forward time-shift operator, u is the control input, e is the error signal, and L q is the learning filter.
The ILC algorithm in (1) seems to be very effective from a mathematical perspective and can converge to the zero-tracking error [16].However, in practice, it usually exhibits unacceptable learning performance, as illustrated in Figure 1.This can be explained by the following: (1) Most of the ILC algorithms like frequency-domain ILC [17], optimal ILC [18], and so on are to a certain extent model based from the perspective of convergence conditions and performance properties.However, it is intractable in practice to obtain the accurate system model, especially at high frequencies.
Frequency domain analysis reveals that the robustness of (1) to model uncertainty is limited [19].The poor robustness would lead to initial convergence followed by divergence or instability when high frequencies propagate through iterations.(See [20] for more discussion of this phenomenon from several points of view.) (2) Although attenuating the repetitive disturbances, ILC leads to propagation of noise and nonrepetitive disturbances which could degrade the servo performance [21].
For the improvement of ILC robustness, it is recommended to use a low-pass filter to prevent the high frequencies and noise from entering the learning feedback loop [22,23].The widely used ILC algorithm is given as follows: where Q q is the low-pass filter, often called the Q-filter.The Q-filter restricts the bandwidth of the learning process, thereby avoiding the propagation of high frequencies.
Remark 1.For a system with relative degree m, define L 0 q = q −m L q ; then the ILC algorithm in (2) can be written as u j+1 k = Q q u j k + L 0 q e j k + m which is an equivalently popular ILC formulation as (2) [24].
The design of the Q-filter in (2) has been addressed in numerous literature.In [25], a nonparametric Q-filter which has no requirement on any explicit properties of nonrepetitive disturbances is developed.A zero-phase Q-filter is designed to eliminate the bad learning transient in [16] where, however, it is indicated that the ILC algorithm in (2) leads to a trade-off between robustness and performance.More clearly, a Q-filter with high bandwidth results in improved performance but at the expense of robustness, and vice versa.Although the time-varying Q-filters in [22,26,27] and the nonlinear Q-filters in [28] extend the robustness and performance boundaries given by the fixed Q-filter in [16], the ILC algorithms in the form of (2) cannot converge to zero-tracking error unless Q q = 1 [24].This motivates the following work in the paper: (1) Three different ILC configurations under the two DOF (degree of freedom) control architecture are compared in terms of both theoretical analysis and practical considerations.
(2) A novel Q-filter configuration in model-based ILC is proposed.It adjusts the control input utilizing the filtered error signal along with the original control signal from the previous iteration, rather than the filtered one as in (2).The proposed algorithm provides improved performance (zero-tracking error) versus the Q-filter configuration in (2) while maintaining high robustness.
(3) Some additional considerations, such as the zerophase filter design, time delay compensation, and the learning coefficient, are provided in the implementation of the proposed ILC algorithm.
The rest of the paper is organized as follows: Section 2 describes the wafer stage considered in this paper, followed by its modelling.Section 3 presents the proposed modelbased ILC algorithm with a novel Q-filter configuration.In addition, some considerations in the practical implementation are given.In Section 4, experimental results are provided to validate the effectiveness and superiority of the proposed algorithm.Concluding remark is finally given in Section 5.

Application Context
2.1.Wafer Stage.In order to reduce the overhead time created by wafer exchange, thereby improving throughput, two wafer stages are used during wafer scanning.While the first stage performs overhead activities such as wafer unload/load, horizontal alignment, and measurement of the surface topography, the second one exposes the previously measured wafer [29].When both stages are finished with their tasks, the stages are swapped and a new cycle begins.As shown in Figure 2, each of the stages consists of two modules: a longstroke module and a short-stroke module.The former used for coarse positioning has an H-bridge design, the work range of which is 400 mm with micrometer-level positioning accuracy.The latter is responsible for fine positioning with a 2 mm work range and nanometer-level positioning accuracy.

Complexity
The wafer stage is controlled in six logical axes: three translations: x, y, and z, and three rotations: r x , r y , and r z .The control system adopts a 6-degree of freedom (DOF) controller structure in combination with force and measurement decoupling designs.Scanning of every field on a wafer is performed in the y-direction by conducting a series of point-to-point motions with constant velocity.After an exposure scan, simultaneous x and y accelerations bring the stage to the next exposure position.Motion in the xy-plane enables the full wafer exposure.Motions in z, r x , r y , and r z are used to keep the wafer surface in the focal plane of the lens.In this paper, for reasons of clarity, only the x-direction long-stroke module is considered.This choice is rather arbitrary but basically captures features that also exist in the remaining directions.

Modelling.
The frequency response of the x-direction long-stroke wafer stage is measured by a sine sweep experiment with a sampling time T s = 200 μs.A series of sinusoidal input signals in the range from 1.0 Hz to 1000 Hz are injected to the stage.The amplitude gain and phase shift are measured by comparing the discrete Fourier transforms of the position and the control signal.Figure 3 shows the measured frequency characteristic, from which it can be observed that the long-stroke wafer stage can be modeled as a doubleintegrator-based system with a mass of approximately 23.85 kg, in series connection with several resonances at high frequencies.These resonances characterize the structural flexibilities of the stage.In addition, the phase decline below −180 °indicates the existence of time delay due to several sources such as the actuator system and current control circuits.
By considering the rigid mode, vibration modes, and the time delay component, the following P s can be formulated as the model of the wafer stage: where K t is the gain including the torque constant and amplifier with current control, M is the mass of the wafer stage, K i is the modal constant of the ith vibration mode, ω i is the natural angular frequency, ζ i is the damping coefficient, and T d is the delay time.Table 1 lists the parameters in P s , while the solid curves in Figure 3 show the frequency response of P s .

Model-Based ILC with a Modified Q-Filter
In this paper, the learning filter L and the Q-filter are designed separately although they can be simultaneously designed in some one-step procedures [30].This kind of ILC design procedure is usually referred to as a twostep ILC design [26].Note that the following developments are presented in the frequency domain using the z-domain presentation.The z-transformation of a system can be obtained by replacing q with z.The frequency response is given by replacing z with e iθ for θ ∈ −ππ .Hereafter, the argument of z will be omitted for compactness of notation.

ILC Configurations.
Since ILC is incapable of attenuating the nonrepetitive disturbances, a two DOF control structure is typically used in practice, where ILC is integrated into an existing closed-loop system as an add-on scheme.Based on the choice of (1) the learning signal and (2) the injection point of the learned control signal, three alternative ILC configurations are usually adopted in precision motion systems [31].They are illustrated in Figure 4 where C fb denotes the feedback controller designed in advance, u f f is the control effort learned by ILC, e is the error signal, and d and v denote the input and output disturbance of the plant, respectively.
In configurations I [32] and II [25], the learning signal used in ILC is the tracking error, that is, r − y j .The learned control signal u j f f in configuration I is injected to the input of the plant, whereas the one in configuration II is injected to the input of the feedback controller.In configuration III [31], the learning signal is C fb r − y j , and the learned control signal is injected into the input of the plant.In general, the ILC algorithms involved in the configurations can be given as follows, respectively: Conf iguration III: The analysis and comparison of three ILC configurations are performed in the following.Conf iguration I: Conf iguration II: Conf iguration III: From (5), the ideal learning filters for three configurations can be given as follows, respectively: Conf iguration I: Conf igurations II and III: From (6), it seems that there is no difference between the three ILC configurations since each of them can achieve zero convergence rate with the ideal learning filter.However, from the frequency characteristics of the ideal learning filters shown in Figure 5, it can be observed that configurations II and III are well suited for the frequency domain design since the DC gain of the learning filter is close to 1, whereas the DC gain in configuration I tends to be infinite, which may lead to numerical issues.5 Complexity is used.The choice of closed-loop controllers will restrict the user from implementing only ILC configuration II.All ILC configurations can be implemented if an open-loop controller is used.
On the other hand, in ILC configurations I and III, if the ILC algorithm yields a large undesired control signal (resulting from the incorrect numerical computation or the unstable ILC algorithm), the direct injection of this signal would saturate even damage the plant.However, this can be avoided in ILC configuration II since the ILC signal will be filtered by the feedback controller C fb before being injected to the plant.

Learning Filter Design.
Based on all the above discussions, for the sake of implementation and safety, ILC configuration II is adopted in this paper.From ( 6), the model-inversion learning filter is given as follows: Remark 2. If the relative degree of the closed-loop system satisfies l ≥ 1, then the learning filter in (7) will be noncausal.Unlike the usual notion of noncausality, the ILC algorithm with a noncausal learning filter is still implementable in practice because of the availability of the entire data from all previous iterations.Remark 3. If the learning filter in ( 7) is unstable which usually happens when sampling a continuous-time system with a fast sampling time [33], model-inverse techniques for nonminimum-phase systems can be adopted, such as the ZPETC method in [34], the ZMETC method in [35], and the noncausal series approximation method in [36].See [37] for their comparisons.Remark 4. From Figure 5(b), it can be observed that the learning filter can be approximated by L = 1 at low frequencies.This approximation can be used in practice if a Q-filter is well designed.Despite to its popularity, ILC configuration I is more complicated in terms of the calculation of the learning filter L = P/ 1 + PC fb −1 .This is another reason why we choose ILC configuration II rather than I.

Preexisting Q-Filter Configuration.
As we discussed in Section 1, if only a single learning filter is used in the model-based ILC, divergence or instability may happen as the iteration increases due to the model uncertainty at high frequencies.A Q-filter is thus often introduced to enhance the robustness of the learning process.Referring to ILC configuration II, the Q-filter is usually configured in ILC as follows: Convergence.The frequency-domain condition for the convergence of the ILC algorithm in ( 8) is given as follows: Theorem 1.Consider ILC configuration II in Figure 4(b) and the ILC algorithm in (8).The ILC algorithm is convergent if Proof.From Figure 4(b), the tracking error in the jth iteration can be given as follows: where S = 1/ 1 + PC fb is the sensitivity function and G = PC fb / 1 + PC fb is the closed-loop system function.
From ( 8) and ( 10), the tracking error in the j + 1 th iteration can be obtained by j+1  11 From ( 10), we have Gu j f f = Sr − e j − PSd j − Sv j substituting which into ( 11) yields The above error propagation indicates that the ILC algorithm in ( 8) is convergent if 3.3.2.Performance.Suppose that the input and output disturbances of the system in Figure 4(b) are repetitive.If the ILC algorithm in ( 8) is convergent, then it follows that from (12).
Therefore, the converged tracking error is It can be obviously observed that e ∞ = 0 holds for all r, d, and v, if and only if the ILC algorithm in ( 8) is converged and Q = 1.Therefore, perfect tracking necessitates Q = 1 at the cost of robustness.This indicates that the Q-filter configuration in (8) leads to a tradeoff between robustness and performance.
3.4.Proposed Q-Filter Configuration.As discussed in Section 3.3, the existing ILC algorithm in (8) cannot converge to zero-tracking error unless Q = 1, which motivates the following ILC algorithm: The Q-filter in the proposed algorithm is configured only in the filtering of the error signal.Theorem 2. Consider ILC configuration II in Figure 4(b) and the ILC algorithm in (16).The ILC algorithm is convergent if Proof.From ( 5), we can easily get the tracking error propagation from the iteration j to j + 1 as follows: from which, it straightforwardly leads to the frequencydomain convergence condition as shown in (17).
3.4.2.Performance.The converged tracking error yielded by the proposed ILC algorithm in ( 16) can be given as follows under the assumption on the repetitiveness of d and v: which leads to e ∞ = 0 20 The zero asymptotic tracking error indicates that the proposed algorithm achieves perfect tracking performance, meanwhile maintaining the robustness to uncertainties at high frequencies.
3.5.Practical Considerations.In the real implementation of the proposed algorithm (16), some practical aspects should be considered.
3.5.1.Zero-Phase Q-Filter.The Q-filter is used to maintain the convergence for all frequencies even in the face of model uncertainties.Although any low pass filter could be used as the Q-filter, the zero-phase filter is generally preferable in ILC since it allows no phase sacrifice.In [16,32], a forthand-back filtering principle is provided to apply a regular low pass filter in a zero-phase manner.In [19], several ways of representing the zero-phase filter are offered using matrices and transforms.
A comparison about the frequency characteristics of a second-order regular Q-filter and a second-order zerophase Q-filter is illustrated in Figure 6.

Learning coefficient.
In practice, more than one iterations may be desired to average out the influence of nonrepetitive disturbances and measurement noise, although zero convergence rate can be achieved theoretically.Therefore, the learning filter L is usually multiplied by a coefficient 0 < γ < 1 to reduce the convergence rate, thereby making the error converge smoothly and averaging out uncertainties through iterations.Note that a large learning coefficient induces fast convergence but is associated with large noise amplification, and vice versa.

Delay Compensator.
A time ahead z α is usually incorporated into the ILC controller in order to compensate (1) the relative degree of the system, (2) the time delay resulting from mechanical dynamics, sensors, actuators, and amplifiers as discussed in Section 2.2, and (3) the phase delay caused by the nonzero-phase Q-filter.The proper time ahead is of significant importance to the ILC performance.Insufficient or too much time ahead would lead to slow convergence even divergence.
In summary, the proposed ILC algorithm is presented as follows:

Complexity
Referring to (17), the frequency-domain convergence condition of the ILC algorithm in (21) can be given as follows:

Experimental Verification
In this section, the proposed ILC algorithm in ( 21) is experimentally validated on a wafer stage as shown in Figure 7.
The wafer stage is mounted on an air bearing with 400 kPa air pressure.The position of the linear motor is measured by a Renishaw linear incremental encoder with the effective resolution of 0.1 μm and maximum velocity of 0.5 m/s.The stage is driven by an all-digital power amplifier based on the field-programmable gate array (FPGA) XC3S400.The bandwidth of the drives is about 2.0 kHz.The wafer stage system is stabilized by a PID feedback controller.The bandwidth of the position loop is about 60 Hz.The drives and the controllers communicate with  8 Complexity each other through high-speed fibers.The proposed algorithm is realized by using C language on a TMS320C6414TGLZ DSP controller.The sampling period of the control system is T s = 200 μs.The internal data of the DSP is transmitted to the computer through the network cable and then displayed on the screen.Although it is a common industrial practice to use a second-order reference trajectory based on a rigid body consideration of a system, high-order motion profiles are more preferred in ultraprecision motion systems since by which less resonant dynamics are excited.In this paper, a thirdorder polynomial motion trajectory with constraints on the 1st to the 3rd derivatives is used, as shown in Figure 8.It is generated by a trajectory planning algorithm that takes system dynamics into account.
4.1.Effectiveness of the Proposed Algorithm.In the experiments, the learning coefficient γ is set as 0.95, and the following zero-phase Q-filter is used as an alternative to the regular low pass filter. where T s , and w c = 2π × 100 is frequency in rad/s.In order to determine the best time ahead α, experiments withα in the range of 14 to 39 are performed.After convergence, the stable error signals are shown in Figure 9, from which it can be observed that when the time ahead is set as α = 29, the best servo performance is achieved.The convergence process is shown in Figure 10. Figure 11 shows the frequency responses of the error propagation functions under α = 14, α = 29, and α = 39.The following indications can be obtained from Figures 10 and 11.
(1) The proposed algorithm is convergent, although the nonrepetitive disturbances and measurement noise could lead to slight fluctuation in the converged error signal.
(2) The learning is most efficient under α = 29 since the corresponding error propagation function has the lowest magnitude at each frequency.
(3) Insufficient α = 14 or too much α = 39 time ahead would lead to slow convergence even divergence.It can be forecasted that high-frequency errors at about 70 Hz to 100 Hz will be amplified for the learning process with α = 14 and divergence would happen as the iteration increases since 4.2.Superiority of the Proposed Algorithm.To further verify the high performance of the proposed method, experimental comparison with the existing ILC algorithm in ( 8) is performed.The Q-filter in ( 8) is set as the same as the one in (23).The time ahead is set as 29.The tracking error in each iteration is shown in Figure 12, from which the following observations can be obtained: (1) The conventional ILC algorithm improves robustness against uncertainties, but at the cost of performance.The tracking error converges to 4 μm during the constant-velocity phase and 20 μm during the acceleration phase.

Conclusion
In order to deal with the trade-off between servo performance and robustness against model uncertainties in the existing model-based ILC, a novel Q-filter configuration is proposed in this paper.Three commonly used ILC configurations in the two DOF control structure are compared from the perspective of theory and practice.Theoretical analysis reveals the compromise of the existing ILC algorithms on the servo performance.Different from conventional Q-filter configurations, the Q-filter in this proposed ILC algorithm is only configured in the error signal.It avoids the weakening of the control signal and ensures the filtering of the high frequencies in the error signal.Some additional practical considerations are provided when implementing the proposed ILC algorithm.Experimental results confirm the effectiveness and superiority of the proposed method.
The observation in the experimental results that the tracking error in the acceleration phase is always larger than that in the constant-velocity phase would motivate the development of a cut-off frequency-varying Q-filter in the further work.(21).Green frequency lines: existing ILC algorithm in (8). 10 Complexity

Figure 1 :
Figure 1: The phenomenon of initial convergence followed by divergence in ILC with only a learning filter.

Figure 3 :
Figure 3: Frequency response functions in Bode representation of the x-direction long-stroke wafer stage.

6 Complexity 3 . 4 . 1 .
Convergence.The following theorem presents the convergence condition of the proposed ILC algorithm in(16).

Figure 6 :
Figure 6: Frequency characteristic comparison of regular Q-filter and zero-phase Q-filter.

Figure 9 :
Figure 9: Converged tracking errors with different times ahead.
Norm of the tracking error

Figure 11 : 9 Complexity( 2 )
Figure 11: Frequency characteristics of error propagation functions under different times ahead.

Table 1 :
Parameters in the wafer stage model.