A Progressive Output Strategy for Real-time Feedback Control Systems

the controller ABSTRACT The real-time requirements imposed on a feedback control system are often hard to be met, as the controller spends a disproportionately large amount of time waiting for a control cycle to reach its final state. When such a final state is established, multiple tasks have to be prioritized and launched altogether simultaneously, and the system is given an extremely short time window to generate its output. This huge gap between the wait and action times, perceived as a load unbalancing problem, hinders a control decision to be made in real time. To address this challenging problem, in this paper, we present a progressive output strategy that divides a control cycle into a few fine-grained control intervals, and the entire workload is scheduled across these control intervals. Dubbed as Progressive Output Strategy (PROS), this approach actively requests intermediate states be created between adjacent control cycles in an adaptive manner. Specifically, as the sensing information is arriving, a system that adopts PROS can generate a series of intermediate solutions that eventually converge to the final optimal control signal. This way, the controller will no longer waste its time idling while waiting for the arrival of all the data for one-shot decision-making. Rather the system actually cuts down the waiting time and is able to act on the intermediate data/states throughout the entire control cycle. Experimental results have confirmed that adopting the PROS in a feedback control loop can evenly distribute the workload over a control cycle, and thus, the time delay is reduced by as much as two orders of magnitude, which is essential to meet the most stringent timing requirements.


INTRODUCTION
A system employs a feedback control loop is able to tightly integrate sensing, computing, and actuating components to handle uncertainties and dynamically adapt its behavior (Lindberg M. et al 2010), (Jacob R. et al 2016). In a feedback loop, as physical processes evolve over time, the computation time is tightly linked to the timing in the physical domain, giving rise to the real-time requirement (Jacob R. et al 2016). Up to date, a great deal of research has been focusing on how to speed up the computation in the feedback control loop. These works have centered around four themes: (1) making the system model numerically easy to compute (Shu L. et al 2012), (2) decomposing the optimization into sub-problems and solving them in a distributive manner (Boyd S. et al 2011), (3) generating solution close to optimal at each control step until the final solution is reached (Wang Y. et al 2010), (Bak S. et al 2017), (Li H. et al 2018), and (4) producing an explicit control law ( e.g. Explicit Model Predictive Control (EMPC)) (Bemporad A. et al 2002), (Oberdieck R. et al 2017).
With all these advances in speeding up the computation for fast feedback loop control, it is still hard for a control system to meet the real-time requirement by focusing only on the feedback loop itself. Rather the workload unbalancing problem that was ignored in previous works needs to be carefully addressed. This problem can be particularly important when all the data have finally arrived and multiple high priority tasks yet to be launched altogether, pressuring the system to output its results as soon as possible, as explained in a motivating example shown in Fig. 1. Here a traffic control system is put in place to help alleviate traffic congestion by optimizing the traffic signal operation strategy. Observed traffic states are utilized as the feedback information, and the controller needs to adjust the traffic signals in response to the changes in traffic flow. The traffic information is received every one or several control cycles that can last as long as a few minutes; during the time, the controller has to sit idle, waiting for the ABSTRACT The real-time requirements imposed on a feedback control system are often hard to be met, as the controller spends a disproportionately large amount of time waiting for a control cycle to reach its final state. When such a final state is established, multiple tasks have to be prioritized and launched altogether simultaneously, and the system is given an extremely short time window to generate its output. This huge gap between the wait and action times, perceived as a load unbalancing problem, hinders a control decision to be made in real time.
To address this challenging problem, in this paper, we present a progressive output strategy that divides a control cycle into a few fine-grained control intervals, and the entire workload is scheduled across these control intervals. Dubbed as Progressive Output Strategy (PROS), this approach actively requests intermediate states be created between adjacent control cycles in an adaptive manner. Specifically, as the sensing information is arriving, a system that adopts PROS can generate a series of intermediate solutions that eventually converge to the final optimal control signal. This way, the controller will no longer waste its time idling while waiting for the arrival of all the data for one-shot decision-making. Rather the system actually cuts down the waiting time and is able to act on the intermediate data/states throughout the entire control cycle. Experimental results have confirmed that adopting the PROS in a feedback control loop can evenly distribute the workload over a control cycle, and thus, the time delay is reduced by as much as two orders of magnitude, which is essential to meet the most stringent timing requirements.
arrival of traffic data. However, when traffic data finally come, the controller is expected to generate the optimal signal strategy within just a few milliseconds. This huge gap between wait and action times, perceived as a load unbalancing problem that prevents the control system from achieving its real time goals, is addressed in this paper. A straightforward approach to solve this load unbalancing problem is to get the workload spread over the entire control interval. To be more specific, instead of having the controller waste its time idling while waiting for the arrival of all the data to make a control decision, the system actually cuts down the waiting time and acts on the intermediate data/states that are available at some specific intervals. Along this process, the controller continues to generate suboptimal solutions and continues to refine them in the next control interval. By the time when all the data needed for making a decision in a control cycle are finally available, the controller can have a "warmstart". As a result, the controller, based on the suboptimal solution thus far obtained, is now able to quickly complete its computation to generate the actual control signal, as shown in Fig. 2. The sampling rate, which relates to the number of sampling intervals in a control cycle, needs to be carefully selected, as it has serious implications on sampling complexity and control performance (Haimovich H. et al 2013), (Miskowicz M. et al 2014), (Sahoo A. et al 2015), (Wu L. et al 2017), (Tzoumas V. et al 2018). To our best knowledge, there is no open literature that deals with the sampling issues pertaining to workload balancing. Based on the idea to use the idle time within a control cycle to generate sub-optimal output solutions, we present the Progressive Output Strategy (PROS) (see Fig. 3). PROS has two components: the sampling and the computing components. In each control cycle, the sampling component adaptively samples the state of the plant, after which it triggers computation based on the partial information it has already received. The computing component updates the estimation of the optimal solution and outputs an optimal solution at the end of a current control cycle. During the wait time, an intermediate solution can be generated along with the state changes in the physical processes. In this way, instead of completely staying idle, the sampling component requests sensing information and activates the computing process. By spreading the workload throughout the entire control cycle, the amount of time needs to be spent on computation at each sampling time tends to be small. As a result, one may use a lowend microcontroller, as opposed to a more expensive high-end machine, to achieve the same real-time performance. Although there are works dedicated to consider both sampling and computing at the same time (Tarbouriech S. et al 2016), (Pan Y V. et al 2015), (Hans J F. et al 2014), as far as we know, this paper is the first attempt to reduce time delay by adopting a strategy to spread the workload. Our specific contributions are summarized as follows:  A novel problem on how to add load balancing into real-time feedback control is formulated.  We present the PROS strategy that can progressively obtain the optimal control signal in a control cycle.  Empirical experiments show that our approach can reduce the time delay by two orders of magnitude on average with a low sampling complexity.
A physical process can be mathematically represented as a seven-tuple P=(X,U,ρ,X_0,X_g,T,T_c), where X is the set of states, X∈R^n. U is the set of feasible control signals, U∈R^m. ρ:X×U→X is the transition function (or dynamic function) of the physical system. ρ can be approximated by a linear model; that is, x_(t+1)=Ax_t+Bu_t for state vector x_t and input u_t at the t^th control cycle.  X_0 is the set of initial states, and X_0⊂X.  X_g is the set of target states, and X_g⊂X.  T is the control horizon defined as the total number of control cycles.  T_c is the time span of one control cycle.
The goal is to determine the best feedback loop based on the following criteria:  Minimize the performance loss J_P, where the performance is defined as a weighted distance between the target state x_g∈X_g and the current state x_t∈X. That is, where is the penalizing matrix of the state deviation from the reference trajectory, and is the cost matrix for the control signals.
 Minimize the total time delay measured as the total computation time needed to obtain all the control signals. That is, where t is time delay of the control signal during the control cycle. According to the classic control theory, a controller can be expressed as a quadratic programming if the physical system can be approximated by a linear model and the cost function is quadratic ( g ). The controller needs to minimize the performance loss, taking into account of some constraints ( g , safety) applicable to the control signals. That is, the controller can be expressed as: where is the optimization vector; g , ; and 0, 0 (positive definite) and they are obtained by incorporating into in Eq. (1). As indicated in Eq. (3), the controller needs to receive the measured state of the system at the end of each control cycle. An approximate linear model will be applied to predict future states based on , and the controller will minimize the cost function that accounts for both the current and predicted states. The controller's output is an optimal control sequence over the time span of t t . Only the optimal strategy within the interval of t t will be implemented in the system. At time step t , the states will be measured again and the data collected will be used to calculate the optimal solution for the next interval, t t , in a rolling horizon fashion. Although the rolling horizon style introduced above makes the controller more robust against external disturbances, it also leads to a load unbalancing problem. According to the formulation of the feedback loop, the state is measured at the end of control cycle , and the control signal is also expected to be obtained at the same time, which imposes a heavy workload to the control system.

PROS: PROGRESSIVE OUTPUT STRATEGY
The PROS framework controls the sampling interval, warm-starts the computation, and imposes a low workload for the computation of the control signal. As shown in Fig. 3 The sampling component of PROS adaptively sets an interval for collecting sensing information from the sensors. This component shall be responsible for distributing the load during the wait time with a minimum number of sampling intervals. To achieve this goal, this paper proposes an urgency-triggered sampling strategy. Here urgency is defined based on the load information provided by the computing component and the remaining time in the current control cycle.

State-oriented Computing Strategy
The main idea of PROS is to adjust the controller's sampling interval and the computing component's needs to quickly adapt itself to the state changes. In light of the parametric active-set algorithm presented in qpOASES (Hans J F. et al 2014), which is an effective method to solve QP sequentially by exploiting the geometrical property (Hans J F. et al 2014), we herein present the State-Oriented Computing Strategy (SOCS).
To obtain QP(x_t) at current instance x_t, SOCS iteratively searches a straight line within the state space from its preceding value given as QP(x_(t-1)). Since the state of the physical system is viewed as one parameter of Eq. (3), the state space shall be treated the same as the parameter space. Fig. 4 shows the parameter space of a two-dimensional example. According to (Hans J F. et al 2014), the parameter space can be divided into multiple regions. In each region, the optimal solution to Eq. (3) has an identical active set, which is a set of indices of the active constraints. It also means that if both the previous state and the current state fall into the same region, their corresponding optimal solutions have the same active set, and thus, no iteration is needed. On the other hand, if the two states are located in different state regions, several active set changes (iterations) are necessary to move from the previous solution to the current one. As shown in Fig. 4, we use a trajectory to indicate the dynamic change of states. Note that once the trajectory crosses the boundary of the state regions, it means the active set has been updated, which corresponds to one iteration.
Note that SOCS generates sub-optimal control signals along with the change of states (in a straight line originated from x_(t-1)), and only the latest solution (the path reaches the state x_t) is applied to the real physical system.
(4a) (4b) (4c) (4d) (4e) The state and constraint vectors are given as follows: (5a) (5b) (5c) where the parameter τ is the step length along the parameter change direction, which is determined by Eq. (10). We assume that the starting point is a known optimal solution U_0^* and λ_0^* (and their corresponding optimal active set is A_0^*) of the last QP(x_0) and we intend to obtain QP(x_0^new). The basic idea of qpOASES is to move from x_0 towards x_0^new, and thus from (U_0^*,λ_0^*) towards (U_new^*,λ_new^*), while keeping primal and dual feasibility, i.e. optimality, for all the intermediate points.
0 This condition is met at , as the solution starts from the previous optimal solution. Hence, we have That is, The active set stays unchanged as long as no previously inactive constraint becomes active. That is, for some , and no previously active constraint becomes inactive for some . That is, 0 Correspondingly, the maximum possible step length is determined as follows: (10a) (10b) (10c) In the case that τ_max equals one, the new state x_0^new has been reached, and at the same, the solution of the new quadratic program QP(x_0^new) has been found. In other cases, a constraint needs to be relaxed or added to active set A, which limits τ_max from reaching 1. Once the active set is updated, the procedure repeats, and a new step direction and size are obtained. This iteration stops until τ_max is equal to one, indicating the solution of QP (x_0^new) has been found.
The detailed algorithm is summarized in Algorithm 1. The input is the optimal solution (U_(i-1)^*,λ_(i-1)^*) of the problem at previous control cycle QP(x_(i-1)), and x_(i-1) is the system state (i.e. problem parameter). According to the optimal solution, we can obtain the corresponding optimal active set A_(i-1)^* which is a set of the indexes of active constraints (constraints that are satisfied with equality). The output of the algorithm is the optimal solution and the corresponding active set of current control cycle.
As SOCS produces a sequence of optimal solutions for QPs along the path, it is possible to break from this sequence any time and initiate a new path from the current iteration towards the next QP.

Urgency-triggered Sampling Strategy
This framework allows the workload of a control cycle to distributed to several sampling intervals, and it contains the following aspects: 1) The system samples the system and collects the data.
2) The system then breaks the cycle time into a number of sampling cycles (intervals) that each lasts for wait.time.
3) During each sampling cycle (interval), computing work can be performed, and the intermediate solution, as an approximation of the final output, shall be generated at the end of every interval.
This framework allows the workload of a control cycle to distributed to several sampling intervals.
Let the execution time of the computing component be less than or equal to the sample interval length, i.e.,exec.time≤wait.time. The intermediate solution would keep to be primal and it maintains dual feasibility, as τ_max converges to 1. According to Eq. (10), the primal feasibility is given as: (11) and the dual feasibility is given as: As a result, the most important part of this framework is to choose the sample interval length wait.time. One simple approach to determine wait.time would be directly dividing T_c into a fixed number of intervals. To spread the most of the workload across the idle time, one idea is to make wait.time as short as possible, and avoid a collision between the operations of sampling and computing. Actually, by setting wait.time to be (13) any potential conflict can be avoided. A coordinated sampling and control scheme is summarized in Algorithm 2. We first initialize the current active set with the working set obtained from the previous control cycle. Then we run iterations of the proposed sampling and computing cooperation strategy times. Here , the number of iterations, is a constant determined offline, , At every iteration, the sensors do not need to continuously measure the system, or during every control cycle; instead they measure states x_(i|t-1) for every wait.time (seconds). After the sensor information x_(i|t-1) is obtained, the suboptimal control signal U_(i|t-1)^* is derived by running SOCS as explained in the previous section.
In the fixed sampling strategy algorithm, the length of wait.time is determined off-line. A two-dimensional example is illustrated in Fig. 5. In the applications where frequent sampling is allowed, the fixed sampling strategy is expected to significantly reduce the number of iterations to generate the optimal control signal, and thus help improve the real-time performance. However, in some cases, sampling, like perception-based sampling, can be expensive or timeconsuming. In these applications, the design objective is two-fold: minimizing the sampling times, and spreading most of the workload across the idle time. As shown in Fig. 4, the regions on a state space are geometrically irregular, indicating the active set changes may happen regularly. A fixed wait.time, a control cycle is divided into equal intervals, would fail to achieve both objectives. Rather, an adaptive sampling strategy is proposed that the sample interval is determined by (15) where f is a strictly monotonically decreasing function of est.load, and T_r is the remaining time of the current control cycle. One can see from Algorithm 1 that in the path from current sampled state x_(i-1|t) to next one x_(i|t), SOCS will generate a step length sequence (16) where τ_k,k=1,2, ,K, is the step size from one state region to the next. Since the number of iterations K can be considered as a workload estimation, a reasonable choice of function f would be (17) where α is the weight of .
According to Eq. (15), if remaining time is short and the estimated value of future load, , is high, it means there is an urgency to complete the computation and cut down the to be short. It is this reason that this strategy is named as urgencytriggered sampling strategy, as illustrated in Fig. 6, and the algorithm is detailed in Algorithm 3. Different from Algorithm 2, is adjusted online. is the cumulative spending time, which is the accumulated sum of the and the execution time of SOCS. If the remaining time is less than a predetermined threshold of , the sampling interval would be set to be .

Experimental Setup
To evaluate our methodology on an industrial control application, we apply it to a DC servomechanism as reported in (Bemporad A. et al 1998). Details of the experimental setup are provided in Tab. 1. The SOCS algorithm is implemented in Matlab using a qpOASS toolbox (Hans J F. et al 2014).
Several tests are conducted to investigate the behavior of three alternative computing strategies. C1: traditional QP solver with an active set strategy.
C2: traditional QP solver initialized with the previous solution.
C3: state-oriented computing strategy presented in Section 3.
Next, we further combine the above computing strategies with different sampling strategies: S1: fixed sampling strategy that all the sampling intervals are set to be the same.
S2 : urgency-triggered sampling strategy, sampling interval is adaptively determined by the estimation of the future load situation and the remaining time of the control cycle, as presented in Section 3.
For S1, the time of one control cycle is divided into equal intervals, and the duration of an interval is determined by the number of samples. According to Eq. (13), the sampling interval of S1 should be set to the maximum execution time in the tests marked as C1-3, which in this case is determined to be 0.01 (seconds). Accordingly, the fixed sampling interval is 0.01 (seconds), which also means that the fixed sampling times of S1 is 10 (times). Since S2 requires the load information be provided in test C3, S2 can only be combined with C3.

Simulation Results
The optimization algorithm solving the fundamental QP problems of C1, C2, C3 is active-set method which can be easily warm-started. It is reasonable to use the number of iterations to profile the workload at different times. Since the real-time constraint applies at the end of each cycle (forced to output control signal), the focus is placed on the workload in the last sample interval. Fig. 7 depicts the number of iterations at each time step for different strategies within three randomly selected control cycles.
load unbalancing problem: Strategies C1, C2 and C3 only have one computing component, and their sampling interval spans the entire control cycle. As shown in Fig. 7, all the iterations complete in one time step, leading to a load unbalancing problem that may cause a large time delay. Although in the cases of C2 and C3, warm-start from the previous solution is possible, their performance improvement is not obvious. load spreading: The load unbalancing problem is addressed by load spreading. To do this, computing components C2 and C3 are combined with different sampling strategies S1 and S2. Different from simple combinations C2+S1 and C3+S1, C3+S2 (PROS) proposed in this paper makes sampling and computing components cooperate, and adaptively control the sampling interval and computing complexity. As shown in Fig. 7, C2+S1 cannot deliver load spreading, because its warm-start strategy is based on the previous solution which does not take advantage of the sampling strategies. In addition, due to the stateoriented warm-start strategy in C3, C3+S1 and C3+S2 are capable to spread the partial load to be completed during otherwise considered idle time. However, C3+S1 relies on the frequent sampling, which may cause a large communication delay or add sampling complexity. This problem is further investigated in the following.  Fig. 8 plots the time delay and the number of samplings obtained from applying different strategies over 40 control cycles. Here both C3+S1 and C3+S2 significantly outperform other baseline cases. Also, the performance gap between C3+S1 and C3+S2 is reasonably small, around 0.12 milliseconds on average. However, sampling times of C3+S2 is much less than that of C3+S1, 72.25% less on average. This result confirms the proposed PROS may achieve near-optimal performance with rather low sampling complexity.
The summary of experiment results over 40 control cycles are shown in Tab. 2. For Avg. Time Delay and Avg. # Iteration, we only evaluated the last sampling period of each control cycle. As one can see, C3 clearly outperforms other competing control algorithms in the term of average time delay. By combining with the S2, we further reduce the numbers of iterations and samplings. Therefore, C3+S2 (PROS) achieves small time delay with a reasonable sampling complexity.

CONCLUSION
Observing the impacts of the load unbalancing problem on the feedback control loop, we developed a cooperative framework, PROS, which coordinates the sampling and computing components of a controller to spread the load to the entire control cycle. The sampling component requests future load condition be estimated from the computing component, and it determines the sampling interval during one control cycle that is used to activate computation of an intermediate solution. In this way, the intermediate solution progressively approaches to the optimal control signal along with changes of the sampled state within a control cycle. Experimental results showed PROS outperformed the competing strategies by a wide margin. The proposed PROS can be extended to apply to the feedback control of nonlinear systems.

DISCLOSURE STATEMENT
NO potential conflict of interest was reported by the authors.