Experimental evaluation of sensor attacks and defense mechanisms in feedback systems

In this work, we evaluate theoretical results on the feasibility of, the worst-case impact of, and defense mechanisms against a stealthy sensor attack in an experimental setup. We demonstrate that for a controller with stable dynamics the stealthy sensor attack is possible to conduct and the theoretical worst-case impact is close to the achieved practical one. However, although the attack should theoretically be possible when the controller has integral action, we show that the integral action slows the attacker down and the attacker is not able to remain stealthy if it has not perfect knowledge of the controller state. In addition to that, we investigate the effect of different anomaly detectors on the attack impact and conclude that the impact under detectors with internal dynamics is smaller. Finally, we use noise injection into the controller dynamics to unveil the otherwise stealthy attacks.


Introduction
As more critical infrastructures and industrial processes are connected via communication networks the threat of cyber-attacks on these systems increases rapidly (Hemsley and Fisher, 2018). The attacks can range from sophisticated attacks, such as the Stuxnet attack on an Iranian uranium enrichment facility (Kushner, 2013), to attacks where the attacker takes over the whole control system as in the attack on the Ukranian power grid (Lee et al., 2016). Since these attacks target critical infrastructures, their impact on society can be severe and new methodologies to secure control systems need to be considered (Cárdenas et al., 2008).
As a response to the threat of attacks on control systems, a new field of security complementary to the IT security measures has emerged, which uses physical process knowledge and control-theoretic results to investigate both attack strategies as well as defense mechanisms (Dibaji et al., 2019). While many attack strategies have been investigated, we focus on sensor attacks in this work, because accurate sensor measurements are important for feedback control. Therefore, we want to investigate how a manipulation of these measurements can influence the closed-loop performance. There exists a large body of work on sensor attacks, see, for example, (Mo and Sinopoli, 2010), (Murguia and Ruths, 2016), and (Guo et al., 2018). All of these consider a strong attacker model with full model knowledge of the closed-loop system, and the ability to manipulate sensor measurements. Furthermore, if a detector is monitoring the system, the attacker also wants to remain stealthy, that is, not trigger an alarm in the detector. In addition to that, the attacker has also knowledge to internal states of the closed-loop system, such as the controller and the detector state. However, the knowledge about the internal states is unrealistic in the beginning of the attack. In (Umsonst and Sandberg, 2021), we prove that the attacker is only able to estimate the controller state perfectly if and only if the linear controller dynamics do not have any eigenvalues outside the unit circle. Once the attacker knows the controller state, to inject its stealthy worst-case attack, the attacker needs to determine the detector's internal state if a detector with internal dynamics is used. We investigate the problem of detector state estimation in (Umsonst and Sandberg, 2019) and conclude that the attacker can estimate the detector state as well, under the assumption of linear detector dynamics. With knowledge about both the controller and the detector state, the attacker can now inject its worstcase attack. Here, we will use the convex worst-case impact estimation problem proposed in (Umsonst et al., 2017) to estimate the worst-case impact of a stealthy sensor attack. Furthermore, the results on sensor attacks are often theoretical in nature, and might not hold when real data is used. The need for evaluation of the attack and detection schemes is also pointed in the survey on cyber-physical security by Lun et al. (2019). Therefore, the goal of this work is to inspect the impact of the sensor attack in an experimental setup. The real process we use in the experiment is the Temperature Control Lab (Park et al., 2020), which is an Arduino-based process used in (remote) education of control students. In this work, we focus on the impact on, and defense mechanisms for, the physical system in the loop and not the cyber-aspects of the attack.
The contribution of this work is twofold. First, we combine three previously independent results, derived in (Umsonst and Sandberg, 2021), (Umsonst and Sandberg, 2019), and (Umsonst et al., 2017), respectively, into one complete sensor attack strategy with three distinct stages. This illustrates that even a powerful attacker with perfect model knowledge and access to the sensor measurements cannot launch a stealthy attack immediately in a realistic scenario, but needs to complete two more stages before it can launch the worst-case attack. Second, we evaluate the theoretical results of the threestaged sensor attack strategy in an experimental setup with a real process. We show that if the attack does not complete the first stage successfully, the attack is detected in one of the later stages. Further, delaying the completion of the first stage of the attack can be achieved by adding an integral part to the controller. The controllers we investigate are the linear quadratic Gaussian (LQG) controller and the LQG controller with integral action. Moreover, we evaluate how close the theoretical worst-case impact estimation is to the actual impact of the attack and simultaneously verify that detectors with internal dynamics can mitigate the impact. The detectors we investigate are the χ 2 detector and the multivariate exponentially weighted moving average (MEWMA) detector. In addition to that, we investigate active noise injection into the controller dynamics to reveal otherwise stealthy attacks by preventing a successful completion of the first stage. Urbina et al. (2016) did experiments with a water treatment system under both actuator and sensor attacks. Their results are related to the ones we obtain in this paper. However, Urbina et al. (2016) do not consider the first two stages of the attack, which are necessary for the stealthy execution of the attack. Although the controller with an integral action leads to a larger attack impact than a controller without an integral action both in our work as well as in (Urbina et al., 2016), we show that the integral action prevents the attack from being stealthy. Porter et al. (2019) compare four different detectors with respect to their attack detection capability in simulation and real-world experiments. In the present work, we will only compare two of the four detectors in-vestigated in (Porter et al., 2019), but look at a more sophisticated attack strategy. We also use a technique similar to the watermarking in (Porter et al., 2019) to achieve a detection of stealthy attacks. However, instead of injecting an additive noise signal to the controller output, u(k), we inject the noise directly into the controller dynamics in this work.
The remainder of the paper is structured as follows. In Section 2, we introduce the components of the experimental setup, such as the hardware, and the controllers and the anomaly detectors investigated. The sensor attack strategy with its three stages is presented in Section 3. Furthermore, Section 3 shows the experimental results for each attack stages and compares them with the theoretical results. Defense mechanisms such as the choice of the controller, the choice of the detector, and the active noise injection are discussed and investigated in Section 4. Section 5 concludes the paper.
Notation: Let x ∈ R n be an n-dimensional real-valued vector and A ∈ R m×n a real-valued matrix with m rows and n columns. The p-norm of a vector x is denoted by is a block diagonal matrix with A and B as its block diagonal entries. Further, I n represents the n-dimensional identity matrix. A Gaussian random variable x with mean µ and covariance matrix Σ is denoted as x ∼ N (µ, Σ).

The closed-loop system
In this section, we present the closed-loop system composed of the temperature control lab, which is the hardware used in the experiment, the controller, and the anomaly detector. Figure 1 shows a block diagram of the closed-loop system under a sensor attack.

The Temperature Control Lab
As the hardware in our experiment we use the Temperature Control Lab (TCLab) (Park et al., 2020). The TCLab is an Arduino-based process, which consists of two heaters with radiators and two temperature sensors, one for each heater (see Figure 1). The heaters are close to each other such that their temperatures are coupled. In this process, we can set the output power of each heater, Q 1 and Q 2 , respectively, to a value between 0 % and 100 %. Different modelling strategies for the TCLab are evaluated by Park et al. (2020) and the conclusion is that a four-dimensional physics-based model describes the process best, where the temperature unit for the dynamics is Kelvin while we use degree Celsius in the figures. The first two states of the model represent the temperatures of the heater (T H,1 and T H,2 ) and their dynamics are based on convective and radiative heat transfers Block diagram of the experimental setup depicting the TCLab, the attack on the measurements, the controller, and the anomaly detector, where the TCLab is controlled around the steady-state input Q∞ and steady-state output T∞. The attacker is able to eavesdrop on the measurements (dashed line) and inject a malicious additive signal ya(k) to the measurements.
between the heaters and the ambient temperature T amb . The last two states represent the state of each sensor (T S,1 and T S,2 ) and are represented by a linear low-pass filter. A more detailed description of the TCLab and its dynamics can be found in Appendix A.

Controller design
In this section, we design two controllers that control the TCLab around a certain steady state. Here, we only provide a coarse description of the controller design, while more information on the controller design can be found in Appendix A. Remember that the main focus of this work lies on evaluating theoretical results on sensor attacks in an experiment with a real process. Therefore, we design the controllers such that they perform satisfactorily with respect to the steady state, but do not take any more constraints, such as rise time constraints, into account.
We design two linear controllers that control the TCLab at a steady-state temperature. To find a linear model, we linearize the non-linear physics-based dynamics around a steady-state value. One can determine that in steady state, we have T S,1 = T H,1 and T S,2 = T H,2 , so that we only need to find the steady-state values of the heater temperatures and the heater power. These are denoted as Figure 1. The linearized dynamics of the TCLab and the equations to determine the steady-state values are provided in Appendix A.
Since the experiments with the TCLab are conducted at room temperature, we set T amb = 294.15 K = 21 • C and we choose T H,∞ = 313.15 K = 40 • C. Linearizing the dynamics around this steady state and discretizing the linearized equations with a sampling time of T s = 1 s leads to our linearized discrete-time model, Based on this linearized model, we want to design two different linear time-invariant output feedback controllers for the TCLab of the form where x c (k) ∈ R nc is the internal controller state and y(k) is the measurement received by the controller, whereỹ(k) = y(k) cannot be guaranteed due to the attacker (see Figure 1). The matrices A c , B c , and C c are matrices of appropriate dimension that we determine through the control design. We design two different controllers because we want to investigate the influence of different controllers on the attack. The controllers we design are an LQG controller and an LQG controller with integral action, subsequently called LQI controller. For the control design, we need to choose cost matrices for both the state and the input, as well as covariance matrices for the process and measurement noise. We use the guidelines provided by Athans (1971) to determine the cost matrices for the LQR problem and also the covariance matrices for the Kalman filter.
For the LQG controller, we choose state and control input cost matrices Q x = 10I 4 and R u = 2I 2 , respectively, such that the controller minimizes ∞ k=0 10x(k) ⊤ x(k) + 2u(k) ⊤ u(k). For the LQI controller, the system is extended by two integrator states, which ensure that y(k) converges to the reference value of zero, since y(k) represents the deviation of the output from the desired steady state. Therefore, we choose state and control input cost matrices as Q x,i = diag(10I 4 , 2I 2 ) and R u = 2I 2 , respectively, for the LQI controller such that the controller minimizes For the Kalman filter used in the LQG and LQI controllers, we set the process noise matrix as Σ w = 5I 4 . Further, we choose the measurement noise matrix as Σ v = I 2 . The matrices A c , B c , and C c are A c = A − BK − LC, B c = L, and C c = −K, for the LQG controller, where x c (k) is an estimatex(k) of the state x(k), K is the controller gain, and L is the steady-state Kalman filter gain. Further, for the LQI controller, we have is again an estimate of the state x(k) and x int (k) is the integrator state. Here, Kx and K int are the respective controller gains for the state estimate and the integrator to determine u(k). In Figure 2, the measurement trajectories of the TCLab over a period of 2700 s are shown when the LQG controller is used (upper plot) and when the LQI controller is used (lower plot). We do not show the first 900 s of the experiment, because they include the transient from room temperature to the desired temperature of 40 • C, which is indicated by a dash-dotted line in Figure 2. We observe that both controllers are able to maintain the temperatures around the desired steadystate temperature, but the LQI controller is closer to the desired steady state due to the integral action.

Anomaly detectors
Both the LQG and the LQI controllers produce a residual signal, where T c ∈ R nx×nc extracts the state estimatex(k) from the controller state. The residual is the difference between the measured and the predicted output at time step k. If our model for the TCLab is accurate around the steady state, the residual should be close to zero. The sample mean µ r and sample covariance Σ r of the residual in the nominal case shown in Figure  for the LQI controller. This shows us that the controllers are predicting y(k) with a high accuracy in steady state. Hence, the residual signal can be used as a way to determine if the plant is behaving nominally or anomalously. Therefore, we use the normalized residual r(k) = Σ − 1 2 rr (k) as the input to an anomaly detector 1 , which produces an output y D (k) ∈ R ≥0 .
The detector output is compared to a threshold J D > 0 and if the output exceeds the threshold, i.e., y D (k) > J D , an alarm is triggered and otherwise no alarm is triggered. When choosing the threshold J D there is a tradeoff between the false alarm rate and the ability to detect anomalies. Furthermore, Urbina et al. (2016) point out that there is also a trade-off between the number of false alarms and the impact of a stealthy attacker. However, since neither the anomaly, nor the attacker, are here known to the operator, the threshold is often chosen as the smallest value that achieves an acceptable false alarm rate.
In this work, we want to investigate both a stateless detector, which does not have internal dynamics, and a stateful detector, which has internal dynamics, because the detector type will also influence the attacker's impact on the closed-loop system. The stateless detector is a χ 2 detector defined as y D (k + 1) = r(k) ⊤ r(k).
Intuitively, this detector determines the size of the residual at the current time step and then compares it to a threshold. Murguia and Ruths (2016) give a closed-form solution for the χ 2 detector threshold J χ 2 D that achieves a desired average time between false alarms under the assumption that r(k) ∼ N (0, I 2 ) and independent and identically distributed (i.i.d.).
The stateful detector is a MEWMA detector, defined as where β ∈ (0, 1] and x D (0) = 0. Further, if y D (k + 1) > J D then the detector state is reset to zero, i.e., x D (k + 1) = 0. The MEWMA detectors filters the residual signal through a low-pass filter, before it takes the squared Euclidean norm to determine the detector output. For tuning the threshold of the MEWMA detector, J M D , no closed-form solution exists, but Runger and Prabhu (1996) propose a way to approximate the threshold that achieves a certain average time between false alarms under the assumption that r(k) ∼ N (0, I 2 ) and i.i.d.
For the thresholds, we choose J D = J χ 2 D = 5.9915 and J D = J M D = 4.3918 for the χ 2 detector and the MEWMA detector with β = 0.2, respectively. These thresholds result in an average time between false alarms of 20 time steps based on the assumption that r(k) ∼ N (0, I 2 ). This assumption does typically not hold in reality. Since tuning the threshold for general noise distributions is a non-trivial task and not the goal of this paper, we will use these thresholds from now on.

The sensor attack model
In this section, we introduce the sensor attack model. We begin by giving a short overview over the investigated sensor attack and its three stages. We also state the assumptions made on the attacker. Then we look into each of the stages of the attack in more detail and investigate how the theoretical results for each attack stage hold up in the experiment. Before that we make the following assumption.

Assumption 1
The system has reached its steady state before the attack begins.
In control system, we often want to achieve a certain steady state. For example, to achieve an optimal temperature for a chemical reaction or operate a centrifuge at a desired speed. Therefore, we assume that the system is in its steady state, when the attacker attacks. Furthermore, steady-state behavior is a common assumption in the literature on security of control systems.

Attacker model
Let us begin with a short overview on the attack procedure and introduce the attacker model.

Assumption 2
The attacker has full model knowledge about the linearized plant and controller dynamics and, therefore, knows the matrices A, B, C, A c , B c , C c , T c , and the covariance matrices, Σ w and Σ v , assumed by the operator. Furthermore, the attacker knows the detector used, its threshold J D , and the covariance matrix of the residual, Σ r . Moreover, the attacker is able to read and manipulate the measurements y(k) from k ≥ k I ≥ 0 on.
Assumption 2 shows that we consider a powerful attack that does not only have perfect knowledge about the plant, controller, and detector model used but is also able to manipulate and eavesdrop on the sensor measurements. This is in line with Shannon (1949) who argues that a system should be designed for the worst case and given enough time the attacker is able to obtain a perfect model of the plant and controller. Further, note that without loss of generality we set the time when the attacker enters the closed-loop system to k = k I . Furthermore, due to the ability to manipulate the sensor measurements, the controller receivesỹ(k) = y(k) + y a (k) instead of y(k) as shown in Figure 1, where y a (k) is a signal constructed by the attacker. Finally, observe that the attacker does not have access to real-time data from inside the controller and the detector, x c (k) and x D (k), respectively Assumption 3 The attacker wants to remain stealthy to the detector, i.e., y D (k + 1) ≤ J D once the attacker has started to inject a non-zero attack signal y a (k).
A detection of the attack by the system operator leads to counter measures against the attack, which is why the attacker wants to remain stealthy. Furthermore, if the attacker is only eavesdropping and has not yet injected a non-zero y a (k) the attack is impossible to detect by using the proposed anomaly detector. Due to Assumption 3, the attacker uses the following attack strategy (Murguia and Ruths, 2016), where T c x c (k) =x(k) is the estimate of the plant's state made by the operator. This attack strategy, if successfully executed, gives the attacker full control over the residual signal, i.e., r(k) = a(k), and, therefore, makes it possible for the attacker to remain stealthy. The attack (2) is a closed-loop attack, since it uses the measurements directly for the attack signal and a(k) can be seen as a reference signal that the attacker injects into the system. However, for a successful execution of (2), the attacker, in addition to Assumption 2, needs to know CT c x c (k), which cannot be known immediately when the attacker enters the system. Therefore, the first stage, Stage I, of the attack strategy is to estimate the quantity CT c x c (k) perfectly. This already shows us that although the attacker is powerful according to Assumption 2, it cannot immediately launch (2) when entering the system. Note that y a (k) = 0 in Stage I such that the attacker is only eavesdropping and does not need to be concerned with its stealthiness.
Once the attacker has managed to estimate CT c x c (k) it can launch the attack in (2) and take over full control of the detector input. Due to Assumption 3 the attacker needs to design a(k) in such a way that no alarm is triggered. In case the detector has an internal state x D (k), the attacker needs to estimate x D (k) as well. Otherwise, a certain detector state in combination with the input a(k) can trigger an alarm and the attacker will also be able to have a larger stealthy impact if it has knowledge of the detector state. Therefore, Stage II of the attack is to estimate the detector state. Finally, when both CT c x c (k) and the detector state are known to the attacker, it can design a trajectory for a(k) that maximizes the attack impact, which is defined in Section 3.4, while remaining stealthy. Injecting this attack trajectory is Stage III of the attack.
Next, we describe each of the three stages of the attack in more detail by presenting theoretical results for each stage and evaluate the results with our experimental setup. For the experiment, we assume that a MEWMA detector (1) is used to be able to illustrate all three stages of the attack. As in Figure 2, we do not show the first 900 s, because after that time the system has reached its steady state. In addition to that, the results we show in each of the stages for the LQG or LQI controller are from the same experiment for the respective controller. This, for example, means that the controller state estimate of Stage I and the detector state estimate of Stage II will be used in Stage III. Furthermore, we will denote the start time and length of each stage by k i and N i , respectively, where i ∈ {I, II, III}. Since the stages follow up on each other we have, k I = 900, k II = k I + N I − 1, and k III = k I + N I + N II − 1.

Stage I: Controller state estimation
In the first stage of the attack, the attacker needs to obtain a perfect estimate of the operator's predicted measurement, since otherwise it cannot launch the attack (2). Therefore, the attacker wants to determine an estimate,x c (k), of the controller state, x c (k), such that lim k→∞ x c (k) −x c (k) 2 = 0. We provided a necessary and sufficient condition for when the perfect estimation is possible in (Umsonst and Sandberg, 2021).
Theorem 1 (Umsonst and Sandberg (2021) be influenced by Gaussian process and measurement noise. Then under Assumption 2 and with knowledge of Σ w and Σ v , the attacker can perfectly estimate the controller state x c (k) if and only if ρ(A c ) ≤ 1, where A c is the controller's system matrix.
In addition to Theorem 1, we point out that if ρ(A c ) < 1 holds then the estimation is exponentially fast, and the attacker can use a non-optimal time-invariant observer and does not need to know the noise properties. As mentioned before, Theorem 1 was derived under the assumption of a linear closed-loop system under the influence of Gaussian noise. Therefore, we now want to determine if the result still holds in our experimental setup, where the plant dynamics are non-linear and the noise has an unknown distribution. Let e c (k) = x c (k) −x c (k) such that e c (k) ∞ denotes the maximum absolute estimation error of the controller state by the attacker. If we use the LQG controller, we have ρ(A c ) = 0.9449 < 1 such that the attacker is able to use a non-optimal observer to estimate the detector state exponentially fast. For the LQG case, the attacker uses an open-loop estimator of the form which guarantees an exponentially fast convergence to the true detector state. Therefore, Stage I in the LQG case is executed for 300 s in the experiment and e c (k) ∞ is depicted in the upper plot of Figure 3. We see that the convergence is indeed exponentially fast and after 300 s the estimation error is smaller than 2.3 · 10 −8 . Although the estimation error is not zero, we consider that the attacker has successfully estimated the controller state after Stage I in the LQG controller case. Next, we look at the LQI controller case, where we have ρ(A c ) = 1 due to the integral action. Therefore, the attacker cannot use the open-loop estimator (3) and needs to use a closed-loop estimate, which takes the plant dynamics and noise statistics into account as well. With perfect system knowledge the attacker can design a time-varying Kalman filter to estimate the controller state perfectly.
However, in our case, the attacker does not have perfect knowledge about the noise processes. For example, the covariance matrix of the process noise, Σ w , is chosen as Σ w = 5I 4 to obtain a certain control performance, but it is not guaranteed to be the correct covariance matrix for the process noise. We already see that the assumptions necessary for Theorem 1 do not hold in the experiment and the attacker is not able to design an optimal time-varying Kalman filter to estimate x c (k). Nevertheless, we want to investigate how the time-varying Kalman filter performs in estimating x c (k) for the chosen noise covariance matrices. Due to the integral action, exponential convergence is not guaranteed either and, therefore, Stage I is executed for N I = 1800 s in the LQI controller case. The maximum estimation error is shown as the solid line in the lower plot of Figure 3. The trajectory of the maximum estimation error in the LQI controller case is very different from the trajectory in the LQG controller case. For example, the maximum absolute error has not converged to zero after 1800 s and the error even increase again towards the end of Stage I. This shows us that the attacker is not able to obtain a perfect estimate of the controller state in our experiment. However, the attack (2) does not need a perfect estimate of the complete controller state, but only a perfect estimate of the residual. Hence, we take a look at the estimation error of the residual signal, , as well. In the lower plot of Figure 3 the maximum residual estimation error e r (k) ∞ scaled with a factor of 500 is shown by the dashed line. Here, we observe that e r (k) ∞ ∈ [0.03, 0.05] for all of Stage I. Because of this observation, we want to investigate if the stealthy attack is still possible and also consider the LQI controller case in Stage II and Stage III in the following.
We want to point out that the controller state estimation continues throughout Stage II and Stage III, because the attacker needs the true controller state to launch (2), but it does not have access to the controller state itself. Therefore, the estimation needs to continue throughout the whole attack sequence.

Stage II: Detector state estimation
Assume now that the attacker successfully managed to estimate the controller state. Then it can launch the attack in (2), which gives the attacker full control over the detector input r(k). However, if the detector has an internal state, x D (k), which can, for example, be reset, an inappropriately chosen a(k) can trigger an alarm and lead to the detection of the attacker. Therefore, in Stage II, the attacker tries to find an estimate,x D (k), such that x D (k) −x D (k) 2 → 0 as k → ∞. Due to the reset in the MEWMA detector, the attacker injects a carefully designed signal a(k) into the detector, which does not trigger an alarm and causes the estimatex D (k) to converge to the true state. We determine the follow-ing result on the length, N II , of Stage II to determine a certain accuracy γ for the detector state estimation error e D (k) = x D (k) −x D (k). (2019)) Assume that the attack executes its strategy (2) such that r(k) = a(k). If the MEWMA detector is used,

Proposition 1 (Umsonst and Sandberg
time steps independent of the value of x D (k II ).
In (Umsonst and Sandberg, 2019), we show that during Stage II the attacker is able to simultaneously estimate the detector state and inject a stealthy attack signal, which increases the expected value of the operator's estimation error, E{e(k)} = E{x(k)−x(k)} while mimicking the statistics of the detector output. The dynamics of E{e(k)} under the attack (2) are and we assume E{e(k II )} = 0 at the beginning of Stage II. The attacker can design a stealthy attack signal that mimics the detector output statistics by solving the following optimization problem each time step, max a(k) where k ≥ k II , y D,k+1 is a sample from a distribution with support [0, J(k+1)] that mimics the statistics of the detector output without triggering an alarm, and J(k) Furthermore, it is assumed that r(k) ∼ N (0, I 2 ), when drawing the sample y D,k+1 . Note that since E{e(k II )} = 0, the attacker knows all relevant signals for this optimization problem and can also solve it offline before the attack. More details can be found in (Umsonst and Sandberg, 2019).
From Proposition 1 we determine that x D (k) − x D (k) ≤ 1.6406 · 10 −12 , when N II = 120, β = 0.2 and J M D = 4.3918. Therefore, we let Stage II run for 120 s for both the LQG and the LQI controller case. The results of Stage II for the LQG controller case are shown in Figure 4, where we also included the last 100 s before the start of Stage II and the vertical dash-dotted line marks the beginning of Stage II. The upper plot shows the maximum detector state estimation error, e D (k) ∞ , where e D (k) = x D (k) −x D (k). We observe that before the start of Stage II the maximum detector state estimation error fluctuates a lot, since in this time periodx D (k) = 0. As soon as the attacker starts Stage II with its controller estimate from Stage I, e D (k) ∞ decreases exponentially fast to zero and we have that e D (k) ∞ ≤ 9.06 · 10 −13 at the end of Stage II. The  Fig. 4. For the LQG controller case, the maximum detector state estimation error is shown in the upper plot before and during Stage II of the attack. We observe that the estimation error decreases exponentially fast during Stage II. The lower plot shows the detector output before and during Stage II. During Stage II the output is still stochastic and never crosses the threshold, which comes from the attack design during Stage II.
lower plot of Figure 4 shows the MEWMA detector output before and during Stage II. Recall that the attack signal is designed in such a way that it mimics the detector output statistics under the assumption that the residual has a standard Gaussian distribution while not crossing the threshold. First, we note that the attacker successfully manages to remain stealthy and the detector output does not cross the threshold. Second, the detector output is still stochastic, but from a visual inspection it seems noisier during Stage II than before. The reason for that is that when designing the attack the assumption that r(k) ∼ N (0, I 2 ) does typically not hold in the experiment. This noisier behavior could make an observant system operator suspicious.
Next, we consider Stage II of the LQI controller case.
Recall that the attacker did not manage to successfully complete Stage I, but has a small estimation error for the part of the controller state that is needed to launch (2). Therefore, it will not have complete control over the input to the detector as in the case of the LQG controller. This uncertainty leads to very different outcomes than in the LQG controller case as seen in Figure 5. The upper plot in Figure 5 shows again the maximum estimation error before and during Stage II, where the beginning of Stage II is marked by the vertical dash-dotted line. During Stage II, we observe that the estimation error initially decreases, but then increases in a spike just to decrease again. Further, the detector state estimation error at the end of Stage II did not decrease to zero and the maximum estimation error equals 0.1171. The spikes in the detector state estimation error can be explained by looking at the detector output during Stage II (lower plot in Figure 5). We observe that the detector output in Stage II crosses the threshold, which leads to a reset of the detector state. This reset explains the sharp increase of the maximum detector state estimation error. Recall that the attacker does neither have access to x D (k) nor to y D (k), so it does not know when the reset is actually happening. So we see that in the LQI controller case the uncertainty in the attacker's controller state estimate leads to a detection of the attacker during the second stage of the attack.

Stage III: Targeting the plant
In the third and final stage of the attack, the attacker launches its stealthy attack that maximizes the impact on the plant while remaining stealthy. Here, the attacker wants to maximize the expected value of the plant's state E{x(N a )} at the end of attack Stage III, where N a = k I +N I +N II +N III marks the end of the third stage of the attack. Let a ∈ R NIIIny be the complete attack trajectory during Stage III, then E{x(N a )} = T xa a under the assumption of linear dynamics for both the plant and the controller, where T xa ∈ R nx×NIIIny describes the influence of the attack trajectory on the expected plant's state at the end of the attack. In (Umsonst et al., 2017), we pose a convex optimization problem that lets us estimate the worst-case attack impact of a stealthy attack. Under the assumption of linear plant and controller dynamics the problem of worst-case impact estimation is formulated as where k ∈ {k III , . . . , k III + N III − 1}. Note that this optimization problem assumes that at the beginning of the third stage of the attack x(k) = 0, x c (k) = 0, and x D (0) = 0, if the detector has an internal state. Due to Assumption 1, assuming x(k) = 0 and x c (k) = 0 is not a strict assumption, because in steady state these values should be close to zero. Furthermore, the change in the states due to Stage II is typically small, because of the short time during which Stage II is executed. Moreover, since at the beginning of Stage III we assumed x D (k) = 0 when determining the worst-case impact, we set the first attack signal in Stage III as which removes the influence of the detector state at the beginning of the third stage of the attack, so that this assumption is fulfilled. Finally, we set the attack length for Stage III to N III = 1800 s, which is equivalent to a worst-case attack that lasts for half an hour. With that, we want to evaluate if the impact on the real plant in the experiment, the TCLab, is the same as the theoretical estimation of the worst-case attack impact provided by (4).
We begin with the LQG controller case, for which the attacker could successfully complete Stage I and Stage II. Therefore, the attacker has now all the information needed to launch a worst-case attack. When solving the worst-case impact estimation problem (4) under the assumption of linearity, we obtain that the theoretical maximum increase in temperature due to the attack is 4.79 • C and the attack will focus on the temperature corresponding to the first measurement. The measurements of the TCLab, the input to the TCLab, and the detector output 300 s before Stage II, during Stage II, and during Stage III are shown in Figure 6. The lower plot of Figure 6 shows the detector output during the attack and we observe that the attacker is able to remain stealthy during the attack, since no alarm is triggered. However, we also observe that the detector output approaches a constant value equal to the threshold in Stage III. Therefore, the attack itself is easily detected by a visual inspection of the detector output. The reason for this constant detector output is that the constraints in the worst-case impact estimation problem (4) only enforce stealthiness, and not, for example, mimicry of the detector output statistics. Furthermore, by looking at the input to the plant (center plot in Figure 6), we note that the attack in Stage III will lead to a constant input to the plant, such that it reaches a new steady state. We also observe that in Stage II the input also increased slightly, due to the attack design, which maximizes the operator's estimation error. Stage II is, however, too short to have a significant impact in the LQG controller case. Finally, the outputs of the TCLab are shown in the upper plot of Figure 6. As expected the attack targets the temperature corresponding to the first measurement (solid line). The temperature in that heater increases from an average of 39.73 • C before the attack to an average of 43.37 • C. This is an increase of 3.64 • C, which is approximately 1.15 • C smaller than the estimated worst-case attack impact.
Next, we present the attack for Stage III when an LQI controller is used. Since the attack in Stage II is based on samples of the truncated detector output distribution and the attacker could theoretically remain undetected in Stage II. When solving (4) in the LQI controller case, we obtain that the theoretical worst-case impact is a change of 9.13 • C from the steady state before the attack and the attacker targets the temperature of the second heater in this case. This demonstrates already two things before looking at the results from the experiment. First, the attacker will target a different state depending on the controller used. Second, the use of an LQI controller increases the theoretically possible worst-case impact by almost a factor of two 2 . Intuitively, if the attacker feeds a constant signal into the LQG controller, the controller output converges to a constant signal as well, while the LQI conroller will integrate the constant signal up, such that the impact increases compared to the LQG controller.
The trajectory of the true TCLab measurements, the control input, and the detector output 300 s before Stage II, during Stage II, and during Stage III are shown in Figure 7. We start by investigating the temperature increase during Stage III shown in the upper plot of Figure 7. We observe that the temperature of the second heater increases constantly from an average of 40 • C before the attack to 54.06 • C at the end of the attack. This is a temperature increase of 14.06 • C, which is ap-  Fig. 7. This figure shows several closed-loop system trajectories before Stage II, during Stage II and during Stage III of the attack for the LQI controller case. The beginning of Stage II is marked by the first vertical dash-dotted line from the left and the beginning of Stage III is marked by the second vertical dash-dotted line. The upper plot shows the two measurements taken from the TCLab with the desired steady-state temperature as a dash-dotted horizontal line, the center plot shows the power applied to the heaters, and the lower plot shows the detector output with the threshold as a horizontal dash-dotted line. These trajectories show that the attacker is not able to remain stealthy, but has a larger impact on the system than in the LQG case.
proximately 5 • C larger then the theoretical worst-case impact. Part of this increase compared to the theoretical value is also due to the attack during Stage II. When Stage III starts the temperature has already risen from 40 • C to around 42.5 • C. Discounting that increase the experimental impact is only approximately 2.5 • C larger then the theoretical worst-case impact. Other than in the LQG controller case, the actual attack impact is larger than but still close to the estimated one. The center plot of Figure 7 shows the trajectory of the input to the TCLab. We note that the input during Stage III seems to be linearly increasing, which is a combination of an almost constant attack signal at the beginning of Stage III and the integral action in the controller. We also observe the large increase in the control input during Stage II, which could explain the crossings of the detector threshold during Stage II. While the impact of the attack in the LQI controller case is more severe compared to the LQG case, we would like to remind the reader that this attack can already be detected in Stage II. Furthermore, the detector output has an oscillating behavior in Stage III, where it repeatedly crosses the threshold and is reset to zero (see lower plot in Figure 7). Therefore, we think that even if the attacker managed to stealthily execute Stage II, its attack strategy in Stage III would lead to a quick detection.

Discussion of defense mechanisms
In this section, we want to discuss the results we obtained from the experiments and how the choice of the controller and the detector can be used as a defense mechanism. Furthermore, we investigate active noise injection into the controller dynamics to prevent controller state estimation.

Choice of controller and detector
We begin by discussing how the choice of controller affects the sensor attack. From Theorem 1, we know that a controller with poles inside or on the unit circle enables the attacker to estimate the controller state perfectly. However, this result is obtained under the assumption of linear plant dynamics and Gaussian noises. In Section 3.2, we showed that in the LQG controller case the attacker is able to estimate the controller state exponentially fast, but in the LQI controller state it is not clear if the attacker can perfectly estimate the controller state. Even after a six times longer time horizon for Stage I than in the LQG controller case, the estimation error is not close to zero. The lack of exact knowledge of the controller state leads to a detection by the detector in later stages of the attack as shown in Figure 5 and Figure 7. Therefore, we conclude that if a controller with stable dynamics is used the attacker is able to execute all three stages of the attack without being detected. If the controller has eigenvalues on the unit circle, this is not possible and will lead to a quick detection. Luckily, controllers in practice often include integral action to take care of disturbances and reach a desired steady state. Hence, the integral action of a controller can lead to a better protection of the system. However, if the attacker manages via another way to obtain a perfect estimate of the controller state the worst-case stealthy attack impact in the LQI controller case can be much larger than in the LQG controller case. In addition to that, the experimental results in Urbina et al. (2016) also show that having an integral part makes sensor attacks worse, but can help mitigate actuator attacks. A theoretical insight into why if an integrator is present in the controller, sensor attacks have a large impact, while actuator attacks are mitigated is provided by Sandberg (2021). Therefore, it is always important to consider several attack strategies, when evaluating the resilience of the closedloop system against attacks.
Next, we want to investigate how the choice of detector influences the attack impact. A metric for detector comparison, which takes the attack impact and the time between false alarms into account, is proposed by Urbina et al. (2016). A detector is considered better than another detector if using one detector results in a lower stealthy attack impact than using another detector, while guaranteeing the same amount of false alarms. Therefore, the detector choice can also be seen as a defense mechanism. Here, we will compare the attack impact on the TCLab when an LQG controller and a χ 2 detector are used, with the attack impact when a MEWMA detector is used. Recall that both detector thresholds are tuned to achieve the same average time between false alarms under the assumption of r(k) ∼ N (0, I 2 ). Since the LQI controller prevents the attacker from a stealthy attack, we will not consider the case of an LQI controller in this comparison.
The χ 2 detector does not have an internal state such that the attacker does not need to execute Stage II of the attack. Therefore, one could argue that the MEWMA detector has already an advantage over the χ 2 detector. However, the execution of Stage II does not take a long time, N II = 120 s in Figure 4. Hence, the extra stage does not seem to be a big inconvenience for the attacker. Figure 8 shows trajectories of the true measurements of the TCLab, of the input to the TCLab, and of the detector output for Stage I and Stage III when an LQG controller and a χ 2 detector are used. As in the case with a MEWMA detector, the attacker will target the first heater in its attack, when solving (4), and the theoretical impact is 16.7881 • C, which is around 3.5 times larger than when a MEWMA detector is used. This already shows us that under the χ 2 detector the attacker is able to launch a stronger attack. However, other than in the MEWMA case, the actual impact on the TCLab is larger than the theoretical impact. The average temperature of around 40.4 • C before the attack increases to an average temperature of 58.5 • C for Heater 1 at the end of the attack (see upper plot of Figure 8), which is approximately 1.7 • C degrees larger than the theoretical impact. Our estimate of the worst-case impact is again very close to the true impact though. Furthermore, the lower plot of Figure 8 shows us that the attack is still stealthy. However, as with the MEWMA detector, a vi-sual examination of the detector output would still raise the suspicion of a system operator.
This leads us to conclude that the detector choice can also be a good defense mechanism, since for the LQG controller case we see that the average temperature under the χ 2 detector is approximately 5 times larger than when the MEWMA detector is used, where both detector thresholds are tuned to the same mean time between false alarms.

Noise injection
Finally, we want to discuss noise injection to prevent the attacker from successfully executing Stage I of the attack when a controller is used that has no poles outside of the unit circle. The basic idea, as proposed in (Umsonst and Sandberg, 2021), is to add an additional noise signal ν(k) ∼ N (0, Σ ν ) to the controller state dynamics, i.e., Since the attacker does not have any access to the noise signal ν(k), the noise prevents the attacker from obtaining a perfect estimate of the controller state. Therefore, the attacker will not be able to remain stealthy during Stage II and Stage III of the attack similar to the LQI controller case discussed previously. A disadvantage of injecting additional noise into the closed-loop is the performance degradation. The performance degradation will affect the steady-state behavior and controller output by making them noisier, which might damage certain actuators if the signal becomes too noisy. By changing the normalization matrix of the residual from Σ − 1 2 r to (Σ r + CT c Σ ν C T T T c ) − 1 2 , the additional noise does not affect the detector performance under nominal conditions. Hence, in the case of Gaussian process and measurement noise, the mean time between false alarms will not change with the additional noise and this adjusted normalization. In line with Assumption 2, we assume the following.
Assumption 4 The attacker knows the distribution of ν(k).
Since the LQI controller has an inherent protection against the estimation of the controller state, we look at the LQG controller case when a χ 2 detector is used. The experiment has the following time line. From 900 s to 1199 s the noise injection is not used and from 1200 s on, we start the noise injection with Σ ν = 0.01I 4 . This is to investigate how the injected noise influences the nominal behavior of the closed-loop system. Furthermore, the attacker executes Stage I in the interval [1200, 1499] s, i.e., k I = 1200 and N I = 300, to demonstrate how the Fig. 9. The upper plot shows the temperature measurements of the TCLab before the noise injection and during the attack. The lower plot shows the heater power applied to the TCLab before the noise injection and during the attack. We observe that the noise injection does not influence the nominal behavior significantly, while the attack has a similar impact than to the case without noise injection.
noise injection prevents the controller state estimation. Then from 1500 s on the attacker injects the stealthy attack signal of Stage III for N III = 900 s. When solving (4) with the new normalization for the residual, the estimated worst-case stealthy attack impact is 18.2846, which is larger than in the case without the noise injection. However, recall that the injected noise should lead to an early detection of the attack. Figure 9 shows the measured temperature (upper plot) and the heater power (lower plot) of the noise injection simulation. The first vertical dash-dotted line from the left marks the start of Stage I and the noise injection, while the second line marks the beginning of Stage III. We start by noting that the heater power becomes slightly more noisy once the noise injection begins, while the temperature measurements are not much more noisy. The overall increase in the noise level in the nominal case is very low, so we conclude that the noise injection does not degrade the system performance much. In Stage III of the attack, the average temperature of Heater 1 increases from an average of around 40.4 • C to an average of around 58.2 • C, which is a very similar increase to the LQG controller case without noise injection under a χ 2 detector. Now that we looked at the effect of the noise injection on the nominal behavior and the attack impact, let us investigate the effect on Stage I and the stealthiness in Stage III of the attack. The upper plot of Figure 10 shows the maximum absolute controller state estimation error e c (k) ∞ during Stage I. Other than in the case without noise injection (upper plot Figure 3) the error does not exponentially converge to zero and there are no signs of a decreasing trend in the error. The error e c (k) is, however, unbiased, which means that on average the estimate of the controller state is correct. If the attacker now launches Stage III of the attack assuming it has a perfect estimate, the detector output will almost immediately cross the threshold as shown in the lower plot of Figure 10. The average value of the detector output during Stage III is 6.3433, which is larger than the threshold J D = 5.9915. Therefore, we see that the noise injection will lead to a detection of the stealthy attack in Stage III. Furthermore, we point out that before the attacker injects an additional signal the detector output before and after the noise injection does not look visually different. Hence, we can conclude that the injection of additional noise to the controller dynamics leads to a detection of the worst-case stealthy attack, while not degrading the performance of the closed-loop system and the detector significantly.
Remark 1 In (Umsonst and Sandberg, 2021) we propose a convex semi-definite program to obtain an optimal noise covariance matrix Σ ν that takes the performance degradation into account. However, this optimal noise distribution is under the assumption of a linear system, Gaussian noise, and exact knowledge about the process and measurement noise covariance matrices. Since these assumptions do not hold in the experiment, we only demonstrated the effectiveness of the noise injection in revealing stealthy attacks in this section without taking the optimality of Σ ν into account.

Conclusion
In this work, we investigate a sensor attack on a feedback system in an experimental setup. The attack consists of three stages, where the first two stages are a preparation for the third stage by estimating internal loop signals such as the controller and the detector state. In the third stage, the attacker launches its worst-case attack that remains stealthy to the detector and achieves the largest impact on the plant. With the experiment, we evaluate if the theoretical results of each stage hold when using real data. We observe that each of the stages can be successfully completed when a controller with stable dynamics is used, such as the LQG controller. Furthermore, the theoretically estimated worst-case impacts are close to the actual worst-case impact obtained from the experiment. By simply adding an integral part to the controller, which is common in practice, the attacker is not able to complete the first stage of the attack exponentially fast. Further, without completing the first stage, the attack is detected in the following stages as well in the case of a controller with integral action. However, the attack impact increases when an integral action is used such as in the LQI controller, which means that if estimating the controller state can be achieved through other means than the ones investigated here, using an integral action can degrade the system performance under the attack. In case integral action is not a sufficient defense mechanism, we evaluate the noise injection into the controller dynamics as a defense mechanism. We showed that noise injection prevents the attacker from completing the first stage as well, while only slightly degrading the system performance. In addition to that, we also investigate how the attack impact depends on the detector used and show that the stealthy worst-case impact is smaller when a detector with an internal state is used.
For future research directions, we want to look into different attacker objectives and investigate if detector dynamics can increase the worst-case impact for certain objectives. Furthermore, the second stage is only examined for detectors with linear dynamics such that we want to investigate if there are detector dynamics, which prevent the attacker from estimating the state of the detector. Finally, we would like to extend the experimental setup to a more sophisticated testbed for cyber-physical security, where, for example, the sensor and actuator measurements are transmitted wirelessly. In this testbed more aspects of cyber-physical security could be tested than only the physical impact on the plant.

Acknowledgments
This work is supported in part by the Swedish Research Council (grant 2016-00861) and the Swedish Civil Contingencies Agency (grant MSB 2020-09672).

A Appendix
This appendix includes a more detailed exposition of the modeling and the identification of the Temperature Control Lab as well as the controller design.

A.1 Temperature Control Lab
The Temperature Control Lab (TCLab) is an Arduinobased process with two heaters and two sensors to measure the heater temperatures (see Figure A.1). The heaters are close to each other such that there exists a coupling between the heater temperatures. Further, we are able to set the power outputs Q 1 for Heater 1 and Q 2 for Heater 2 to a value between 0 % and 100 %. Several models to describe the TCLab dynamics are investigated in Park et al. (2020) and the model that is deemed the most accurate is the following physics-based model, (A.1) Here, T H,1 (t), T H,2 (t), and T amb are the temperature of Heater 1, of Heater 2, and the ambient temperature, respectively. Further, T S,1 (t) and T S,2 (t) are the sensor measurements of Heater 1 and Heater 2, respectively. The temperature unit in (A.1) is Kelvin, while in the subsequent plots we use degree Celsius to display the temperature measurement. Table A.1 shows the parameters and their (range of) values of the physical model. Note that for some parameters in Table A.1 the value is specified by an interval. This means these parameters need to be estimated, which we will do next.
The code to interact with the TCLab and for parameter estimation can be found at APmonitor (2021). The code from APmonitor (2021) is adjusted to estimate both τ c,1 and τ c,2 instead of assuming that τ c,1 = τ c,2 as in Park et al. (2020). To estimate the unknown parameters, we apply piecewise constant control inputs to the TCLab and use the measured output signals for parameter estimation (see Figure A.2), where we set the ambient temperature, T amb , to be the average of the first measurement of the two temperature sensors. With the control inputs and the sensor measurements we want to determine α 1 , α 2 , U , U s , τ c,1 , and τ c,2 by solving the following optimization problem as in the code provided in where t i are the time points at which we measured the temperature, N id is the number of measurements, and T S,1meas (t i ) and T S,2meas (t i ) are the measured temperatures for the sensors of Heater 1 and Heater 2, respectively. To estimate the unknown parameters, we collected data for thirty minutes, i.e., N id = 1800, and used a sampling time of one second, such that t i − t i−1 = 1 s (see Figure A.2). With the data in Figure A.2, the parameters that minimizes the objective in (A.2) are α 1 = 0.00854 W % , α 2 = 0.00480 W % , U = 4.05 W m 2 K , U s = 26.44 W m 2 K , τ c,1 = 25.16 s, and τ c,2 = 22.50 s. The value of the objective function with the data from Figure A.2 is 1.46. Finally, to validate our estimated model parameters, we recorded ten more minutes, i.e., N id = 600, of data and compared it with our the output of the theoretical model with the estimated parameters (see Fig-

A.2 Controller design
Now that we introduced the TCLab, its dynamics, and identified the parameters, we want to control the TCLab around a certain temperature. For that we design a linear controller, which utilizes a linear discrete-time model of the TCLab. We would like to point out that we want to design a controller that performs satisfactorily around the steady state and we do not consider more constraints on the rise time and the settling time, for example.
To find a linear model, we linearize the non-linear physics-based dynamics (A.1) around a steady-state value. We observe that lim t→∞ T S,i (t) = lim t→∞ T H,i (t) = T H,i∞ for i ∈ {1, 2}, thus we only need to find the steady-state values of the heater temperatures and the heater output, denoted as T H,1∞ , T H,2∞ , Q 1∞ , and Q 2∞ , respectively. The equations to find the steady-state values are given by where we assume that the ambient temperature T amb is known and fixed. Alternatively, we can also add the two steady-state equation and also subtract them from each other to obtain the subsequent steady state equations, Now, we can fix the desired steady-state temperature T H1,∞ and T H2,∞ , and solve for Q 1∞ and Q 2∞ . With T H1,∞ = T H2,∞ = T H,∞ we derive that Q 2∞ = α1 α2 Q 1∞ and are the necessary steady-state inputs to achieve T H1,∞ = T H2,∞ = T H,∞ . Note that we need T H,∞ ≥ T amb to ensure that both Q 1∞ ≥ 0 and Q 2∞ ≥ 0.
Figure A.4 shows a steady-state input that theoretically results in the temperatures being the same value, where the ambient temperature is T amb = 21 • C. In this figure, we use T H,∞ = 40 • C such that the required steady-state inputs are Q 1∞ = 21.73 % and Q 2∞ = 38.72 %. For k ≥ 1000 the average temperature for the first heater after the transient is approximate 40 • C, while the second heater reaches 41.12 • C. Hence, we observe that Heater 1 reaches the desired steady-state temperature of 40 • C, while the second heater is around 1.2 • C warmer than 40 • C. We, further, note that the heaters do not reach the same temperature. Despite performing well with the validation data as shown in Figure A.3, we observe here that our estimated model parameters do not seem to be able to model the steady-state dynamics well. Hence, our theoretical model does not describe the TCLab perfectly.
Next, we look into the feedback controller design to achieve a better control around the desired steady-state value than the feedforward injection of the steady-state values.
,∞ ] ⊤ , and we did not linearize around the ambient temperature since it is assumed to be constant. Note that with the assumption of a constant ambient temperature, the linearized model does not depend on the ambient temperature itself.
Wanting to reach T H1,∞ = T H2,∞ = 40 • C, and discretizing the linearized equations with a sampling time of T s = 1 s leads to our linearized discrete-time model, Next, we design two linear controllers for the TCLab based on the linearized discrete-time model, which should control the TCLab around its steady state. In the following, we set T amb = 21 • C, since the experiments are conducted at room temperature. The closed-loop block diagram is shown in Figure A.5.
The first controller we design is a linear quadratic Gaussian controller (LQG controller), which is a combination of a linear quadratic regulator and a Kalman filter. The linear quadratic regulator designs a statefeedback control law such that the cost function, where Q x ∈ R 4×4 and R u ∈ R 2×2 are cost matrices that penalize the state and the controller input, respectively. Since we can only obtain a noisy measurement y(k), a steady-state Kalman filter is used to estimate the state  x(k). The Kalman filter minimizes the mean square error, E{ x(k) −x(k) 2 2 }, wherex(k) is the Kalman filter's estimate of the plant's state. Therefore, the controller input is given by u(k) = −Kx(k) and the optimal controller gain K is given by and P is the solution to the Riccati equation The steady-state Kalman filter dynamics arê where L is the steady-state Kalman gain given by and P L is the solution to the Riccati equation Here, Σ w ∈ R 4×4 and Σ v ∈ R 2×2 are the covariance matrices of the process noise affecting the linear system dynamics and of the measurement noise affecting the measurements.
When designing the LQG controller, the parameters Q x , R u , Σ w , and Σ v are our design variables. We use the guidelines provided by Athans (1971) to tune these design variables. Since the Kalman filter estimate is used to determine the control signal, it is advised to not choose the matrices, Q x and R u , for the LQR design and the noise covariance matrices, Σ w and Σ v , for the Kalman filter design independently Athans (1971).
Since the theoretical steady state is not reached when applying the steady-state input (see Figure A.4), we want the controller to have a tighter control around the steady state and therefore we choose a larger cost matrix for the state in the LQR problem. Since the steady-state control input is relatively small, we also decide to not penalize the control input with a large cost matrix, which leads to the following choice of state and control input cost matrices, Q x = 10I 4 and R u = 2I 2 , respectively. For the Kalman filter, the value of the process noise covariance matrix can be interpreted as how much we trust the theoretical model. Hence, a large process noise covariance matrix compared to the measurement noise covariance matrix means that we trust the sensor measurements more than our model. Since the theoretical result does not match the real steady-state value in Figure A.4, we set the process noise matrix as Σ w = 5I 4 . Further, we know that the sensors have an accuracy of ±1 • C, so we choose the measurement noise matrix as Σ v = I 2 .
We test the designed LQG controller in the following. We apply the steady-state input from t ≥ 0 s on and activate the controller at the same time. The trajectories of the temperature measurements and the heater power can be seen in Figure A.6. We observe that the controller leads to a spike in the heater power, which leads to a faster convergence close to the steady-state value than in Figure A.4. Furthermore, the spike in the control input does not exceed the maximum control input of 100 %. After reaching their steady state both temperatures remain at an average temperature of approximately 40 • C for t ∈ [900, 3600] s, which shows us that the controller successfully manages to keep the temperatures at the desired steady-state temperature. The mean value of the heater power for t ∈ [900, 3600] s is 21.73 % for Heater 1 and 38.71 % for Heater 2, which is close the steady-state inputs Q 1,∞ = 21.73 % and Q 2,∞ = 38.71 %.
Next, we add an integral action to the LQG controller. The LQG controller with integral action is subsequently called LQI controller. The control input of the LQI controller is given by where x int (k) ∈ R ny is the state of the integrator and d(k) is the desired reference, which the output y(k) should track. Since y(k) represents the deviation from the reference steady-state temperature, it should be close to zero such that d(k) = 0 is used in the following. To design the controller matrix K int , we use again equations (A.4) and (A.5), but we use A aug and B aug instead of A and B, where −T s C I and B aug = B 0 , are the system and input matrix of the plant augmented with the integrator state. Since the augmented system has two more states, we need to change the cost matrix Q x in (A.5) as well. The new cost matrix for the augmented system state is chosen as Q x,int = diag(10I 4 , 2I 2 ), while we still use R u = 2I 2 for the cost matrix of the controller input. Since we have access to the integrator states, we do not need to re-calculate the observer gain L. Using the LQI controller, we obtain temperature trajectories that reach the steady state (see Figure A.7). Initially the controller leads to an oscillatory behavior until the steady state is reached. The reason for these oscillations is the large deviation from the steady-state value at the beginning of the experiment, which integrates up and results in an integrator windup. Anti-windup schemes can present a remedy to the oscillatory behavior but are not investigated here. For more information on anti-windup schemes we refer the reader to Galeani et al. (2009). Furthermore, we see that in the beginning the heater power should have values below 0 % and above 100 %, but the saturation of the heater power limits the values to the interval between 0 % and above 100 %. Once the TCLab reaches temperatures around the desired steady-state value, we see that the control input is inside the interval between 0 % and above 100 % and that both heaters have the same temperature. The average heater power in the interval [900, 3600] s is 21.54 % and 43.24 %, for Heater 1 and Heater 2, respectively. We see that the average heater power during steady state for Heater 2 is larger than the average control input for Heater 2 when using the LQG controller in Figure A.6, while the heater power for Heater 1 is similar. The reason for that is that the LQI controller can adjust to changes in, for example, the ambient temperature due to the integral action. Therefore, the LQI controller is able to keep the temperatures at the desired steady-state value.