Linear quadratic Gaussian control with advanced continuous-time disturbance models for building thermal regulation

This paper introduces a linear quadratic control scheme for a continuous-time system with observations taken at discrete times. Particular attention is given to the derivation of the disturbance terms in the model. Control performance may depend critical on accurate disturbance forecasts. This is the case for building climate control, where solar rays pass through e.g. windows and deliver significant amounts of energy and the dynamics can be very fast, fluctuating, and spontaneous. We thus argue that it is critical for control performance to sufficiently describe and include disturbances in the control description to obtain satisfactory control accuracy. We suggest and derive in details a control framework based on continuous-time stochastic differential equations (SDEs) and linear quadratic Gaussian control using an advanced continuous-time disturbance model to supply disturbance forecasts. The numerical simulation results suggest that control with embedded forecasts handles uncertainties well and provides up to 26% performance improvements compared to standard disturbance mitigation techniques. Furthermore, we demonstrate that the quadratic controller has a useful trade-off between variability in the control signal, economic cost, and variability around the reference point.


Introduction
As the share of renewable energy in the electricity grid continues to grow, related problems become more prominent.Among these problems are misalignments in energy production and consumption, voltage overload, congestion, etc.The traditional solution up to modern times has been to control the production such that the electricity demand is covered.However, in an efficient implementation of future weatherdriven energy systems, this is no longer an option.One solution is to focus on demand-response methodologies to unlock and control the flexibility at the consumer side [1,2].A lot of recent work therefore centre around the concept known as energy flexibility [3,4].The idea is to control the demand to align it with the production by utilising the inherent energy storage in households and buildings (such as thermal mass or stationary batteries), see e.g.[5] for an introduction and specific application examples.A key technology for enabling this solution is sophisticated predictive control of buildings.A well described energy flexibility setup involves a two-level control structure, where the upperlevel controller (e.g. for voltage) computes a price signal which shifts the overall energy demand of the lower level controllers (e.g.individual buildings) to times where the CO2-concentration in the electricity mix is lower [2,6].
Linear quadratic regulation (LQR) is a very well studied and applied control scheme due to its optimal linear control law and simplicity [7,8].The linear quadratic Gaussian (LQG) regulator extends the LQR scheme by also considering Gaussian system noise.Hence, LQG deals with the stochastic case of LQR, which may be closer to reality [9].Disturbances also constitute an important aspect of control [10].The literature identifies the solar radiation and ambient air temperature as critical factors dictating the heating and cooling needs for buildings [11,12].Model-based predictive control (MPC) for smart energy systems is an active research area with many examples of e.g.conventional [13], stochastic [14] and robust MPC [15].The use of LQG control in the literature is not as widely used for HVAC systems where often the simpler LQR scheme is used instead [16][17][18][19][20]. Furthermore, the common standard for mitigating disturbances in control (such as weather disturbances for buildings) is to apply offset-free regulation, see e.g.Errouissi et al. [21] or Taylor et al. [22] for applications in building climate control.Here, one introduces an additional integrating state that is able to cancel out a constant disturbance.LQG optimal control minimises the quadratic deviation of a linear transformation of the states to a set-point.Thus, compared to certainty-equivalent linear control (such as economic MPC with linear costs on constraint violations), LQG control inherently deals with uncertainty in a different manner.The quadratic penalty steers the mean value of the system state into the reference signal.Linear cost on the other hand steers the https://doi.org/10.1016/j.apenergy.2022.120086Received 28 February 2022; Received in revised form 22 September 2022; Accepted 30 September 2022 median onto the reference point [23].We discuss the differences and their consequences in more detail later.
For building climate control that uses weather forecasts, one of the dominating standards is to use meteorological forecasts [24].Such forecasts have the advantage of giving relatively accurate forecasts potentially several days ahead.However, they typically do not include solar radiation forecasts and tend to be less accurate for short-term purposes [25,26].Many examples of short-term forecasting models exist in the literature, such as linear regression-based models [27], artificial neural networks [28], and more advanced time series techniques [29].Thilker et al. [10] propose an SDE-based model for predicting key attributes such as the solar radiation and outdoor air temperature.This form has the advantage that it fits naturally in the building model description (which is also formulated using SDEs).This paper shows the benefits of using an embedded model for disturbance forecasts in a LQG control setup for a residential building and highlights possible savings.

Main aim and contribution of the paper
This paper discusses the role of disturbances in dynamical systems in great details and relates it to building thermal control.We derive a zero-order hold discretised system and optimal control problem from the continuous-time system model of a building undergoing disturbances from the ambient air temperature and solar radiation.Furthermore, this paper develops a LQG control framework for a building thermal model with an embedded disturbance model that supplies short-term weather forecasts.We carry out a simulation study suggesting that the proposed LQG controller with embedded disturbance model performs almost as good as a controller that uses perfect disturbance forecasts.

Structure and outline of the paper
This paper has the following structure.Section 2 describes the standard LQG control scheme and derives the optimal control law and state estimate.Section 3 presents, motivates and derives the optimal control problem for the extended multi-step LQG control, which also includes the electricity prices and accounts for the weather disturbances.Section 5 presents the numerical simulation-based results and Section 6 summarises the findings and discusses future work and possibilities.

Continuous-time LQG control with embedded disturbance forecasts
This section introduces the continuous-time linear quadratic control problem.We start by describing the importance of conditional expectations of the future in predictive control.We also deal with the discretisation in time of the dynamics and the objective function and how to deal with disturbances in such discretisation.Zero-order hold is the dominating discretisation standard.However, such discretisation introduces additional error terms that may affect the performance, and thus are important to characterise.In this section, we discuss errors and their effects on the system.
In this paper, we denote time dependence in subscript () =   and we use   as short-hand notation for    .This paper deals with continuous-discrete time state space models on the following form [30] d ⊺  ] ⊺ .The weather is, however, governed by strong non-linearities and cannot be treated as a linear system, which is why it is necessary to uncouple them.Section 4.2 elaborates on this.

Relation between forecasting and control
In the following, let   be a continuous-time stochastic process (not necessarily the same as in Eq. ( 1)).The following term measures the expected -order moment between   and a signal where ‖ ⋅ ‖  is the usual -norm of vectors,  ≥ 1, and   = { 0 ,  1 , … ,   } is all historic observations up till time   .Putting  = 2, we obtain the non-central second order moment.Such an objective measures the variance of a process.Now, let   be controllable in the sense given in [31].Given information until time   , we can write the minimum-variance optimal control problem as The optimal minimiser of the minimum-variance problem (3) can be shown to be the conditional expectation, denotes the conditional variance with respect to   .The above attains its minimum exactly for   = E[  |  ].This implies that we need to use conditional expectations to solve stochastic control problems with a quadratic cost function as in (3).Hence, we need to estimate the following For linear systems with Gaussian noise, the Kalman filter is an optimal estimator of the conditional expectation.In a discrete form (which we derive in Section 2.2), the continuous-time system in Eq. ( 1) can be written Since the conditional expectation of the system state depends on disturbances,   , we also need forecasts of the conditional expectations of these to do optimal predictive control.That is, we need to supply the following forecasts of the disturbances to the optimal control problem Note that putting  = 1 in (2) yields the absolute value, for which the median of   is the optimal minimiser.By tilting the absolute value function appropriately, certain quantiles become the minimiser.This knowledge can be very useful in forming the right objective function to mimic the desired behaviour of the system.It might be that one side of the reference trajectory is a very expensive operational area, and thus shaping the objective to make the reference trajectory align with the 99% quantile may be appropriate.
The quadratic objective function thus measures the squared norm of deviation between the conditional expectation of the stochastic process and the signal   .However, as we shall also do in the present paper, it is very common to include additional terms in the quadratic cost function in (2) [23] such that the problem becomes where   is a reference trajectory,   is the control signal, and   is a cost related to the input at time .The subscripts  and  denotes the weighted 2-norm (with weights  and ).For energy systems, this objective is appropriate for stabilising e.g.voltages in grids, aligning energy consumption of building stocks etc.The first regularisation term reflects the cost of the input deviating too much from a desired reference point.The second term typically reflects some economic costs related to input in time.In the context of building thermal control using a heat pump, the latter can be the electricity price related to the heat pump operations.

Discretisation of the continuous-time dynamics
In this section, we derive a discretisation of the continuous time dynamical system in (1).Due to the discrete nature of the way computers perform numerics, it is necessary to somehow discretise the optimisation problem in (7).For energy systems, the disturbances typically play an important role and are themselves encumbered with uncertainty.We shall therefore put extra attention towards how to handle the disturbance term in (1),     , in the discretisation.
Let   be the sample time between the control points,   =  +1 −   .We assume that the input  is constant during the time intervals [  ,  +1 [ for all  ∈ N, i.e. zero-order hold.The solution to   in (1) gets the form where the matrices ,  and process noise   are defined by and  in ( 9) have known closed-form solutions and are easy to compute.However, it is not straightforward how to deal with the integral ∫ (8).The rest of this section deals with this integral when disturbance forecasts are supplied as zero-order hold values.
As Section 4 describes, the embedded disturbance model in (1b) estimates the mean value of the disturbances between time samples [  ,  +1 [, that is d is the true mean value of the disturbances, the conditional mean value supplied by the disturbance model given observations up till time   , and   is i.i.d.Gaussian distributed white noise.To rewrite the system in (1) with the weather predictions, we can rewrite the integral in (8) as ≈  d| where In (11e), it is assumed (wrongfully) that the disturbances act constantly on the system such that   − d = 0 and the integral The term   can be thought of as the error related to the zero-order hold discretisation.Fig. 1 depicts the calculations involved in (11).The hatched areas indicate the two factors in the integral in (11c) between two time samples   and  +1 .It becomes evident that the integral of the product of the two terms in general is not zero.Unfortunately, the disturbance processes are strongly non-linear and behave indescribably, which implies that   − d becomes correlated in time (as Fig. 1 illustrates).For this reason, the easiest solution is to neglect the error as in (11e).Since the uncertainty of the disturbances is described by Brownian motions, it is possible to characterise the uncertainty of the disturbances between time samples using a Brownian bridge.In such a setup, by conditioning the Brownian bridge on attaining the estimated disturbance values at times {  ,  +1 }, the disturbance process between the time steps is Gaussian.It is then straight forward to compute uncertainties and densities for the disturbances at all points in time.
Inserting (11e) into (8) gives us the system with   =    and   ∼ (0,  1 ) and   ∼ (0,  2 ). 1 is the covariance of the uncertainty related to the disturbance estimate and ) d.Aggregating the noise terms,   =   +   , gives the conventional discrete-time stochastic state space system It immediately reveals that the computation of E[ + |   ],  = 1, … ,  requires conditional mean forecasts of the disturbances.

Error quantification of the discretisation
Since   is difficult to determine, the easiest solution is to disregard it.But can we say something about the error we make by this simplification?The discretised system (with all its terms) gets the form: where The error term   can be bounded by the following: Due to the Gaussian noise acting on the system, we do not know the deterministic position of the system -instead, the position is given by a Gaussian density, centred around x| .
⋅ max The first two norms of (15b) are related to the singular values of the dynamics  and -i.e. if the dynamics are fast, the error might be larger (due to the matrix exponential that acts as a weighting factor) and vice versa.The last term relates the error to the maximum quadratic variation between the continuous process   and its mean value in the interval [  ,  +1 [.Note that the error term ‖  ‖ 2 2 scales linearly in time, in line with the variance of a standard Brownian motion.Also note that had the weighting factor exp (   ( +1 − ) ) not been present, the integral would simply vanish.It is hence the system's dynamical influence that causes the error of the zero order hold.

Discretisation of the continuous-time objective function
The objective function has the purpose of weighting the performance of control solutions to make them comparable.E.g., typically, the purpose of controlling the indoor air temperature of a building is to maintain a comfortable temperature while minimising the electricity consumption and/or price (and perhaps more criteria such as variations in the input signal).It is thus a tuning problem to correctly weight the contributions to mimic desired behaviour.
In the previous subsection, we obtained the system on a discretetime form using a zero-order hold discretisation scheme.This has the advantage of parameterising the input signal as a set of values { + } −1 =0 , which is numerically tractable in an optimisation problem.Given information up until time   ,   = { 0 ,  1 , … ,   } the finitehorizon continuous-time LQG control objective function from Eq. ( 7) gets the form where   and ū are reference trajectories.Note that we employ the weighted 2-norm.How to discretise the objective function is not straight forward and can be done in multiple ways.In general, the discretisation is an approximation of the continuous-time objective.However, in the linear-quadratic case, an exact discretisation exists [32].
The derivation is tedious, though, and if the time between samples and control inputs is small, the integral can be approximated well by an Euler discretisation, i.e. evaluating the objective function point-wise.Let be a partition of the future control times of the system.We discretise the objective function point-wise such that the integral in ( 16) becomes a sum over the values at the time points in ( 17) Using that   =   , the first term in the sum simplifies to where  ′ =  ⊤  and  ⊤  = − ⊤  .The second part of the objective function equivalently has the form The combined discretised objective function (where we omit terms that are independent of the system state or the input signal) is ] . (21)

Solution to the LQG control problem with embedded disturbance model
The previous section presented and derived the discretisation of the system dynamics and objective function to make computations tractable for the computer.There are multiple ways to solve the LQG optimal control problem.Singh and Pal [33] derive an optimal linear feedback law based on the current state and future disturbance inputan extension to the classical LQR feedback law.However, such laws are not able to deal with constraints, which requires (in general) numerical solvers to solve.In this section, we derive solutions to the finite horizon constrained LQG optimal control problem.
When dealing with stochastic systems as in (1) governed by both Gaussian system and observation noise, the LQG control problem can be divided into two sub-problems due to the separation principle (under certain conditions) [34].Given a noisy observation of the system,   , the steps are 1.Reconstruct the system states,   , using the regular Kalman filter (which is an optimal state estimator in the linear case with Gaussian noise), i.e.
2. Solve the LQG optimal control problem using the reconstruction as a certainty-equivalent state estimate.
The LQR and LQG optimal control problems are special cases when it comes to their solutions and ability to separate the estimation and optimal control problems.This section presents and solves both problems.

State estimation (filter)
Due to the Gaussian error term in (13),   =   +   , the system experiences random forces that pushes it away from its deterministic path given by the linear system in Eq. ( 5).We thus do not know the exact position of the system.To estimate the system state, filters and observers are common choices [30,35].However, in case of a linear state space model with Gaussian process noise, the Kalman filter provides an optimal state estimator.It estimates a distribution of the system state at time   , (  |   ) ∼ ( x| , P| ) given the past observations.Fig. 2 depicts the distributional development in the Kalman filter update: We start out with an initial state estimate at time   given by a distribution ( x| , P| ).The system development gives a predicted estimate of the successive state, ( +1 |   ) ∼ ( x+1| , P+1| ).When the next observation,  +1 , becomes available, the Kalman filter combines the system prediction and the observation at time  +1 , to compute the optimal system state estimate, Algorithm 1 lists the necessary computational steps in the Kalman filter to update the system state distribution.

Solution to the LQG optimal control problem
Consider the following finite horizon, inequality constrained optimal control problem given x| and { d+| } −1 =0 , with the discrete-time objective function We solve the optimal control problem in (22) by introducing a notation for the system for all points in time from  = 0, … ,  − 1.
where the matrices and vectors are We also introduce the vector containing the prediction errors  +| = x+| −  + ,  ∈ N, which are due to the Gaussian error term in (13), Inserting the new notation in to the objective function ultimately isolates the stochasticity to the prediction errors (the predictions and prediction errors are independent under the linear quadratic assumption) where ] ⊤ .Since the last term does not depend on  , we can omit it in the optimisation problem.The next step is to insert (24) into the cost function Notice that we eliminated the equality constraints (22b) since they are now embedded in the cost function.We can write the objective,   , as a quadratic function in with the matrix and vector 22) is equivalent to the following convex quadratic programme  with φ as in (29).The objective, φ , is convex and enables fast numerical solvers such as interior-point methods, see e.g.[36].Had the problem been unconstrained, the solution to the quadratic optimal control problem would have been given in closed-form [37].This has the advantages of being fast to evaluate and gives useful insights into how the solution depends on the various parameters.The LQG control framework is now given by a filtering-and an optimisation step.Given an observation,   , we reconstruct the current state, x| , in the filtering step, and solve the optimal control problem to retrieve the optimal input.In the MPC algorithm, we require an addition computational step to compute the disturbance forecasts.Algorithm 2 lists the LQG control framework.

The smart building and disturbance model
In the numerical case study (to simulate the performance of controllers using different disturbance forecasting schemes) we consider the heat dynamics of a building given by a model based on SDEs.Furthermore, we use a SDE-based disturbance model to supply forecasts.This section introduces both models briefly.

The smart building model
This paper considers a model based on SDEs with a linear model for the building heat dynamics and a non-linear model for the disturbances.The SDE representation in (1) provides a natural way to express physical systems due to the continuous-time formulation.
This paper uses the continuous-time heat dynamics model of a building identified and estimated by [38].The authors show that a sufficient model for describing the heat dynamics of the specific building involves two heat accumulating media temperatures: the room air and the floor.We denote the state variable () = [   (),   () ]  , where   and   are the room air and floor temperature, respectively.Furthermore, the authors identify how the important disturbances act on the smart building; that is the ambient air temperature,   (), and the solar radiation,   ().Lastly the room is equipped with an electrical heater to supply heat.Fig. 3 illustrates the heat dynamics and interactions between the model components.Mathematically, the following matrices govern the continuous-time smart building model on linear form as in ( 1)  1 gives the values and descriptions of the parameters in (31).The model in (31) is of course very simple and could be extended to include many effects such as the relative humidity or CO 2 -levels, both of which impacts the indoor climate [39].

Table 1
The values used in the model for a single smart building in (31)

The disturbance model
An important contribution of this paper is to show that disturbance modelling is crucial for control performance.We use the advanced dynamical disturbance model introduced in [6,10].The disturbance model predicts the mean value of the solar radiation,   (), and the ambient air temperature,   (), in the time interval [  ,  +1 [.The model is based on stochastic differential equations and incorporates many climate processes.Fig. 4 gives an overview of the model: It includes models of the following components • Cloud cover,  • Global solar radiation (based on the direct and diffuse radiation),   • Net radiation,   • Ambient air temperature,   Fig. 4 illustrates the way the components are coupled and thoroughly explains the dynamics and interactions.Ultimately, the model predicts the amount of solar radiation hitting a horizontal surface (in Watts) and the ambient air temperature.That is, given an observation of the disturbances at time   , ŷ, , the disturbance model returns a sequence of disturbance forecasts where d+| = [  ,+| ,  ,+| ]  is the prediction of the disturbances in the time interval [  + ,  ++1 [,  ∈ N and  is the prediction horizon.The building-and disturbance systems in (31) are one-way coupled and the observation of the building system does not contain much additional information about the disturbance state.For this reason, we let separate Kalman filters reconstruct each system for a given time instance.Furthermore, we let the MPC use the weather forecasts as input to the grey-box model for the disturbances to solve the optimal control problem.
Fig. 5 shows a forecasting example with a prediction horizon of four days.The disturbance model forecasts the expected value, which visibly goes to a steady state after some time (after which the forecasts corresponds to a mean value).
Remark.The presented model includes only a limited climatic processes for predicting the ambient air temperature and global solar radiation.If relevant for the control objective, other variables could be added to the disturbance model -e.g.humidity factor, CO 2 , etc.

Numerical case study
In this section, we carry out a simulation study to quantify the effects of using embedded disturbance forecasts for the continuoustime LQG controller based on the optimal control problem in (30) using the discretised dynamics and objective function in ( 13) and ( 21), respectively.We compare the results with a controller that uses perfect forecasts to give an upper bound on the possible control performance.We also compare a controller that uses offset-free control -also known as persistent forecasts.The latter is a standard way of dealing with disturbances in the control literature, see e.g.[40][41][42] for introductions to the technique.

Simulation setup
To include weather disturbances into the simulations and mimic natural settings, we use actual weather data as the true disturbances acting on the building model.The controllers use weather predictions supplied by the disturbance forecasting schemes (either persistent, advanced, or perfect forecasts).The results in this section (simulation results in Figs.7 and  For more details on the data, it is thoroughly described in [43].
The controllers use a prediction horizon of 84 h and a time sample of 1 h.The constraints on the input are  min = 0 W,  max = 1500 W,  min = −500 W, and  max = 500 W. The electricity price is parameterised as   =   c , where c is electricity price data taken from Nordpool and   is a constant scalar to weight the electricity price in the optimisation.To give a visual example of the control solution and better understand the differences of the strategies, Fig. 6 shows the first two weeks of the 7 months of the control simulation.The overall behavioural pattern seems to be more or less the same for all controllers.They tend to buy electricity at the same times.However, the controller using persistent forecasts seems to consistently overheat in periods with significant amounts of sun, compared to the other controllers.Here, the controllers using advanced and perfect forecasts seem to supply more equal control solutions.

8.
The savings in terms of the objective of the controllers using advanced-and perfect forecasts compared to a controller using persistent forecasts.The solid and dashed lines use   = 0 and   = 0.5 ⋅ 10 −5 , respectively.E.g.The dashed lines at the weight   = 0.0002 shows that the advanced-and persistent forecasts performs around 10% better compared to the persistent forecasts for   =   = 0.5 ⋅ 10 −5 .

Comparison of performances between forecasting schemes
Fig. 7 shows the pareto-fronts displaying the trade-offs between variance in the signals and economic costs.Fig. 7(a) shows the tradeoff between the variance of the solution and the control signal: If we require less variance of the process around the reference signal, the variance of the input increases and vice versa.This trade-off is a consequence of the LQG control that weights the variance of solution and the input.The controller using advanced forecasts is able to obtain smaller variation of the solution for a given tolerable input variation compared to persistent forecasts.Fig. 7(b) shows the trade-off between the variance in the solution and the economic costs.It shows again that it is more expensive to require less variation in the solution.Again, the controller using advanced forecasts is able to obtain a certain variation in the solution for a smaller economic cost.
Overall, the advanced forecasts are able to deliver better solutions in terms of economic costs and variations in the input and solution.The performance is sometimes close to that of using perfect forecasts.
Lastly, Fig. 8 shows the savings of controllers using perfect and advanced forecasts as a function of the economic price weight compared to a controller using persistent forecasts.The objective function savings

Fig. 1 .
Fig. 1.An illustration of the calculations of the model discretisation.The disturbance model estimates the mean value in the time sample [  ,  +1 ].The slanted hatched area indicates the factor   − d and the vertical hatched area shows the exponential factor (in 1 dimension) in the integral in (11c).

Fig. 3 .
Fig. 3.A diagram of the smart building model and the interactions.The room air exchanges heat with the floor and the ambient air and is heated by electrical heaters.The solar radiation enters through the windows and typically delivers significant amounts of heat.

Fig. 4 .
Fig.4.Two illustrations of how the climatic processes interact and how the disturbance model is designed.In 4(a) the two components of the solar radiation is displayed: The global radiation consists of direct-and diffuse radiation where the latter is reflected off from objects in the atmosphere.In 4(b), the interactions of the components near Earth's surface is depicted.The net radiation is the net input of short-and long wave radiation.It plays a crucial role in describing the ambient air temperature.The sea surrounding Denmark highly regulates the air temperature.The heat capacity of water is much larger compared to that of air, and the sea temperature (  ) thus acts as a slow and regulating component in the model. ∞ represents a constant heat loss of the air.

Fig. 5 .
Fig. 5.The advanced dynamical disturbance model compared to persistent forecasts and true disturbances.
8) are based on 7 months of weather data, where observations are separated by 1 h.The data was collected from a weather station located in Taastrup in Denmark in the period from 1971 to 1973.The data include the following variables:• Cloud cover ([okta]) • Direct radiation ([W/m 2 ]) • Diffuse radiation ([W/m 2 ]) • Net radiation ([W//m 2 ]) • Air temperature ([ • C])

𝒖, and 𝒅 are
the system, input, and disturbance states, respectively.  ,   and   are matrices governing the state evolution, input, and disturbances, respectively.  is standard Brownian motion,   is the observation,   is the observation noise, and   is the control variable.The disturbance model in (1b) is non-linear and can be used to forecast the disturbance states.Had the disturbances   been described by a linear model, we could couple the building and weather states into a single state space model [ = (     +     +     ) d +   d  (1a) d  =   (  , )d +   ( .