Event-triggered second-moment stabilisation under action-dependent Markov packet drops

This paper considers the problem of the second-moment stabilisation of a scalar linear plant with process noise. It is assumed that the sensor communicates with the controller over an unreliable channel, whose state evolves according to a Markov chain, with the transition matrix on a timestep depending on whether there is a transmission on that timestep. Under such a setting, an event-triggered transmission policy is proposed which meets the objective of exponential convergence of the second moment of the plant state to an ultimate bound. Further, upper bounds on the transmission fraction of the proposed policy are provided. The results are illustrated through an example scenario of control in the pres-ence of a battery-equipped energy-harvesting sensor. The proposed control design as well as the analytical guarantees are veriﬁed through simulations for the example scenario.


INTRODUCTION
In the literature, the problem of control over time-varying action-dependent channels has been understudied. This paper addresses this gap using the approach of event triggering for controlling a scalar linear system over an unreliable actiondependent Markov channel.

Literature review
Last two decades have seen extensive research on various issues and design methods in networked control systems (NCS) [1][2][3][4]. One such area is event-triggered control [5][6][7][8][9], which has been applied in numerous contexts for various control goals. However, the volume of work on event-triggered control in a stochastic setting is still not as considerable as in the deterministic setting. Some early works in the stochastic setting include [10][11][12][13]. Several papers that consider event-triggered transmissions under stochastic packet drops exist in the context of estimation [14], linear quadratic Gaussian (LQG) control [15][16][17], non-linear systems [18], multi-loop control of linear systems [19,20] and stabilisation [21][22][23]. However, these works consider only independent and identically distributed (i.i.d.) packet drops. An exception in the works on event-triggered control is our previous paper [24], which considers Markov packet drops. Even in the literature on NCS, a very common assumption is that the packet drops are i.i.d. across time. However, in order to better capture time-correlation effects in networks, there has been recent consideration of packet-drop probabilities evolving according to a Markov chain. Some recent works considering Markov packet drops include stability of Kalman filtering over networks [25,26], channel selection for control of multi-loop non-linear systems [27], and mean-square stabilisation with quantised feedback [28,29]. Beyond packet drops, some other works on NCS with Markovian channels include [30] for Kalman filtering with Markov inter-reception times, control under Markov missing data [31], mean-square stabilisation with the channel data rate evolving as a Markov chain [32] and over a noisy fading channel where the evolution of fading gain is Markovian [33,34] as well as in the context of control over vehicular ad hoc networks [35,36] (see also references therein).
In the literature on communication systems, Markov models for channels have a long history, starting with the works of Gilbert [37] and Elliott [38]. Reference [39] is a relatively recent survey on Markov modelling of fading channels. Channels whose properties depend on past actions also serve as useful models for communication systems as well as for other applications. Some examples in the communication literature include [40], which considers streaming in buffer-enabled wireless networks, and [41], which is on communication in underwater acoustic channels. Action-dependent Markov processes also model systems of other communication channels. Reference [42] is a recent survey on models and research works on systems whose operation depends on a 'utilisation-dependent component' such as queueing in action-dependent servers [43], iterative learning algorithms and systems with energy-harvesting (EH) components, among other applications. Reference [44] considers a communication system powered by an EH battery, modelled as an action-dependent Markov channel. This model shares significant conceptual commonality with the model we use for simulations in Section 6.

Contributions
The major contributions of this paper are as follows.
• We consider the problem of second-moment stabilisation over a channel with action-dependent Markov packet drops.
To the best of our knowledge, such channels have not been considered before in the context of NCS. For example, the works [28,29] consider Markov packet drops without dependence on past transmission actions. We provide a necessary condition on the plant dynamics and the channel parameters for our transmission policy to achieve the control objective. This necessary condition is similar to the conditions often found in the data rate limited control [45] and NCS in general. • The proposed event-triggered transmission policy is similar in spirit to our earlier work [21,24]. However, [21] consider only i.i.d. Bernoulli packet drops and [24] consider Markov packet drops. In contrast, here we consider action-dependent Markov packet drops, which results in a coupling of the evolution of the plant and channel states. This aspect makes the analysis necessary for providing theoretical guarantees on performance significantly more challenging. In particular, the two main analytical contributions in this part are theoretical guarantee on the second-moment stability and an upper bound on the fraction of timesteps, over a time horizon, on which a transmission occurs under the event-triggered policy. • We model the problem of control with a battery-equipped EH sensor using the proposed action-based Markov channel framework and illustrate our proposed event-triggered policy and results through simulations. This example also demonstrates the wider applicability of our model, beyond the problem of control over wireless communication channels.

Notation
We let ℝ, ℤ, ℕ and ℕ 0 denote the sets of real numbers, integers, natural numbers and non-negative integers, respectively. We use the standard font for scalar quantities while boldface for vectors and matrices. The notations 1, i and I denote the vector with all 1s, the vector whose ith entry takes the values 1 and 0 everywhere else, and the identity matrix, respectively, of appropriate dimensions. We use (A) to denote the spectral radius of a real square matrix A. We denote the space of probability vectors (i.e. vectors with non-negative entries that sum to 1) of n dimensions as ℙ n .

SYSTEM DESCRIPTION
In this section, we describe the plant, channel, controller and the control objective. A schematic of the system is provided in Figure 1.

Plant and controller model
Consider a scalar linear plant with process noise The parameter a is the inherent gain of the plant, which we assume is unstable, that is, |a| > 1. The variables x k , u k and v k are the plant state, the control input and the process noise, respectively, at timestep k ∈ ℕ 0 . We assume that v k is i.i.d. across timesteps k and independent of all the other system variables. Its distribution has zero mean and finite variance, that is, The sensor determines t k at each timestep k according to an event-triggered transmission policy on the basis of plant state and all the information available on timestep k. Even if the sensor transmits a packet at timestep k (t k = 1), the packet may be dropped by the communication channel according to a packetdrop model which we describe in Section 2.2. We let r k be the reception indicator: The controller uses the controller state,x + k , to generate the input u k := Lx + k , where L is such thatā := (a + L) ∈ (−1, 1). The controller statex + k itself evolves aŝ wherex k :=āx + k−1 is the estimate of the plant state given past data. Corresponding to the controller state and plant state estimates, we define the estimation error z k and controller state error z + k as The two quantities differ only on successful reception times. It is possible to write the plant state evolution as Equations (2)-(4) compositely describe the evolution of the plant state, controller state and the estimate of plant state.

Channel model
We model the communication channel as an action-dependent finite state-space Markov channel. The channel can be in one among a finite number of states on each timestep. The state of the channel on a given timestep describes the quality of service it provides. Here, the channel state on a timestep determines the packet-drop probability on that timestep. We denote the channel state at timestep k by k ∈ {1, … , n}, with n a finite positive integer. We assume that the probability distribution of k+1 depends on k and t k , the transmission decision on timestep k. Thus, the evolution of the channel is an action-dependent Markov process. We let p i j and p (1) i j denote the probabilities of the channel state transitioning from j to i given t k is equal to 0 and 1, respectively. Thus, i j := Pr We let P 0 and P 1 be column-stochastic matrices, whose (i, j )th elements are p (0) i j and p (1) i j , respectively. We model the unreliability of the channel through a packet-drop probability e i for each element i of the channel state-space. Thus, if on timestep k the channel state k = i and if the sensor transmits a packet then the channel drops it with probability e i ∈ [0, 1] and it communicates the packet successfully to the controller with probability (1 − e i ), that is, where 'w.p.' stands for 'with probability'. Thus, the packet drops on each timestep is Bernoulli, though not i.i.d. We collect the probabilities of packet drops across all possible channel states in the vector e := [e 1 , e 2 , … , e n ] T ∈ [0, 1] n . Correspondingly, we define the transmission success probability vector d as d := 1 − e.

Sensor's information pattern
Next, we describe the information available to the sensor to make the transmission decisions t k . Apart from the plant state x k that the sensor can measure perfectly on each timestep k, we assume that if a successful reception occurs on timestep k, then the controller acknowledges it by relaying the reception indicator variable r k and the channel state k over an error-free feedback channel. However, the sensor may use this channel feedback information only on subsequent timesteps. To describe all the information available to the sensor on timestep k more formally, we first introduce the variables R k and R + k to track the latest reception time before and latest reception time until timestep k, respectively. Thus, The variable R k is useful for the sensor's decision making while R + k is helpful in the analysis. Further, we let S j for j ∈ ℕ 0 be the j th successful random reception time, that is, where without loss of generality, we have assumed that the zeroth successful reception occurs on timestep 0.
From the controller feedback, the sensor knows R k and R k before deciding t k , from which the sensor can utilise the channel evolution model to obtain the probability distribution of the channel state p k ∈ ℙ n given R k , R k and all the transmission decisions from R k to k − 1, that is, where p k (i ) is the ith element of the vector p k . Letting we can obtain p k recursively as In the following remark, we discuss about the case when channel state feedback may not be error-free.
Remark 2.1Value of p + k under erroneous channel state feedback. The probability distribution p k represents the belief of the sensor about the true value of channel state k , which evolves based on the action-dependent Markov transition matrix and the intermittently available feedback through p + k . Under perfect channel state feedback, on a reception timestep (r k = 1), the sensor knows the value of k and therefore updates the intermediate belief p + k to k , else (r k = 0) it uses the current belief p k for the same. In case of imperfect channel feedback, the channel state information acquired from the controller can be represented via a probability distributionp k , and the value of p + k can be set top k when r k = 1. The analysis can then be suitably modified.
We denote by I k the information available to the sensor about the controller's knowledge of plant state before transmission while we use I + k to denote the information available to the sensor after channel state feedback (if any). Thus, I + k = I k when r k = 0, and I + k contains r k and k over I k when r k = 1. In other words, Note that the channel state feedback by the controller is represented as r k−1 k−1 and r k k in I k and I + k , respectively. If r k = 1 then r k k = k , and if r k = 0 then r k k = 0 and thus no channel state feedback is available. Note that {I k } k∈ℕ 0 and {I + k } k∈ℕ 0 are action-dependent Markov processes. In particular, the probability distribution of I k conditioned on {I s , t s } k−1 s=0 can be shown to be the same as the one conditioned on {I k−1 , t k−1 }. Similarly,

{I +
k } is 'sufficient information' to determine the distribution of I + k+1 given all the past information

Control objective
Given the plant and the controller models in Section 2.1, the only decision making left to be designed is the sensor's transmission policy  , which determines t k for each timestep k. In particular, we seek to design a feedback transmission policy using the available information I k on timestep k. The offline control objective that we seek to guarantee is the second-moment stabilisation of the plant state to an ultimate bound exponentially. Formally, we want to ensure which is to have the second moment of the plant state decay exponentially at least at a rate of c 2 until it settles to the ultimate bound B. We assume that the convergence rate parameter c 2 ∈ (ā 2 , 1). Note that (7) prescribes the restriction on the plant state evolution in an offline fashion, in terms of only the initial information. However, a recursive formulation of the control objective is more conducive to designing a feedback transmission policy.
To design a feedback transmission policy, we need to define an online version of the control objectives which is conditioned upon the information sets I + R k that become available to the sensor through feedback received from the channel. First, we define the performance function h k for every timestep k as follows: Then, the online objective is to ensure We borrow Lemma III.1 from [21], which demonstrates that any transmission policy that satisfies the online objective also satisfies the offline objective.
Lemma 2.1(Sufficiency of the online objective [21]).. If a transmission policy  satisfies the online objective (8), then it also satisfies the offline objective (7). □ Note that in the control objective (7), the sources of randomness that determine the expectation are the transmission policy  , the random channel behaviour and the process noise. The transmission policy and the random channel behaviour determine the successful reception times while the process noise affects the evolution of the performance function during the inter-reception times. As the online objective (8) is essentially a condition on the evolution of the performance function during the inter-reception times, Lemma 2.1 continues to hold in the setting of this paper.

TWO-STEP DESIGN OF TRANSMISSION POLICY
Designing a transmission policy so that the described system meets the control objective (7) or even the stricter online objective (8) poses many challenges. The main challenge stems from the random packet drops, which makes the necessity of a transmission on timestep k dependent on future transmission decisions. Further, the future evolution of the channel state depends on all the past and current transmission decisions. Thus, the transmission decisions t k cannot be made in a myopic manner and instead must be made by evaluating their impact on the channel and the control objective over a sufficiently long time frame. To tackle this problem, we adopt a two-step design procedure. This general design principle is similar to that in [21], wherein the reader can find a more detailed discussion about this procedure as well as its merits. We now describe the two steps of the design procedure.
In the first step, for each timestep k, we consider a family of nominal policies with look-ahead parameter D ∈ ℕ. A nominal policy with parameter D involves a 'hold-off' period of D timesteps from k to k + D − 1 during which t k = 0, and then there is perpetual transmission, that is t k = 1 for all timesteps after k + D − 1. Thus, letting  D k be the nominal policy with parameter D, we have In the second step of the design procedure, we construct the event-triggered policy,  D et , using the nominal policies as building blocks. Given (9), one can reason that if the nominal policy with parameter D ∈ ℕ satisfies the online objective from the current timestep k, then a transmission on the current timestep is not necessary to meet the online objective. Further, if the online objective cannot be met from timestep k using the nominal policy  D k then it may be necessary to transmit on timestep k. This forms the basis for the construction of the event-triggered policy, which we detail next.
First, we need a method to check if the nominal policy  D k satisfies the online objective from timestep k. For this, we define the look-ahead function,  D k , as the expected value of the performance function h k at the next successful reception timestep k = S j +1 under the nominal policy, that is, We can evaluate  D k as a total expectation, over all possible values of S j +1 , as where Ω D (w, p) is the probability of the event that the first successful reception after timestep k is at timestep k + w under the nominal policy  D k and given p k , the probability distribution of the channel state at time k, conditioned on the information at time R k . Formally, The closed form of Ω D (w, p) is given as follows: where E is the diagonal matrix with elements of e on its main diagonal. The explanation of (13) is as follows -the probability vector p, when left-multiplied by P (D) 0 provides the probability vector of the channel state immediately after the holdoff period, which is of D timesteps. The said vector when leftmultiplied by (P 1 E) (w−D) provides the probabilities of, subsequent to the hold-off period, making a transmission attempt (w − D) times successively but failing to achieve reception on every attempt. Finally, left-multiplication by d T gives the probability of finally having a successful reception on the (k + w) th timestep. Thus, (13) is the closed form of Ω D (w, p) defined in (12).

The event-triggered policy
The main idea behind the proposed event-triggered policy is the following. A negative sign of the look-ahead function  D k indicates that it is not 'necessary' to transmit on timestep k as there exists a transmission sequence (given by the nominal policy) that meets the objective at least on the next random reception timestep. However, if the sign of  D k is non-negative, it means that the sensor cannot afford to hold off transmission for D timesteps from the current timestep k, and still ensure that the online objective is not violated on some future timestep. In the proposed event-triggered transmission policy, the sensor evaluates  D k at every timestep k, and when it turns non-negative the sensor keeps transmitting on every timestep until a successful reception occurs, and then the sensor again waits for  D k to turn non-negative. The event-triggered transmission policy may be described formally as follows: where k is the first timestep after R k when  D k ≥ 0 and Z k is the first timestep, after R k , on which there is a successful reception, that is, Note that the event-triggered policy is described recursively in terms of R k , the latest reception time before k, and the lookahead function  D k . As a result, the policy in (14) is valid for all time k ≥ 0. In the analysis of the policy (14) in the sequel, it is useful to refer to the j th reception time, denoted by S j . Similarly, we let One can think of the policy (14) as operating in one of the two modes: 'do not transmit' or 'transmit'. The policy switches from the first mode to the second at a time k exactly when  D k ≥ 0 for the first time after the last successful reception. After a successful reception, the policy shifts back to the 'do not transmit' mode. Thus, from this perspective,  D k ≥ 0 can be thought of as the event-triggering rule.

IMPLEMENTATION AND PERFORMANCE GUARANTEES
In this section, we describe the implementation details of the proposed event-triggered policy, and analyse the system under this policy through several intermediate results. At the end of the section, we provide sufficient conditions on the ultimate bound B and the look-ahead parameter D such that the system meets the online objective (and the offline objective) under the event-triggered policy.

Closed-form expression of the look-ahead criterion
For implementation of the event-triggered policy (14), we need an easy method to compute the look-ahead function  D k . In particular, we provide here a closed-form expression of the lookahead function. We begin by expanding the expectation term in (11) as follows [46]: From (11) and (15), it is evident that convergence of  D k requires the convergence of infinite series of the form with p ∈ ℙ n , and D ∈ ℕ and for values of b equal toā 2 , c 2 , a 2 , aa and 1, which satisfy Each of the terms g D (b, p) involves an infinite matrix geometric series. The criteria for convergence and the closed form of g D (b, p) for these values of b would allow us to determine the same for  D k . For the same, we use the well-known result that for a non-negative matrix K, the infinite matrix geometric series To obtain a closed-form expression of g D (b, p), first note that If (bP 1 E) < 1 then we obtain In the following result, we apply the convergence criterion of a matrix geometric series to provide a necessary and sufficient condition for  D k to be well defined.

Lemma 4.1.  D k converges for all probability vectors p k if and only if
Proof. From (11)-(12) and (15)-(16), we see that an expansion of  D k involves terms such as g D (b, p) with b equal toā 2 , aā, a 2 and c 2 . Using (17) and noting that (b 1 P 1 E) > (b 2 P 1 E) when |b 1 | > |b 2 |, we can state that (a 2 P 1 E) > (bP 1 E) forb equal toā 2 , aā and c 2 . Thus, (a 2 P 1 E) = a 2 (P 1 E) < 1 is a necessary and sufficient condition for convergence of  D k . □ We now proceed to give a closed-form expression of the look-ahead function  D k in the following lemma.

Lemma 4.2 (Closed form of the look-ahead function).
Suppose that a 2 (P 1 E) < 1. The following is a closed-form expression of the lookahead function  D k : , the closed form of the function g D (b, p) is given in (18), while f D (b, p) is given by Finally, is defined as follows: Proof. Most terms in the closed form of  D k follow directly from (11), the series expansion of  D k , the closed form of Ω D (w, p) in (13), the expansion of the expectation term (15), the definition (16) and the closed-form (18) of g D (b, p). We only need to simplify We split this summation into two parts based on if c 2w N k is larger or smaller than B. Observe that , defined in (19), is the smallest integer w ≥ D such that B ≥ c 2w N k . Then, where we obtain [r1] by observing that assuming (bP 1 E) < 1. With this we obtain the complete closed-form expression of the look-ahead function  D k . □ Note that the closed form of  D k is a third-degree polynomial of the plant state x k , error z k , and individual elements of p k , and is amenable for online computation. Furthermore, note that the look-ahead function  D k possesses a mathematical structure consisting of a linear operator with unit-dimensional rowspace acting on the stochastic vector p k .

Necessary condition on the ultimate bound B
We now seek a necessary condition on the ultimate bound B for there to exist a transmission policy that satisfies the online objective. To this end, we introduce the open-loop performance function, H (w, y), which we define as the expectation of the performance function h S j +1 conditioned upon I + S j and the event that S j +1 = S j + w and x 2 S j = y, that is, Note that H (w, x 2 S j ) is very similar to (15) Note that H (w, x 2 S j ) < 0 indicates that given the information I + S j , the online objective is satisfied on timestep S j + w. Conversely, a positive sign implies that the online objective is expected to be violated on timestep S j + w. Using this observation, we demonstrate in the following proposition that for B less than a critical B 0 , there exists no transmission policy that can satisfy the online objective. is such that l 1 (w * * (y), y) = l 2 (w * * (y), y). Now, it suffices to prove the following two claims. Claim (a): l 1 (w, y) > 0 for all w ∈ ℕ for y ∈ (B, B 0 ). Claim (b): l 2 (w, y) > 0 for all w ∈ ℕ for y ∈ (B, B 0 ). First, note that l 1 (0, y) = 0 for all values of y. Next, evaluating the partial of l 1 (w, y) with respect to w at w = 0 and for y ∈ (B, B 0 ), we obtain > log(ā 2 ∕c 2 )B 0 +M log(a 2 ) [r2] = 0.
Note that we have used the fact thatā 2 < c 2 to obtain [r1], and used the definition of B 0 in [r2]. Since l 1 (w, y) is a quasi-convex function of w [21, Lemma IV.8], it is increasing for all w > 0, which proves claim (a). Now, we prove claim (b). We first derive a function g(w) that is a lower bound on l 2 (w, y) for w ≥ 0 and y ∈ (B, B 0 ).
The partial derivative of g(w) evaluated at w = 0 is where in [r4] we have used the definition of B 0 . Since g(0) = 0, g(w) has slope 0 at w = 0 and g is strictly convex in w, we conclude that l 2 (w, y) > g(w) > 0 for all w ∈ ℕ, which proves claim (b) and thus concluding the proof. □ Proposition 4.1 demonstrates that B > B 0 is a necessary condition on B for a transmission policy to satisfy the online objective. Note that this is a necessary condition on B even under the setting of [21,24], where no such condition is provided. In the following subsection, we further analyse the open-loop performance function H (w, y) to find a sufficient criterion on B and D that guarantees that the online objective is met under the eventtriggered policy.

4.3
The performance-evaluation function,  D

S j
For the purpose of analysing system performance between any two successive reception times S j and S j +1 , we define the performance-evaluation function,  D S j . Its definition is similar to that of  D k in (10), though we define  D S j only for k = S j (successful reception times) and condition upon the information set I + S j instead of I S j . In particular, we let Here,Ω D (w, ) is the probability of getting a successful reception w timesteps after S j starting with channel state on S j under the nominal policy  D−1 S j +1 . The purpose of the functioñ Ω D (w, ) is analogous to that of Ω D (w, p) in  D k , and is formally defined as The closed form ofΩ D (w, ) can be obtained in a manner similar to the closed form of Ω D (w, p), and is given as Note that in (24), the probability functionΩ D (w, ) takes the channel state as an argument instead of a probability distribution p, since our assumed channel state feedback mechanism stipulates perfect feedback, that is, p S j = S j , and thus p S j is a deterministic function of S j . Before proceeding, we discuss conceptual and structural differences between  D k and  D k in the following remark.  [21] and in the Markov channel case in [24]). The reason for doing this is that in case of non-action-dependent channels (P 0 = P 1 ), once S j is known, the resulting closed form of the probability functionΩ D (w, ) is the same irrespective of whether we condition the probability in (23) upon nominal policy  D−1 S j +1 or  D S j . However, this is not true for the action-dependent Markov channels, since the stipulation that t S j = 1 leads to calculation of belief on timestep S j + 1 as p S j +1 = P 1 S j instead of p S j +1 = P 0 S j . This is visible in the closed form ofΩ D (w, ) in (24), and obviously this would not be an issue if P 0 = P 1 , as aforementioned.
For a well-chosen value of B, it can be shown that the openloop performance function possesses the property of sign monotonicity. This property is an important characteristic of H (w, y) and will prove useful in later results. The value of B * defined in Proposition 4.2 can be numerically computed using the procedure in the Appendix, which is based on the proof of Lemma IV.13 in [21]. We now provide a closedform expression of the performance-evaluation function  D S j , similar to the closed form of  D k in Lemma 4.2.

Lemma 4.3 (Closed form of performance-evaluation function).
Suppose that a 2 (P 1 E) < 1. A closed form of the performance-evaluation function  D S j is given as and finally, is defined as Proof. Recall the infinite series expansion of  D S j in (22). To evaluate it, we substitute H (w, x 2 S j ) with its closed form from (21) and that ofΩ D (w, S j ) from (24). Correspondingly, we get an expression that is the sum of multiple infinite series, as in the derivation of  D k in Lemma 4.2. To evaluate said terms, we define the summation functionsf (b, ) andg (b, ) given in the statement of the lemma and which are analogous to f (b, p) and g (b, p), respectively, and used for obtaining the expression for where [r1] follows from (10), while in [r2] we can replace the policy  with  D+1 k because the event t k = 0 is consistent with the policy  D+1 k on time step k and once t k = 0 is fixed the expected value of  D k+1 is independent of the transmission policy used on subsequent timesteps. In [r2], we also use the fact that if t k = 0 then R k+1 = R k . Finally, [r3] uses the fact that {I k , t k } is sufficient informationand then the tower property.
2: For proving this part, we observe that I k and the additional information that r k = 1 and k implies the knowledge of I + k . Considering this fact and proceeding with a similar methodology as the proof of claim 1, we observe that

Remark 4.2 ((Comparison with [21]).). Note that the statement of part 1 of Proposition 4.3 differs from Proposition
IV.4(a) (first part) of [21] which considers the expected value of  D k+1 in the setting of a channel with i.i.d. Bernoulli packet drops, in that we condition  D k+1 upon the stricter condition that t k = 0 as opposed to r k = 0 in [21]. This is because if the probabilities of channel state transition are action dependent, then on a timestep with a transmission but no reception (i.e.  . Consider the vector valued function Q( ) : ℕ 0 → ℝ n given by We provide the proof of Proposition 4.4 in the Appendix. Next, we consolidate the results so far to provide a theoretical guarantee that the event-triggered policy satisfies the online objective (8).
Theorem 4.1 (Performance guarantee of the event-triggered policy). If B > B * (see Appendix) and the look-ahead parameter D satisfies the condition Q(D) < 0, then the event-triggered policy (14) guarantees that the online objective (8), and therefore the original offline objective (7), are met.
Proof. Given Lemma 2.1, it suffices to show that the online objective (8) is met by the event-triggered policy. We centre the proof around the following two claims.
Claim (a): ] < 0. These two claims guarantee that the online objective is met, as where {S i } are the random reception times and S j = R + k . To prove Claim (a), we note that by the definition of openloop performance function H (w, y) in (20), we have We now prove Claim (b). It can be seen from Proposition 4.3 that for all k ∈ (S j , T j ) ℤ , where [r1] is obtained using the tower property and the fact that t k = 0 for k ∈ (S j , T j ) ℤ , while [r2] is obtained from Proposition 4.3. Furthermore, Proposition 4.3 (b) implies that Next, we condition the expected value of h S j +1 over information from timestep T j as well as timestep S j and using the tower property of conditional expectations, we obtain where the inner expectation in [r3] is conditioned under the nominal policy  0 T j since for all timesteps k ∈ [T j , S j +1 ] ℤ , we have transmissions (t k = 1). We consider two cases: T j ≤ S j + D and T j > S j + D. In the first case, since t k = 0 for k ∈ (S j , T j ) ℤ , we use (25) and (26) to write (27) as We now consider the second case in which T j > S j + D. Since we have t k = 0 for k ∈ (S j , T j ) ℤ , we use (25) to write (27) as since  D k is negative, by definition, for k ∈ (S j , T j ) ℤ . This proves Claim (b), and hence also the result. □ We conclude this section by commenting on the extension of the event-triggered policy to vector systems.  (7) can easily be extended to a general vector system of the form x k+1 = Ax k + Bu k + v k , with x k ∈ ℝ n , [v k ] = 0, and [v k v T k ] = M = M T > 0, with the control objective being to find a policy  such that The control scheme could be u k = Lx k (similar to u k = Lx k in the scalar case), with (A + BL) being Schur stable. There are two primary approaches towards the vector case extension. The first approach is applicable when it is possible to decompose the vector system into n scalar subsystems, and correspondingly obtain n lookahead criteria ( ) on every timestep. We can then use the largest value of the n look-ahead criteria so obtained in the triggering condition (14), thereby creating an eventtriggered policy that can stabilise the worst-case mode of the system, and can thus stabilise the entire system. The second approach involves a scalarisation of the vector system using any appropriate l p norm of the state variables and matrices involved in various calculations. This approach has been considered for vector systems in the Bernoulli packet-drop channel system in [21], and can easily be extended for the present case. (1) , for all x S j ∈ ℝ. Supposing the two claims are true, consider the transmission fraction during the j th horizon, Δ j , conditioned on I + S j . We note that it satisfies the inequality in (28) since the transmission fraction is increasing in the term  D et [|Δ (1) j | | I + To prove Claim (a), we start by demonstrating that, for a given value of ∈ ℕ and under the assumption that x 2

TRANSMISSION FRACTION
To this end, we consider two cases,  ∈ Λ 1 = [0, Bc −2 ) and  ∈ Λ 2 = [Bc −2 , ∞), respectively. If  ∈ Λ 1 , then we have where the inequality is from Claim (a) of Proposition 4.4, the first equality from (A1) and the second equality from the fact that  ∈ Λ 1 . Now, consider the case of x 2 S j ≥  ∈ Λ 2 . Recall from the proof of Claim (a) of Proposition 4.4 that  S j can be Hence, from the design of the event-triggered policy (14), it follows that T j > S j + , or in other words, no transmission takes place at least  timesteps from S j , in expectation. Thus, Now, consider Claim (b). Note that t k = 1 for all k ∈ Δ (1) j , and from the event-triggered policy,  D et [|Δ (1) j |] is simply the expected number of timesteps for reception under a string of continuous transmission attempts, starting from timestep T j and channel state T j . To capture the same, we define the con- (30). We bound |Δ  Remark 5.2 (Trade-off between control performance and transmission fraction). Suppose for a given value of  and some ∈ ℕ, we have Q  ( ) < 0 but Q  ( + 1) i ≥ 0 for at least one i ∈ [1, n] ℤ . Then if the operational value of the look-ahead parameter is D, we note that D +  = . The system designer can either choose a high value of D (conservative control) but this results in a lower value of , and thus a larger upper bound on   . Conversely, a lower value of D (aggressive control) leads to a higher value of , and thus a smaller upper bound on   .
We show in the following result that an upper bound on the asymptotic transmission fraction,  ∞ can be obtained by setting  = Bc −2D in the upper bound of   provided in Theorem 5.1. Proof. The proof is similar to that of Theorem 5.1 except for one key difference. We note that in Theorem 5.1,  (0) was obtained as the -maximiser of Q  (D + ) under the constraint that Q  (D + ) < 0. This ensured that the transmission fraction over the horizon (S j , S j +1 ] ℤ is upper bounded by  , which we want to be negative so that (31) is valid. Thus, we let which follows from the fact that c 2 >ā 2 and the definitions of Q  ( ) and Q( ). The rest of the proof follows along similar lines as that of Theorem 5.1. □

ILLUSTRATIVE EXAMPLE
In this section, we validate our transmission policy design through simulations. In this section, we illustrate the wider applicability of our channel model and our proposed design method with a model-based example. We consider control with a battery-powered EH sensor, and the state of charge (SoC) of the said battery constitutes the 'channel' state. The channel state evolves according to a linear saturated system with noise, which fits in the action-dependent Markov channel framework.

Energy-harvesting sensor
In this subsection, we model an EH sensor with a battery. The amount of energy harvested by the sensor is assumed to be stochastic, and a lack of enough energy collected by the sensor could lead to failure of transmissions. We model the SoC of the battery as a discrete valued quantity in the set [0,s] ℤ , wheres > 0 represents the maximum SoC. We let  k ∈ [0,s] ℤ denote the battery SoC on timestep k, which also is the 'channel' state in our framework. On every timestep, the battery first provides energy for transmission if required (t k = 1), and then harvests energy according to an arrival process {Z k } ∞ k=1 , which we assume to be i.i.d. We let ∈ ℕ be the energy cost of making a successful transmission, and if there is less than units of energy in the battery, the transmission fails and no energy is extracted from the battery. The above dynamics can be represented with a linear saturated system as where  + k is the intermediate state after possibly a transmission, which utilises energy from the battery. We now derive the Markov transition matrices P 0 and P 1 . From (32), we can obtain the (i, j )th element of P 0 and P 1 , with t k = 0 and t k = 1, respectively, as , if s ( j ) ≥ t k and s (i ) <s where s (i ) ∈ [0,s] ℤ is the ith discrete level that the battery SoC could be in. For the purpose of simulations, let Z k belong to a Poisson distribution with arrival rate > 0. Thus, Pr[Z k = q] = exp(− ) q (q!) −1 for q ≥ 0, and Pr[Z k = q] = 0 for any q < 0. In order to determine the packet-drop probabilities, that is, the vector e, we note that for any state s, if s < then the probability of packet drop is 1, otherwise it is 0. We write this formally as where e( j ) represents the jth element of the vector e.

Simulation results
For the EH sensor model, we choose the parameterss = 15, = 8 and = 0.85, while for the plant parameters, we choose the values a = 1.05, c = 0.98,ā = 0.95c, M = 0.25, B = 10 and x 0 = 15.5B. From the calculations presented in the Appendix, we find that B * = 2.32, and therefore the condition B > B * is satisfied. We carried out simulations using MATLAB. In order to generate empirical results, we simulate the system evolution 5000 times, followed by taking an average of these results. For the channel, we set the initial state 0 = 1 for all simulated trajectories, that is, the battery starts off completely discharged.
The simulation results are presented in Figures 2 and 3. In particular, Figure 2(a) shows the evolution of the empirical mean of the plant state for different values of the look-ahead parameter D. We note that a higher value of D leads to more 'aggressive' control as described in Remark 5.2. Figure 2(b) shows the evolution of the empirical mean of the battery SoC. In order to compare performance of the event-triggered policy with a periodic time-triggered policy, we also include in Figure 2(a) and (b) the evolution of plant and channel state under policy  tt , which sets t k = 1 for every k which is an integer multiple of ∈ ℕ 0 , and t k = 0 otherwise. It is interesting to note in Figure 2(b) that the battery SoC (channel state) settles to a constant value after initial transient behaviour, and this constant value is smaller for larger values of D, that is, a higher value of D expends more energy from the battery. The benefit in terms of energy savings in the EH battery under the proposed policy over periodic time-triggered policies is evident from Figure 2(b). In order to demonstrate the pattern of transmission times under the event-triggered policy, we display a stem plot of transmission and reception for one realisation of system evolution under event-triggered transmissions in Figure 2(c). Figure 3(a) shows the empirical value of the transmission fraction  k for both models for 5000 timesteps, and it can be seen that  k reaches a steady-state value for large k, with greater values of D leading to higher asymptotic values of  k . Figure 3(b) shows the empirical value of   generated during the simulation, while Figure 3(c) shows the theoretical upper bounds on both   (given in Theorem 5.1) and  ∞ (given in Corollary 5.1). From Figure 3(c), it can be seen that the theoretical upper bound on  ∞ is similar to the theoretical upper bound on   for  = Bc −2D , as noted in the proof of Theorem 5.1. As expected, both empirical values of  k and   , and their respective upper bounds are greater for larger values of D, which demonstrates the trade-off between performance and transmission fraction, as discussed in Remark 5.2.

CONCLUSION
This paper considers an NCS consisting of a scalar linear plant with process noise and non-collocated sensor and controller. Further, the sensor communicates over a time-varying channel whose state evolves according to an action-dependent Markov process. The state of the channel determines the probability with which a packet transmitted by the sensor is dropped. In this setting, we have designed an event-triggered transmission policy that guarantees the second-moment stabilisation of the plant state at a desired rate of convergence to an ultimate bound. We also derived upper bounds on the transient and the asymptotic transmission fraction, the fraction of timesteps on which the sensor transmits. We have verified and illustrated our analy-sis and theoretical guarantees through simulations in an example scenario, in which we considered the problem of control with an EH and battery-equipped sensor. Future work includes incorporation of imperfect measurement of plant and channel state, application of the proposed action-dependent Markov channel framework to control over a shared channel and over channels that are queuing processes.
To prove Claim (a2), we establish an upper bound on c 2 under the assumption that x 2 S j ∈ Λ 2 . Note that where we have again used the fact that c 2 < 1. From this bound, one can upper bound x 2 S jf (c 2 , S j ) as This concludes the proof of Claim (a2). Now, we recall the closed form of  S j . If x 2 S j ∈ Λ 1 , we havef (c 2 , S j ) −g (c 2 , S j ) = 0 and x 2 S j < Bc −2 , whilẽ g (ā 2 , S j ) ≥ 0. These facts along with Claim (a) imply that  S j ≤  j ( ) when x 2 S j ∈ Λ 1 . In the case that x 2 Then using Claim (b), the fact thatg (ā 2 , S j ) <g (c 2 , S j ) (sincē a 2 < c 2 ), and lastly the fact that x 2 S j ≥ Bc −2 , we conclude that  S j ≤  j ( ) when x 2 S j ∈ Λ 2 . Thus,  j ( ) uniformly upper bounds  S j for all x 2 S j ∈ [0, ∞).
We start the proof of Claim (b) by noting that  j ( ) can be written as From the element-wise non-negativity of P ( −1) 0 P 1 S j for all ∈ ℕ and S j ∈ [1, n] ℤ , we conclude that a sufficient condition to ensure  j (D) < 0 for a given D and all j ∈ ℕ 0 is to ensure that Q(D) < 0. We now show that every element of Q( ) is monotonically increasing in , and thus, Q(D) < 0 ensures )  (ā 2 ) +M log 2 (a 2 ) (a 2 ).
Note that each element of the second derivative is strictly positive. Thus, each element of Q( ) is strictly convex in . Also, note that the first derivative of Q( ) at = 0 is where [r1] follows from the fact that B ≥ B 0 . Since each element of Q( ) is strictly convex for ∈ ℝ and increasing at = 0, it follows that each element of Q( ) is monotonically increasing for ≥ 0. Thus, Q(D) < 0 implies Q( ) < 0, and thereby  S j < 0 for all ∈ [1, D] ℤ . □ Procedure to compute a sufficient lower bound B * on the ultimate bound B Here, we provide a procedure to compute the lower bound B * on B, referred to in Proposition 4.2. This procedure is based on the proof of Lemma IV.13 in [21] and we present it here for completeness. First, we define the following constants: P 1 := log(a 2 ∕ā 2 ), P 2 := log(a 2 c 2 ∕ā 2 ), P 3 := log(1∕c 2 ), ) .
Then We can also generalise our results to the case when there is no process noise (M = 0). For this scenario, note that a more basic definition of the function F * * (.) is given in Equation (24 b) in our previous work [21]. From this definition, it is easy to see that whenM = 0, F * * (y) :=ā 2 y − B, where is such that c 2 y = B. Thus, in particular, we see that = 0 when y = B and hence F * * (B) = 0 for all B ≥ 0. Further, note that B 0 = 0 ifM = 0. Hence, ifM = 0, we can choose B = B * = B 0 = 0, which then guarantees asymptotic stability for the plant state to zero.