Some novel aspects of the positive linear observer problem: Differential privacy and optimal l1 sensitivity

We present several results concerning the l 1 sensitivity, a crucial parameter for differential privacy, of a positive linear observer. Specifically, for compartmental systems we derive explicit analytic expressions for positive observers that minimize a bound for the l 1 sensitivity. Results are given for single-output systems and classes of multiple-output systems. For single-output general positive systems, we characterize the optimal l 1 sensitivity bound of a positive observer with given convergence rate. We also make some initial observations on sensitivity for more general classes of positive observers. © 2020 The Authors. Published by Elsevier Ltd on behalf of The Franklin Institute. This is an open access article under the CC BY license ( http://creativecommons.org/licenses/by/4.0/ )


Introduction
Motivated by applications in areas such as transportation, population dynamics, and communications, there has been a lot of interest in the theory of positive systems over recent decades. The linear time invariant (LTI) system a gain is optimal in that it will lead to a mechanism with minimal noise (variance) which is thus the most accurate possible. The structure of the paper is as follows. In Section 2 , we recall the relevant background to our work. In Section 3 , we present our main results on optimal positive observers for compartmental systems; results are given for the single output and multiple output cases. In Section 4 , we consider the problem of minimising the l 1 sensitivity bound for a single output system where the norm of the observer system matrix is specified. We briefly note in Section 5 that it is possible to improve on the optimal sensitivity for classical observers by considering the more general class [4] . Finally, in Section 6 , we present our concluding remarks and some ideas for future work.

Background and technical preliminaries
R n denotes the vector space of n -tuples of real numbers and R m×n the space of m × n matrices with real entries. For x ∈ R n : x ≥ 0 means that x i ≥ 0 for 1 ≤ i ≤ n . R n + denotes the nonnegative orthant R n + := { x ∈ R n | x ≥ 0} . Similarly, for A ∈ R m×n : A ≥ 0 means that a ij ≥ 0 for 1 ≤ i ≤ m and 1 ≤ j ≤ n . We will sometimes use the notation [ A ] ij to denote the i , j entry of a matrix A . R m×n + denotes the cone of nonnegative matrices. A ≥ B denotes that A − B ≥ 0 is a nonnegative matrix. A T denotes the transpose of A .
For a vector c ∈ R n + the support of c , supp( c ) is given by supp (c) = { j : c j = 0} . Throughout, we use x to denote the l 1 norm of x ∈ R n and for T ∈ R m×n , T denotes the l 1 induced norm of T . It is well known [25] that this is given by Our main results concern the case where the system (1) is compartmental [16][17][18] . A matrix A in R n×n + is compartmental if n i=1 a i j ≤ 1 for all 1 ≤ j ≤ n . If the system matrix A in Eq. (1) is compartmental, then Eq. (1) is a compartmental system.

Differential privacy and linear observers
We very briefly recall the most relevant facts concerning differential privacy in the context of this paper. For reasons of space, we omit much of the fundamental details on probability; for more details and background, see [20] .
For differential privacy, we first must choose a similarity relation on the space of signals y ( · ). Following [21] , given two constants K > 0, 0 < α < 1, the signals y , y are said to be similar, y ∼ y if there is some t 0 ≥ 0 with The similarity relation captures changes in the signal due to the behaviour of a small number of individuals at time t 0 . The parameters K and α need to be chosen in advance and depend on the application context and level of protection required. K describes how large an initial change is permitted while α describes the speed with which this change decays to zero. The sensitivity of the system (2) is the key parameter that determines the magnitude of the noise required in order to achieve differential privacy . We use ˆ x y to denote the observer signal corresponding to the output y . Definition 2.1. The l 1 sensitivity of the observer (2) is given by: A real-valued Laplace random variable with mean 0 and scale parameter b > 0 has probability density function (pdf) f (x) = 1 2b e − | x| b . Given ε > 0, an ε differentially private mechanism for the observer (2) can be constructed by adding noise from a Laplace distribution to each component of ˆ x (t ) at each time t ≥ 0. The key fact is the following.
Proposition 2.1 [20] . Let Eq. (2) have sensitivity and ε > 0 be given. Choose b > ε , and define the mechanism ˆ X y by adding a Laplace random variable with mean 0 and scale parameter b , to each component of ˆ x y (t ) for all t. Then ˆ X y is ε differentially private.
Proposition 2.1 highlights the important role played by the sensitivity of the system in determining the amount of noise required for differential privacy. Exact computation of the sensitivity can be difficult and complex. However, a differentially private mechanism for Eq. (2) can be constructed using the Laplace distribution provided we have an upper bound for the l 1 sensitivity of Eq. (2) .
Recall that Eq. (2) is a positive observer for the positive system (1) if and only if the following conditions are satisfied [3] : In order to ensure that an observer of the form (2) has a finite l 1 sensitivity with respect to the similarity definition (5) , we require that A − LC < 1 . Thus we are considering a more restricted form of the positive observer problem. In [23] the following bound for the l 1 sensitivity of a linear observer was derived. Theorem 2.1. Consider the observer (2) with A − LC < 1 and let K > 0, 0 < α < 1 be given. The sensitivity of Eq. (2) with respect to the similarity relation (4) satisfies the following bound: To minimize the amount of noise added to the observer state ˆ x , we want an observer gain L that minimizes the upper bound in Theorem 2.1 . As K and α are fixed parameters, we focus on minimizing the function for It is worth noting that the function is essentially the l 1 norm of the zero-initial-state observer (2) . Hence, it quantifies the sensitivity of this system to perturbations in the signal y in terms of the l 1 norm on the associated sequence spaces. An interesting direction for future work would be to exploit this observation to investigate whether results on robust observer design, and the dual problem of robust state feedback, can be related to the problems of differential privacy discussed here. In later examples, we shall help clarify the (somewhat complicated) relationship between ( L ) and the norm of the observer gain L . In particular, we show that for some system classes, reducing L can increase ( L ) but the opposite may also occur.

Compartmental systems
We now assume Eq. (1) is a compartmental system and seek to minimize the function , in Eq. (7) , over observer gain matrices L that define a positive observer for Eq. (1) . First note that if A < 1, we can choose L = 0 giving (L) = 0; as this case is trivial for the problem we are studying here, we assume that A = 1 (remember A is compartmental).

Single output systems: p = 1
We first consider the case where p = 1 so that C = c T for some nonnegative column vector c . Given a compartmental matrix A ∈ R n×n + with A = 1 and a column vector c ∈ R n + , we say that ( A , c ) is a feasible pair if there exists some nonnegative l ∈ R n + with A − lc T < 1 , A − lc T ≥ 0. Note that when p = 1 , the conditions lc T ≥ 0, l ≥ 0 are equivalent [3] . We denote the set of all such l by F A,c . We first characterize feasible pairs for the compartmental case.
Proof. First assume that (i) and (ii) hold. Choose some k such that a kj > 0 for all j ∈ supp( c ) and set x = min { a k j c j : j ∈ supp (c) } . Clearly, x > 0. Now define l ∈ R n + by It follows immediately that for i = k , a i j − l i c j = a i j ≥ 0 for all j . Moreover, for i = k, a k j − l k c j = a k j ≥ 0 for j ∈ supp( c ). For j ∈ supp( c ), the definition of x implies that a k j − l k c j = a k j − xc j ≥ 0 and so A − lc T ≥ 0. Furthermore, x > 0 and J ⊆ supp (c) ; these conditions imply that for j ∈ J , From the definition of J , we conclude that A − lc T < 1 and hence ( A , c ) is a feasible pair. Conversely, assume that ( A , c ) is a feasible pair and let l ≥ 0 be such that A − lc T ≥ 0 and A − lc T < 1 . As A = 1 , it follows that J is non-empty and that l = 0. Let j ∈ J be given. Then n i=1 a i j = 1 and, as A − lc T < 1 , we must have n i=1 (a i j − l i c j ) < 1 . This implies that c j > 0 and hence j ∈ supp( c ). This proves (i). As l = 0, we can choose some i with l i > 0. Then as a i j − l i c j ≥ 0 for all i , j we must have a ij > 0 for all j ∈ supp( c ). This proves (ii) and completes the proof of the Lemma.
When p = 1 , the gain matrix L is simply a column vector l ∈ R n + ; hence, for this subsection we slightly alter our notation for for the function in Eq. (7) and write ( l ) for l in F A,c . In our next result, we give an explicit analytic characterisation of this minimum value of ( l ) for l ∈ F A,c .

Theorem 3.1. Let a compartmental matrix A ∈ R n×n
Proof. We first note that for any l ∈ F A,c , This implies that 1 − A − lc T ≤ l min j∈J c j and hence that Thus to complete the proof, we need to show that there exists some Then as α k < 1 for all k ∈ J c , M > 0. By assumption, F A,c is non-empty. Thus, by Lemma 3.1 , J ⊆ supp (c) and we can choose some i 0 such that a i 0 j > 0 for all j ∈ supp( c ). Define the vector ˆ l ∈ R n + by: We note the following readily verifiable facts.
To finish the proof, we will show that for ˆ l constructed above, A −ˆ l c T = 1 Putting the previous calculations together, we see that It now follows immediately that which completes the proof.
Constructing the optimal ˆ l : It is relatively straightforward to construct the optimal observer gain by following the steps in the proof of Theorem 3.1 .
1. First find the minimum value of the entries of c corresponding to indices in J ; choose one such entry c j 0 .
l by setting its i 0 th entry equal to the minimum of b and M and all other entries zero. Remark: Theorem 3.1 gives an explicit, analytic expression for the minimum of the sensitivity bound for a single-output compartmental system:

Example 3.1. Let
Further, it provides a constructive way of obtaining the gain vector ˆ l that achieves this minimum.

Extension to multiple output systems ( p ≥ 2)
We next consider the more general case where p ≥ 2. Given a compartmental matrix A ∈ R n×n we say that ( A , C ) is a feasible pair and denote the set of all such L by F A,C . We wish to minimize the function ( L ) given by Eq. (7) over L ∈ F A,C . Lemma 3.1 was important for our construction of the optimal observer gain vector l in Theorem 3.1 . It is tempting to conjecture the following natural generalisation of this result. For 1 ≤ k ≤ p , let c ( k ) denote the k th row of C . As above, let J denote the set { j : n i=1 a i j = 1 } . A natural conjecture generalising Lemma 3.1 is that ( A , C ) is a feasible pair if and only if the following two conditions are satisfied.
Unfortunately, this conjecture is not true as is shown by the following example.

Example 3.2. Consider the matrices:
Then clearly A is compartmental and A = 1 . Moreover, ( A , C ) is feasible as the matrix We will revisit the previous example after our next result which extends Theorem 3.1 to multiple output compartmental systems satisfying conditions (F1) and (F2) above.

Theorem 3.2. Let a compartmental A ∈ R n×n
Then the pair ( A , C ) is feasible and moreover there exists ˆ L ∈ F A,C such that for all L in F A,C : where is given by Eq. (7) .
Proof. Let L ∈ F A,C be given. Then, as A − LC ≥ 0: Note that while we are not assuming L ≥ 0 here, we do have C ≥ 0 and hence for 1 ≤ i , j ≤ n : It follows that for any j ∈ J : Combining this with Eq. (12) , we see that It now follows immediately that for any L ∈ F A,C : It remains for us to show that there exists some ˆ L in F A,C such that ( ˆ L ) attains this lower bound.
To begin, write J c for the complement of J , J c = { 1 , . . . , n} \ J . Also, for 1 ≤ j ≤ n , let α j = n i=1 a i j ; thus α j < 1 for all j ∈ J c . It is easy to see that by choosing x > 0 sufficiently small, we can ensure that Now, assumption (F2) implies that for 1 ≤ k ≤ p , there is some (not necessarily unique) i k in { 1 , . . . n} such that a i k j > 0 for all j with c kj > 0. We use this fact to construct ˆ L . First, choose i 1 such that c 1 j > 0 implies a i 1 j > 0. Then, for some x > 0 (to be determined later) set ˆ l i 1 1 = x, ˆ l s1 = 0 for s = i 1 . Repeat this, choosing i k for k = 2, 3 , . . . p and in each case setting ˆ l i k k = x and ˆ l sk = 0 otherwise. Note that all of the i k selected need not necessarily be distinct. However, it can be seen from the construction of ˆ L that: (i) each column of ˆ L has exactly one non-zero entry which is equal to x ; (ii) ˆ l sq > 0, c qj > 0 implies that a sj > 0 (as in this case s = i q ).
From (i), it follows that n i=1 ˆ l i j = x for 1 ≤ j ≤ p and that ˆ L = x. From point (ii), it follows that by choosing x sufficiently small (and positive) we can ensure that A −ˆ L C ≥ 0. Clearly as ˆ L ≥ 0, ˆ L C ≥ 0. Finally, note that if j ∈ J , there is some k ∈ { 1 , . . . p} such that c kj > 0. Hence, by the construction of ˆ L , ˆ l i k k = x > 0. This implies that n i=1 [ A −ˆ L C] i j < 1 for any such j and by the definition of J , A −ˆ L C < 1 . Thus ˆ L ∈ F A,C . Finally, note that Now, if we choose x > 0 sufficiently small so that Eq. (14) holds, then we can ensure that max 1 ≤ j≤n (α j − xγ j ) = 1 − x min j∈J γ j . As ˆ L = x by construction, it follows that for such a choice of x > 0: This completes the proof.

Remarks:
(i) Theorem 3.2 explicitly characterizes the minimum value of ( L ) for a multiple output compartmental system satisfying (F1), (F2). Further, the proof is constructive as it demonstrates how to construct an optimal observer gain matrix L . (ii) For p ≥ 2, it is in general not necessary for the gain matrix L of a positive observer (10) to be nonnegative. However the optimal L constructed in Theorem 3.2 is indeed nonnegative. The problem of determining classes of positive linear systems for which there exists a nonnegative optimal observer gain is an interesting one for further research. It is certainly not going to be true for arbitrary positive systems as [3] contains examples of systems (1) for which there exists a positive observer with LC ≥ 0 but no positive observer with L ≥ 0.
Constructing an optimal observer gain ˆ L : It is possible to construct an optimal gain ˆ L for a system satisfying (F1), (F2) in the following way.  It is straightforward to see that (F1) and (F2) are satisfied and J = { 1 , 3 } so the optimal observer sensitivity is given by 1. Following the algorithm steps to construct ˆ L , 1 = 1 and b 1 = min { 4/ 3 , 5 / 8 } = 5 / 8 . Next we note that i 1 = 2, i 2 = 3 so A simple calculation shows that b 2 = 2/ 5 so we can (for example) choose x = 1 / 5 to give Our next example shows that the conclusion of Theorem 3.2 does not necessarily hold without the assumptions (F1), (F2) on the matrix pair ( A , C ).

Trade-offs for l 1 sensitivity
When designing an observer (2) , the speed of convergence to the true state, as determined by A − LC , is an important consideration. In general, there will be a conflict or trade-off between obtaining the lowest possible value for the sensitivity bound (or equivalently the function ) and minimising the norm A − LC . In this section, we consider this trade-off for a general positive LTI system, not necessarily compartmental, with a single output.
Consider the system (1) , where A is a nonnegative matrix, not necessarily compartmental, and C is a (non-zero) nonnegative row vector, which we will write as c T for some c ∈ R n + \ { 0} . Thus, we adopt the notation of Section 3.1 and consider the interplay between the objective functions A − lc T and ( l ) where l ∈ R n + is constrained to lie in the feasibility region F A,c . Formally, we will address the following question in detail.

Problem 4.1. Given A ∈ R n×n
+ , c ∈ R n + and η ∈ [0, 1), find the minimum value of ( l ) subject to We first characterise the possible values of η = A − lc T for l satisfying the first two conditions of Eq. (15) .

Lemma 4.1. Let A ∈ R n×n
+ , c ∈ R n + be given and suppose that l ∈ R n + is such that A − lc T ≥ 0. Define the vector ˆ l for 1 ≤ i ≤ n by Then Proof. As A − lc T ≥ 0 and l ≥ 0, it follows that for all a ik c k , then l ≤ˆ l . As c ≥ 0 and the l 1 induced matrix norm is monotonic [25] , it follows that:

Remarks:
(i) For the rest of this section, given A ∈ R n×n + and c ∈ R n + , we shall use η min and η max to denote (ii) If η min = η max , then A − lc T = A for any l satisfying the first two conditions of Eq. (15) . There are two possibilities in this case: A ≥ 1 in which case the problem is not feasible; A < 1 and l = 0 gives a solution of Problem 4.1 with (l ) = 0. Thus if η min = η max , the problem is either infeasible or trivial, so we shall assume for the rest of this section that η min < η max ; this implies that ˆ l = 0. (iii) Let A ∈ R n×n + , c ∈ R n + be such that η min < η max . Then it is not difficult to see that for any η ∈ [ η min , η max ] there exists some l ∈ R n + that satisfies Eq. (15) . This is essentially an application of the Intermediate Value Theorem coupled with the continuity of the norm. (iv) We are interested in the function as a bound for the l 1 sensitivity of a linear observer (2) . This bound, given by Theorem 2.1 , is only valid if A − lc T < 1 holds. For this reason, we will assume for the rest of this section that η min < 1.
Proof. By assumption, A −ˆ l c T < A . This implies that for any j with n i=1 a i j = A , we must have c j > 0. Let n i=1 a ik = A for some k . It follows that k ∈ supp( c ) and that for η ∈ [ η min , η max ]: The next result characterizes the norm of l ∈ F (η) in terms of η.

Proof. As
From the definition of supp( c ), it follows immediately that: Remark: The last result gives us a lower bound for l where l is in F (η) and η ∈ [ η min , η max ]. We next show that this lower bound is in fact a minimum.
Proposition 4.1. Let A ∈ R n×n + , c ∈ R n + \ { 0} be given. Assume that η min < η max . For any η ∈ [ η min , η max ] let M ( η) be given by Eq. (20) . Then: Proof. From Lemma 4.3 , it is enough to show that there exists l * ∈ F (η) with l * = M(η) . As η min < η max , it follows that there exists l ∈ F (η) such that l > 0; choose such an l . As Hence for all j ∈ supp( c ), n i=1 a i j ≤ η. Define First of all, note that for all j ∈ supp( c ), Next note that the definition of M ( η) implies that: for some j 0 ∈ supp( c ).

Remark:
It is now straightforward to apply Proposition 4.1 to answer Problem 4.1 . With the application to differential privacy in mind, we make the assumption that η min < 1 (in order to ensure the bound in Theorem 2.1 is valid). The next result follows immediately from the definition of in Eq. (7) and the set F (η) .
Remark: In the corollary, if η max < 1 we can include the right endpoint η 1 (allow η ∈ [ η min , η max ]) but this makes no material difference to the argument or conclusion.

Implications for globally optimising the sensitivity bound
We now give two simple applications of Corollary 4.1 to the problem of finding a global minimum of over l in F A,c .

Proposition 4.2.
Let A ∈ R n×n + , c ∈ R n + \ { 0} be given. Assume that η min < η max and η min < 1 . If n i=1 a i j ≥ 1 for all j ∈ supp(c), then the infimum of ( l ) for l in F A,c is M(η min ) 1 −η min .
Finally for this section, we prove a simple result for the case where the non-zero entries of c are all equal.

The l 1 sensitivity of positive observers with coordinate transformation
In [4] , a more general form of positive observer was studied for continuous time systems. For the discrete time setting, the observer structure studied in [4] is of the form: In order for this to define a positive observer for Eq. (1) , the following conditions are sufficient: By suitably adapting the calculation previously published in [23] for the observer (2) , we can readily derive the following bound for the l 1 sensitivity of Eq. (22) . As before, we require that F < 1 in order to ensure that the sensitivity bound is finite. Proposition 5.1. Consider the observer (22) with F < 1 . Let K > 0, 0 < α < 1 be given. The sensitivity of Eq. (22) with respect to the similarity relation (4) satisfies the following bound: Remark: It is easy to see that a classical Luenberger observer (2) corresponds to the choice T = I , F = A − L C, G = L , so Theorem 2.1 is a corollary of Proposition 5.1 . It is immediate that an optimal observer of the form (22) cannot perform worse than the optimal observer of the classical type. However, note that characterising the set of possible observers satisfying Eq. (22) may be significantly more complicated than for the classical case. Hence, the problem of determining an optimal observer belonging to the more general class is likely to be far more challenging and is beyond the scope of the present paper. The point of the next example is simply to show that it is possible to improve significantly on the theoretical minimum for a classical observer by considering the more general type.
In Example 5.1 , we consider an observer (22) for a single-output compartmental system. Thus, an observer is defined by a triple ( F , T , g ) in R n×n + × R n×n × R n + satisfying: We construct such an observer for which T −1 g 1 − F is strictly less than the theoretical minimum given by Theorem 3.1 for the classical observer. 5 . Thus, the upper bound for the general observer (23) of 9 5 is significantly lower than the theoretical minimum value of 3 for a classical observer in this case.

Concluding remarks
Theorems 3.1 and 3.2 provide simple, usable expressions for the minimum value of the l 1 sensitivity bound in Theorem 2.1 as well as indicating a constructive procedure for computing an optimal observer gain. In Section 4 we characterize the interplay between the l 1 sensitivity bound and the rate of convergence for single-output positive LTI systems. We have provided several numerical examples to illustrate the results of the paper; Example 5.1 shows explicitly that the more general type of observer in [4] can offer significantly improved performance.
There are several interesting directions for future research. For instance, extending Theorem 3.2 to a more general class of multiple-output compartmental systems. Specifically, can we relax the assumptions (F1) and (F2) of Section 3.2 ? The optimal gain matrix in Theorem 3.2 is nonnegative even though, as we have noted, this does not need to be the case for general positive systems. This suggests the problem of identifying matrix pairs ( A , C ) with p ≥ 2 for which the minimum value of ( L ) for L ∈ F A,C occurs at a nonnegative matrix L . Another question is to consider arbitrary output matrices C for the multi-output case and the corresponding optimization. Finally, determining the minimum value of the bound for the more general observer class in Section 5 presents a significant challenge.