Approximate Relational Hoare Logic for Continuous Random Samplings

Approximate relational Hoare logic (apRHL) is a logic for formal verification of the differential privacy of databases written in the programming language pWHILE. Strictly speaking, however, this logic deals only with discrete random samplings. In this paper, we define the graded relational lifting of the subprobabilistic variant of Giry monad, which described differential privacy. We extend the logic apRHL with this graded lifting to deal with continuous random samplings. We give a generic method to give proof rules of apRHL for continuous random samplings.


Introduction
Differential privacy is a definition of privacy of randomized databases proposed by Dwork, McSherry, Nissim and Smith [7]. A randomized database satisfies εdifferential privacy (written ε-differentially private) if for any two adjacent data, the difference of their output probability distributions is bounded by the privacy strength ε. Differential privacy guarantees high secrecy against database attacks regardless of the attackers' background knowledge, and it has the composition laws, with which we can calculate the privacy strength of a composite database from the privacy strengths of its components.
Approximate relational Hoare logic (apRHL) [2,16] is a probabilistic variant of the relational Hoare logic [4] for formal verification of the differential privacy of databases written in the programming language pWHILE. In the logic apRHL, a parametric relational lifting, which relate probability distributions, play a central role to describe differential privacy in the framework of verification. This parametric lifting is an extension of the relational lifting [10, Section 3] that captures probabilistic bisimilarity of Markov chains [13] (see also [6, lemma 4]). The concept of differential privacy is described in the category of binary relation and mappings between them, and verified by the logic apRHL.
Strictly speaking, however, apRHL deals only with random samplings of discrete distributions, while the algorithms in many actual studies for differential privacy are modelled with continuous distributions, such as, the Laplacian distributions over real line. Therefore apRHL is desired to be extended to deal with random continuous samplings.

Contributions
Main contributions of this paper are the following two points: • We define the graded relational lifting of sub-Giry monad describing differential privacy for continuous random samplings.
This graded relational lifting is developed without witness distributions of probabilistic coupling, and hence is constructed in a different way from the coupling-based parametric lifting of relations given in the studies of apRHL [1,2,16].
In the continuous apRHL, we mainly extend the proof rules for relation compositions and the frame rule. We also develop a generic method to construct proof rules for random samplings. By importing the new rules added to apRHL+ in [1], we give a formal proof of the differential privacy of the above-threshold algorithm for real-valued queries [8, Section 3.6].

Preliminaries
We denote by Meas the category of measurable spaces and measurable functions between them and denote by Set the category of all sets and functions. The category Meas is complete and cocomplete, and the forgetful functor U : Meas → Set preserves products and coproducts. We also denote by ωCPO ⊥ of the cateory of ω-complete partial orders with the least element and continuous functions.

A Category of Relations between Measurable Spaces
We introduce the category BRel(Meas) of binary relations between measurable spaces as follows: • An object is a triple (X, Y, Φ) consisting of measurable spaces X and Y and a relation R between X and Y (i.e. R ⊆ U X × U Y ). We remark that R does not need to be a measurable subset of the product space X × Y .
When we write an object (X, Y, Φ) in BRel(Meas), we omit writing the underlying spaces X and Y if they are obvious from the context. We write p for the forgetful functor p : BRel(Meas) → Meas × Meas which extracting underlying spaces: (X, Y, Φ) → (X, Y ). We call an endofunctor F on BRel(Meas) a relational lifting of an endofunctor E on Meas if (E × E)p = pF .

Sato
The Sub-Giry Monad The Giry monad on Meas is introduced in [9] to give a categorical approach to probability theory; each arrow X → Y in the Kleisli category of the Giry monad bijectively corresponds to a probabilistic transition from X to Y , and the Chapman-Kolmogorov equation corresponds to the associativity law of the Giry monad.
We recall the sub-probabilistic variant of the Giry monad, which we call the sub-Giry monad (see also [17,Section 4]): • For any measurable space (X, Σ X ), the measurable space (GX, Σ GX ) is defined as follows: the underlying set GX is the set of subprobability measures over X, and the σ-algebra Σ GX is the coarsest one that makes the evaluation function ev A : GX → [0, 1] (mapping ν to ν(A)) measurable for each A ∈ Σ X .
• For each f : The monad G is commutative strong with respect to the cartesian product in Meas.
The Kleisli category Meas G is often called the category SRel of stochastic relations [17,Section 3]. The category SRel is ωCPO ⊥ -enriched (with respect to the cartesian monoidal structure) with the following pointwise order: The least upper bound sup n∈N f n of any ω-chain f 0 ⊑ f 1 ⊑ · · · ⊑ f n ⊑ · · · is given by (sup n f n )(x)(B) = sup n (f n (x)(B)). The least function of each SRel(X, Y ) (written ⊥ X,Y ) is the constant function of the null-measure over Y . The continuity of composition is obtained from the following two facts: • From the definition of Lebesgue integral, for any ω-chain {ν n } of subprobability measures over X, X f d(sup n ν n ) = sup n X f dν n holds.
• From the monotone convergence theorem, we have X sup n f n dν = sup n X f n dν.
This enrichment is equivalent to the partially additive structure on SRel [17, Section 5]: For any ω-chain {f n } n∈N of f n : X → Y in SRel, we have the summable sequence {g n } n where g 0 = f 0 and g n+1 = f n+1 − f n .Conversely, for any summable sequence {g n } n∈N , the functions f n = n k=0 g n form an ω-chain.

Differential privacy
Throughout this paper, we define the approximate differential privacy as follows: What we modify from the original definition [8,Definition 2.4] is the domain and codomain of c; we replace the domain from N to R, and replace the codomain from a discrete probability space to G(R n ). We apply this definition to the interpretation of pWHILE programs. The input and output spaces can be other spaces: in section 4 we consider the above-threshold algorithm Above whose output space is Z. The above modification is essential in describing and verifying the differential privacy of this algorithm because it takes a sample from Laplace distribution over real line.

A Graded Monad for Differential Privacy
The composition law of differential privacy plays crucial role to in the compositional verification of the differential privacy of database programs. Barthe, Köpf, Olmedo, and Zanella-Béguelin constructed a parametric relational lifting describing differential privacy, and developed a framework for compositional verification of differential privacy [2].
Following this relational approach, we construct the parametric relational lifting of Giry monad to describe differential privacy for continuous random samplings. This lifting forms a graded monad on the category BRel(Meas) in the sense of [11]. The axioms of graded monad correspond to the (sequential) composition law of differential privacy. An M -graded monad ({T e } e∈M , η, µ e 1 ,e 2 , ⊑ e 1 ,e 2 ) on C is called an M -graded lifting of monad (T,

A Graded Relational Lifting of Giry Monad for Differential Privacy
Let M be the cartesian product of the monoids ([1, ∞), ×, 1) and ([0, ∞), +, 0) equipped with the product order of numerical orders. For each (γ, δ) ∈ M , we define the following mapping of BRel(Meas)-objects by Proof. Since the functor p is faithful, it suffices to show: First, the following equation holds: where ≤ is the numerical order relation on G1 ≃ [0, 1]. We omit the proof of this equation. It can be shown in the same way as [12,Theorem 12].
The M -graded lifting {G (γ,δ) } (γ,δ)∈M describes only one side of inequalities in the definition of differential privacy. By symmetrising this, we obtain the following M -graded lifting {G (γ,δ) } (γ,δ)∈M exactly describing the differential privacy for continuous probabilities: In the original works [2,3] of apRHL, the following relational lifting (−) ♯(γ,δ) is introduced to describe differential privacy. This lifting relates two distributions if there are intermediate distributions d 1 and d R , called witnesses, whose skew distance, defined by ∆ |}, is less than or equal to δ. We denote by D the subdistribution monad over Set. Let Ψ be a relation between sets X and Y , and d 1 ∈ DX and d 2 ∈ DY be two subdistributions. We define the relation Ψ ♯(γ,δ) ⊆ DX × DY as follows: Proposition 2.5 For any countable discrete spaces X and Y , and relation Ψ ⊆ We remark GX = DX for countable discrete space X. When X is not countable, we have the above results by embedding each d ∈ DX in the set DX ′ of subprobability distributions over the countable subspace X ′ = X ∩ supp(d).
is proved by the witnesses given by

The Continuous apRHL
We introduce a variant of the approximate probabilistic relational Hoare logic (apRHL) to deal with continuous random samplings. We name it the continuous apRHL.

The Language pWHILE
We recall and reformulate categorically the language pWHILE [2]. In this paper, we mainly refer to the categorical semantics of a probabilistic language given in [5,Section 2]. The language pWHILE is constructed in the standard way, hence we sometimes omit the details of its construction.

Syntax
We introduce the syntax of pWHILE by the following BNF: Here, τ is a value type; x is a variable; p is an operation; d is a probabilistic operation; e is an expression; ν is a probabilistic expression; i is an imperative; c is a command (or program). We remark constants are 0-ary operations.
We introduce the following syntax sugars for simplicity:

Typing Rules
We introduce a typing rule on the language pWHILE. A typing context is a finite set Γ = {x 1 : τ 1 , x 2 : τ 2 , . . . , x n : τ n } of pairs of a variable and a value type such that each variable occurs only once in the context. We give typing rules of pWHILE as follows: Γ ⊢ t e 1 : τ 1 . . . Γ ⊢ t e n : τ n p : (τ 1 , . . . , τ n ) → τ Γ ⊢ t p(e 1 , . . . , e n ) : Here, the type (τ 1 , . . . , τ n ) → τ of each operation p and each probabilistic operation d are assumed to be given in advance.
We easily define inductively the set of free variables of commands, expressions, and probabilistic expressions (denoted by F V (c), F V (e), and F V (ν)).

Denotational Semantics
We introduce a denotational semantics of pWHILE in Meas. We give the interpretations [[τ ]] of the value types τ : We interpret a typing context Γ = {x 1 : τ 1 , x 2 : τ 2 , . . . , x n : τ n } as the product space The interpretation of expressions are defined inductively by: The interpretation of commands are defined inductively by: Here, . . , x n : τ n }, f k = π 2 , and f l = π l • π 2 (l = k).
which is obtained from the distributivity of the category Meas.
We remark that, from the commutativity of the monad G, if Γ ⊢ x : τ and

Judgements of apRHL
A judgement of apRHL is c 1 ∼ γ,δ c 2 : Ψ ⇒ Φ, where c 1 and c 1 are commands, and Ψ and Φ are objects in BRel(Meas). We call the relations Ψ and Φ the precondition and postcondition of the judgement respectively. Inspired from the validity of asymmetric apRHL [2], we introduce the validity of the judgement of apRHL.

Proof Rules
We mainly refer the proof rules of apRHL from [2,16], but we modify the [comp] and [frame] rules to verify differential privacy for continuous random samplings.
The relational lifting G (γ,δ) does not preserve every relation composition. However, it preserve the composition of relations if the relations are measurable, that is, the images and inverse images along them of mesurable sets are also measurable (see also [12,Section 3.3]). Generally speaking, it is difficult to check measurability of relatons, hence the continuous apRHL is weak for dealing with relation compositions. However, we have the following two special cases: • The equality/diagonal relation on any space is a measurable relation.
• Any relation between discrete spaces is automatically a measurable relation.

Hence, the following [comp] rule is an extension of the original [comp] rule in [2]:
Φ and Φ ′ are measurable relations We define the [frame] rule with the construction Range(−): ] is countable discrete then the condition (ν 1 , ν 2 ) ∈ Range(Θ) is equivalent to supp(ν 1 ) × supp(ν 2 ) ⊆ Θ, and hence the above [frame] rule is an extension of the original [frame] rule in [2].

Soundness
The

Mechanisms
In this part, we give a generic method to construct the rules for random samplings, and by instantiating the method we show the soundness of the proof rules in prior researches: [Lap] for Laplacian mechanism [7], [Exp] for Exponential mechanism [14], [Gauss] for Gaussian mechanism [8, Theorem 3.22, Theorem A.1], and [Cauchy] for the mechanism by Cauchy distributions [15].
Let f : X × Y → R be a positive measurable function, and ν be a measure over Y . We define the following function f a : Σ Y → [0, 1] by

Sato
We remark that the function f (a, −) : Y → R is measurable. If the function is not 'almost everywhere zero' and Lebesgue integrable, that is, 0 < Y f (a, −) dν < ∞ then f a (−) is a probability measure.
The following proposition, which is an extension of [2, Lemma 7], plays the central role in the construction of sound proof rules for random samplings.
Proposition 3.2 Let f : X × Y → R be a positive measurable function, and ν be a measure over Y . For all a, a ′ ∈ X, γ, γ ′ ≥ 1, δ ≥ 0, and Z ∈ Σ Y (window set), if the following three conditions hold then for any B ∈ Σ Y , we have f a (B) ≤ γγ ′ f a ′ (B)+δ.
From the [rand] rule, the following rule is proved: We give the function f : 2σ 2 ), where σ > 0 is the variance of Gaussian mechanism. We introduce the probabilistic operation Gauss σ : real → real with [[Gauss σ ]] = f (−) , whose continuity is easily proved.
From the [rand] rule, we obtain the following rule:

An Example: The Above Threshold Algorithm
Barthe, Gaboardi, Grégoire, Hsu, and Strub extended the logic apRHL to the logic apRHL+ with new proof rules to describe the sparse vector technique (see also [8,Section 3.6]). They gave a formal proof of the differential privacy of above threshold algorithm in the preprint [1] in arXiv.
In this section, we demonstrate that the above threshold algorithm with realvalued queries is proved with almost the same proof as in [1]. The new proof rules of apRHL+ are still sound in the framework of the continuous apRHL.

Sato
We consider the following algorithm AboveT: if T ≤ S ∧ r = |Q| + 1 then 6: r ← j; 7: j ← j + 1 We recall the setting of this algorithm. This algorithm has two fixed parameters: the threshold t : real and the set Q : queries of queries where |Q| : int is the number of Q. The input variable is d : int, and the output variable is r : int. We prepare the new value types queries and data with [[data]] = R N and queries = int (alias), and the typings j : int, T : real, and S : real. We assume that an operation eval : (queries, int, data) → real is given for evaluating i-th query in Q for the input d. We require [[eval]] to be 1-sensitivity for the data d, that is, The differential privacy of Above is characterised as follows: The following rules in apRHL+ are sound in the framework of continuous apRHL: Hence we extend the contiuous apRHL by adding these rules, and therefore we construct a formal proof almost the same proof as in [1] in the extended continous apRHL.
The soundness of the rule [Forall-Eq] is proved from the following lemma:

Formal Proof
We now demonstrate that the (ε, 0)-differential privacy of algorithm AboveT is proved with almost the same proof as in [1].
From the [Forall-Eq] rule with variable r, it suffices to prove for all integer i, We denote by c 0 the sub-command consisting of the initialization line 2 of AboveT. From the rules [assn], [LapGen] rule with r = r ′ = 1, and σ = 2/ε, [seq], and [frame] we obtain where We denote by c 1 and c 2 the main loop and the body of the main loop respectively (i.e. c 1 = while (j < |Q|) do c 2 ). We aim to prove the following judgement by using the [while] rule: To prove this, it suffices to show the following cases for the loop body c 2 : Here, we provide the following loop invariant as follows: The judgement in the case (i) is proved from the rules   (||d 1 − d 2 || 1 ≤ 1 ∧ T 1 + 1 = T 2 ) ⇒ (S 1 + 1 = S 2 ∧ T 1 + 1 = T 2 ).
The case (iii) is proved in the similar way as (i).
This appendix will be deleted from the final version of this paper.

A Appendix
We show some omitted proofs in this paper.
A.1 Proofs in Section 1.2 Proposition A.1 The composition of the category SRel = Meas G is continuous with respect to the ordering ⊑.
Proof. Consider a measurable function h : Y → GZ and an ω-chain {f n : X → GY } n with respect to ⊑. We fix x ∈ X. Since the ω-chain of measures f n (x) are bounded, and hence it conveges strongly (sup n f n )(x). This implies that, from the definition of Lebesgue integral, for any C ∈ Σ Z and x ∈ X, we obtain  is a measurable function X → GY .
Proof. For each x ∈ X, the finiteness of the measures f 1 (x) and f 2 (x) imply the countable additibity of (f 1 − f 2 )(x) as follows: where n B n is the union of a countable disjoint collection B 0 , B 1 , .... Therefore f 1 − f 2 is at least a function of the form X → GY . The σ-algebra of GY is generated by the following countable collection: Since is measurable for all A ∈ Σ Y and α ∈ [0, 1] ∩ Q (i = 1, 2). We then calculate Hencer, the function f 1 − f 2 is measurable. ✷

A.2 Proofs in Section 2.2
We recall the definition of the indicator function χ A : X → [0, 1] of a subset A ⊆ X: