Quantitative Logics for Equivalence of Effectful Programs

In order to reason about effects, we can define quantitative formulas to describe behavioural aspects of effectful programs. These formulas can for example express probabilities that (or sets of correct starting states for which) a program satisfies a property. Fundamental to this approach is the notion of quantitative modality, which is used to lift a property on values to a property on computations. Taking all formulas together, we say that two terms are equivalent if they satisfy all formulas to the same quantitative degree. Under sufficient conditions on the quantitative modalities, this equivalence is equal to a notion of Abramsky's applicative bisimilarity, and is moreover a congruence. We investigate these results in the context of Levy's call-by-push-value with general recursion and algebraic effects. In particular, the results apply to (combinations of) nondeterministic choice, probabilistic choice, and global store.


Introduction
There are many notions of program equivalence for languages with effects. In this paper, we explore the notion of behavioural equivalence, which states that programs may be considered behaviourally equivalent if they satisfy the same behavioural properties. This can be made rigorous by defining a logic, where each formula φ denotes a certain behavioural property. We write (P ... |= φ) to express the satisfaction of formula φ by term P ... , which is usually given by a Boolean truth value (true or false). Two terms P ... and R ... are said to be behaviourally equivalent if they satisfy the same formulas. Such an approach is taken in for example [9].
In particular, we use this method to define equivalence for a language with algebraic effects in the sense of Plotkin and Power [26]. Effects can be seen as aspects of computation which involves interaction with the world 'outside' the environment in which the program runs. They include: exceptions, nondeterminism, probabilistic choice, global store, input/output, cost, etc. The examples given have common ground in the work of Moggi [21], and can moreover be expressed by specific effect triggering operations making them 'algebraic' in nature.
In the presence of such algebraic effects, computation terms need not simply reduce to a single terminal term (that is a value), they may also invoke effects on the way. Following [26,13], we consider a computation term to evaluate to an effect tree, whose nodes are effect operators and leaves are terminal terms. The paper [28] introduced modalities that lift boolean properties of values to boolean properties of the trees modelling their computations. See [23,22,27] for alternative ways in which logics can be used to describe properties of effects.
The use of a Boolean logic does however not readily adapt to several examples of effects, for example the combination of probability and nondeterminism. The literature on compositional program verification shows the usefulness of quantitative (e.g. real-number valued) program logics for verifying programs with probabilistic behaviour, possibly in combination with nondeterminism [14,20]. The paper [28] develops a general Booleanvalued framework which, although featuring many examples, does not apply to the combination of probability and nondeterminism.
This paper provides a general framework for quantitative logics for expressing behavioural properties of programs with effects, generalising the Boolean-valued framework from [28]. We consider a quantitative (quantityvalued) satisfaction relation '|=', where (P ... |= φ) is given by an element from a quantitative truth space A (a degree of satisfaction). This allows us to ask open questions about programs, like "What is the probability that ..." or "What are the correct global starting states for ...". We define equivalence by stating that programs P ... and R ... are equivalent, if for any formula φ we have (P ... |= φ) = (R ... |= φ) (P ... satisfies φ precisely as much as R ... does). A key feature of the logic is the use of quantitative modalities to lift quantitative properties on value types to quantitative properties on computation types.
As in [28], we are able to establish that the behavioural equivalence defined as above is a congruence, as long as suitable properties on the quantitative modalities are satisfied. These properties require notions of monotonicity, continuity, and a notion of preservation over sequencing called decomposability. As in [28], the congruence is established by proving that given one of the properties (leaf-monotonicity), our behavioural equivalence is equal to an effect-sensitive notion of Abramsky's applicative bisimilarity [1,3]. Given further properties on the modalities, this relation can be proven to be compatible using Howe's method [11].
The main contribution of this paper is the generalisation of [28], and the corresponding generalised results. This goes through smoothly, though there are some subtleties like what to take as primitive in a quantitative setting. In particular, we will see the necessity of a threshold operation. The other main contributions are the examples illustrating the quantitative approach. Some examples such as the combination of nondeterminism with probabilistic choice, or with global store, do not fit into the Boolean-valued framework of [28], but do work here 1 . But there are also examples, such as probability, global store, and cost, whose treatment is more natural in our quantitative setting, even though they also fit in the framework of [28].
As a vehicle of our investigation we use Levy's call-by-push-value (CBPV) [17,16], together with general recursion and the aforementioned algebraic effects. As such, it generalises [28] in a second way by using callby-push-value to incorporate both call-by-name (CBN) and call-by-value (CBV) evaluation strategies. This is significant, since once either divergence or effects are present, the distinction between the reduction strategies becomes vital. For example, if we take some probabilistic choice por signifying a fair coin flip, we have that 'por(λx . 0, λx . 1) ≡ λx . por(0, 1)' holds in CBN, but not in CBV. So it is interesting to consider CBPV, as it expresses both these behaviours. The distinction is expressed in the difference between production-types FA where one explicitly observes effects, and types like A → C where the observation of effects is postponed to a later moment. As such, this language is an ideal backdrop for studying effects.
In Section 2 we give the operational semantics of the language, starting with the effect-free version and working towards our treatment of algebraic effects. In Section 3 we present our quantitative logic, introducing quantitative modalities to deal with the observation of effects. In Section 4 we look at the resulting behavioural equivalence and the properties that establish the congruence property (or compatibility in its technical form). In Section 5 we relate this equivalence to applicative (bi)similarities by defining a relator using our modalities. This then allows us to adapt a Howe's method proof of compatibility from [3,28] for this equivalence. We finish in Section 6 with some discussions.

Operational semantics
We use a simply-typed call-by-push-value functional language as in [16,17], together with general recursion and a ground type for natural numbers, making it a call-by-push-value variant of PCF [24]. To this, we add algebraic-effect-triggering operators in the sense of Plotkin and Power [26]. We first focus on the effect-free part of the language, as we want to consider effects independently of the underlying language.

The language
We give a brief overview of the language and its semantics. The types are divided into two flavours, Value types and Computation types. Value types contain value terms that are passive, they don't compute anything on their own. Computation types contain computation terms which are active, which means they either return something to or ask something of the environment.
Value types A, B and computation types C, D are given by: where I is any finite indexing set. By asserting finiteness of I in the case of product types, the number of program terms is kept countable (a property which will have benefits later on in the formulation of the logic).
The type U C is a thunk type, which consists of terms which are frozen. These terms were initially computation terms but are made inactive by packaging them into a thunk. The type N is the type of natural numbers, containing the non-negative integers. With this type, we can program any computable function on the natural Figure 1: Typing rules numbers as in PCF [24]. The type FA is a producer type, which actively evaluates and returns values of type A to the current environment. As was stated, this is the type at which we can observe effects. The type A → C is a type of functions, which is a computation type since its terms are actively awaiting input.
We have a countably-infinite collection of term variables x, and term contexts: Γ ::= ∅ | Γ, x : A. Note that contexts only contain Value types, meaning that like in call-by-value, we can only ever substitute value terms. This is no loss of generality, as we can simulate substituting computation terms by packaging them into a thunk. The terms of the language are as follows: Value terms: V, W :: Computation terms: M , N : We underline terms (M ) and types (C) when they are computation terms and computation types respectively. We will also use E ... , F ... and P ... , R ... to denote general types and their terms, e.g. they could be either value or computation types/terms. Following [16], their typing rules are given in Fig. 1. We distinguish two typing judgements, ⊢ v and ⊢ c , for value and computation terms respectively. We write Terms(E ... ) for the set of closed terms of type E ... . Note the addition of the fixpoint operator fix(−) which has been added to allow for general recursion and hence divergence. We write n : N for the numeral representing the n-th natural number.

Semantics
We give the semantics of this language by specifying a reduction strategy for computation terms in the style of a CK-machine [5]. We distinguish a special class of computation terms, called terminal terms, which will not reduce further. They consist of: return(V ) : FA, λx : A . M : A → C, and M i | i ∈ I : Π i∈I M i .
We first give the rules for terms we can directly reduce. We denote these using relation symbol : The behaviour of the other non-terminal computation terms; M to x . N , M · V and M · i, is implemented using a system of stacks defined recursively: S, Z : We write S{M } for the computation resulting from applying S to M , which can be seen as evaluating the program M within the environment S.
i} Whenever one encounters a computation of which one needs to first evaluate a subterm, one unfolds the continuation into the Stack and focusses on evaluating that subterm. This method is given by the stack reduction relation in the following way:

Adding algebraic effect operators
We add algebraic effects in the style of [13], given by specific effect operators. We use a type variable α for computation types. Effects are given by operators of the following arities (like in [13,25]): For each effect under consideration, we bundle together effect operators pertaining that effect in a set called an effect signature Σ. Given such a signature, new computation terms can be constructed according to the typing rules in Fig. 2. Example 3 (Probabilistic choice and global store). We will also consider the combination of the previous two examples, probabilistic choice with global store, given by effect signature Σ pg := Σ p ∪ Σ g .
Example 4 (Cost). If we want to keep track of costs of an evaluation, we take the signature Σ c := {cost c : α → α | c ∈ C}, where we have a countable set of real-valued costs C. The computation cost c (M ) assigns a cost of c to the evaluation of M . This cost can represent a time delay or some other resource.
Example 5 (Combinations with nondeterminism). We consider a binary operator nor : α 2 → α for nondeterministic choice, which contrary to probabilistic choice is entirely unpredictable. One interpretation is to consider it under the control of some external agent or scheduler (e.g. a compiler), which one may wish to model as being cooperative ( angelic), antagonistic ( demonic), or neutral. We will consider nondeterminism and it's operator in combination with any one of the previous three examples. The resulting signatures are named Σ pn , Σ gn , Σ gpn , and Σ cn respectively.
Example 6 (Combinations with error). Lastly, given some set of error messages E, we consider adding error raising effect operations {raise e : α 0 → α | e ∈ E} to the language, where raise e () stops the evaluation of a term, and displays message e. There is no continuation possible afterwards.
In the presence of such effects, the evaluation of a computation term might halt when encountering an algebraic effect operator. We broaden the semantics, where a computation term now evaluates to an effect tree, a coinductively generated term using operations from our effect signature Σ together with terminal terms and a symbol for divergence ⊥. This idea appears in [26], but here we adapt the formulation from [13] to call-by-push-value.
We define the notion of an effect tree over any set X, where X can be thought of as a set of terminal terms.
1. An effect tree (henceforth tree) over a set X, determined by a signature Σ of effect operations, is a labelled and possibly infinite depth tree whose nodes have the possible forms given below.

A leaf node labelled
x where x ∈ X.
The set of trees over set X and signature Σ is denoted T Σ (X). We can equip this set with a partial order ≤, where t ≤ r if r can be constructed from t by pruning (possibly infinitely many) subtrees and labelling the pruning points with ⊥. Moreover, the preorder is ω-complete, so each ascending chain of trees t 0 ≤ t 1 ≤ . . . has a least upper bound ⊔ n t n .
For any x ∈ X, we denote η(x) ∈ T Σ (X) for the tree which only consists of one leaf labelled x. We also have a map µ : T Σ (T Σ (X)) → T Σ (X) which flattens a tree of trees into one tree, by transforming the leaves (which are trees) into subtrees.
For each computation type C we define the evaluation map | − | : Terms(C) → T Σ (Terms(C)), which returns a tree, whose leaves are either labelled with ⊥ or labelled with a terminal term of type C. We define this inductively by constructing for each n ∈ N the n-th approximation of the tree. Using this, we define |M | := n |ε, M | n . We view |M | as an operational semantics of M in which M is reduced to its (possibly) observable computational behaviours, namely the tree of effect operations potentially performed in the evaluation of M . See Figure 3 for two examples of effect trees.
These trees are still quite syntactic, and may contain lots of unobservable information irrelevant to the real-world behaviour of programs. In the next section, we will set up the quantitative logic which will extract from such trees only the relevant information, using quantitative modalities.

Quantitative Logic
We define a quantitative logic expressing behavioural properties of terms. Each type has a set of formulas, which can be satisfied by terms of that type to varying degrees of satisfaction. These degrees of satisfaction are given by truth values from a complete lattice.
A countably complete lattice is a set A with a partial order , where for each subset X ⊆ A there is a least upper bound sup(X) and a greatest lower bound inf(X). In particular, we define T := sup(A) = inf(∅) as the completely true value, and F := inf(A) = sup(∅) as the completely false value.
We also equip this space with a notion of negation or involution, which is a bijective map ¬ : A → A such that ∀a ∈ A, ¬(¬a) = a and ∀a, b ∈ A, (a b) ⇔ (¬b ¬a). We will use the words involution and negation interchangeably. Given the conditions of an involution, it holds that ¬T = F and ¬F = T . 2 Examples of complete lattices with involution/negation used in this paper are: 3. The powerset P(X) over some set X, whose order is given by inclusion ⊆, so T = X and F = ∅. Negation is given by the complement, where ¬A : 4. For A a complete lattice and X a set, the function space A X with point-wise order is a complete lattice.
We construct a logic for our language in order to define a behavioural preorder. For each type E ... , value or computation, we have a set of formulas Form(E ... ). Greek letters φ, ψ, . . . are used for formulas over value types, underlined Greek letters φ, ψ, . . . for formulas over computation types, and underdotted Greek letters φ ... , ψ ... , . . . for formulas over any type. We are aiming to define a quantitative relation (P ... |= φ ... ) to denote the element of A which describes the degree to which the term P ... satisfies the formula φ ... (e.g. this may describe the probability of satisfaction or the amount of time needed for satisfaction). We choose the formulas according to the following two design criteria, as in [28].
Firstly, we design our logic to only contain behaviourally meaningful formulas. This means we only want to test properties that are in some sense observable by users and/or other programs. For example, for the natural numbers type N we have a formula {n} which checks whether a term is equal to the numeral n. For function types we have formulas of the form V → φ which tests a program on a specific input V , and checks how much the resulting term satisfies φ.
Secondly, we desire our logic to be as expressive as possible. To this end, we add countable disjunction (suprema) and conjunction (infima) over formulas, together with negation ¬. Moreover, we add two natural quantity-specific primitives: a threshold operation and constants. Both such operations are used frequently (albeit implicitly) in practical examples of quantitative verification, e.g. in [20].

Quantitative modalities
Fundamental to the design of the logic is how we interpret algebraic effects. In CBPV, effects are observed in producer types FA. In order to formulate observable properties of FA-terms in our logic, we include a set of quantitative modalities which lift formulas on A to formulas on FA. We bundle our a selection of quantitative modalities together in a set Q.
Each modality q ∈ Q denotes a function q : T Σ (A) → A, which is used to translate a tree of truths into a singular truth value. Given a quantitative predicate Θ : X → A on a set X, we can use a modality q to lift it to a quantitative predicate q(Θ) : In the examples, we will define the denotation of a modality q by giving for each n ∈ N an approximation q n . These will follow the rules: q 0 (t) = F, q n (⊥) = F, and q n+1 (η(a)) = a, and effect specific rules given in the examples below. Given these approximations, the denotation q (t) is given by sup{ q n (t) | n ∈ N}.
Example 1 (Probabilistic choice). We use as quantitative truth space the real number interval A := [0, 1] with := ≤, which denote probabilities 3 . We take a single modality E for expectation, where (M |= E(Θ)) gives the expected value of (V |= Θ) given the probabilistic distribution M induces on its return values V . This is achieved by giving E the denotation E : Trees Σp (A) → A which sends a tree of real numbers to the expected real number, where the approximation of the denotation is given by: E n+1 (p-or(t, r)) = ( E n (t) + E n (r))/2.
Example 2 (Global store). Given a set of locations L, we have a set of states S := L → N. Our set of truth values is given by the powerset A := P(S) with := ⊆. We have a single modality G, where (M |= G(Θ)) gives the set of starting states for which M terminates with a value V such that the end state is contained in (V |= Θ). We define this formally with the following rules: G n+1 (lookup l (t 0 , t 1 , . . . )) = {s ∈ S | s ∈ G max(0,n−s(l)) (t s(l) )} and G n+1 (update l,m (t)) = {s | s[l := m] ∈ G n (t)}.
Example 3 (Probabilistic choice and global store). For this combination of effects, we take as truth space the functions A := [0, 1] S with point-wise order, where S is the set of global states and [0, 1] the lattice of probabilities with standard order. Intuitively, this space assigns to each starting state a probability that a property is satisfied. We define a single modality EG which, for each state s ∈ S, is given by the following rules: EG n+1 (p − or(t, r))(s) := ( EG n (t)(s) + EG n (r)(s))/2, EG n+1 (lookup l (t 0 , t 1 , . . . ))(s) = EG max(0,n−s(l)) (t s(l) )(s), and EG n+1 (update l,m (t))(s) = EG n (t)(s[l := m]).

Example 4 (Cost).
We use the infinite real number interval A := [0, ∞] with := ≥ denoting an abstract notion of cost (e.g. time). Trees are just branches in this example. We have a single modality C, where (M |= C(Θ)) is the cost it takes for M to evaluate plus the cost given by Note that for any tree t either infinite or with leaf ⊥, we have C (t) = ∞. This reflects the idea that a diverging computation will exceed any possible finite cost.

Example 5 (Combinations with nondeterminism). To add nondeterminism to any of the previous examples, we keep their truth space
and extend the definition of their modality q ∈ {E, G, EG, C} in two ways, creating an optimistic modality q ♦ and a pessimistic modality q such that: . For the combination with probability, we can see the nondeterministic choice as being controlled by some external agent, which chooses a strategy for resolving the nondeterministic choice nodes, like in a Markov decision processes. E ♦ finds the optimal strategy to get the best expectation, whereas E finds the worst strategy. Similarly, C ♦ will search for the minimum possible execution cost, while C will look for the maximum cost.
For instance, if the denotation |M | of a term M of type FN is given by the first tree in Fig. 3, then Example 6 (Combinations with error). There are two ways of defining combinations with error messages, akin to the sum and tensor approach of combining effects from e.g. [12]. Let Σ, A and Q be the signature, truth space, and quantitative modalities of the effects to which we want to add error messages from a set E. Given a modality q ∈ Q and some function f : E → A, assigning to each message a value, we define a new modality q f which, besides inheriting the rules from q, follows the rule q f n+1 (raise e ) = f (e). We define two new sets of modalities for this combination, giving a different interpretation of error.
E.g. in the presence of global store (Example 2), the modalities from Q + are not able to observe the final global state when an error message has been raised, whereas some modalities from Q × can. For instance, for e ∈ E and f : E → A such that f (e) := {s[l := 1] | s ∈ S}, it holds that G f is in Q × but not in Q + ). Moreover, (update l (1, raise e ()) |= (G f (⊤))) = T whereas (update l (0, raise e ()) |= G f (⊤)) = F. Those two terms are however not distinguishable by any modality from Q + .
All the Boolean-valued examples of modalities for effects in [28], can also be accommodated in our quantitative setting by taking A := {T , F}. These include for instance Input/Output.

Formulation of the logic
We write Form(E ... ) for the set of formulas over type E ... , which is defined by induction on the structure of E ... . Fig.  4 gives the inductive rules for generating these formulas. We have modality formulas q(φ), constant formulas κ a , and step formulas φ ... a . Note that conjunctions and disjunctions (i.e., meets and joins) are taken over countable sets of formulas only.

Figure 4: Formula constructors
The modality formula q(φ) is particularly important, as it expresses how the quantitative modalities are used to observe effects. The last couple of satisfaction rules are for formula constructors occurring at each type.
All formulas together form the general logic V. We distinguish a specific fragment of V, the positive logic V + excluding all formulas which use ¬(). The logic V + can be interpreted without giving an involution on A. We end this section by looking at some interesting properties we can construct using the logic, illustrating the expressibility of the logic. In case of global store (Example 2), we can construct formulas in the style of Hoare logic. For instance, taking two subsets P, Q ∈ A = P(S) of global states, the statement M |= G(κ Q ) P will give T , precisely if, when starting the execution of M with a state from P , the execution will terminate with a state from Q. As another example, in case of global store with probability (Example 3), where A := [0, 1] S , we can construct, given a formula φ and a distribution of states µ ∈ [0, 1] S , a formula Σ µ (φ) such that (M |= Σ µ (φ))(s) = min(1, s∈S µ(s)·(M |= φ)(s)). Then (M |= Σ µ (EG(κ T ))) expresses the probability of termination of M , given that the starting state is sampled from µ. In the same vein, we can look at the combination of probability and nondeterminism (Example 1 and 5), where (M |= a,b∈[0,1] (E ♦ (κ T ) a ∧ E (κ T ) b ∧ κ (a+b)/2 )) expresses the probability that M terminates, given that the agent/scheduler in control of nondeterministic choice is sampled from a distribution of which 50% is helpful and 50% is antagonistic.

Behavioural equivalence
We can define a behavioural preorder for any sub-collection of formulas L.
Definition 4.1. For any fragment of the logic L ⊆ V, the logical preorder ⊑ L is given by: The general behavioural preorder ⊑ is the logical preorder ⊑ V , whereas the positive behavioural preorder ⊑ + is the logical preorder ⊑ V + . We denote ≡ and ≡ + for the logical equivalences ≡ V and ≡ V + respectively (the behavioural equivalences). These closed relations can be extended to relations on open terms by using the open extension (where two open terms are related if they are related for any substitution of variables).
A basic formula is a non-constant formula (not necessarily atomic) which on the top level does not have conjunction , disjunction , negation ¬, constant formula κ a or step-construction (−) a . It is not difficult to see that both ⊑ and ⊑ + are completely determined by basic formulas. Note that since V + ⊆ V, it holds that (⊑) ⊆ (⊑ + ) and (≡) ⊆ (≡ + ).
Proof. Note that at each type level, the preorder is completely determined by basic formulas. All other formulas depend solely on the satisfaction of basic formulas, by a simple induction. As such, the above characterisations are a simple consequence of unfolding the satisfaction relation of basic formulas.

Congruence properties
A relation on terms is compatible, if it is preserved over the typing rules from Fig. 1. We introduce the three properties that we will require in order to establish that (the open extensions of) the behavioural preorders are compatible, hence precongruences. The space T Σ (A), which forms the basis of the technical definition of the modalities, plays a fundamental role in this. The first property considers the leaf order T Σ ( ) on T Σ (A), where t T Σ ( ) r if r can be created by replacing leaves a ∈ A of t by leaves b ∈ A of higher value a b. The ⊥ leaves can however not be replaced.
This property is useful for establishing a variety of different results, but mainly just shows that modalities preserve the implicit (point-wise) order φ ). The second property considers the ω-complete tree order ≤ on T Σ (A), defined just after Definition 2.1.
Definition 4.6. A modality q ∈ Q is tree Scott continuous if for any ascending chain t 0 ≤ t 1 ≤ t 2 ≤ . . . it holds that q ( n∈N t n ) = sup{ q (t n ) | n ∈ N}. This is property is necessary in the congruence proof for inductively approximating the satisfaction value of infinite trees generated by the fixpoint operator and infinite arity effect operators.
The third and final property is the most technical one, and considers the preservation of the behavioural preorder over sequence operations such as (−) to x . (−). It considers the monad multiplication map µ : T Σ (T Σ (A)) → T Σ (A), and requires that the abstract generalisation of the behavioural preorder on T Σ (T Σ (A)) and T Σ (A) is preserved by the µ-map. To formulate this, we need first define these abstract relations.
We write h : For a function h : X → A (a valuation on X) and a modality q ∈ Q, we write t ∈ q(h) for q (h * (t)).
For any relation R ⊆ X × Y , and valuation h : X → A, we define (R ↑(h)) : Y → A to be the function such that R ↑(h)(b) := sup a∈X,aRb (h(a)). We classify abstract quantitative behavioural properties on T A. A function H : T Σ (A) → A is called quantitative behaviourally saturated if for any two trees t, t ′ ∈ T A such that t t ′ , it holds that H(t) H(t ′ ). We write QBS for the set of quantitative behaviourally saturated functions. Note that H ∈ QBS if and only if there is a function F : T A → A such that H = ↑(F ). Moreover, for any q ∈ Q, it is easy to see that q ∈ QBS. We define a relation on quantitative double trees T T A.
Definition 4.8. We define the preorder on T T A by: for any two quantitative double trees r, r ′ ∈ T T A, r r ′ ⇐⇒ ∀q ∈ Q, ∀H ∈ QBS, r ∈ q(H) r ′ ∈ q(H).
Proof. For '⇒', note that for any t ∈ T A, F (t) ↑(F )(t) so the result follows from leaf-monotonicity and the fact that ↑(F ) ∈ QBS. For '⇐', use that for H ∈ QBS, ↑(H) = H.
We can define the third property, decomposability, together with its stronger counterpart, sequentiality 4 . Definition 4.10. Q is decomposable if for all t, r ∈ T Σ (T Σ (A)), if t r then µt µr. A modality q ∈ Q is sequential if for all t ∈ T Σ (T Σ (A)), q (µt) = q ( q * (t)).
Lemma 4.11. If all modalities q ∈ Q are leaf-monotone and sequential, then Q is decomposable.
The three properties defined above allow us to establish compatibility: Theorem 4.12. If Q is a decomposable set of leaf-monotone and Scott tree continuous modalities, then ⊑ and ⊑ + are compatible, hence precongruences.
All our examples satisfy these three properties. Both leaf-monotonicity and Scott tree continuity are consequences of the inductive and hence continuous definitions of the modalities, while decomposability holds by observing that any modality from the examples is sequential. We illustrate this in the following lemma. Proof. Take r, r ′ ∈ T Σ (T Σ (A)) as above and assume E (µr) > a ∈ [0, 1], then since E (µr) = sup n ( E n (µr)) there must be an n ∈ N such that E n (µr) > a. By the recursive definition of E (−) we can see that E n (r[t → F ′ n (t)]) ≥ E n (µr), and hence E ( E * (r)) > a. Now assume E ( E * (r)) > a, then there must be an m such that E m ( E * (r)) > a. Now, E m only looks at a finite amount of leaves, and hence there must be a k such that E m ( E * k (r)) > a. Again, studying the recursive definition of E (−) we observe that E m+k (µr) ≥ E m ( E * k (r ′ )), so we conclude that E (µr) > a. This is for all such a ∈ A, so E (µr) = E ( E * (r)).
We end this section with an example of an equivalence and an in-equivalence. It has to be said that the purpose of this paper is to give a widely applicable approach to defining equivalence, not to prove equivalence of terms. Moreover, for practical purposes, establishing an in-equivalence is easier than establishing an equivalence, since you only have to find one formula which distinguishes the two.

Applicative Bisimilarity
We investigate how our quantitative modalities can be used to define a notion of Abramsky's applicative bisimilarity [1], related to the behavioural equivalence (Theorem 5.7), starting off by defining a relator [18,29].
We write xRy for (x, y) ∈ R. Remember from the previous section that (t ∈ q(h)) = q (t[x → h(x)]) and (R ↑(h))(b) := sup{h(a) | a ∈ X, aRb}. Note that Q( ) = and Q( ) = (see Lemma 4.9). The following characterisation of the relator is immediate: The following lemma shows that this satisfies the usual properties of monotone relators from [18,29]. The proof is technical yet straightforward, and is left out to preserve space. Lemma 5.3. If all quantitative modalities from Q are leaf-monotone, then Q(−) has the following properties: 4 Sequentiality is one of two properties for q to be an Eilenberg-Moore algebra for the monad T Σ (−) 1. If R is reflexive, then so is Q(R).

For R ⊆ X × Y and S ⊆ Y × Z, Q(R)Q(S) ⊆ Q(RS).
Here RS is relational concatenation.
Fundamental to the definition of the relator is the notion of the right-predicate R ↑(h). When the relation in question is our behavioural preorder, these right-predicates can be expressed in the logic. Proof. We use Lemma 4.3 to define φ D : In the case that R is a relation on terms of some value type A, we write Q(R) for the relation on terms of type FA given by Q({(return(V ), return(W )) | V R A W }). A relation R on terms is well-typed, if it only relates terms of the same type and context, and R is closed if it only relates closed terms.
Definition 5.5. A well-typed closed relation R is an applicative Q-simulation if: The applicative Q-similarity is the largest applicative Q-simulation, whereas the applicative Q-bisimilarity is the largest symmetric applicative Q-simulation.
Theorem 5.6. If all quantitative modalities from Q are leaf-monotone, then the positive behavioural preorder ⊑ + is the applicative Q-similarity.
Proof. Note that ⊑ + satisfies the first 6 properties for being a Q-simulation as a consequence of Lemma 4.4. We prove the seventh property: Assume M ⊑ + N , q ∈ Q and D : Terms(A) → A. We use Lemma 5.4 to find a formula φ D such that φ D (V ) = (⊑ + ↑(D))(V ). By reflexivity of ⊑ + , we have D(V ) (⊑ + ↑(D))(V ), so by leaf-monotonicity and M ⊑ + N it holds that: q (D * (|M |)) (M |= q(φ D )) (N |= q(φ D )) = q ((⊑ + ↑(D)) * |N |). We can conclude that |M |Q(⊑ + A )|N |. So we proved that ⊑ + is a Q-simulation. We now need to prove that ⊑ + contains any other Q-simulation R. To do that, we show that R preserves any formula φ ... in the following sense: If P ... R R ... , then (P ... |= φ ... ) (R ... |= φ ... ). We do this by induction on formulas, using the fact that any formula is well-founded. Assume P ... R R ... . Suppose R preserves any formula from X ⊆ Form(P ... ). Then (P .. It is not difficult to prove that R preserves most basic formulas. The only difficult formula to consider is q(φ) ∈ Form(FA). Assume M RN , so by simulation property |M | Q(R) |N |. By induction hypothesis and relator property 2 in Lemma 5.3, it holds that |M |Q . We conclude that M ⊑ + N . We can conclude that ⊑ + is the largest Q-simulation, hence it is equal to Q-similarity.
Note the crucial use of Lemma 5.4 in the proof, which explains the need of the step-formulas in the logic.
Theorem 5.7. If all quantitative modalities from Q are leaf-monotone, then the general behavioural preorder ⊑ is the largest symmetric applicative Q-simulation, and hence equal to applicative Q-bisimilarity.
Proof. Firstly, it holds by Lemma 4.2 that ⊑ is symmetric. Secondly, ⊑ is a Q-simulation by the same proof as above. Lastly, any symmetric Q-simulation R is included in ⊑, using a similar proof as above, proving with induction on formulas φ

Howe's method
In this subsection, we briefly outline how the Howe's method [10,11] can be used to establish compatibility for the open extension of applicative Q-similarity and Q-bisimilarity as in [3,28]. Firstly, we need some properties of the relators in addition to Lemma 5.3. The proofs are technical and are left out to preserve space.
Lemma 5.8. If all q ∈ Q are leaf-monotone and Scott tree continuous, then the following four properties hold: 2. for any chain of trees t 0 ≤ t 1 ≤ t 2 ≤ . . . , ∀n(t n Q(R)r n ) ⇒ (⊔ n t n )Q(R)(⊔ n r n ).
As a consequence of the above lemmas, the following holds. One of the contributions of this paper is identifying the properties on quantitative modalities for which the above relator properties are satisfied, such that we can apply Howe's method. The application of Howe's method itself is however not novel, and is simply an alteration of the proof used for the call by value case in [3,28] (untyped and simply-typed respectively), using results from [15]. As such, details of the proof have been omitted. In short, Howe's method allows us to establish the following theorem.
Theorem 5.11. If Q is a decomposable set of leaf-monotone and Scott tree continuous quantitative modalities, then Q-similarity and Q-bisimilarity are compatible.
Combining theorems 5.6, 5.7, and 5.11 we can derive Theorem 4.12, that the general and positive behavioural equivalence/preorder are compatible.

Discussions
We have generalised the logic from [28] to a quantitative logic for terms of a call-by-push-value language with general recursion and several (combinations of) algebraic effects. The quantitative logic is expressive, contains only meaningful behavioural properties, and induces a compatible program equivalence on terms.
In this paper, we consider program properties (or observations) as the primary way of describing program behaviour. According to this philosophy, the generalisation to quantitative properties is natural. Alternatively, one could consider relations (or comparisons) as primary, and instead generalise to quantitative relations. The resulting theory is that of metrics, along the lines of [2,4,19]. Relating the logic from this paper, or a variation thereof, to metrics (e.g. like the ones in [7]) is a topic for future research.
The quantitative logic does not however naturally induce a metric on the terms. This is mainly because of the inclusion of step-formulas φ a , which take the quantitative information from φ and collapses it to a binary value. These step-formulas are necessary for relating the behavioural equivalence with the applicative bisimilarity. Their necessity can be seen as a natural consequence of the non-linearity of the language. E.g., in the case of probability with A := [0, ∞], the step-formula can be constructed using products of formulas.
The quantitative logic is very expressive, allowing one to deal with some awkward combinations of effects that are not amenable to a boolean treatment. Despite the many examples of combination of effects, there is no general theory for quantitative modalities of combined effects. Such a theory is a potential subject for further research. It would also be interesting to look at other examples of effects which the quantitative logic could describe, like a the algebraic jump effect described in [6], or some form of concurrency.
The logic and examples from [28] can be considered as further examples for this paper, where one considers A := {T , F}. The property of Scott openness is the Boolean version of a combination of Scott tree continuity and leaf-monotonicity, and the notion of decomposability is a quantitative generalisation of the notion from [28] with the same name. It should be noted, however, that most modalities from [28] are not sequential.
Along the lines of [28], it is possible to define a pure variation of the logic. This is a logic independent of the term syntax, using function formulas of the form The logical equivalence of this pure logic will be equal to the behavioural equivalence, if the behavioural equivalence is compatible.
The denotation q : T A → A of quantitative modalities are, in the case of the running examples, Eilenberg-Moore algebras. These are algebras a : T X → X such that a • η X = id X and a • T a = a • µ X , the second statement coincides with the property of sequentiality in this paper. As such, our example modalities potentially fit into the framework of Hasuo [8]. Connections between the two approaches may be explored in the future.
Since the theory has been formulated for call-by-push-value, it is not difficult to extract logics for specific reduction strategies including; call-by-name, call-by-value and lazy PCF [16,17]. The language can also be extended with universal polymorphic and recursive types. These extensions of the language are worked out in the author's forthcoming thesis. Further extensions could also be considered in the future.
4. This follows from the fact that if x R y then R ↑(h)(y) h(x), so we can use leaf-monotonicity. Now for the second property.
Proof of Corollary 5.10.
1. Using point (iii) of Lemma 5.8 on the assumptions we get f * (t) Q(Q(S)) g * (r). We can then apply Lemma 5.9 to get the correct result.

A.1 The Howe closure
Given this definition, a well-typed relation R is compatible if and only if R ⊆ R. The Howe closure is also the least solution to the equation S = S(R) • and the least solution to the inclusion S(R) • ⊆ S. We look at some preliminary results, mostly from Lassen [15]: If R is reflexive, then: Proof. We prove the properties separately.
2. Note that the compatible refinement of a reflexive relation is reflexive. Proof. We proof the properties individually.
1. We use that R is transitive, hence (R) • is transitive meaning (R) 2. This follows from applying property 2 of Lemma 5.3 to the previous statement.

A.2 The Howe closure of an applicative Q-simulation
We look at the Howe closure of a Q-simulation preorder R. We assume that Q is a decomposable set of leaf-monotone and Scott tree continuous modalities. The lemmas proven in the previous two subsections are satisfied, hence we know that (R) ⊆ (R) • by Lemma A.2. We prove that (R) • is a Q-simulation by explicitly checking the seven conditions from Definition 5.5.
Proof. Using the inductive definition of (R) • there must be an L : N such that V(R) • L and LRW , the latter meaning L = W because of the simulation property. The fact that V(R) • L must have come as a conclusion from either C3 or C4. In the first case, V = Z = L and hence V = L = W . In the second case, V = S(V ′ ) and L = S(L ′ ) with V ′ (R) • L ′ , and the proof has been reduced to showing V ′ = L ′ , since then V = S(V ′ ) = S(L ′ ) = L = W . We do induction on the structure of V , which cannot go on forever since V is a syntactically finite term. So eventually we get to Z and we can make a conclusion of the form V = nS(V ′′ ) = nS(Z) = nS(L ′′ ) = L = W for some n ∈ N. That concludes the proof.
The following lemma is evident from the compatibility properties.
Lemma A.5. By compatibility of (R) • it holds that: We can easily prove two more simulation properties.
Proof. There is a pair (l, L) such that (j, V ) (R) • Σi∈I Ai (l, L)(R) • (k, W ). The latter implies l = k and LRW by simulation property. The former statement can only have come from compatible extension rule C13, so j = l and V (R) • L. We can now use Lemma A.3 to conclude that V (R) • W .
Proof. There is a pair (L, L ′ ) such that (V, V ′ ) (R) • A×B (L, L ′ )(R) • (W, W ′ ). The latter implies LRW and L ′ RW ′ by simulation property. The former statement can only have come from compatible extension rule C15, so V (R) • L and V ′ (R) • L ′ . We can now use Lemma A.3 to conclude that V (R) • W and V ′ (R) • W ′ . So all conditions except 6 of being a Q-simulation are satisfied. Condition 6. is the most difficult to prove and requires an induction on the reduction relation of terms.
It requires us to look at terms P ... , R ... of type FA such that P ... (R) • R ... , and prove that |P ... |Q((R) • )|R ... |. Using Lemma 5.8, this can be reduced to asking |P ... | n Q((R) • )|R ... | for all n. This allows us to do an induction on the denotation map |P ... |. In general, one would look at the shape of P ... and see what it reduces to after one step, so one can use the induction hypothesis. This is a relatively straightforward investigation in the fine-grained call by value case.
For call-by-push-value, we have the problem that effects may occur in any computation type, which is particularly problematic when considering non-producer type. Concretely, it may be that our P ′ is of a computation type, it may not be of the form λx . M . This is problematic, as we still do not have any clue to what P ... might reduce to. To investigate that, we would require another case analysis on P ... ′ , which results in a bureaucratic nightmare. We can say that the application case is uninformative, and we need to continue doing case analysis until we find a term that is not of the form of an application, which we call informative. Doing structural induction on P ... , we observe the following result.
Lemma A.10. Any computation term P ... is of the form S{P ... ′ } where S is a frame and P ...
′ is an informative term.
Definition A.11. Two frames S and Z match when the following statements hold.

If
We have the following property.
. Now for the induction step, assume the statement holds for any smaller frames S ′ .  Matching frames are very handy, since they can make use of compatibility: The last important property of frames is that it works nicely with respect to the reduction relation.
We have the necessary tools to prove the following lemma. Proof. We do an induction on n.
Induction step (n + 1). We assume as the induction hypothesis that for any P ...
We do a case distinction on P ... ′ : C, which is informative, so not of the form M · V or M · i. We start with the three unfold cases, where the S frame is actively used.
1. If P ... ′ = return(V ) : C, which can only be of F-type, so the frame S must be ε as no other frames accept a term of this type. Hence P . ′ could only have been derived via the lambda compatibility rule C17, so R ...
We can do the following derivation using Lemma A.14 and the induction hypothesis: That finishes the case distinction, so we know that for any shape of P ... ′ it holds that |S{P ... ′ }| n+1 Q((R) • )|Z{R ... ′ }|. As was discussed before, this is sufficient in establishing that |P ... | n+1 Q((R) • )|R ... |, and hence this finishes the induction step. So the proof by induction has been finished.
We can conclude that M (R) • N ⇒ |M |Q((R) • )|N | for closed terms of type FA. As such, we can conclude: In particular, the Howe's closure of Q-similarity is a Q-simulation, and hence the Q-similarity itself. Since the Howe's closure of a preorder is itself compatible, we can can conclude that the Q-similarity is compatible. We can now derive Theorem 5.11 as stated in Section 5.1, with the same method as in [28]. The bisimilarity part of this result is established using what is known as the transitive closure trick (see e.g. [28]).