Bisimulation for Feller-Dynkin Processes

Bisimulation is a concept that captures behavioural equivalence. It has been studied extensively on nonprobabilistic systems and on discrete-time Markov processes and on so-called continuous-time Markov chains. In the latter time is continuous but the evolution still proceeds in jumps. We propose two definitions of bisimulation on continuous-time stochastic processes where the evolution is a \emph{flow} through time. We show that they are equivalent and we show that when restricted to discrete-time, our concept of bisimulation encompasses the standard discrete-time concept. The concept we introduce is not a straightforward generalization of discrete-time concepts.


Introduction
Bisimulation [Mil80,Par81,San09] is a fundamental concept in the theory of transition systems capturing a strong notion of behavioural equivalence.In particular, it is a notion stronger than that of trace equivalence.Bisimulation has been widely studied for discrete time systems where transitions happen as steps, both on discrete [LS91] and continuous state spaces [BDEP97,DEP02,Pan09].In all these types of systems a crucial ingredient of the definition of bisimulation is the ability to talk about the next step.Thus, the general format of the definition of bisimulation is that one has some property that must hold "now" (in the states being compared) and then one says that the relation is preserved in the next step.Some attempts have been made to talk about continuous-time [DP03], but even in what are called continuous-time Markov chains there is a discrete notion of time step; it is only that there is a real-valued duration associated with each state that makes such systems continuous time.They are often called "jump processes" in the mathematical literature, see, for example, [RW00,Whi02], a phrase that better captures the true nature of such processes.
Outside of computer science, there is a vast range of systems that involve true continuous-time evolution: deterministic systems governed by differential equations and stochastic systems governed by "noisy" differential equations called stochastic differential equations.These have been extensively studied for over a century since the pioneering work of Einstein [Ein05] on Brownian motion.In the computer science literature there have been studies of very special systems that feature continuous time: timed automata [AD94] and hybrid systems [ACH + 95].In these systems the time evolution is assumed to be piecewise constant (timed automata) or piecewise smooth (hybrid automata) and bisimulation is defined without recourse to talking about the next step.However, a general formalism that covers processes like diffusion is not available as far as we are aware.
In this work we aim at a general theory of bisimulation for stochastic systems with true continuous-time evolution.We focus on a class of systems called Feller-Dynkin processes for which a good mathematical theory exists.These systems are the most general version of Markov processes defined on continuous state spaces and with continuous time evolution.Such systems encompass Brownian motion and its many variants.
The obvious extension of previous definitions of bisimulation on discrete Markov processes or on jump processes fail to provide a meaningful notion of behavioural equivalence as we will illustrate later on.It is a mistake to think that one can get a reasonably good understanding of such systems by considering suitable "limits" of discrete-time systems.Intuitively, the notion of bisimulation is sensitive to small changes that are not captured when taking the limit.It is true that, for example, Brownian motion can be seen as arising as a limit, in the sense of convergence in distribution, of a discrete random walk as both the discrete time unit and the step size go to zero.However, entirely new phenomena occur with the trajectories of the Brownian motion which are not understandable through the limiting process at least not in any naive sense: the probability of being at any single state x at a given time t is zero, but the probability of hitting x before a given time s is strictly positive.
To avoid those issues, we work with the set of trajectories of the system.A number of possible ways had to be explored and in the end the particular version we present here turned out to have the desired properties: (a) corre-sponds to our intuition in a number of examples and (b) correctly specializes to the discrete-time case.
Section 2 explains the mathematical background on Feller-Dynkin processes and Brownian motion.In section 3, we show why a naive extension of previous definition of bisimulation does not work and we propose a new definition of bisimulation as an equivalence relation that we illustrate on a number of examples.In section 4, we give an equivalent definition of bisimulation as a cospan of morphisms extending the previous notion of span of "zig-zag" morphisms in the discrete-time case.In section 5, we show that our definition of bisimulation is coherent with the previous definition of bisimulation in discrete time.Much remains to be done, of course, as we describe in the concluding section.

Background on Feller-Dynkin processes
We assume that basic concepts like topology, measure theory and basic concepts of probability on continuous spaces are well known; see, for example [Bil08,Dud89,Pan09].
The basic arena for the action is a probability space.Definition 2.1.A probability space is a triple (S, F, P ) where S is a space (usually some kind of topological space), F is a σ-algebra (usually its Borel algebra) and P is a probability measure on F.
Given a measurable space (X, Σ) a Markov kernel is a map τ : X × Σ − → [0, 1] which is measurable in its first argument, i.e. τ (•, A ∈ Σ) : X − → R is measurable for any fixed A in Σ and for any fixed x ∈ X, τ (x, •) is a (sub)probability measure.These kernels describe transition probability functions.
A crucial concept is that of a filtration.They will play a central role in the description of a process.Definition 2.2.A filtration on a measurable space (Ω, F) is a nondecreasing family (F t ) t≥0 of sub-σ-algebras of This concept is used to capture the idea that at time t what is "known" or "observed" about the process is encoded in the sub-σ-algebra F t .Definition 2.3.A stochastic process is a collection of random variables (X t ) 0≤t<∞ on a measurable space (Ω, F) that take values in a second measurable space (S, S) called the state space.We say that a stochastic process is adapted to a filtration F t if for each t ≥ 0 we have X t is F t -measurable.
Note that a stochastic process is always adapted to the filtration G t , where G t is defined as the σ-algebra generated by all the random variables {X s |s ≤ t}.The filtration (G t ) t≥0 is also referred to as the natural filtration associated to (X t ) t≥0 .
Before stating the definition of the continuous-time processes we will be interested in, let us first start by recalling the definition of their discretetime counterparts.Definition 2.4.A labelled Markov process (LMP) is a triple (X, Σ, τ ) where (X, Σ) is a measurable space and τ is a Markov kernel.
We will quickly review the theory of continuous-time processes on continuous state space; much of this material is adapted from "Diffusions, Markov Processes and Martingales, Volume I" by Rogers and Williams [RW00] and we use their notations.Another useful source is "Functional analysis for probability and stochastic processes" by A. Bobrowski [Bob05].Let E be a locally compact Hausdorff space with countable base and let it be equipped with the Borel σ-algebra The physical picture is that the added state, ∂, represents a point at infinity; we will view it as an absorbing state.
We say that a continuous real-valued function f on E "vanishes at infinity" if for every ε > 0 there is a compact subset K ⊂ E such that ∀x ∈ E \ K we have |f (x)| ≤ ε.This space is a Banach space with the sup norm.Definition 2.5.A semigroup of operators on any Banach space is a family of linear continuous (bounded) operators T t indexed by t ∈ R ≥0 such that The first equation above is called the semigroup property.The operators in a semigroup are continuous however there is a useful continuity property of the semigroup as a whole.Definition 2.6.For X a Banach space, we say that a semigroup Definition 2.7.A Feller-Dynkin semigroup (FDS) is a strongly continuous semigroup ( Pt ) t≥0 of linear operators on C 0 (E) (the space of continuous functions on E which vanish at infinity) satisfying the additional condition: The following important proposition relates these FDS with Markov processes which allows one to see the connection with more familiar probabilistic transition systems.Proposition 2.8.Given such an FDS, it is possible to define a unique family of sub-Markov kernels (P t ) t≥0 : A very important ingredient in the theory is the space of trajectories of a FD processes (FD semigroup) as a probability space.This space does not appear explicitly in the study of labelled Markov processes but one does see it in the study of continuous-time Markov chains and jump processes.Definition 2.9.We define a trajectory ω on E ∂ to be a cadlag It is possible to associate to such an FDS a canonical FD process.Let Ω be the set of trajectories ω : [0, ∞) − → E ∂ .Definition 2.10.The canonical FD process associated to the FDS ( Pt ) is where • given any probability measure µ on E ∂ , by the Kolmogorov extension theorem, there exists a unique probability measure P µ on (Ω, G) such that for all n ∈ N, 0 ≤ t 1 ≤ t 2 ≤ ... ≤ t n and x 0 , x 1 , ..., x n in E ∂ , where P +∂ t is the Markov kernel extending the Markov kernel P t to E ∂ by P +∂ t (x, {∂}) = 1 − P t (x, E) and P +∂ t (∂, {∂}) = 1.We set P x = P δx .This is the version of the system that will be most useful for us.In order to bring it more in line with the kind of transition systems that have hitherto been studied in the computer science literature we introduce a finite set of atomic propositions AP and such a FD process is equipped with a function obs : E − → 2 AP .This function is extended to a function obs : Instead of following the dynamics of the system step by step as one does in a discrete system we have to study the behaviour of sets of trajectories.The crucial ingredient is the distribution P x which gives a measure on the space of trajectories for a system started at the point x.

Brownian motion as a FD process
Brownian motion is a stochastic process describing the irregular motion of a particle being buffeted by invisible molecules.Now its range of applicability extends far beyond its initial application [KS12].The following definition is from [KS12].Definition 2.11.A standard one-dimensional Brownian motion is a Markov process adapted to the filtration F t , B = (W t , F t ), 0 ≤ t < ∞ defined on a probability space (Ω, F, P ) with the properties 1. W 0 = 0 almost surely, 2. for 0 ≤ s < t, W t −W s is independent of F s and is normally distributed with mean 0 and variance t − s.
In this very special process, one can start at any place, there is an overall translation symmetry which makes calculations more tractable.In order to do any calculations we use following fundamental formula: If the process is at x at time 0 then at time t the probability that it is in the (measurable) set D is given by

Bisimulation
The concept of bisimulation is fundamental and its history is well documented [San09].We recall the definition of bisimulation on continuous state spaces with discrete time steps [DEP02,Pan09], we call it a DT-bisimulation to emphasize that it pertains to discrete-time systems.We consider LMPs equipped with a family of atomic propositions AP where A ∈ AP is interpreted on a specific LMP as a subset of the state space represented by its characteristic function χ A .Definition 3.1.Given an LMP (X, Σ, τ, (χ P ) P ∈AP ), a DT-bisimulation R is an equivalence relation on X such that if xRy, then • for all R-closed sets B ∈ Σ, τ (x, B) = τ (y, B).

Naive approach
The key idea of bisimulation is that "what can be observed now is the same" and bisimulation is preserved by the evolution.In order to capture this we need two conditions: the first captures what is immediately observable and the second captures the idea that the evolution preserves bisimulation.
Let us consider the naive extension of bisimulation in discrete time: let us consider an equivalence relation R on the state space E such that whenever x R y (x, y ∈ E): (initiation 1) obs(x) = obs(y), and (induction 1) for all R-closed sets C in E, for all time t, P t(x, C) = P t (y, C) Let us illustrate on an example why this definition is not enough.
We consider the case of Brownian motion on the reals where there is a single atomic proposition marking 0: obs(0) = 1 and obs(x) = 0 for x = 0.
Intuitively, we would like that two states x and y are bisimilar if and only if |x| = |y| as the only symmetry that this system has is point reflection with respect to 0.
However, the two conditions (initiation 1) and (induction 1) are not strong enough to enforce that this equivalence relation is the greatest bisimulation.
Let us define the equivalence This equivalence satisfies both conditions (induction 1) and (initiation 1).The last one follows directly from the definitions of R and obs.
For the induction condition, the only R-closed sets are ∅, {0}, R * and R, and for any state z = 0 and time t ≥ 0, P t (z, ∅) = P t (z, {0}) = 0 and

Definition
As we have just shown, unlike in the discrete-time case we cannot just say that the "next step" preserves the relation.Therefore we have to talk about the trajectories; but then we need to choose the right condition on sets of trajectories.Definition 3.2.An equivalence relation R on the state space E is a bisimulation if whenever xRy, the following conditions are satisfied: (initiation 1) obs(x) = obs(y), and (induction 2) for all R-closed sets B in G, P x (B) = P y (B) where by Rclosed, we mean that for all ω ∈ B if a trajectory ω is such that for all time t ≥ 0, ω(t)Rω (t), then ω ∈ B.
Clearly equality is trivially a bisimulation.And by definition of P x , condition (induction 2) implies (induction 1).
We have chosen to give names to the conditions.The reason for choosing those names will become clear in section 3.3.2.Remark 3.3.Usually, for discrete time, instead of a single kernel τ , a labelled Markov process is a family of Markov kernels indexed by a family of actions.These actions correspond to the environment or the user acting on the process.The second condition of bisimulation is then stated on the corresponding Markov kernels for all actions.It is possible to to the same for continuous-time.We can consider a family of FD processes indexed by a set of actions.Condition (induction 2) is then stated for all these actions.Everything done afterwards can be adapted to that setting that way.Lemma 3.4.An equivalence relation R is a bisimulation if and only if whenever xRy, the following conditions are satisfied: (initiation 2) for all obs-closed sets B in G, P x (B) = P y (B) where by obs-closed, we mean that for all ω ∈ B if a trajectory ω is such that (induction 2) for all R-closed sets B in G, P x (B) = P y (B).
Proof.Let us consider a bisimulation R. Let us now consider two states x, y such that x R y and an obs-closed measurable set B. First note that the set B is R-closed: if ω ∈ B and for all t ≥ 0, ω(t) R ω (t), then, by definition of bisimulation (initiation condition), obs(ω(t)) = obs(ω (t)) for all t ≥ 0. Since the set B is obs-closed, this means that ω ∈ B and hence B is R-closed.Using the induction condition, we have that P x (B) = P y (B).
Let us now consider an equivalence R that satisfies both conditions.Let x, y be two states such that x R y and let us define the set The set B x is obs-closed and P x (B x ) = 1.Therefore P y (B x ) = 1 (by (initiation 2)) and therefore obs(y) = obs(x).Definition 3.5.Two states are bisimilar if there is a bisimulation that relates them.Proposition 3.6.Given two bisimulations R 1 and R 2 , the transitive closure Proof.Clearly R is an equivalence.
Let us prove that the equivalence R satisfies both conditions.Assume xRy.This means that there is a finite sequence Let us consider an obs-closed set B in G, then since both R 1 and R 2 are bisimulations, we have that Let us now consider an R-closed set B. First, note that the set B is R 1closed: consider ω ∈ B, and a trajectory ω such that for all time t ≥ 0, ω(t) R 1 ω (t).Then, in particular, for all time t ≥ 0, ω(t) R ω (t), and since Since R 1 is a bisimulation, we have that P x (B) = P x 0 (B), P xn (B) = P y (B) and P x 2k+1 (B) = P x 2k+2 (B) (for all suitable k).And since R 2 is a bisimulation, we have that P x 2k (B) = P x 2k (B) (for all suitable k).We then have: Proposition 3.7.The relation "is bisimilar to" is the greatest bisimulation.
Proof.Let us denote R max the relation "is bisimilar to":

R
It is enough to prove that it is a bisimulation.First note that it is an equivalence.Indeed, it is reflexive and symmetric since the equality is a bisimulation.For transitivity, note that if x R max y and y R max z, then there are two bisimulations R 1 and R 2 such that x R 1 y and y R 2 z.By proposition 3.6, we know that the transitive closure R of R 1 ∪ R 2 is a bisimulation and in particular this means that x R y, y R z and therefore x R z.Since R is a bisimulation, this means that x is bisimilar to z, which proves the transitivity of R max .
Consider x and y such that x is bisimilar to y.This means that there is a bisimulation R such that x R y.The initiation condition for R gives us that obs(x) = obs(y), which also corresponds to the initiation condition we want for R max .Consider now B an R max -closed set.The set B is also R-closed: consider ω ∈ B and a trajectory ω such that for all time t ≥ 0, ω(t) R ω (t).
Then, we also have that for all time t ≥ 0, ω(t) R max ω (t) and since the set B is R max -closed, we have that ω ∈ B. And since x R y, we have that P x (B) = P y (B) which concludes the proof.
We now consider several examples and give their greatest bisimulation.
Proving that an equivalence is the greatest bisimulation follows the following outline: first proving that the equivalence satisfies conditions (initiation 1) and (induction 2) (and hence it is a bisimulation), and then using (initiation 2) to prove that it is the greatest bisimulation possible.

Deterministic Drift
Consider a deterministic drift on the real line R with constant speed a ∈ R.
We consider two cases: with 0 as the only distinguished point and with all the integers distinguished from the other points.
With zero distinguished: Let us consider the case when there is a single atomic proposition called obs, and obs(x) = 1 if and only if x = 0. Proposition 3.8.Two states x and y are bisimilar if and only if either ax > 0 and ay > 0 or x = y.
Proof.To make this proof not too tedious, we will assume that a > 0 (case a < 0 works in a similar fashion and case a = 0 is boring).Denote Let us consider x and y such that xRy.We have to consider two cases: • If obs(x) = 1, this means that x = 0.The state 0 is only bisimilar to itself, which means that y = 0 and therefore obs(x) = obs(y).
• If obs(x) = 0, this means that x = 0.The state 0 is only bisimilar to itself, which means that y = 0 and therefore obs(x) = obs(y).
Consider a measurable set B. First, for any z ∈ R, let us denote ω z the trajectory ω z (t) = z + at and note that P z (B) = δ B (ω z ).
Consider an R-closed measurable set B. We want to show that P x (B) = P y (B).For that, there are two cases to consider: • either x ≤ 0, in which case x = y since xRy which proves the point, • or x > 0, in which case y > 0. In that case, for all t, ω x (t) > 0 and ω y (t) > 0 and in particular for all time t, ω This concludes the second part of the proof.
Let us now prove that this is the greatest such bisimulation.We are using condition (initiation 2) for that: Consider x > 0 and y ≤ 0. For all t ≥ 0, ω x (t) = 0, but ω y (−y/a) = 0. Define B = {ω | ω(−y/a) = 0}.This set is obs-closed but P x (B) = 0 and P y (B) = 1.These two states cannot be bisimilar.
With all integers distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x ∈ Z. Proposition 3.9.Two states x and y are bisimilar if and only if x − x = y − y , i.e. x − y ∈ Z.
First let us prove that R is a bisimulation.Take x R y and denote k = x − y ∈ Z.
Note that x ∈ Z if and only if y ∈ Z and therefore obs(x) = obs(y).
Consider an R-closed set B. In particular, this means that k Deterministic drift is invariant under translation, which means that P y (B) = P y+k (B + k) = P x (B).
Let us now prove that this is the greatest bisimulation.Let x, y ∈ R.Here we are going to assume that a > 0, the case a < 0 works exactly the same but considering x instead of x .Define z = x − x.For any s ∈ R, let us denote ω s the trajectory ω s (t) = s+at.Note that ω x (z/a) = x+z = x ∈ Z and ω y (z/a) = y + z = y − x + x .This means that ω y (z/a) ∈ Z if and only if y − x ∈ Z. Finally, define B = {ω | ω(z/a) ∈ Z}.This set is obs-closed and measurable, but we have proven that P x (B) = P y (B) if and only if y − x ∈ Z.

Fork
One could think that since trajectories are already included in the initiation condition (initiation 2), the additional induction condition is not necessary.However, this example illustrates the crucial role of the induction condition in the definition of bisimulation.It is an extension of the standard "vending machine" example in discrete time to our continuous-time setting and it shows that even the condition (induction 1) are enough to discriminate between states that (initiation 2) cannot distinguish.
Let us consider the following state space: There are two atomic propositions (denoted P and Q on the diagram), that are satisfied by the final state of some of the branches.The process is a drift at a constant speed to the right.When it reaches a fork, it moves to either branch with probability 1/2 (and stops when he hits an atomic proposition).
The kernel is defined as follows for t ≤ 100: for j = 2, 3, t = 0 P t ((x, j), (x + t, j)) = 1 for all j and for all t such that (x + t, j) exists P t ((y, 4), (y + t, j)) = 1 2 for j = 5, 6 and for all t such that (y + t, j) exists P t ((100, j), (100, j)) = 1 for j = 2, 3, 5, 6 The basic claim is that the states x 0 and y 0 cannot be bisimilar since states x 1 , x 2 , y 1 cannot be bisimilar either.This is where condition (induction 2) is really important since the two states x 0 and y 0 have similar traces as they both satisfy the condition (initiation 2).Proposition 3.10.The two states x 0 and y 0 satisfy the condition (initiation 2).
Proof.From state x 0 , there are only two trajectories possible, each with probability 1/2: From state y 0 , there are only two trajectories possible, each with probability 1/2: However, for all time t ≥ 0, obs(ω x 1 (t)) = obs(ω y 1 (t)) and obs(ω x 2 (t)) = obs(ω y 2 (t)), which means that if a set B is obs-closed, then ω x 1 ∈ B (resp.ω x 2 ∈ B) if and only if ω y 1 ∈ B (resp.ω y 2 ∈ B).Putting all this together, we get that for any obs-closed set B: Proposition 3.11.The states x 0 and y 0 cannot be bisimilar.

Examples based on Brownian motion
Let us now consider two different states x and y.We can define the set B t = {ω | ∃s < t ω(s) = 0}.This set is obs-closed.It can also be expressed as B t = T −1 0 ([0, t)) where T 0 is the hitting time for Brownian motion and we know that for any state z, . This proves that no equivalence strictly bigger than R may satisfy (initiation 2).
With all integers distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x ∈ Z. Proposition 3.13.Two states x and y are bisimilar if and only if x − x = y − y or y − y .
Proof.First let us prove that R = {(x, y) | x − x = y − y or y − y} is indeed a bisimulation.This relies on the invariance under translation and symmetry of the problem.
Let us consider x R y.There are two cases to consider: This means that x ∈ Z if and only if y ∈ Z and therefore obs(x) = obs(y).Now consider an R-closed measurable set B of trajectories.Since it is R-closed, we have that Using previous case, we can assume that x and y are in [0, 1] and x = 1 − y.We have that x ∈ {0, 1} if and only if y ∈ {0, 1} and therefore obs(x) = obs(y).Now consider an R-closed measurable set B of trajectories.Since it is R-closed, we have that 1 Let us show that it is the greatest such bisimulation.Consider x / ∈ Z and y ∈ Z.We have that obs(x) = obs(y) and therefore these two states cannot be bisimilar.
Let us now consider x, y / ∈ Z such that xRy and the sets B t = {ω | ∃s ∈ [0, t) ω(s) ∈ Z}.These sets are obs-closed.Furthermore, they can also be expressed as: This proves that the sets B t are measurable.Let us compute P z (B t ) for any z ∈ R: Since this is true for all t ≥ 0 and all z ∈ R, we have that P x− x ((T 0 ∧ T 1 ) ∈ ds) = P y− y ((T 0 ∧ T 1 ) ∈ ds) and therefore the following Laplace transforms are equal: Using [KS12], we have that Therefore, we know that Using simple properties of cosh, we get that either . This proves that no equivalence strictly bigger than R may satisfy (initiation 2).Let us now prove that this is the greatest bisimulation.
This set is obs-closed, however, for z > 1, we have that Therefore if there is a bisimulation R such that xRy, in particular we have that P x (B t ) = P y (B t ) and hence |x| = |y|.
Since we have that P x (B t ) = P y (B t ) for all t ≥ 0, we get that P x+1 (T 0 ∧ T 2 ∈ ds) = P y+1 (T 0 ∧T 2 ∈ ds) and therefore the corresponding Laplace transforms are equal: Therefore, we know that cosh x Using simple properties of cosh, we get that either x = y or x = −y which concludes the proof.

Brownian motion with drift
Let us consider a Brownian process with drift: W t = W t + at (where W t is the standard Brownian motion and a > 0, note that the case a < 0 is symmetric).
With zero distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x = 0. Proposition 3.15.Two states x and y are bisimilar if and only if x = y.
Proof.As stated before, the equivalence where a state is only related to itself is a bisimlation.
Let us now show that this is the greatest bisimulation.Let us consider two different states x and y.Similarly to what we did for the standard Brownian motion, we can rule out the case where x = 0 (and y = 0) or y = 0 (and x = 0) by simply looking at the function obs.
It can also be expressed as where T 0 is the hitting time for Brownian motion and we know that for any state z, Since we have that for all t, P x (B t ) = P y (B t ), then we also have that for all s ≥ 0, Since x, y = 0, we have that for all s, t ≥ 0, which is equivalent to (s − t)y 2 = (s − t)x 2 .This means that in that case |x| = |y|.Going back to the original expression, we have that for all s ≥ 0, −(x + as) 2 = −(y + as) 2 and therefore 2asx = 2asy.Since a = 0, we get that x = y in order to have (initiation 2).
With all integers distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x ∈ Z. Proposition 3.16.Two states x and y are bisimilar if and only if x − x = y − y .
Proof.First let us prove that R = {(x, y) | x − x = y − y } is indeed a bisimulation.This relies on the invariance under translation of the problem.Note that compared to standard Brownian motion with all integers distinguished, the drift "removes" the invariance under symmetry.
Indeed, let us consider x R y.Let k = y − x ∈ Z (i.e.x + k = y).Clearly obs(x) = obs(y) as in the standard Brownian motion case.Let us consider an R-closed set B (B + k = B as in the standard case).We have that: Let us show that it is the greatest such bisimulation.Similarly to the standard case, x / ∈ Z and y ∈ Z cannot be bisimilar since they don't have the same observables.
Let us now consider x, y / ∈ Z and the sets B t = {ω | ∃s ∈ [0, t) ω(s) ∈ Z} such that for all t, P x (B t ) = P y (B t ).Similarly to what we did in the case of standard BM, for all z, Since for all t, P x (B t ) = P y (B t ), we get that E x− x [e −λ(T 0 ∧T 1 ) ] = E y− y [e −λ(T 0 ∧T 1 ) ].For 0 ≤ z < 1 and all λ ≥ 0, We can denote k = √ 2λ + a 2 and we can define for z ∈ (0, 1) and k ≥ a, g z (k) = sinh((1 − z)k)e −az + sinh(zk)e a(1−z) .We have that for all k ≥ a, g x− x (k) = g y− y (k).We want to prove that x − x = y − y .This is done through the following lemma.Lemma 3.17.Consider z 1 , z 2 ∈ (0, 1).If g z 1 (k) = g z 2 (k) for all k ≥ a, then z 1 = z 2 .
Proof.Of lemma.First, note that for z ∈ (0, 1), Let us study the second case z 1 = z and z 2 = 1 − z for z ∈ (0, 1) \ {1/2}.We have that for all k ≥ a, g z (k) = g 1−z (k).This equation amounts to sinh(k(1 − z))e −az + sinh(kz)e a(1−z) = sinh(kz)e −a(1−z) + sinh(k(1 − z))e az By reorganizing the terms, we get that Considering the left hand-side, we have that: We can therefore consider the limit of the left-hand side of the equation: This means that in order to satisfy (initiation 2), we need to have that x − x = y − y and therefore R is the greatest bisimulation.
With an interval distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x ∈ [−1, 1].Proposition 3.18.Two states x and y are bisimilar if and only if x = y.
Proof.As stated, the equality is a bisimulation.Let us prove that it is the greatest.

Let x, y /
∈ [−1, 1] and for all t ≥ 0, For z > 1, we have that This function is injective on [1, +∞) which means that we cannot have both x, y > 1.
For z < −1, we have that This function is injective on (−∞, 1] which means that we cannot have both x, y < −1.
Assume x > 1 and y < −1.We then have that For λ = 0, we get that x = 1, which is not possible, and therefore x and y cannot be bisimilar.
We can denote k = √ 2λ + a 2 and we can define for z ∈ (0, 1) and k ≥ a, h z (k) = sinh((z + 1)k)e a(1−z) + sinh((1 − z)k)e −a(1+z) .We have that for all k ≥ a, h x (k) = h y (k).We want to prove that x = y.This is done through the following lemma.Lemma 3.19.
Let us study the second case z 1 = z and By reorganizing the terms, we get that Considering the left hand-side, we have that: We can therefore consider the limit of the left-hand side of the equation: With this the overall proof is complete.

Brownian motion with absorbing wall
Another usual variation on Brownian motion is to add boundaries and to consider that the process does not move anymore or dies once it has hit a boundary.Since all our previous examples involved probability distributions (as opposed to subprobabilities), we will see the boundary as killing the process.
Absorption at 0: let us consider the case of Brownian motion with absorption at the origin and without any atomic proposition.The state space is R >0 .Proposition 3.20.Two states x and y are bisimilar if and only if x = y.
Proof.We know that equality is a bisimulation.Let us prove that it is the greatest.
For all t ≥ 0, the set Let us clarify the intuition behind that set B t : it is the set of trajectories such that the process following one of these trajectories is dead at time t.
For all state x ≥ 0, The only way P x abs (B t ) = P y abs (B t ) is therefore to have x = y.
This means that R is a bisimulation.
Let us now prove that it is the greatest bisimulation.For all t ≥ 0, the set For all x ∈ (0, b) and t ≥ 0, Similarly to what was done in the case of standard Brownian motion with all integers distinguished, we get that P x abs (B t ) = P y abs (B t ) for all t ≥ 0 if and only if x = y or x = b − y.
Absorption at 0 and 2b with atomic proposition at b: let us consider the case of Brownian motion with absorption at the origin and at 2b > 0, so the state space is (0, 2b), and with a single atomic proposition such that obs(b) = 1 and obs(x) = 0 for x = b.Proposition 3.22.Two states x and y are bisimilar if and only if x = y or y = 2b − x.
Proof.Let us define the equivalence Let us show that this relation R is a bisimulation.Clearly x = b, if and only if, 2b − x = b and therefore obs(x) = obs(2b − x).The proof of the induction condition is similar to claim ??.
Similarly to proposition 3.21, we also have that this is the greatest bisimulation.
Absorption at 0 and 4b with atomic proposition at b: let us consider the case of Brownian motion with absorption at the origin and at 4b > 0, so the state space is (0, 4b), and with a single atomic proposition such that obs(b) = 1 and obs(x) = 0 for x = b.Proposition 3.23.Two states x and y are bisimilar if and only if x = y.
Proof.Using proposition 3.21, it is clear that a state x can only be bisimilar to either itself or 4b − x.For z < b, Similarly, for z > b, This function is strictly decreasing on (b, 4b), hence we cannot have x ∈ (b, 2b) bisimilar to 4b − x.Moreover for x < b, we get And we get that E

Feller-Dynkin cospan
The concept of bisimulation that we have discussed so far is defined between states of a process.One often wants to compare different processes with different state spaces.For this one needs to use functions that relate the state spaces of different processes.One does want to preserve the relational character of bisimulation.In the coalgebra literature one uses spans of so-called "zigzag" morphisms.In previous work [DDLP06] on (discretetime) Markov processes people have considered cospans as this leads to a smoother theory.Intuitively, the difference is whether one thinks of an equivalence relation as a set of ordered pairs or as a collection of equivalence classes.

Feller-Dynkin homomorphism
This definition of bisimulation can easily be adapted to states in different Markov processes by constructing the disjoint union of the Markov processes.
The disjoint union of two Markov processes is defined as such: given two FD processes (E j , E j , ( P j t ), (P j t ), Ω j , G j , (P x j ), obs j ) j=1,2 , we write i 1 : E 1 − → E 1 E 2 and i 2 : E 2 − → E 1 E 2 for the two corresponding inclusions.The disjoint unions of the two FD processes is the process (E 1 E 2 , E, ( Pt ), (P t ), Ω, G, (P x ), obs) where: • the topology on E 1 E 2 is generated by the topologies on E 1 and where O 1 and O 2 are opens of E 1 and E 2 respectively, • E is the Borel-algebra generated by this topology.It can also be expressed as the σ-algebra generated by {i • for any state x ∈ E 1 E 2 , any time t ≥ 0 and any function in C 0 (E 1 E 2 ), we define the semigroup: where f j : E j − → R is defined by f j (y) = f • i j (y).The semigroup Pt inherits the desired properties from P 1 t and P 2 t for it to be a FDS, • for any state x ∈ E 1 E 2 , any time t ≥ 0 and any measurable set C ∈ E, the kernel can be made explicit as: , • for any state x ∈ E 1 E 2 , we set • the set of trajectories on (E 1 E 2 ) ∂ is denoted Ω.Note that a trajectory in Ω can switch between E 1 and E 2 .The set Ω is equipped with a σ-algebra G as is standard for FD processes.For a state x, we can explicit the probability distribution for B ∈ G: We can also make explicit what a bisimulation is in that context (we will omit to mention the inclusions i 1 and i 2 to be readable): Definition 4.1.Given two FD processes (E j , E j , ( P j t ), (P j t ), Ω j , G j , (P x j ), obs j ) j=1,2 , a bisimulation between the two FDPs is an equivalence R on E 1 E 2 such that for all xRy (x ∈ E i , y ∈ E j ), (inititiation 1) obs i (x) = obs j (y), and (induction 2) for all measurable R-closed sets B, P x (B∩Ω i ) = P y (B∩Ω j ).
This condition can also be stated as follows.For all sets B 1 ∈ G 1 and B 2 ∈ G, P x i (B i ) = P y j (B j ) if the two sets satisfy the following condition: In that formulation, B k = B ∩ Ω k and the condition states that the set B is R-closed in terms of the sets B 1 and B 2 .
Note that R ∩ (E j × E j ) is a bisimulation on (E j , E j , (P j t ), (P x j )).To proceed with our cospan idea we need a functional version of bisimulation; we call these Feller-Dynking homomorphisms or FD-homomorphisms for short.
Definition 4.2.A continuous function f : E − → E is called a FD-homomorphism if it satisfies the following conditions: • for all x ∈ E and for all measurable sets B ⊂ Ω , P f (x) (B ) = P x (B) where Note that if f and g are FD-homomorphisms, then so is g • f .Proposition 4.3.The equivalence relation R defined on E E as is a bisimulation on E.
Proof.Consider x and y such that x R y.We are going to assume that f (x) = f (y) and we will be treating the case xRf (x) at the same time.
Second, let us check the induction condition (induction 2).Consider an R- As f is an FD-homomorphism, we have that P f (x) (B ) = P x (B).
Let us show that B = B ∩ Ω.
Since the set B is R-closed and f • ω ∈ B, we have that ω ∈ B which proves the first inclusion.
• Consider ω ∈ B ∩ Ω.The trajectory f • ω is well-defined and is in Ω since f is continuous.Similarly to what was done for the first inclusion, we get that f • ω ∈ B since ω ∈ B and B is R-closed.This proves that ω ∈ B We get that P f (x) ( B ∩ Ω ) = P x ( B ∩ Ω).
Corollary 4.4.The equivalence relation R defined on E as is a bisimulation on E.
Here is an example with one atomic proposition.Let M 1 be the standard Brownian motion on the real line with obs 1 (x) = 1 if and only if x ∈ Z.Let M 2 be the reflected Brownian motion on [0, 1] with obs 2 (x) = 1 if and only if x = 0 or 1.Let M 3 be the reflected Brownian motion on 0, 1 2 with obs 3 (x) = 1 if and only if x = 0. Let M 4 be the standard Brownian motion on the circle of radius 1 2π (we will identify points on the circle with the angle wrt the vertical) with obs 4 (x) = 1 if and only if x = 0.
We can define some natural mappings between these processes: where Note that the condition in the definition of φ 3 means that y is the closest integer to x. Proposition 4.5.These morphisms are FD-homomorphisms.
Proof.Note that all these functions are continuous.
First note that obs 3 • φ 1 (θ) = 1 if and only if φ 1 (θ) = 0 (by definition of obs 3 ).By definition of φ 1 , this is |θ|/2π = 0, i.e. θ = 0.But this corresponds to the only case where obs 4 (θ) = 1 (note that what we have proven is an equivalence).We have therefore proven that obs The second condition is obvious by definition of Brownian motion on these sets.
We equip this set with the smallest topology that makes π ∼ continuous (where the topology on E 1 E 3 is the topology inherited from the inclusions).Note that this corresponds to the pushout in T op.
This proves that B 1 ∈ G 1 .
First note that if φ 1 (z 1 ) = φ 3 (z 3 ), then there exists z 2 ∈ E 2 such that z 1 = h(z 2 ) and z 3 = g(z 2 ).Since h and g are FD-homomorphisms, we get that This indeed defines a probability distribution on (Ω 4 , G 4 ) (this is a direct consequence of the fact that P z 1 and P z 3 are probability distributions).
Let us now check that this is indeed a Feller-Dynkin Process.We define first the corresponding kernel for t ≥ 0, x ∈ E 4 and C ∈ E 4 , Given an LMP (X, Σ, τ, (χ A ) A∈AP ), we can always view it as a FD process on (E, E) with E = X × [0, 1) and E = Σ × B([0, 1)) by adding to the space the following kernel: for all x ∈ X and C ∈ Σ, t ≥ 0 and s ∈ [0, 1), We also define (obs(x, s) Let us recall the definition of bisimulation in the discrete time setting.Definition 5.1.Given an LMP (X, Σ, τ, (χ A ) A∈AP ), a DT-bisimulation R is an equivalence relation on X such that if xRy, then Proof.Let us denote π R : X − → X/R the quotient.We can also define some function τ (π R (x), A) = τ (x, π −1 R (A)) Note that the choice of x does not change the right term since R is a DTbisimulation and π −1 R (A)) is an R-closed set.A sequence of change of vari-ables yields: Proof.First note that the relation R is indeed an equivalence since R is one.
Let us show that obs(x, s) = obs(y, s).This is a direct consequence of the fact that obs i (x, s) = χ A i (x) (and similarly for y) and since x R y, χ A i (x) = χ A i (y).
Finally, we want to prove that P (x,s) (B) = P (y,s) (B) for any measurable R -closed set B.
The measurable R -closed set B is of the form {ω | ∀i ∈ N ω(i) ∈ A i } where A i ∈ E ∂ is an R -closed set.
Let us denote for all n ∈ N, B n = {ω | ∀i ≤ n ω(i) ∈ A i } and let us show that P (x,s) (B n ) = P (y,s) (B n ).To prove that, we can also assume that ∀(x, t) ∈ A i , t = s.We denote A i = {z | (z, s) ∈ A i }.
Assume ∂ ∈ A k with k ≤ n.We can write explicitly:  For that reason, we can deal with the two following cases and conclude in full generality that for all n ∈ N, P (x,s) (B n ) = P (y,s) (B n ): • First, if for all i ≤ n, ∂ / ∈ A i , then we have that P +∂ 1 ((z, s), A i ) = τ (z, A i ) and in this case we have that P (x,s) (B n ) = x 0 ∈A 0 ... xn∈An δ (x,s) (dx 0 )P +∂ 1 (x 0 , dx 1 )...P +∂ 1 (x n−1 , dx n ) = x 1 ∈A 1 ...
Moreover, since P (x,s) (B) = lim n− →∞ P (x,s) (B n ) (and similarly for y), we get the desired result.
Definition 5.4.An equivalence R on the state space of an LMP viewed as a FD process is time-coherent if for all x, y in the state space of the LMP and for all 0 ≤ t < 1, (x, t)R(y, t) ⇒ ∀s ∈ [0, 1) (x, s)R(y, s) Given any equivalence R on the state space of an LMP viewed as a FD process, we define its time-coherent closure (denoted time(R)) as the smallest time-coherent equivalence containing R. Proposition 5.5.If R is a bisimulation on an LMP viewed as a FD process, then so is time(R).GJP06].It seems that some of the details in that work are possibly incorrect, so we hope to fix those details and to adapt similar ideas to our framework.
Another important and interesting question is that of approximations (see [DGJP03, DDP03, CDPP14]) of our Markov processes.Here we will undoubtedly face new subtleties as we will have to cope with both spatial and temporal limits.
Finally, a fundamental result in this area is the logical characterization of bisimulation [vB76,HM80] which was also extended to the probabilistic case [DEP02].We hope to be able to provide such a logic for continuoustime processes based on the set of [0, 1]-valued functions used to obtain a bisimulation metric.A game interpretation of bisimulation could also be provided [FKP17].Perhaps some interesting insights could also come from nonstandard analysis [FK17] where there is also a notion of equivalence but one which is quite different from bisimulation.In that work the notion of adapted spaces is fundamental.
3.4.1 Standard Brownian MotionWith zero distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x = 0. Proposition 3.12.Two states x and y are bisimilar if and only if |x| = |y|.Proof.First let us prove that R = {(x, y) | |x| = |y|} is a bisimulation.Consider x R y, i.e. |x| = |y|.This means that x = 0 if and only if y = 0.In other terms, obs(x) = 1 if and only if obs(y) = 1 and hence obs(x) = obs(y).Let now B be an R-closed measurable set of trajectories.This means that ω ∈ B if and only if −ω ∈ B since |ω(t)| = | − ω(t)| for all time t ≥ 0. And therefore P x (B) = P −x (−B) = P −x (B) where −B := {t → −ω(t) | ω ∈ B}.
With an interval distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x ∈ [−1, 1].Proposition 3.14.Two states x and y are bisimilar if and only if |x| = |y|.Proof.First let us prove that R = {(x, y) | |x| = |y|} is a bisimulation.Consider x R y, i.e. |x| = |y|.Clearly, x ∈ [−1, 1] if and only if y ∈ [−1, 1] and therefore obs(x) = obs(y).Let us now look at the induction condition.Consider an R-closed measurable set B of trajectories.This means that −B := {t → −ω(t) | ω ∈ B} = B and therefore P x (B) = P y (B).

Absorption at 0
and b: let us consider the case of Brownian motion with absorption at the origin and at b > 0 and without any atomic proposition.The state space is therefore (0, b).Proposition 3.21.Two states x and y are bisimilar if and only if x = y or x = b − y.Proof.Let us define the equivalence R = {(x, x), (x, b − x) | x ∈ (0, b)} First note that it is indeed a bisimulation.There are no atomic propositions.Let us now consider an R-closed set B of trajectories.This means that b − B := {t → b − ω(t) | ω ∈ B} = B.So for x ∈ (0, b), we have that First, state b is not bisimilar to 3b since obs(b) = 1 and obs(3b) = 0. Let us now show that x ∈ (0, 2b) with x = b and 4b − x ∈ (2b, 4b) are not bisimilar.Let us define B t = {ω | ∃s ∈ [0, t) ω(s) = b}.This set is indeed obs-closed.Similarly to what was done in the standard Brownian motion case, we can compare E x abs [e −λT abs b ] and E 4b−x abs [e −λT abs b ] instead.

Proof.
Let us first start by clarifying what the equivalence time(R) is.Define the relation Q = {((x, s), (y, s)) | ∃t (x, t)R(y, t)}, it is reflexive and symmetric.Let us consider its transitive closure tc(Q).The relation tc(Q) is an equivalence.Moreover, it contains the equivalence R and it is is timecoherent.