Towards a classiﬁcation of behavioural equivalences in continuous-time Markov processes

Bisimulation is a concept that captures behavioural equivalence of states in a transition system. In [6], we proposed two equivalent deﬁ-nitions of bisimulation on continuous-time stochastic processes where the evolution is a ﬂow through time. In the present paper, we develop the theory further: we introduce diﬀerent concepts that correspond to weaker behavioural equivalences and compare them to bisimulation. In particular, we study the relation between bisimulation and symmetry groups of the dynamics. We also provide a game interpretation for two of the behavioural equivalences. We then compare those notions to their discrete-time analogues


Introduction
Bisimulation [16,18,20] is a fundamental concept in the theory of transition systems capturing a strong notion of behavioural equivalence. In particular, it is a notion stronger than that of trace equivalence. Bisimulation has been widely studied for discrete time systems where transitions happen as steps, both on discrete [15] and continuous state spaces [4,9,17]. In all these types of systems a crucial ingredient of the definition of bisimulation is the ability to talk about the next step. Thus, the general format of the definition of bisimulation is that one has some property that must hold "now" (in the states being compared) and then one says that the relation is preserved in the next step.
Outside of computer science, there is a vast range of systems that involve continuous-time evolution: deterministic systems governed by differential equations and stochastic systems governed by "noisy" differential equations called stochastic differential equations. These have been extensively studied for over a century since the pioneering work of Einstein [12] on Brownian motion.
In [6], we introduced a notion of bisimulation for stochastic systems with true continuous-time evolution. Some attempts had previously been made to talk about continuous-time [10], but even in what are called continuoustime Markov chains there is a discrete notion of time step; it is only that there is a real-valued duration associated with each state that makes such systems continuous time. They are often called "jump processes" in the mathematical literature, see, for example, [19,21], a phrase that better captures the true nature of such processes.
We focused on a class of systems called Feller-Dynkin processes for which a good mathematical theory exists. These systems are Markov processes defined on continuous state spaces and with continuous time evolution. Such systems encompass Brownian motion and its many variants.
In this paper, we explore four other notions of behavioural equivalence for such continuous-time processes. The strongest notion is that of a group of symmetries. It is stronger than the notion of bisimulation introduced in [6] and it captures the symmetries of the system.
Temporal equivalence is a notion that is weaker than bisimulation. It looks closer to the definition of bisimulation in discrete time than the definition we provided in [6], however it also strongly relies on trajectories. Temporal equivalence can be summed up as trace equivalence with some additional step-like constraints. Whether group of symmetries and temporal equivalence are strictly stronger and weaker respectively is still an open question.
The third notion is that of trace equivalence. It is the weakest of all those behavioural equivalences and an example in [6] shows that it is a strictly weaker notion.
Finally, we give two game interpretations, one for bisimulation and one for temporal equivalence. They closely mirror that provided in [13,7]. The game for bisimulation also emphasizes the importance of trajectories for the study of behavioural equivalences in continuous time.
The relations between those different behavioural equivalences can be displayed as follows.
trace equivalence We end this paper by studying discrete-time systems and by revisiting the examples provided in our previous study. This seems to indicate that the correct notion that extends bisimulation to continuous-time systems is that of temporal equivalence and not the initial definition provided in [6].

Feller-Dynkin Processes
We assume that basic concepts like topology, measure theory and basic concepts of probability on continuous spaces are well known; see, for example [3,11,17].
The basic arena for the action is a probability space. Definition 2.1. A probability space is a triple (S, F, P ) where S is a space (usually some kind of topological space), F is a σ-algebra (usually its Borel algebra) and P is a probability measure on S.
Given a measurable space (X, Σ) a (sub)-Markov kernel is a map τ : X × Σ − → [0, 1] which is measurable in its first argument, i.e. τ (·, A ∈ Σ) : X − → R is measurable for any fixed A in Σ and for any fixed x ∈ X, τ (x, ·) is a (sub)probability measure. These kernels describe transition probability functions. Definition 2.2. A filtration on a measurable space (Ω, F) is a nondecreasing family (F t ) t≥0 of sub-σ-algebras of F, i.e. F s ⊆ F t ⊆ F for 0 ≤ s < t < ∞.
This concept is used to capture the idea that at time t what is "known" or "observed" about the process is encoded in the sub-σ-algebra F t . Definition 2.3. A stochastic process is a collection of random variables (X t ) 0≤t<∞ on a measurable space (Ω, F) that take values in a second measurable space (S, S) called the state space. We say that a stochastic process is adapted to a filtration (F t ) t≥0 if for each t ≥ 0 we have X t is F t -measurable.
Note that a stochastic process is always adapted to the filtration (G t ) t≥0 , where for each t ≥ 0, G t is defined as the σ-algebra generated by all the random variables {X s |s ≤ t}. The filtration (G t ) t≥0 is also referred to as the natural filtration associated to (X t ) t≥0 .
Before stating the definition of the continuous-time processes we will be interested in, let us first start by recalling the definition of their discretetime counterparts. Definition 2.4. A labelled Markov process (LMP) is a triple (X, Σ, τ ) where (X, Σ) is a measurable space and τ is a Markov kernel.
We will quickly review the theory of continuous-time processes on continuous state space; much of this material is adapted from "Diffusions, Markov Processes and Martingales, Volume I" by Rogers and Williams [19] and we use their notations. Another useful source is "Functional analysis for probability and stochastic processes" by A. Bobrowski [5]. Let E be a locally compact, Hausdorff space with countable base which is also σ-compact and Polish and let it be equipped with the Borel σ-algebra The physical picture is that the added state, ∂, represents a point at infinity; we will view it as an absorbing state. Denoting O the topology on E, the space We say that a continuous real-valued function f on E "vanishes at infinity" if for every ε > 0 there is a compact subset K ⊂ E such that ∀x ∈ E \ K we have |f (x)| ≤ ε. This space is a Banach space with the sup norm. Definition 2.5. A semigroup of operators on any Banach space is a family of linear continuous (bounded) operators T t indexed by t ∈ R ≥0 such that The first equation above is called the semigroup property. The operators in a semigroup are continuous however there is a useful continuity property of the semigroup as a whole. Definition 2.6. For X a Banach space, we say that a semigroup T t : Definition 2.7. A Feller-Dynkin (FD) semigroup is a strongly continuous semigroup (P t ) t≥0 of linear operators on C 0 (E) (the space of continuous functions on E which vanish at infinity) satisfying the additional condition: The following important proposition relates these FD semigroups with Markov processes which allows one to see the connection with more familiar probabilistic transition systems. Proposition 2.8. Given such an FD semigroup, it is possible to define a unique family of sub-Markov kernels (P t ) t≥0 : A very important ingredient in the theory is the space of trajectories of a FD process (FD semigroup) as a probability space. This space does not appear explicitly in the study of labelled Markov processes but one does see it in the study of continuous-time Markov chains and jump processes. Definition 2.9. We define a trajectory ω on E ∂ to be a cadlag 1 function It is possible to associate to such an FD semigroup a canonical FD process.
Let Ω be the set of trajectories ω : [0, ∞) − → E ∂ . Definition 2.10. The canonical FD process associated to the FD semigroup • given any probability measure µ on E ∂ , by the Kolmogorov extension theorem, there exists a unique probability measure P µ on (Ω, G) such that for all n ∈ N, 0 ≤ t 1 ≤ t 2 ≤ ... ≤ t n and x 0 , x 1 , ..., x n in E ∂ , where P +∂ t is the Markov kernel extending the Markov kernel P t to E ∂ by P +∂ t (x, {∂}) = 1 − P t (x, E) and P +∂ t (∂, {∂}) = 1. We set P x = P δx .
1 By cadlag we mean right-continuous with left limits. 2 The σ-algebra G is the same as the one induced by the Skorohod metric, see theorem 16.6 of [2] 3 The dxi in this equation should be understood as infinitesimal volumes. This notation is standard in probabilities and should be understood by integrating it over measurable state sets Ci. This is the version of the system that will be most useful for us. In order to bring it more in line with the kind of transition systems that have hitherto been studied in the computer science literature we introduce a finite set of atomic propositions AP and such a FD process is equipped with a function obs : E − → 2 AP . This function is extended to a function obs : E ∂ − → 2 AP {∂} by setting obs(∂) = ∂.
Instead of following the dynamics of the system step by step as one does in a discrete system we have to study the behaviour of sets of trajectories. The crucial ingredient is the distribution P x which gives a measure on the space of trajectories for a system started at the point x.

Brownian motion as a FD process
Brownian motion is a stochastic process describing the irregular motion of a particle being buffeted by invisible molecules. Now its range of applicability extends far beyond its initial application [14]. The following definition is from [14]. Definition 2.11. A standard one-dimensional Brownian motion is a Markov process adapted to the filtration (F t ) t≥0 , defined on a probability space (Ω, F, P ) with the properties 1. W 0 = 0 almost surely, 2. for 0 ≤ s < t, W t −W s is independent of F s and is normally distributed with mean 0 and variance t − s.
In this very special process, one can start at any place, there is an overall translation symmetry which makes calculations more tractable. In order to do any calculations we use the following fundamental formula: If the process is at x at time 0 then at time t the probability that it is in the (measurable) set D is given by The associated FD semigroup is the following: for f ∈ C 0 (R) and x ∈ R,P

Bisimulation
We introduced a notion of bisimulation in [6] that we will recall in this section. We will also define two weaker notions: trace equivalence and temporal equivalence. The latter one seems to be a better generalization of bisimulation in discrete time systems than our original definition of bisimulation.  Another well-known concept is that of trace equivalence. Temporal equivalence can be viewed as trace equivalence which additionally accounts for step-like branching. As such, it is weaker than bisimulation but stronger than trace equivalence. As shown in section 6.1, it seems to be the notion that best generalizes discrete-time bisimulation since the induction requirement of bisimulation is actually very strong. Definition 3.5. A temporal equivalence is an equivalence relation R on E such that for all x, y ∈ E, if x R y, then (initiation) for all measurable time-obs-closed sets B, P x (B) = P y (B), and (induction) for all measurable R-closed sets C, for all times t, P t (x, C) = P t (y, C). Lemma 3.6. There is a greatest temporal equivalence.
Proof. Define M the set of temporal equivalences and R the transitive closure of R ∈R R .
First note that the relation R is an equivalence. The equivalence {(x, x) | x ∈ E} is a bisimulation and hence x R x. Furthermore, if x R y, it means there exists (x i ) i=0,...,n in E and (R j ) j=0,...,n−1 in R such that x 0 = x, x n = y and for every i ∈ {0, ..., n − 1}, x i R i x i+1 . Since R i is an equivalence, x i+1 R i x i , and hence y R x. Finally, by definition the relation R is transitive. Now, we can prove that R is a temporal equivalence. Consider x R y, i.e. there exists (x i ) i=0,...,n in E and (R j ) j=0,...,n−1 in R such that x 0 = x, x n = y and for every i ∈ {0, ..., n − 1}, For the initiation condition, consider a measurable time-obs-closed set B.
For the induction condition, consider t ≥ 0 and a measurable R-closed set C. Then the set C is R i -closed for every i ∈ {0, ..., n − 1}: consider z R i z and z ∈ C, then by definition of R, z R z and since C is R-closed, z ∈ C. Since R i is a temporal equivalence, P t (x i , C) = P t (x i+1 , C) and hence which concludes the proof. Remark 3.7. Two states that are related by a bisimulation are called bisimilar. There is a greatest bisimulation that corresponds to this equivalence.
Similarly, two states that are related by a temporal equivalence are called temporally equivalent. This equivalence is the greatest temporal equivalence. Lemma 3.8. A bisimulation is also a temporal equivalence. If two states are temporally equivalent, then they are trace equivalent.
Proof. Let R be a bisimulation and consider two states x and y such that x R y.
Consider a time-obs-closed set B. Then it is also time-R -closed: consider two trajectories ω and ω such that ω ∈ B and for every time t, ω(t) R ω (t).
Then for every time t, obs(ω(t)) = obs(ω (t)) (By the initiation condition of bisimulation). Since B is time-obs-closed and ω ∈ B, ω is also in B. Using the induction condition of bisimulation, we get that P x (B) = P y (B).
Consider a measurable R-closed set C and a time t. Define the set B = {ω | ω(t) ∈ C} = X −1 t (C). It is measurable and time-R-closed. We can then apply the induction condition and we get This concludes the proof that R is a temporal equivalence.
The second part of the lemma follows directly from the initiation condition of a temporal equivalence: this is precisely trace equivalence.
In [6], we also introduced the notion of FD-homomorphism that extends the discrete-time notion of zigzags [8]. We showed that cospans of FDhomomorphisms and bisimulations correspond to one another. In particular, if R is a bisimulation, the quotient by R yields a homomorphism.
4 Symmetry groups of the process 4

.1 Definition
Given a function f : Consider a (non-empty) group of homeomorphisms H on E. Then it is possible to define a relation R on E as follows: x R y, if and only if there exists h ∈ H such that h(x) = y. Lemma 4.2. The relation R is an equivalence.
Proof. First note that since the group H is non-empty, it contains at least the identity. This means in particular that for any x ∈ E ∂ , x R x.
For symmetry, consider x R y, i.e. there exists h ∈ H such that h(x) = y.
Since H is closed under inverses, h −1 ∈ H and h −1 (y) = x and hence y R x.
For transitivity, consider x R y and y R z, i.e. there exists h 1 , h 2 ∈ H such that h 1 (x) = y and h 2 (y) = z. Then (h 2 • h 1 )(x) = z and since H is closed under composition, we have that h 2 • h 1 ∈ H and hence x R z.
Given a group of symmetries H, we will denote R H its corresponding equivalence on the state space: Remark 4.4. One of the requirements for being a group of symmetries is to be closed under inverse and composition. This condition is useful for getting an equivalence on the state space, however, it is usually easier (if possible) to view a group of symmetries H as generated by a set H gen of homeomorphisms: the set H is then the closure under inverse and composition of the set H gen . If the set H gen satisfies the following conditions, then the set H is a group of symmetries: • for all f ∈ H gen , obs • f = obs, and • for all measurable sets B such that for all f ∈ H gen , f * (B) = B, for all x ∈ E ∂ and for all g ∈ H gen , P x (B) = P g(x) (B).
is in H gen for every i. First note that since for every f ∈ H gen , obs • f = obs implies that obs = obs • f −1 and hence for every i ∈ 1, n , obs • f i = obs. Finally, we get that Now, consider a set B such that for all h ∈ H, h * (B) = B. In particular, for every f ∈ H gen , f * (B) = B. This implies that for every y in E ∂ and for Remark 4.5. If we have two groups of symmetries H 1 and H 2 , it may be that H 1 ∪ H 2 is not a group of symmetries, but it generates a group of symmetries by closing it under composition. Proof. Consider ω ∈ B and any time t ≥ 0.

Relation to bisimulation
. This is true for any time t and since B is time-R H -closed, we have that f * (ω) ∈ B. To prove the converse implication, consider ω ∈ B. We define ω as ω Proof. Consider two equivalent states x R H y, i.e. there exists h ∈ H such that h(x) = y.
Second, let us consider a measurable, time-R H -closed B. Using lemma 4.7, we know that for every f ∈ H, f * (B) = B, and hence P x (B) = P h(x) (B) = P y (B), which concludes the proof.
There are a few additional remarks to be made here. Remark 4.9. It may be tempting to view functions in a group of symmetries as FD-homomorphisms. However, this is not necessarily the case. Indeed, a FD-homomorphism f : To illustrate this, consider Brownian motion with an atomic proposition on 0 and the group of symmetries {s, id} like in previous remark. Define the following trajectories: For every time t, |ω 1 (t)| = |ω 2 (t)| = |ω 3 (t)|, which means that any time-R Hclosed set that contains one of these trajectories should contain all of them.
To account for this, we can make the condition more complex by allowing to "use" different functions from H as time goes by. More formally, define the set H traj as the set of functions F obtained in the following way. Given a set I such that I = N or 0, m (for m ∈ N), an I-indexed family of times t i such that t 0 = 0 < t 1 < t 2 < ... such that i∈I [t i , t i+1 ) = R ≥0 (where t m+1 is understood as ∞) and an I-indexed family of homeomorphisms f i ∈ H, we can define F : Ω − → Ω such that for ω ∈ Ω, However, this is still not an equivalence: consider Brownian motion as previously and the trajectories ω(t) = t×sin(1/t) (ω(0) = 0) and ω (t) = |ω(t)|. These two functions are R H -related at all times but the last condition stated still does not account for this.

Game Interpretation
The following games are adaptations from [13,7] to our setting of continuoustime processes. It is especially interesting to note that the game interpretation of bisimulation emphasizes once again the role of trajectories in that concept whereas the game interpretation of temporal equivalence resembles that in discrete time very closely. We define the following game. Duplicator's plays are pairs of trajectories that he claims are time-bisimilar. Spoiler is trying to prove him wrong.

Game interpretation of bisimulation
• Given two trajectories ω and ω , Spoiler chooses t ≥ 0 and B = ∅ ∈ G such that P ω(t) (B) = P ω (t) (B) • Duplicator answers by choosing ω 0 ∈ B and ω 1 / ∈ B such that obs•ω 0 = obs • ω 1 and the game continues from (ω 0 , ω 1 ) A player who cannot make a move at any point loses. Duplicator wins if the game goes on forever. The only way for Spoiler to win is to choose a time-obs-closed set. Theorem 5.3. Two trajectories ω and ω are time-bisimilar if and only if Duplicator has a winning strategy from (ω, ω ).
Proof. Denote R the greatest bisimulation.
For the first implication, if two trajectories ω and ω are time-bisimilar, we know that for all t ≥ 0, for all time-R-closed sets B , P ω(t) (B ) = P ω (t) (B ). Spoiler chooses a time t ≥ 0 and a measurable set B such that P ω(t) (B) = P ω (t) (B). This means that the set B that Spoiler chose cannot be time-Rclosed. That is why Duplicator can find two trajectories ω 0 ∈ B and ω 1 / ∈ B that are time-bisimilar. This strategy is winning for Duplicator, since it is allowing him to respond to every move from Spoiler and Duplicator wins all infinite plays.
For the reverse implication, define the following relation R on trajectories: ω R ω if and only if duplicator has a winning strategy from (ω, ω ).
Note that R is an equivalence: • reflexivity: Spoiler has no valid move from (ω, ω), hence duplicator wins.
• symmetry: Assume ω R ω . Whatever move (B, t) Spoiler does when Duplicator says (ω , ω) is also a valid move from (ω, ω ). Duplicator can then play as he would have from (ω, ω ) and if he had a winning strategy then, it is also a winning strategy now. This means that ω R ω.
• transitivity: Assume ω R ω and ω R ω . Now consider the game when duplicator starts by saying (ω, ω ). Spoiler then says (B, t) such that P ω(t) (B) = P ω (t) (B). In this case, note that we have P ω(t) (B) = P ω (t) (B) or P ω (t) (B) = P ω (t) (B) (or both). Duplicator then picks one of those situation (or if only one of them is true, he picks this one) and replies what he would have replied in the game starting with the corresponding start: (ω, ω ) or (ω , ω ). Since Duplicator had a winning strategy in both game, he has one here. Hence ω R ω .
Define the following relation R 1 on states: z R 1 z if and only if ω z R ω z . This relation is an equivalence (this is a direct consequence of the fact that R is itself an equivalence). Furthermore, this relation is a bisimulation. To prove this, assume it is not a bisimulation. I.e. there exists x R 1 y such that either obs(x) = obs(y) or there exists a measurable time-R 1 -closed set B of trajectories such that P x (B) = P y (B). We can start by excluding the first case. Indeed we know that ω x R ω y , which means that obs • ω x = obs • ω y , i.e. obs(x) = obs(y).
We can now show that there is a contradiction. Consider the game starting from (ω x , ω y ). Spoiler says (B, 0). Now, whatever move (ω, ω ) (ω ∈ B and ω / ∈ B) duplicator picks, there exists t ≥ 0 such that ω(t) and ω (t) are not R 1 -related (since B is time-R 1 -closed). This means that Spoiler has a winning strategy from (ω ω(t) , ω ω (t) ) that he can play. This contradicts the fact that duplicator has a winning strategy from (ω x , ω y ). Which proves that R 1 is a bisimulation.
Corollary 5.4. Two states x and y are bisimilar if and only if Duplicator has a winning strategy from (ω x , ω y ). Remark 5.5. We can also define the relation R 2 on states: x R 2 y if and only if there exists ω, ω , t such that ω R ω , ω(t) = x and ω (t) = y.
Trivially, if x R 1 y, then x R 2 y. Now assume that x R 2 y, and consider ω, ω , t according to the definition of R 2 . Duplicator has a winning strategy from ω x , ω y . Indeed, either Spoiler is stuck from the start, in which case duplicator wins, or spoiler says (B, t). This means that P x (B) = P y (B). Duplicator then replies what he would have said in the game starting from (ω, ω ) if Spoiler had said (B, t). This proves that R 1 = R 2 .

Game interpretation of temporal equivalence
We define the following game. Duplicator's plays are pairs of states that he claims are bisimilar. Spoiler is trying to prove him wrong.
• Given two states x and y, Spoiler chooses t ≥ 0 and C = ∅ ∈ E such that P t (x, C) = P t (y, C).
• Duplicator answers by choosing x 1 ∈ C and y 1 / ∈ C that are traceequivalent and the game continues from (x 1 , y 1 ) A player who cannot make a move at any point loses. Duplicator wins if the game goes on forever. The only way for Spoiler to win is to choose a set that is closed under trace equivalence. Duplicator's only valid moves are pairs of trace equivalent states. Theorem 5.6. Two states x and y are temporally equivalent if and only if Duplicator has a winning strategy from (x, y).
Proof. Denote R the greatest temporal equivalence.
For the first implication, if two states x and y are temporally equivalent, we know that for all t ≥ 0, for all R-closed sets C , P t (x, C ) = P t (y, C ). Spoiler chooses a time t ≥ 0 and a measurable set C such that P t (x, C) = P t (y, C). This means that the set C that Spoiler chose cannot be R-closed. That is why Duplicator can find two states x 1 ∈ C and y 1 / ∈ C that are temporally equivalent. This strategy is winning for Duplicator, since it is allowing him to respond to every move from Spoiler and Duplicator wins all infinite plays.
For the reverse implication, define the following relation R on the state space: x R y if and only if Duplicator has a winning strategy from (x, y).
Note that R is an equivalence: • reflexivity: Spoiler has no valid move from (x, x), hence duplicator wins.
• symmetry: Assume x R y. Whatever move (C, t) Spoiler does when Duplicator says (x, y) is also a valid move from (y, x). Duplicator can then play as he would have from (x, y) and if he had a winning strategy then, it is also a winning strategy now. This means that y R x.
• transitivity: Assume x R y and y R z. Now consider the game when duplicator starts by saying (x, z). Spoiler then says (C, t) such that P t (x, C) = P t (z, C). In this case, P t (x, C) = P t (y, C) or P t (y, C) = P t (z, C) (or both). Duplicator then picks one of those situation (or if only one of them is true, he picks this one) and replies what he would have replied in the game starting with the corresponding start: (x, y) or (y, z). Since Duplicator had a winning strategy in both game, he has one here. Hence x R z.
Furthermore, this relation is a temporal equivalence. To prove this, assume it is not a temporal equivalence, i.e. there exists x R y such that either x and y are not trace equivalent, or there exists a measurable R -closed set C and a time t such that P t (x, C) = P t (y, C).
Duplicator's only valid moves are pairs of trace equivalent states, so only the second case is possible. Now consider (C, t) to be Spoiler's move from (x, y). Whatever move (x 1 , y 1 ) Duplicator chooses, it is not possible to have x 1 R y 1 since C is R -closed. Since the game is determined, Spoiler has a winning strategy from (x , y ) which contradicts the fact that Duplicator has a winning strategy from (x, y). Which proves that R is a temporal equivalence.

Justification for these behavioural equivalences
The goal of this work is to extend the notion of bisimulation that exists in discrete time to a continuous-time setting. Therefore two important questions are the following: do we get back the definition of bisimulation that existed in discrete time when we restrict Feller-Dynkin processes to (some kind of) discrete-time processes? How well do these notions behave on examples?

Discrete-time Case
It is common in discrete time to consider several actions. Everything that was exposed in this paper can easily be adapted to accommodate several actions. However, we will not mention actions in this section either for the sake of readability.
Given an LMP (X, Σ, τ, (χ A ) A∈AP ) where Σ = σ(T ) where T is a topology on X, we can always view it as a FD process where transitions happen at every time unit. Since the process has to remain memoryless, a state of the FD process is a pair of a state in X and a time explaining how long it has been since the last transition. For trajectories to be cadlag, that time is in [0, 1).
We will write obs(x) = (χ A (x)) A∈AP to mimick what we have in continuous time.
The following lemma is in [6] Lemma 6.2. Consider a DT-bisimulation R. If xRy, then for all n ≥ 1, for all R-closed set A 1 , ..., A n , It is possible to define the notion of trajectories in the LMP and that of trace equivalence just as we did in the case of FD-processes. A trajectory is a function ω : N − → X {∂} such that if ω(n) = ∂, then for every k ≥ n, ω(k) = ∂. Two states x and y are trace equivalent if for every set B of trajectories that is measurable and time-obs-closed, P x (B) = P y (B) (where P x and P y are obtained using the Daniell-Kolmogorov theorem as in section 2).
We have the following result that will be later useful to us. Lemma 6.3. In an LMP with finitely many atomic propositions, any two states x and y that are DT-bisimilar are trace equivalent.
Proof. Consider a non-empty, measurable set B that is time-obs-closed.
Define the following sets for k ∈ N: Since AP is finite, D k is also finite for every k ∈ N.
First, note that B = k∈N B k . Indeed, by definition B ⊂ B k for every k ∈ N which proves the direct inclusion. For the reverse inclusion, note that since B is time-R-closed, We also have that {ω | obs(ω(i)) = obs(ω (i))} Using these expressions and infinite distributivity of set union and intersections, the equality B = k∈N B k follows.
) and similarly for P y (B k ), which shows that for every k ∈ N, P x (B k ) = P y (B k ). This result can also be extended to countably many atomic propositions in the following way: Lemma 6.4. In an LMP with countably many atomic propositions, any two states x and y that are DT-bisimilar are trace equivalent.
Proof. Consider a non-empty, measurable set B that is time-obs-closed. We will denote A 0 the atomic proposition corresponding to ∂.
Define the following sets for k ∈ N: with γ ∈ D k,l . The set D k is finite for every k, l ∈ N.
First, note that B = k,l∈N B k . Indeed, by definition B ⊂ B k,l for every k, l ∈ N which proves the direct inclusion. For the reverse inclusion, note that since B is time-R-closed, We also have that Using these expressions and infinite distributivity of set union and intersections, the equality B = k,l∈N B k,l follows.
Second, for γ ∈ D k,l , This proves that B k,l (γ) is measurable. Furthermore, P x (B k,l (γ)) = P y (B k,l (γ)) using lemma 6.2. Now, B k,l = γ∈D k,l B k,l (γ) is measurable since D k,l is finite. Since for γ = γ , B k,l (γ) ∩ B k,l (γ ) = ∅, we also have that P x (B k,l ) = γ∈D k,l P x (B k,l (γ)) and similarly for P y (B k,l ), which shows that for every k, l ∈ N, P x (B k,l ) = P y (B k,l ).
and therefore P x (B) = P y (B) by down-continuity of measures P x and P y . Proposition 6.5. If the equivalence R is a DT-bisimulation, then the relation R defined as Proof. Consider (x, s) R (y, s), t ≥ 0 and a measurable and R -closed set C. By definition of P t , P t ((x, s), C) = τ t+s (x, C ) where C = {z | (z, s ) ∈ C} with s = t + s − t + s (and similarly for y).
The set C is R-closed. Indeed, consider two states z ∈ C and z ∈ X such that z R z . These conditions imply that (z, s ) ∈ C and (z, s ) R (z , s ).
Since the set C is R -closed, (z , s ) ∈ C and hence by definition of the set C , z ∈ C .
Since (x, s) R (y, s), we also have that x R y. By lemma 6.2, we have that τ t+s (x, C ) = τ t+s (y, C ). This allows us to conclude that P t ((x, s), C) = P t (y, s), C).
The initiation condition (trace equivalence) is a direct consequence of lemma 6.3. Remark 6.6. In [6], we compared bisimulation and DT-bisimulation. However some further studies led us to realize that understanding time-R-closed sets is harder than expected when R is an equivalence, even in the seemingly simple context of discrete-time systems. In particular, we have never managed to prove that the σ-algebra of measurable time-R-closed sets is generated by the X −1 t (A) where t ∈ R and A is a measurable and R-closed subset of the state space.
The result presented in this paper seems to indicate that temporal equivalence is actually the notion that extends DT-bisimulation to continuous time and that the definition of bisimulation in [6] may be too strong in some contexts. There remains to understand which contexts. Proposition 6.7. If the equivalence R is a temporal equivalence, then the relation R defined as the transitive closure of the relation Proof. First note that R is indeed an equivalence.
Since R is a temporal equivalence, for every i, P 1 ((x i , t i ), C) = P 1 ((x i+1 , t i ), C) for every i ≤ n. Additionally, note that for every z ∈ E and every s ∈ [0, 1), P 1 ((z, s), C) = τ (z, C ). This proves that τ (x, C ) = τ (y, C ). Remark 6.8. This result is stronger than the one in [6] since we only ask for R to be a temporal equivalence instead of a bisimulation and additionally, we do not impose further restrictions on the equivalence R such as timecoherence in [6].
These results can be summed up in the following theorem relating temporal equivalence and DT-bisimulation. Theorem 6.9. Two states x and y (in the LMP) are DT-bisimilar if and only if for all t ∈ [0, 1), the states (x, t) and (y, t) (in the Feller-Dynkin process) are temporally equivalent. Remark 6.10. This result seems to indicate that temporal equivalence is the notion of behavioural equivalence that best extends bisimulation from discrete time to continuous time and not the notion of bisimulation introduced in [6] that is much stronger than temporal equivalence.

Basic examples
We now revisit some examples from [6] and clarify them in our new framework.

Deterministic Drift
Consider a deterministic drift on the real line R with constant speed a ∈ R >0 and a single atomic proposition. We consider two cases: with 0 as the only distinguished point and with all the integers distinguished from the other points.
With zero distinguished: Let us consider the case when there is a single atomic proposition, and obs(x) = 1 if and only if x = 0.
We have proven in [6] that two states x and y are bisimilar if and only if either x > 0 and y > 0 or x = y, i.e. the equivalence It is a temporal equivalence and furthermore, to prove that it is the greatest bisimulation we have only used the initiation condition of temporal equivalence. This proves that R is also the greatest temporal equivalence. The equivalence R is also trace equivalence.
Let us now study group of symmetries. Given x, y ∈ R >0 , we define the function f x,y : R − → R by Note that f x,y • f y,x = id and f x,x = id. However, the set of all those functions is not closed under composition so we consider H the closure under composition of F := {f x,y | x, y > 0}. Lemma 6.11. The set H is a group of symmetries and the equivalence generated is R.
Proof. There really is only one condition to check: consider a measurable set B such that for all f ∈ F, f * (B) = B and consider z ∈ R and f x,y for x, y > 0. We want to show that P z (B) = P fx,y(z) (B).
There are two cases to consider. First z ≤ 0, in which case f x,y (z) = z and the desired result holds. Second z > 0, in which case let us denote z = f x,y (z). Note that P z (B) = χ B (ω z ) (the indicator function where ω z is the trajectory defined by ω z (t) = z + at) and P fx,y(z) (B) = χ B (ω z ). Since f z,z • ω z = ω z , ω z ∈ B if and only if ω z ∈ B which concludes the proof of the first point.
The second point is pretty straightforward.
Note that since R is the greatest bisimulation, we know that there is no group of symmetries that generates a bigger equivalence (there may be different groups of symmetries though, but they generate at most R).
With all integers distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x ∈ Z.
We have proven in [6] that two states x and y are bisimilar if and only if Similarly to previous case, this equivalence is also a temporal equivalence and furthermore, to prove that it is the greatest bisimulation we have only used the initiation condition of temporal equivalence. This proves that R is also the greatest temporal equivalence. The equivalence R is also trace equivalence.
The proof that this is a bisimulation also shows that the group of functions {h k | k ∈ Z} where h k (z) = z + k is a group of symmetries. Similarly to previous case, there may be other groups of symmetries but the equivalence generated on the state space cannot be greater.

Fork
We used the following example to show how important the induction condition is in the definition of bisimulation. It is an extension of the standard "vending machine" example in discrete time to our continuous-time setting.
Let us consider the following state space: There are two atomic propositions (denoted P and Q on the diagram), that are satisfied by the final state of some of the branches. The process is a drift at a constant speed to the right. When it reaches a fork, it moves to either branch with probability 1/2 (and stops when he hits an atomic proposition).
The state space is made explicit as: The kernel is defined as follows for t ≤ 100: , j), (x + t, j)) = 1 for all j and for all t such that (x + t, j) exists P t ((y, 4), (y + t, j)) = 1 2 for j = 5, 6 and for all t such that (y + t, j) exists P t ((100, j), (100, j)) = 1 for j = 2, 3, 5, 6 What we showed was that the states x 0 and y 0 cannot be bisimilar. In fact, the greatest temporal equivalence (which is also the greatest bisimulation) is Regarding groups of symmetries, let us define the following functions: and g = f 2,5 • f 3,6 = f 3,6 • f 2,5 . Note that f 2,5 • f 2,5 = id and similarly for f 3,6 . The set {id, f 2,5 , f 3,6 , g} is a group of symmetries that generates the equivalence R.

Examples based on Brownian motion
It is especially interesting to read the proofs that were done for these examples with this new framework in mind. All the proofs follow the same steps. First we define an equivalence. We then state that it is a bisimulation, by actually displaying a set of functions which is a group of symmetries. Second, we show that it is corresponds to trace equivalence and hence that it is the greatest bisimulation (and also temporal equivalence/group of symmetries). We will only restate the results with our framework and let the reader convince himself that the proofs indeed correspond to what we described. Two states x and y are trace equivalent if and only if |x| = |y|. The corresponding equivalence is also the greatest bisimulation and temporal equivalence. This equivalence is also generated by the group of symmetries {s, id} where s(x) = −x for every x ∈ R.

Brownian motion with drift
Let us consider a Brownian process with drift: W t = W t + at (where W t is the standard Brownian motion and a > 0, note that the case a < 0 is symmetric).
With zero distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x = 0.
A state is only trace equivalent to itself. The corresponding equivalence is also the greatest bisimulation and temporal equivalence. This equivalence is also generated by the group of symmetries {id}.
With all integers distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x ∈ Z.
Two states x and y are trace equivalent if and only if x − x = y − y . The corresponding equivalence is also the greatest bisimulation and temporal equivalence. This equivalence is also generated by the group of symmetries With an interval distinguished: Let us consider the case when there is a single atomic proposition and obs(x) = 1 if and only if x ∈ [−1, 1].
A state is only trace equivalent to itself. The corresponding equivalence is also the greatest bisimulation and temporal equivalence. This equivalence is also generated by the group of symmetries {id}.

Brownian motion with absorbing wall
Another usual variation on Brownian motion is to add boundaries and to consider that the process does not move anymore or dies once it has hit a boundary. Since all our previous examples involved probability distributions (as opposed to subprobabilities), we will see the boundary as killing the process.
Absorption at 0: let us consider the case of Brownian motion with absorption at the origin and without any atomic proposition. The state space is R >0 .
A state is only trace equivalent to itself. The corresponding equivalence is also the greatest bisimulation and temporal equivalence. This equivalence is also generated by the group of symmetries {id}.
Absorption at 0 and b: let us consider the case of Brownian motion with absorption at the origin and at b > 0 and without any atomic proposition. The state space is therefore (0, b).
Two states x and y are trace equivalent if and only if x = y or x = b − y.
The corresponding equivalence is also the greatest bisimulation and temporal equivalence. This equivalence is also generated by the group of symmetries Absorption at 0 and 2b with atomic proposition at b: let us consider the case of Brownian motion with absorption at the origin and at 2b > 0, so the state space is (0, 2b), and with a single atomic proposition such that obs(b) = 1 and obs(x) = 0 for x = b.
Two states x and y are trace equivalent if and only if x = y or x = 2b − y.
The corresponding equivalence is also the greatest bisimulation and temporal equivalence. This equivalence is also generated by the group of symmetries {id, s 2b } where s 2b (x) = 2b − x for every x ∈ R.
Absorption at 0 and 4b with atomic proposition at b: let us consider the case of Brownian motion with absorption at the origin and at 4b > 0, so the state space is (0, 4b), and with a single atomic proposition such that obs(b) = 1 and obs(x) = 0 for x = b.
A state is only trace equivalent to itself. The corresponding equivalence is also the greatest bisimulation and temporal equivalence. This equivalence is also generated by the group of symmetries {id}.

Poisson process
This is an example that we did not consider in [6]. Poisson process models the number of customer arriving at a taxi stop for instance. It is a continuous-time process (N t ) t≥0 on the set of natural numbers N, a discrete space. We define the set Ω of trajectories as usual on the state space. The probability distribution on the set Ω is defined as We are going to study two cases. In the first case, we are able to test if there is an even or odd number of customers that have arrived. In the second case, we are able to test if there are more customers than a critical value.
Testing parity of number of customers: There is a single atomic proposition on the state space: obs(k) = 1 if and only if k is even. Proposition 6.12. Two states x and y are bisimilar if and only if x ≡ y mod 2.
Proof. Let us define the following equivalence First, it is indeed a bisimulation. Consider y = x + 2n where n ∈ N (note that obs(x) = obs(y)) and B a measurable, time-R-closed set.
For the reverse direction, let M be the set of non-decreasing trajectories.
This concludes the proof that R is a bisimulation. Now, notice that x R y if and only if obs(x) = obs(y). Since this is weaker than trace equivalence, we have that R is trace equivalence and the greatest bisimulation and temporal equivalence.
Remark 6.13. This situation may look a lot like the deterministic or Brownian drift with parity as the atomic proposition. However, there is one key difference here: we are preventing the set of translations by an even number to be a group of symmetries by only allowing positive numbers. These translations are however FD-homomorphisms. Proving that there is no greater group of symmetries than {id} is not as trivial as it may look.
Testing for a critical value: Fix m ∈ N ≥0 , we define the function obs by obs(x) = 1 if and only if x ≥ m. Proposition 6.14. Two states x = y are bisimilar if and only if x, y ≥ m.
Proof. Denote Let us show that it is a bisimulation. Consider x R y and assume x = y. This means that x, y ≥ m and hence obs(x) = obs(y). There are now two cases to consider: • If B is empty, P y (B) = P x (B) = 0.
We claim that B = {ω | ω(0) ≥ m} ∩ M . The direct inclusion is by definition. For the reverse implication, consider a non-decreasing trajectory ω such that for every time ω(0) ≥ m. This implies that for every time t ≥ 0, ω(t) ≥ m and since we also have that ω (t) ≥ m, in particular ω(t) R ω (t). Since ω ∈ B ⊂ B and B is time-R-closed, ω ∈ B and hence ω ∈ B .
To prove that it is the greatest bisimulation, we show that it corresponds to trace equivalence. The proof above can be easily adapted to show that x, y ≥ m are trace equivalent.
Consider the case when x = y are both less than m. Consider a time t > 0 and define B t = {ω | ω(t) ≥ m}. This set is time-obs-closed and for k < m, This allows us to conclude that if x = y, P x (B t ) = P y (B t ).

Conclusion
The main lesson we have learned is that the continuous-time setting is far more complex and richer than the discrete-time setting. There are entirely new phenomena at work, for example, the concept of local time or the fact that exit and entry times are not always easily definable when the state space is also a continuum. Not surprisingly, there are different possible extensions of the discrete-time equivalences to the continuous-time setting. We have uncovered a few different behavioural equivalences; we expect some of them to be equivalent with some reasonable restrictions on the systems studied. The question of when they are really different is open and tends to get mired in measurability issues.
One of the interesting prospects is the pursuit of the symmetry group point of view. There are Nöther-like theorems for such systems [1] and it would be very interesting to explore such theorems in our setting.
There is still ongoing work to extend logical characterization and event bisimulation to the continuous-time setting. We are also exploring behavioural metrics in this setting.