On the constructions of the skew Brownian motion

This article summarizes the various ways one may use to construct the Skew Brownian motion, and shows their connections. Recent applications of this process in modelling and numerical simulation motivates this survey. This article ends with a brief account of related results, extensions and applications of the Skew Brownian motion.


Introduction
The Skew Brownian motion appeared in the '70 in [44,87] as a natural generalization of the Brownian motion: it is a process that behaves like a Brownian motion except that the sign of each excursion is chosen using an independent Bernoulli random variable of parameter p. As a sequel, if X is a Skew Brownian motion of parameter p (that is, p is the probability that the excursion is positive), then P[ X t > 0 ] = p for any t > 0, while outside 0, X t behaves like a Brownian motion. As shown in [41], this process is a semi-martingale which is a strong solution to some Stochastic Differential Equation (SDE) with local time where L 0 t (X) is the local time at 0 of X. At the same time, N. Portenko constructed in [69,70] a process whose infinitesimal generator has a singular drift concentrated on some hypersurface. This is a way to model some permeable barrier. Indeed, he considers diffusion where N = (N 1 , . . . , N d ) is the vector conormal to a smooth hyper-surface S, a, b and q are continuous and |q(x)| < 1 for all x ∈ S. He showed that the solution to ∂u ∂t = Lu with u(0, x) = ϕ(x) is also solution to the PDE            ∂u(t, x) ∂t = Lu(t, x) on R * + × R d \ S, (1 + q(x))N + (x) · ∇u(t, x) = (1 − q(x))N − (x) · ∇u(t, x) u(t, ·) is continuous on S, u(0, x) = ϕ(x) where N + and N − are the inner and outer conormal to S. The condition on the left and right flux on the surface S is called a transmission condition or a flux condition.
In dimension one with a = 1 and b = 0, this process is the Skew Brownian motion. Thus, the infinitesimal generator of this diffusion process has a rather natural interpretation, since it corresponds to half the Laplace operator plus a "generalized drift" given by a Dirac mass at 0 with a coefficient that corresponds to its skewness. Besides, the effect of the drift is translated into the PDE as a transmission condition. It was also noted in [92] that SDEs with local time may be used to model diffusion processes in a media with permeable barrier.
A recent series of works [27,56,58,61] shows that the properties of the Skew Brownian motion may be used in a systematic way to provide different schemes to simulate diffusion processes generated by with a and ρ have discontinuous points of first kind. All these Monte Carlo methods rely on a probabilistic interpretation of the transmission conditions at the points where the coefficients are discontinuous and appropriate changes of scales. Indeed, the diffusion process X generated by L is solution to an SDE with local time of type where ν is a finite, signed measure that has a mass at the points where a or ρ are discontinuous.
SDEs of type (2) have been studied first in [24,51,52] (see also [9,25] for some extensions), where existence and uniqueness of a strong solution is proved under some rather general conditions (σ shall be uniformly elliptic, bounded and of finite variation, b shall be bounded, and ν shall be of finite mass with |ν({ x })| < 1 for any point x ∈ R).
This type of SDE generalized the usual SDEs and were studied with hypotheses on the coefficients that are as weak as possible: [9,10,30], ... In addition, it is worth noting that the articles [24,52] also give an account on the "critical cases" between strong and weak solutions of SDEs, in the sense they deal with the weakest possible conditions on the coefficients to ensure the existence of a strong solution (see also [6] for an example of application to the theory of SDEs). In addition, some Dirichlet processes can be constructed as solutions of equations of this type [8,25].
Another natural extension of the Skew Brownian motion is the Walsh Brownian motion. It is a diffusion process that moves on rays emanating from a single point and was introduced by J. Walsh in [87]. This diffusion process can be used locally as a description of a spider martingale [94,Sect. 17,p. 103], and is a special case of a diffusion on a graph, since we recover a transmission condition at each vertex. Diffusions on graphs are particularly important in the study of dynamical Hamiltonian systems as shown in the pioneering work of M. Freidlin and M. Weber [34], but could be useful for modelling many physical or biological diffusion phenomena. It is then possible to use Monte Carlo methods to solve that type of problems.
One has also to note that Walsh Brownian motion was useful to provide a counter-example to a natural question on Brownian filtration, as B. Tsirelson shown it in [85].
The goal of this article is then to summarize the different ways to construct a Skew Brownian motion (using PDEs, Dirichlet forms, approximations by diffusions with smooth coefficients and by random walks, scale functions and speed measures, SDE, excursions theory...) and their relationships. The last section presents quickly some extensions of the Skew Brownian motion and their applications to various fields.

Notations
Classically, we denote by R + (resp. R − ) the set of non-negative (resp. nonpositive) real numbers, and by R * + (resp. R * − ) the set R + \ {0} (resp. R − \ {0}). The space of continuous functions from X to Y is denoted by C(X, Y). The space of square integrable function f on X is denoted by L 2 (X). On R d with d ≥ 1, we denote by H 1 (R d ) the completion of the space of smooth functions with compact support with respect to the norm

416
A function in H 1 (R d ) has then a square integrable generalized derivative ∂ xi f with respect to any coordinate x i in R d . Let us recall that when d = 1, for any function f in H 1 (R), there exists a continuous function f such that f (x) = f (x) almost everywhere.
The space H 2 (R) contains the function f in H 1 (R) such that ∇f also belongs to H 1 (R).

Differential operators with generalized coefficients
Let δ denote the Dirac function at 0. For the first time, we are interested in solving the parabolic PDE for some q ∈ [−1, 1],    ∂u(t, x) ∂t = 1 2 △u(t, x) + qδ∇u(t, x), u(0, x) = ϕ(x), which is equivalent to construct the semi-group associated to the operator with a singular first-order differential term In [69,70], N. Portenko constructed the semi-group generated by L and showed it is a Feller semi-group. In fact, N. Portenko works in R m instead of R, where δ is a Dirac mass of a surface S smooth enough. Here, we restrict ourselves to the simpler form of L given by (4), and we give a short account of the whole construction in Section 11.10.1 below.

Perturbation of the heat kernel
One wants to construct the semi-group (Q t ) t>0 , if it exists, generated by L. Let us denote by (P t ) t>0 the semi-group generated by 1 2 △ over R. It is well known that P t has a density p(t, x, y) given by the heat (or Gaussian) kernel: for any continuous, bounded function f , P t ϕ(x) = R p(t, x, y)ϕ(y) dy with p(t, x, y) It is natural to construct Q t as a perturbation of P t , that is The problem here is to give a meaning to ∇Q t ϕ(0), since we will see that ∇Q t ϕ is discontinuous at 0. However, let us remark that for any continuous function f and any τ > 0, x → p(τ, x, 0)f (0) is of class C 1 (R; R) and ∂ ∂x p(τ, x, 0)f (0)| x=0 = 0.
So, if we inject in the right-hand side of (5) the value of Q t ϕ(x) def.
= u(t, x) given by (5), one gets with (6) that We use (7) as the definition of Q t .

Single layer potential
Another idea to construct the semi-group is to consider that the PDE (3) corresponds to the potential generated by a charge whose value is V (t, ϕ) = q ∇u(t, 0+) + ∇u(t, 0−) 2 at the point 0. From standard results (See [31,38] for example), one knows that the solution Q t ϕ(x) def.
= u(t, x) of such a problem is given on R * + and R * − by Yet V (τ, ϕ) also involves the value of u. Indeed, a computation similar to the one done previously leads also to (7), and thus The advantage of this formulation is that one also knows from standard results on potential theory (see [38] for example) that which means that with (8), Then it is immediate that (t, with Hence, one can give a proper meaning to (3), Proposition 1. When ϕ is continuous and bounded, and α ∈ [0, 1], there exists a unique solution to (12) for which the maximum principle holds.
Proof. The existence and uniqueness of the solution of (12) may be proved using the equivalence between this problem and the same PDE written in a variational form: see Section 3.1.
But we could also prove this result without using the notion of weak solutions. The existence of u is proved, since we have constructed such a solution with the help of the semi-group (Q t ) t>0 .
Let us note also that the uniqueness of u follows from the maximum principle. If the initial condition ϕ is non-negative (resp. non-positive), then u is nonnegative (resp. non-positive). Hence, if ϕ = 0, then u is both non-negative and non-positive and is then equal to 0. As (12) is linear, this yields the uniqueness of its solution.
We deal now with the case α ∈ (0, 1). Assume that ϕ ≤ 0 and ϕ has a compact support. Indeed, if u is a solution of (12), then u is solution to ∂u ∂t (t, x) = 1 2 △u(t, x) both on R * + × R + and R * + × R − , with the boundary condition ϕ on R + and R − , and the lateral boundary condition u(t, 0) on R * + ×{ 0 }. The solution u for each of these equations satisfies u(t, x) → 0 when |x| → +∞ or t → +∞.
If ϕ ≥ 0, then −ϕ ≤ 0 and then u ≥ 0, since −u is solution to (12) with −ϕ as an initial condition. To conclude, let us note that if ϕ(x) = C for all x ∈ R, then u(t, x) = C.
It remains to drop the assumption that ϕ has a compact support. From (7), if ϕ is continuous, |ϕ| is bounded and ϕ(x) = 0 on (−R, R), then sup (t,x)∈R * + ×R |Q t ϕ(x)| converges to 0 as R → ∞. By combining this result with the previous one when ϕ has a compact support, one easily gets that if ϕ ≤ C (resp. ϕ ≥ C), then the solution Q t ϕ(x) of (12) The next proposition follows from the previous facts.

Construction of a Skew Brownian motion
The consequences of the construction given by N. Portenko are summarized in the following theorem. Theorem 1. If |q| ≤ 1, then (Q t ) t>0 is the semi-group of a strong Markov process (X t , F t , P x ; t ≥ 0, x ∈ R) on a probability space (Ω, F , P), which we call a Skew Brownian motion of parameter α (abbreviated by SBM(α)) with α = (1 + q)/2. This process is continuous and conservative. Besides, there exists a (F t , P)-Brownian motion B and a continuous additive functional η such that The additive functional η is of finite variation, and it variation increases only when the process is at 0: s.. Of course, it remains to prove that the infinitesimal generator of X is L given by (4). It is clear from (7) that Q t has a density q(t, x, y) given by In fact, a more convenient expression of q(t, x, y) will be given later in (17). For that, it could be shown that for any function ψ continuous and bounded, The equality (16) identifies the diffusion coefficient, which is equal to 1 here, while (15) allows us to identify the drift term of X with qδ 0 . This is a generalization of the way of characterizing the drift term and the diffusion coefficient of a diffusion process, as presented for example in the book [71].

An explicit construction of the semi-group
We have constructed a semi-group (Q t ) t>0 such that Q t ϕ(x) is a solution to the PDE (12). It is possible to look for a solution to (12) using an explicit computation.
Let ϕ be a continuous, bounded function. Let us remark that . Thus, P t ψ(x) is solution to the heat equation on R * + × R * with the initial condition equal to 0. Moreover, ∇P t ψ(0+) = −∇P t ϕ(0) and ∇P t ψ(0−) = ∇P t ϕ(0). Thus, we are looking for a solution T t ϕ(x) written under the form where λ is chosen in order that α∇T t ϕ( This result is a direct consequence of Proposition 1. Proposition 3. The semi-groups (T t ) t>0 and (Q t ) t>0 are equal.
Hence, one gets the following expression for Q t , which is more tractable than (14):

Using Dirichlet forms
The idea is to get rid of the singular first-order term by transforming L into a symmetric operator in a proper Hilbert space.

Weak solutions of PDE
Here, we assume that α = (q + 1)/2 ∈ { 0, 1 }. The problem with the parabolic PDE (12) is to deal with the transmission condition at 0 given by So, an alternative way to consider (12) is to look for a weak solution of the following PDE where A is the divergence form operator By a weak solution to (19), we mean a function u such that for all ψ ∈ C ∞ ([0, T ]; R) with ψ(T, x) = 0, then, integrating formally (19) with respect to ψ(t, x)ρ(x) dx with ρ(x) = 1/2a(x) and using integrations by parts, As u is smooth on the domain where the coefficient a is smooth, it is easily deduced that u is smooth on (0, T ]×R * + and (0, T ]×R * − . Let us choose ψ(t, x) = ψ 1 (x)ψ 2 (t) for two functions ψ 1 and ψ 2 smooth enough and such that ψ 2 (T ) = 0. Using an integration by parts on R * + and R * − and the freedom of choice of ψ 1 and ψ 2 , we are led to (18). Moreover, one knows that u(t, x) is continuous on (0, T ] × R. Thus, the weak solution of (19) is also a solution of (12) and the converse is also true. The article [49] and the book [48, § III.13, p. 224] contain accounts on the properties of the solution of the transmission problem.
The domain Dom(A) of A is the set of functions f of H 1 (R) such that Af belongs to L 2 (R). With this domain, A is a self-adjoint (hence closed) operator with respect to the scalar product f, g L 2 (R;a) = R f (x)g(x)a(x) dx of L 2 (R; a). It is also well known that a divergence-form operator as A is the infinitesimal generator of a Feller semi-group ( P t ) t>0 with a density transition function (t, x, y) → Γ(t, x, y) with respect to a(x) dx, that is P t f (x) = R Γ(t, x, y)f (y)a(y) dy for any continuous and bounded function f on R. Of course, u(t, x) = P t ϕ(x), and from the self-adjointness of A, Γ(t, x, y) = Γ(t, y, x) for all (t, x, y) ∈ R * + × R 2 . The Proposition 4 is a direct consequence of Proposition 1. Proposition 5. (i) As ( P ) t>0 is a Feller semi-group, it is the generator of a strong Markov stochastic process which is continuous and conservative (which is then a SBM(α) thanks to Proposition 4).
(iii) There exist some constants C 1 , C 2 , C 3 , C 4 > 0 depending only on α such that for all (t, x, y) ∈ R * + × R 2 , where g(t, x, y) = 1 √ 2πt exp(−(x − y) 2 /2t) is the heat kernel. The inequality (21) is called the Aronson inequality. It allows us to deduce some general results about the tightness of a family of processes generated by divergence form operators, or a bound on the probability to leave a sphere centered on x prior to some time T (see for example [54,76,78] for applications).
Of course, the transmission condition (18) implies that A note on the adjoint of A with respect to L 2 (R). When one uses the scalar product f, g L 2 (R) = R f (x)g(x) dx, it is easily seen that the adjoint A * of A is given by This also explains why the density p(t, x, y) of the SBM(α) with respect to the Lebesgue measure is discontinuous as a function of y, in account to the Forward Kolmogorov equation. Of course, this can also be deduced from Proposition 5(ii).

Dirichlet forms
Let (E, Dom(E)) be the quadratic form for any (f, g) ∈ Dom(E) = H 1 (R). Let ·, · L 2 (a,R) be the scalar product f, g L 2 (a,R) = R f (x)g(x)a(x) dx. Then (E, Dom(E)) is the bilinear form associated to (A, Dom(A)) by It is indeed easily checked that (E, Dom(E)) is a Dirichlet form [59,39], that is a symmetric bilinear form that is closed and contractive.
We can now give a third construction (however with somewhat complicated tools that we use here in a simple case) of a SBM(α). Proposition 6. The Dirichlet form (E, Dom(E)) generates a continuous, Hunt (hence strong Markov) process, which is a SBM(α).
Proof. This follows from standard results in the theory of Dirichlet forms since (E, Dom(E)) is a regular, local Dirichlet form. Indeed, the infinitesimal generator of the process (E, Dom(E)) generates is (A, Dom(A)), hence it is a SBM(α).
Remark 1. In general, the process we construct using the previously invoked results on Dirichlet form is only defined for quasi-every starting point but not for any starting point. Yet in dimension one and under our assumptions on E, any set of zero capacity is empty.

The Itô-Fukushima decomposition
The theory of Dirichlet form gives us another way to prove Theorem 1. We denote by (X, F t , P x ; t ≥ 0, x ∈ R) the SBM(α) and by (θ t ) t≥0 its shift operator.
A continuous additive functional (CAF) is a continuous process Y such that A CAF Y is characterized by a Radon measure ν, called its Revuz measure by the following way: If f is solution to e −t dY t ] is a continuous 1 version of f . Let G α be the resolvent of (E, Dom(E)), that is the operator giving the unique solution u ∈ H 1 (R) to α u, ϕ L 2 (a,R) + E(u, ϕ) = f, ϕ L 2 (a,R) , ∀ϕ ∈ H 1 (R).
Using the resolvent of the Dirichlet form, we see immediately that if ν( dx) = h(x) dx, the corresponding CAF is · 0 h(X s )/a(X s ) ds. We say that a CAF Y is of zero quadratic variation if Of course, a CAF is said to be locally of zero quadratic variation if there exists a sequence { τ n } n∈N of stopping times (with respect to the minimal filtration generated by X satisfying the usual hypotheses) converging almost surely to +∞ and such that N ·∧τ n is of zero quadratic variation.
Theorem 2 (Itô-Fukushima decomposition). Let f be a function locally in H 1 (R). Then there exists a local, square-integrable martingale M f with M f t = f 0 |∇f (X s )| 2 ds, and a CAF N f locally of zero quadratic variation such that Proof. This theorem follows from the results in [39].
Corollary 1. If f belongs locally to H 2 (R) (we choose a version of f such that f ′ is continuous), then where (ℓ t ) t≥0 is the CAF associated to the Revuz measure δ 0 .
Remark 2. Of course, up to some multiplicative constant, the CAF ℓ is the local time at 0 of the process X (see Proposition VI.45.10 in [74]).
Proof. Using a localization argument, we may assume that f belongs to H 2 (R), and we choose a version of f such that f ′ is continuous. By an integration by part, for any smooth function ϕ with compact support, Hence, It follows that f = G 1 ν and that where Y is a CAF associated to ν. Yet the CAF associated to ν is Y t = t 0 (f (X s ) − f ′′ (X s )/2) ds − β f ℓ t , which yields the result.

Approximations of the coefficients
We give now two results about the approximation of the coefficients of a divergence form operator. Let (ρ, a, b) be measurable functions from R to R. We assume that there exist some constants λ and Λ for which Let (L, Dom(L)) be the infinitesimal generator Under (22), (L, Dom(L)) is the infinitesimal generator of a continuous, conservative strong Markov process (See [78,54], ... for example).
Proposition 7. Let (ρ n , a n , b n ) n∈N be a family of measurable coefficients such that (22) holds uniformly and 1 a n for some 1 ≤ q ≤ ∞. Then the stochastic process X n generated by the operator 1 2ρ n d dx a n d dx + b n d dx converges in distribution to the process X generated by L. Proof. Under (23), the convergence of the resolvent and the semi-group of L n to that of L follows from results in [99,100] for example. Hypothesis (22) implies that one may compare the density transition functions of the processes X n and X uniformly with respect to the Gaussian kernel, and that these functions are locally Hölder continuous in space and time. Thus, this is sufficient to deduce the weak convergence of X n to X: see [54,76] for example.
Corollary 2. Let a n , ρ n and b n be such that (ρ n , a n , b n ) n∈N satisfies (22) and then the conclusion of Proposition 7 is true.
In order to approximate the SBM by some classical SDEs, it is then possible to use Proposition 7 or Corollary 2 with the operator A defined in (20). We will see in the next section an application of this result to identify to different constructions of the SBM. We give now another application, in which the SBM is constructed as a renormalized limit of some SDE with drift. The following result has first appeared in [75] (See also Theorem 14 and Remark 15 below for a restatement of this result in the more general context of SDEs with local time).
Proposition 8. Let X be the solution to the SDE where B is a Brownian motion and b ∈ L 1 (R; R). If X n t = n −1 X tn 2 , then X n converges in distribution to a SBM(α) with α = e κ /(1 + e κ ) and κ = 2 for any ζ ∈ R. As b belongs to L 1 (R; R), then one can set ζ = −∞. Assume that a = ρ = 1. The process (X, P x ) generated by A is then the solution to (24). The infinitesimal generator of X n is 1 It follows that, if Φ n (x) = 2 x −∞ nb(ny) dy, then Φ n (x) = 2 nx −∞ b(y) dy and converges to κ when x > 0 and to 0 when x < 0. Thus with (20) where a(x) has been divided by (1 − α) and Corollary 2, it is easily checked that X n converges in distribution to the Skew Brownian motion of parameter α = e κ /(1 + e κ ).

The scale function and the speed measure
Studying a Markov process (X, P x ) x∈R with infinitesimal generator (L, Dom(L)) is much simpler in dimension one, since its behaviour can generally be described by two unique (up to multiplicative and additive constants) strictly increasing functions S and V . On this description, see the books [12,44,74],...
The function S, defined on the state space of the process, is called the scale function and satisfies is also a Markov process and is on its natural scale. The function V is a primitive of a measure m on the state space of the process, which is called the speed measure. Indeed, this measure characterizes the average exit time from any given interval for the process. If a < x < b, then Any Markov process on its natural scale may be written as B T (·) , where B is a Brownian motion and T (·) is a random time change computed from B and m (or equivalently V ). If m is absolutely continuous with respect to the Lebesgue measure, then the time spent by the process at any point has zero Lebesgue measure.
Remark 3. The process X is the scale function S(x) = x and the integrated speed measure V (x) = V • S −1 . If the speed measure m of X has a density ϕ(x) with respect to the Lebesgue measure, then the speed measure m has a density ϕ(S −1 (x))/S ′ (S −1 (x)) with respect to the Lebesgue measure.
The boundary conditions are also coded by S and m, but from now, we only deal with processes whose state-space is R and that does not explode.
The infinitesimal generator (L, Dom(L)) can be described by The following theorem provides a way to prove the convergence of stochastic processes by looking at their scale functions and speed measures. It allows us to identify the process constructed by its speed measure and its scale function with a process constructed by Dirichlet forms or PDEs results.

Theorem 3 ([36]
). Let (X, P x ; x ∈ R) be a conservative, continuous process defined by the functions (S, V ), and let (X, P n x ; x ∈ R) n∈N be a family of conservative, continuous processes defined by x converges weakly to P x in the space of continuous functions for any starting point x.

The Skew Brownian motion
Comparing (25) and (20) suggests to construct the process (X, P x ) x∈R with scale function and integrated speed measures Remark 4. If (S, V ) are the scale function and the integrated speed measure of a diffusion process, then (κS + λ, κ −1 V + λ) for κ > 0 and λ ∈ R are also possible scale functions and integrated speed measure.
The question is to know if (X, P x ) x∈R is equal to the process previously constructed with Feller semi-groups.
Proposition 9. The scale function S and the integrated speed measure V of the SBM(α) are given by (26).
Proof. Let (a n , ρ n , b n ) n∈N be a family of measurable functions satisfying (22). Let us consider If a n , ρ n and b n are smooth enough, then it is known that A n is the infinitesimal generator of a continuous stochastic process (X, P n x ) x∈R that can be described either by the functions or by its semi-group, which is a Feller semi-group. Hence, if almost everywhere, 0, then Proposition 7 together with Theorem 3 prove that the process generated by A via its Feller semi-group and the one by its scale function and speed measure are equal in distribution.

The SBM as solution of a SDE with local time
Combining the description of the Skew Brownian motion together with the Itô-Tanaka formula allows us to strengthen the semi-martingale decomposition in Theorems 1 and 2 and to express the Skew Brownian motion as the strong solution of some SDE with its local time.

The Itô-Tanaka formula
Let f is the function from R to R which is the difference of two convex functions.
) exists for almost every x. In addition, there exists a signed measure µ, called the second derivative measure, such that for any piecewise C 1 function with compact support on R. If f has a second derivative, then f ′′ is the density of µ with respect to the Lebesgue measure. Let X be a real-valued semi-martingale. Then there exists a process (L x− t (X)) t≥0,x∈R , called the left local time, such that for any function f as above, This is the Itô-Tanaka formula. The left local time is continuous, has finite variation and satisfies Remark 5. Here, we use a normalisation of the local time which is different from the one in [46].
We also set for x ∈ R, which we call the right local time.
on a countable set [46, Problem 6.21, p. 213], the second derivative measure of g is equal to µ(− dx). Hence, with the Itô-Tanaka formula, As g(Y t ) = f (X t ), summing this expression with (29), we get where is the symmetric local time. Formula (32) is the symmetric Itô-Tanaka formula.
Let us note that if M is the martingale part of X, then t 0 1 {Xs=x} dM s is equal to 0, as it follows easily from computing the expectation of the brackets of the integral. Remark 6. We have defined the left, right and symmetric local time of a diffusion process X at 0 in view of using the Itô-Tanaka formula. An alternative construction is for the symmetric local time at 0, and for the left and right local time at 0. This is a consequence of the occupation time formula [46, Theorem 7.1(iii)] which asserts that if f is a measurable, bounded function, then for all t ≥ 0. This formula follows easily from the Itô formula when f is smooth enough, and then by a density argument.

The SDE the SBM solves
Using the scale function and the speed measure, it is then easy to construct the SBM as the solution of some SDE. For this, let us consider first the SDE where B is a one-dimensional Brownian motion and y ∈ R. According to a result due to S. Nakao [63] (see also [51,52]), this SDE has a unique strong solution. Thus, let B be a standard Brownian motion with its natural filtration (transformed to satisfy the standard hypotheses). The strong solution to (35) with respect to B is also given by The , this proves that S is the scale function of X and V is the integrated speed measure of X (The function V is identified through (36) and Remark 3 with the help of Theorem 16.84 in [12] for example). Hence, for any t ≥ 0. It remains to express L 0 t (Y ) in terms of L 0 t (X). Again with the symmetric Itô-Tanaka formula, On the other hand, the second derivative measure of |r(x)| is the Dirac measure δ 0 at 0. Hence, since sgn From the uniqueness of the decomposition of X as a semi-martingale that where L 0 t (X) is the symmetric local time of X at 0 and x = r(y). As r is oneto-one and since the solution to (35) is unique, one gets also that X is unique. The construction of the SDE (37) relies on the fact that S is one-to-one, S ′ is constant on R + and R * − and S(R ± ) ⊂ R ± . Then (35) is equivalent to (37). We then also get that weak existence and weak uniqueness for (37), as the SDE (35) has also a unique weak solution. This will be used below to deal with the martingale problem. Remark 7. In [52], J.-F. Le Gall extended results on strong existence and uniqueness to SDEs of type dX t = σ(X t ) dB t + R ν( dx) dL x t (X) under rather general conditions on the coefficient σ and the measure ν. In this article, the reader will also find another way to prove that the diffusion process generated by (27) also converges to the solution of (37).
More recently, R. Bass and Z.-Q. Chen also considered this kind of equation in [9] and extended some of the results of [52]. Section 11.8 contains a short account of this theory.

About the left, right and symmetric local time
As t → L 0 t (X) is a continuous additive functional that increases only when t belongs to Z = { s ≥ 0 X s = 0 }, any continuous additive functional t → η t with the same property is almost surely equal to C η L 0 (X), where C η is a constant that depends only on η (see Proposition VI.45.10 in [74]). Hence, L 0 (X), L 0− (X) and L 0+ (X) are all proportional.
It follows from (30) and (31) that As the quadratic variation of . Outside 0, the process X behaves like the Brownian motion and thus x → L x+ t (X) is continuous on R \ { 0 }. With equalities (39), this proves the following theorem, initially due to J. Walsh.

The martingale problem
Of course, one may wish to describe the distribution of the SBM(α) with the help of the martingale problem (see [46,79] among many other books).
Due to the presence of the local time in the SDE describing the SBM(α), this is not the most suitable construction. Yet we will be able to define a martingale problem and show that it is well posed.
Let us denote by D(α) the set of continuous, bounded functions f on R with two bounded derivatives f ′ and f ′′ on R * such that f ′′ (0+) and f ′′ (0−) exist, and αf ′ (0+) = (1 − α)f ′ (0−). Hence, f is the difference of two convex functions and its second generalized derivative is With the occupation time formula (34), Thus, We denote by (X t ) t≥0 the canonical process on C(R + ; R). Let us define by Thus, let us now state the martingale problem.
is said to be a solution of the martingale problem associated to D(α).
where M is a local martingale. Let us compute its brackets: with the Itô formula applied to x → x 2 , where N is a local martingale. Since one deduces easily that M t = t 0 S ′ (X s ) 2 ds. From the representation theorem (see for example Theorem 4.2 in [46, p. 170]), there exists an extension ( Ω, F , Q) of the probability space (C(R + ; R), Bor(C(R + ; R)), Q) as well as a filtration ( F t ) t≥0 and a ( Q, Since S ′ is constant on R + and R * − and S(R ± ) ⊂ R ± , one can set Y t = S(X t ) and then Y t = S(x) + t 0 S ′ (Y s ) dW s for any t ≥ 0, Q-almost surely. Thus, (Y, ( F t ) t≥0 , Q) is a weak solution to (35), which is known to be unique. Thus S −1 (Y ) = X is the SBM(α).

Decomposition of the excursions' measure
Another construction is the following: Let Y be a Reflected Brownian motion, and let Z = { s ≥ 0 Y s = 0 } be the closure of its zeros. The Lebesgue measure of Z is zero, but this set cannot be ordered. However, the set R + \ Z can be decomposed as a countable union ∪ n∈N J n of intervals J n . Each interval J n corresponds to some excursion of Y , that is, if J n = (ℓ n , r n ), Y t > 0 for t ∈ (ℓ n , r n ) and Y ℓn = Y rn = 0.
Fix α ∈ [0, 1]. At each J n , we associate a Bernoulli random variable e n which is independent from any other random variables (and the Reflected Brownian motion Y ) and such that P[ e n = 1 ] = α and P[ e n = −1 ] = 1 − α. Let X be the process given by Theorem 6 ( [44]). The process X is a Skew Brownian motion of parameter α.
Proof. Obviously, the process X behaves like a Brownian motion on any time interval J n . By Remark 4, its scale function S and its integrated speed measure V are of type where the constant γ has to be specified. If τ 1 (resp. τ −1 ) are the first time the process X reaches 1 (resp. −1), On the other hand, using the decomposition by excursions and the independence of the e n 's with respect to Y , Thus, γ = α, and X is the Skew Brownian motion of parameter α.
Let τ be the right continuous inverse of the local time L 0 t (X) of a continuous diffusion process X. Each jump of τ corresponds to an excursion of this process, that is for each t such that τ (t) = τ (t−), there exists an interval J n = (ℓ n , r n ) such that ℓ n = τ (t−) and r n = τ (t). Let U be the set of excursions, that is the set of continuous functions f from R + to R such that f (0) = 0, f (t) = 0 for all t ≥ ζ for some ζ > 0 and either f (t) > 0 for t ∈ (0, ζ) or f (t) < 0 for t ∈ (0, ζ).
For any n ∈ N, f (t) = X((t − r n ) ∧ (ℓ n − r n )) is some element of U, with the lifetime ζ = ℓ n − r n . Accordingly, for each t ∈ J where J = { t ≥ 0 τ (t) = τ (t−) }, one may associate a point f (t) in U, corresponding to some excursion of X. A striking result due to K. Itô is that the process (t, f (t)) t≥0 is a homogeneous Poisson point process with values on R + × U and intensity measure dt × P. The measure P is a σ-finite measure, but which is infinite. This measure P, called the excursions' measures, fully characterizes the diffusion process.
For a measurable subset Γ of U such that P[ Γ ] is finite, P[ Γ ] denotes the average number of excursions in Γ per unit of local time.
On the excursions theory, see [11,74]... The following proposition is a direct result from the construction of the Skew Brownian motion given by the theory of excursions (see also [11]).
Proposition 11. For the Skew Brownian motion of parameter α, the measure P may be written where P + (resp. P − ) is the excursions' measure of the Reflected Brownian mo- Remark 8. In [89] (see also [43]), S. Watanabe shows how to construct a diffusion given its excursions' measures, which provides another way to construct the SBM from Proposition 11.
Remark 9. This proposition has to be connected with the relations between the left, right and symmetric local times in (39).  (26). Remark 11. The article [55] studies the link between the coefficients in a decomposition of type (41) for a general diffusion with coefficients discontinuous at 0, that is related to the derivatives of some Green functions at 0. We recover the decomposition (41) Another construction of the density transition function. The reflection principle for the Brownian motion together with the decomposition of excursions allows us to give another construction of density transition function of the Skew Brownian motion, following a result of J. Walsh [87]. Let τ be the first time the Skew Brownian motion hit 0, and B be a Brownian motion. Then If (x, y) belongs to R * + × R − or to R * − × R + , by the continuity of the path, where p(t, x, y) is the transition density function of the Brownian motion. On the other hand, by the reflection principle and the construction of the Skew Brownian motion, Hence, after a short computation, this gives us Formula (17).

Approximation by random walks (I)
As one can expect, a Skew Brownian motion may be approximated by the following random walk: Let (S k ) k≥0 be the random walk starting from 0 with probability transition We set for all t ≥ 0 and any integer n, X n t = 1 n S ⌊n 2 t⌋ + n 2 t − ⌊n 2 t⌋ n (S 1+⌊n 2 t⌋ − S ⌊n 2 t⌋ ).
The following theorem may be found in [41], and then in [52] and in [17] in a more general setting. P.Étoré used this result in [27] to construct a numerical scheme and compute its speed of convergence (see Section 11.9.3). We give another proof of this theorem in Section 9 by constructing the random walk (S k n ) k∈N from the trajectories of the SBM(α). Theorem 7 ([41]). The sequence (X n ) n∈N converges in distribution in the space of continuous functions to the SBM(α) X.
Proof. Let us prove first the convergence of the marginals of X n . For this, we remark that Hence, ]. Using the results of Section 7, one has also that The results follows from the convergence of the normalized reflected random walk |X n | to the reflected Brownian motion |X| given by the Donsker theorem.
The proof of the convergence finite-dimensional distributions of the X n 's is similar and uses the same kind of computations as in (42). Yet the computations become heavy so that we skip it: in the next Section, we will see how to construct our random walk (S n k ) k∈N from a trajectory of X, and then how to prove the convergence of X n t (which is then defined on the probability space of X) in probability to X t . The convergence of the finite-dimensional distributions is then immediate.
The tightness of (X n ) n∈N follows from the Kolmogorov criteria [46, Theorem 2.8, p. 53]. For this, let us note that Then all the ξ k are independent with variance 1, and E[ ξ k ] is equal to 2α − 1 if S k = 0 and to 0 if S k = 0. Thus for any integer p > q > 0, Hence, it follows from (43) that for any s, t ≥ 0, E[ |X n t − X n s | 2 ] ≤ 3 max{2α, 1}(t − s) 2 . One deduces that for any γ < 1/2, there exists a random variable K n γ such that sup This implies the tightness of (X n ) n∈N . In [17], it is proved that (X n t , Y n t ) t≥0 converges in distribution to (B α t , t 0 f (B α s ) dB α s ) t≥0 , where (X n t ) t≥0 and (Y n t ) t≥0 are the linear interpolations of X n k/n = ( √ n) −1 S k/n and Y n k/n = k i=1 f (X n (i−1)/n )(X n i/n − X n (i−1)/n ), and f is a measurable, bounded function. This allows us to prove some results related to the "horizontal-vertical" random walk. Remark 13. In [41], it is also noted that if given S i = 0, the distribution of S i+1 has the distribution of an integrable random variable Z with values in Z, then n −1 S ⌊n 2 t⌋ converges to a SBM(α) with α = E[ Z + ]/E[ |Z| ].

Approximation by random walks (II)
We give another way to prove Theorem 7 that starts from a trajectory X(ω) of the SBM(α).
For the sake of simplicity, let us assume that X 0 = 0. Fix some integer n, and let us construct recursively from X the following sequence of stopping times: τ n 0 = 0 and Of course, as the trajectories of the SBM(α) are continuous, |X τ n k+1 − X τ n k | = n −1 . In other words, the sequence (τ n k ) k=0,1,... records the successive passage times of X(ω) on the grid { k/n k ∈ N }. Finally, set S n k = X τ n k and define X n by where δt = 1/n 2 .
With the scale function given in (26), P[ X τ n k+1 = (ℓ + 1)/n | X τ n k ] is equal to α if ℓ = 0 and to 1/2 otherwise. With the strong Markov property of X, this means that the random walk (S n k ) k∈N is equal in distribution to the random walk previously constructed in Section 8.
This theorem is a "specialization" to the SBM(α) of Theorem 4.1 in [52].
Theorem 8. The process X n constructed from X as above converges uniformly on [0, T ] in probability to X with respect to P.
Proof. We prove first the convergence in probability of X n t to X t for any t ∈ [0, T ]. Let us note first that since the reflected excursion of the SBM(α) are the same as the one of the Brownian motion, then for k = 0, 1, . . . , the increments τ n k+1 − τ n k are independent and have the same distribution (whatever the value of α). In particular, E[ τ n k+1 − τ n k ] = 1/n 2 = δt (this explain our choice of δt). The key relation is Hence, In the proof of Theorem 7, the tightness of (X n ) n∈N has been established by showing it satisfies the Kolmogorov criteria. Hence, if osc(X n , δ) is the modulus of continuity of X n , then for any δ > 0 and any γ < 1/2, where K n γ is the random γ-Hölder constant of X n which is known to satisfy sup n∈N E[ K n γ ] < +∞. One may replace X n by X in (44), and the result follows from the convergence in probability of τ n ⌊n 2 t⌋ to t. By a scaling argument, (τ n k+1 − τ n k ) k=0,...,⌊n 2 t⌋ is equal in distribution to (n −2 σ k ) k=0,...,⌊n 2 t⌋ , where the σ n k 's are independent, identically distributed with the distribution of the first exit time from [−1, 1] of the Brownian motion starting from 0. The σ k have a mean equal to 1 and a finite variance, so that the desired result is a consequence of the weak law of large numbers. Now, to prove the uniform convergence in probability of X n to X, it is sufficient to note that for 0 = t 1 < t 2 < . .
Hence, one deduces that lim n→∞ P[ sup t∈[0,T ] |X n t − X t | ≥ C ] = 0 from the stochastic equicontinuity of (X n ) n∈N and of X by choosing δ (and then the points {t i } ℓ i=0 ) small enough, and afterwards using the convergence in probability of the X n ti to X ti for i = 1, . . . , ℓ. In order to set up a simulation scheme, the rate of convergence is computed in [27,28], in the more general context of diffusion processes with discontinuous coefficients (besides, using a proper time increment, we are not bound to use a random walk on a regular grid as shown in [28]. See also points D. and E. in Section 11.9.3).

A "follow the leader" construction
We present here another construction based on techniques coming from the theory of continuous multi-armed bandits (See [60] for example). Indeed, we present here in a simpler case the proof given in [2] that deals to a variably skewed Brownian motion (See Section 11.6). Some of the results given in [17] are also related to this construction. Let us set α ∈ (0, 1), β = 2α − 1 ∈ (−1, 1) and γ = (1 + β)/(1 − β) > 0. The idea is to construct a process (Z 1 t , Z 2 t , U 1 t ) t≥0 with values in (R + ) 2 × (R + ) 2 × R + from which one can get the positive excursions of the SBM(α) from Z 1 , its negative excursions from Z 2 and its local time from U .
Indeed, Z = (Z 1 , Z 2 ) moves either horizontally or vertically when away from the line Υ = { (x, y) y = γx }. The horizontal (resp. vertical) displacements of Z away from (U 1 t , γU 1 t ) correspond to a positive (resp. negative) excursions of the SBM(α) arising when its local times takes the value L t = (1 + γ)U 1 t . So, we can read "graphically" the signs and the heights of the excursions on the local time scale: See Figure 1. It is then possible to reconstruct a SBM(α) from (Z 1 , Z 2 ).

Fig 1. A representation of the excursions of the SBM(α)
The heuristic idea is to construct Z 1 and Z 2 as two time-changed independent Brownian motion (B 1 T 1 (t) ) t≥0 and (B 2 T 2 (t) ) t≥0 where either T 1 (t) or T 2 (t) increases at rate 1 while the other remains constant. The times at which the switches between T 1 and T 2 occur are obtained by comparing the supremum, as a function of time, of γB 1 with the ones of B 2 We now turn to the rigorous construction. Let us define a subset D of (R + ) 2 by whose closure D has the following properties: It follows from these properties that D is described by the part of its boundary that intersects (R * + ) 2 : See Figure 2. In addition, there exists a unique process T (t) = (T 1 (t), T 2 (t)) -called a strategy -such that [88] (a) T (t) = (0, 0) and T 1 , T 2 are non-decreasing; (b) T 1 (t) + T 2 (t) = t for t ≥ 0; (c) for all s 1 , s 2 ≥ 0, the set (s 1 , s 2 ) T 1 (t) ≤ s 1 , T 2 (t) ≤ s 2 is F (s1,s2) -measurable; (d) the graph of (T (t)) t≥0 parametrizes the boundary of D.

Fig 2. Construction of the time changes
Indeed, T 1 (t) increases at rate 1 when B 2 T 2 (t) > γB 1 T 1 (t) and T 2 (t) increases at rate 1 when B 2 T 2 (t) < γB 1 T 1 (t) . In other words, the strategy "follows the leader" between γB 1 and B 2 .
Let us now consider a maximal time interval (t, t ′ ) with t ′ > t such that T 1 is strictly increasing on it, which means that T 2 is constant on (t, t ′ ). We first note that as (T 1 (t), T 2 (t)) belongs to the boundary of D, As (t, t ′ ) is chosen to be maximal, for all u ∈ [t, , which means that (B 1 T 1 (u) ) u∈[t,t ′ ] performs an excursion below the level B 1 T 1 (t) on the [t, t ′ ]. As (T 1 (t), T 2 (t)) belongs to the boundary of D, there exists no time t < T 2 (t) such that sup has a local maximum and B 2 T 2 (t) reaches its maximum. We now set for r ≥ 0, Z i r = B i T i (r) for i = 1, 2 and U 1 r = sup s∈[0,s] Z 1 s , U 2 r = γU 1 r . From the previous considerations, Thus, on the time interval [t, t ′ ] above, Z 1 performs an excursion below the level U 1 t and then Z moves horizontally away from Υ, with Z t = Z t ′ = U t ∈ Υ.
A similar study could be done for T 2 , and we see now how to define our candidate to be a SBM: Proposition 13. The process X is a SBM(α) and L is its local time.
Proof. Let (G t ) t≥0 be the filtration defined by G t = F T (t) . Let us remark first that B t = −Z 2 t + Z 1 t has for quadratic variation B t = T 1 (t) + T 2 (t) = t from the very definition of the strategy and is (G t ) t≥0 -adapted. By the Paul Lévy's theorem, B is a Brownian motion and then X t = B t + βL t with β = 2α − 1. The process |X| is equal to the distance of Z from Υ and then, by construction, . Again with computing the quadratic variation of this latter process, B is a (G t ) t≥0 -Brownian motion. Besides, Thus, with the Itô-Tanaka formula, L t is the symmetric local time of X. Finally, from (45), X t = B t + (γ − 1)/(γ + 1)L t , and the skewness parameter α is identified from value of γ.

General constructions of "Skew" diffusion processes
The general theory of one-dimensional diffusion processes gives all the tools to construct rather easily a "skew" diffusion process. There are basically two ways to do so.
First, consider diffusion processes living on [0, +∞), or symmetric processes, and the sign of each excursion is changed with independent Bernoulli random variables. For this, one may use the description of a strong Markov process in term of a minimal process (i.e., a process that is killed when it reaches 0) and an entrance law (See [11] for a whole account on this theory). See [86] for related results and constructions.
In [4,5] a Skew Bessel process is constructed this way. Indeed, the symmetry of the process plays an important role in the previous construction, so here is another one. Let X be a process with a scale function S and a speed measure m, x 0 in the state space of X and γ > 0 let us define a new process X γ by the scale measure S γ and speed measure m γ defined by if S is such that S(x 0 ) = 0. As the behavior of a process around a point x is specified locally by its scale function and speed measure and the process generated by (S, m) is the same as the process generated by (λS, λ −1 m) for all λ > 0, the process X γ behaves like X except at x 0 . The "skewness" of this process may be appreciated by the result stated in Remark 10.
The article [21] shows how to construct an asymmetric Bessel process by this method and provides applications to the computation of Asian options with discontinuous volatility. In [22], various processes are constructed by this way, in order to model some prices whose volatility suddenly increases or decreases when the price goes above or below a threshold.

Walsh's Brownian motion
Initially introduced by J. Walsh in [87], the Walsh's Brownian motion is a diffusion process on a set of n rays in R 2 emanating from 0. To each ray I i is associated a weight α i corresponding heuristically to the probability for the process to go in this ray. On each ray, the process behaves like a Brownian motion.
Of course, due to the irregularities of the trajectories of the Brownian motion, this description is a non-sense, but this process may be described by its excursions' measure where P i is the excursion measure of the Reflected Brownian motion on the ray I i . On this process, see also [4].
The Skew Brownian motion corresponds to the case n = 2. The spider martingale generalizes the notion of Walsh's Brownian motion. This object has given rise to an abundant literature on Brownian filtrations, especially by giving a negative answer to the question "if a Brownian motion is adapted to some filtration, is this filtration generated by a Brownian motion?" (See [7,85], [94,Sect. 17,p. 103] and the cited works within), but this goes far beyond the scope of this article.

A generalized arcsine law
For the Walsh's Brownian motion of Section 11.2, let A i (t) be the occupation time up to time t of the ray I i , that is A i (t) = t 0 1 { X(s)∈Ii } ds. As for the Brownian motion, A i (t) dist.
= tA i (1) for any t ≥ 0. The arcsine law for the Brownian motion may be extended to this case.
For α = 1/2, this is the arcsine distribution. In Theorem 9, the occupation time of R + or R − at time 1 of the Skew Brownian motion is a random variable of type ξ 1/2,α .
If the Skew Brownian motion is replaced by a Skew Bessel process of dimension 2 − 2ν, then the occupation time at time 1 is distributed as one of the ξ ν,α [5,90]. In [90], it is also proved that for a general one-dimensional diffusion, the possible limit of A R + (t)/t, where A R + (t) is the occupation time of R + , is a random variable of type ξ ν,γ .
In the case of a Brownian motion on a graph (See below Section 11.9.2), analytical considerations allows us also to compute the Laplace transforms of the occupation time on each of the edges of the graph [23].

The Skew Brownian motion as a flow
As the SBM is a strong solution to the SDE, it is natural to study the dependence of the SBM(α) with respect to its starting point.

Theorem 10 ([96]
). Let X 1 and X 2 be two strong solutions to (46) with either: (a) The same starting point x and two different parameters β 1 and β 2 in (−1, 1). (b) Different starting points x 1 and x 2 and the same parameter β ∈ (−1, 1)\{0}. Then, the L 1 distance between X 1 and X 2 is given by either

Generalization of the Skew Brownian motion (I)
A possible generalization of the Skew Brownian motion is to use a variable "skewness coefficients", which leads to an SDEs with local time of type where β is a differentiable function on R + with β ′ (x) ∈ (−1, 1) for all x ∈ R + . This SDE has been studied in [2].
The proof of the theorem relies on an extension of the construction given in Section 10. The variably skew Brownian motion has been used recently for the Skorokhod embedding problem [18].

Generalization of the Skew Brownian motion (II)
Another natural generalization is the solution of the SDE which is motivated by some homogenization -i.e., change of scale -results [92].
Theorem 12 ([91]). If σ is bounded below by a positive constant, bounded and Lipschitz continuous, b is bounded and α is continuous with |α| ≤ 1, then the solution of (49) is pathwise unique.
Remark 14. In [91], S. Weinryb uses a normalization of the local time different from ours.

Generalization of the Skew Brownian motion (III)
The SDE solved by the Skew Brownian motion may be generalized as a SDE of type where ν is a signed measure. Of course, the case σ = 1 and ν = (2α − 1)δ 0 corresponds to the SBM(α). Let BV be the space of functions f : R → R of bounded variation such that i) f is right continuous, ii) There exists ε > 0 such that f (x) ≥ ǫ.
Let M be the space of all signed measures ν on R such that |ν({x})| < 1 for all x ∈ R.
Of course, the proof relies on an argument similar to the one given in Section 5.2 on the SDE that the SBM(α) solves. Here, the idea is to construct a harmonic function by the following way [52, Lemma 2.1]: for ν ∈ M, there exists a unique (up to a multiplicative constant) such that As f is in BV, f ′ ( dx) is the bounded measure associated to f as in (28). Moreover, ν c is the continuous part of ν. In some sense, the measure ν corresponds to the logarithmic derivative of the function f (x). Then, we set F ν (x) = x 0 f ν (x) dx. It follows that X is solution to (50) if and only if Y = F ν (X) is solution to The existence of a weak solutions follows from the martingale problem [79,46] while the pathwise uniqueness follows from a result due to S. Nakao [63]. The Yamada-Watanabe theorem allows us to conclude (See [46] for example). For the SBM(α), F ν (x) corresponds, up to a multiplicative function, to the scale function S(x) given in (26).
In [52], J.-F. Le Gall also provides the following convergence theorem.
Theorem 14 ([52, Theorem 3.1]). Let (ν n ) n∈N be a sequence of measures in M and (σ n ) n∈N be a sequence of functions in BV such that for some constants M and ε, i) sup n∈N |ν n (R)| ≤ M , ii) ε ≤ σ n (x) ≤ M for all n ∈ N, x ∈ R and sup n∈N |ν n ({x})| ≤ 1 − ε for all x ∈ R. Let us denote by X n the solution to (37) with σ and ν replaced by σ n and ν n , with respect to the same Brownian motion (and then on the same probability space) and assume that X n 0 converges in L 1 to X 0 .
Assume moreover, that there exist two functions σ and f in BV such that . Then X n converges uniformly in L 1 (R) to X on [0, T ] for any T > 0, where X is the strong solution to X t = X 0 + t 0 σ(X s ) dB s + R ν( dx)L x t (X). Remark 15. This theorem endows the importance of the function f and allows us to recover the result of Proposition 8. It has to be noted that one may construct a sequence (ν n ) n∈N of measures in M satisfying the hypotheses of Theorem 14 converging weakly to a measure µ but for which the measure ν given by this theorem differs from µ. Proposition 8 illustrates this points if we take for drift b a symmetric function with compact support on [−1, 1] with ∞ −∞ b(x) = κ/2, so that nb(x/n) dx converges to the Dirac measure (κ/2)δ 0 . Yet n −1 X tn 2 converges to the solution to Y t = B t + e κ −1 e κ +1 L 0 t (Y ). J.-F. Le Gall also proved some extension of the Donsker theorem when σ = 1 in (50). This Donsker theorem was used and improved by P.Étoré in [27,28,29] in order to approximate one-dimensional diffusion processes with discontinuous coefficients by random walks (See also Section 11.9.3).
Recently, R. Bass and Z. Q. Chen have shown in [9] strong existence and pathwise uniqueness of (50) under the assumptions that i) ν ∈ M, ii) a is bounded below and above by a positive constant iii) there exists a strictly increasing function f such that Let us note that if σ ∈ BV, then σ satisfies (52) with f (x) = 2 σ ∞ x −∞ | dσ|, so that the results of [9] are slightly more general than the one in [52].
The article [9] also provides us with a comparison principle and shows that there is no solution to (50) when ν({x}) > 1 for some x ∈ R (the non-existence of a solution was stated in [52] without proof).
As the local time plays a central role in the theory of one-dimensional diffusion processes, SDE with local times are also considered in the study of diffusions with singular drift, in order to get some diffusions under weak hypotheses (see for example [9,10,30], ...), Dirichlet processes [8,25], ... In addition, they also appear as a tool to study the "critical cases" for results on strong/weak existence and uniqueness of solutions of SDE (see [24,52] for example).
Finally, let us note that this kind of process appears also naturally when one studies diffusion processes generated by differential operators with discontinuous coefficients: see Section 11.9.

Spatial change of variable
The symmetric Itô-Tanaka formula allows us to state a stability result of solutions of (50) under a subclass of functions in BV.
Proposition 17. Let F be a function whose derivative f belongs to BV and is bounded above and below by a positive constant constant. If X is the solution to (50) with σ ∈ BV and ν ∈ M , then If F is one-to-one, then Y = F (X) is also solution to some SDE of type (50) with a coefficient (f σ) • F −1 ∈ BV and a measure µ ∈ M with for some function h.
Proof. Formula (53) a direct consequence of the symmetric Itô-Tanaka formula (32). After a one-to-one change of variable, if ϕ belongs to BV, then it is easily checked that R ϕ ′ ( dx)k(x) = R ϕ ′ ( dx)k(F −1 (x)) dx for any bounded, measurable function k. The measure µ is then computed with Lemma 2.1 in [52] which asserts the existence of h, the previous change of variable and the relation L ). As we saw it, the existence and uniqueness results on SDEs with local time relies on some application of this formula, to get rid of the local time. In [6], M. Barlow used this kind of transform to construct counter-examples to the uniqueness of strong solutions to some SDEs.
From a numerical point of view, this change of variable is important in order to simulate the process X, as shown in Section 11.9.3.
In [67], Y. Ouknine made use of the Itô-Tanaka formula to study the distribution of solutions of SDEs of type (50) with σ constant on R + and on R * − and ν = κδ 0 . This class of SDEs remains stable when one uses for F in Proposition 17 functions of type F (x) = px + + qx − .

Random time change
Random time change is also a tool to study SDE with local time, as shown first in [24]. In [61,Chapter 5], M. Martinez used a random time change based on the SBM instead of a Brownian motion to provide us with existence and uniqueness results of one-dimensional SDEs in the spirit of the work of H. Engelbert and J. Schmidt (see [24] or [46,Section 5.5.A]).

Diffusions with discontinuous coefficients and diffusions on graphs
We present now in this section some results about PDEs with discontinuous coefficients, and diffusion equation on graphs. This gives the general framework to consider both a large subclass of SDEs of type (50) and extensions of the Walsh's Brownian motion presented in Section 11.2.

Infinitesimal generators and associated diffusion processes
Let a and ρ be two measurable functions on R such that 0 < λ ≤ a(x) ≤ Λ and λ ≤ ρ(x) ≤ Λ for all x ∈ R. Let also b be a measurable bounded Let (L, Dom(L)) be the differential operator Using scale functions and speed measures, or analytical results on the fundamental solution of ∂ ∂t − L, or Dirichlet forms, it can be proved that (L, Dom(L)) is the infinitesimal generator of a continuous, conservative, strong Markov process (X t , t ≥ 0; F t , t ≥ 0; P x , x ∈ R) (See [39,54,78] for example).
The next proposition follows from an application of Corollary 2 and Theorem 14, (See [27,29,54]). Note that a localization argument shall be used if b does not belong to L 1 (R; R) (see [78,Lemma II.1.12] for example).
Proposition 18. We assume in addition to the previous hypotheses that ρ and σ belong to BV. Then for any starting point x, X is the unique strong solution to the SDE where B is a Brownian motion, L y t (X) is the symmetric local time of X at the point y and The SDE (55) is a particular case of (50) encountered in Section 11.8.
The article [40] shows how to compute explicitly the density transition function and the Green function of this process using spectral analysis.

Transmission conditions and diffusions on a graph
The solution of the parabolic PDE ∂u ∂t = Lu with u(0, x) = f (x), where L is given by (54), has to be understood as a weak solution, that is as a function u(t, x) ∈ C(0, T ; L 2 (R)) ∩ L 2 (0, T ; H 1 (R)) such that for all ψ ∈ C ∞ ([0, T ]; R), Now, if the coefficients a and ρ belong to BV and the points {x i } i∈J at which a is discontinuous, then u is also solution to the transmission problem where u ∈ C 1,2 (R * [49]). Remark 16. Note that we can get any arbitrary condition transmission at some point x i without changing the diffusion coefficient on the intervals (x i−1 , x i ) and (x i , x i+1 ). For this, it is sufficient to multiply the coefficients a by a µ j and ρ by 1/µ j on (x j , x j+1 ) for a well chosen sequence { µ j } j∈J .
Differential operators on graphs are natural expansions of one-dimensional differential operators with a transmission conditions at some given points. Let G be an oriented graph with a length ℓ e > 0 associated to each edge e. Thus each edge e may be seen as a segment of R and a differential operator L e = where e ∼ v means that v is an endpoint of the edge e and ∂f ∂nv,e is the derivative of f in the direction of the edge e at the vertex v.
The condition (57) prescribes the flux in each direction, and is clearly a transmission condition. The one-dimensional problem (56) can be seen as a diffusion equation on a graph whose vertices are the x i 's and whose edges are the intervals (x i , x i+1 ).
Indeed this generalization goes farther, since one may derive an Itô formula for this diffusion process with a local time at each vertex [33]. Besides, approximations by random walks is considered in [26]. The differential operator (L, Dom(L)) is the infinitesimal generator of a stochastic process X, which can be seen as the limit of a diffusion process moving in small tubes around the edges and reflected at the boundary: See [35]. Diffusions on graphs are useful to describe the limiting behavior of perturbed Hamiltonian systems (see [34] and the subsequent works). Some of the Monte Carlo methods presented below in Section 11.9.3 may be adapted to this case. On some particular graphs, it is also possible to compute explicitly the transition density function of X: see [65] for example.

Numerical simulations
The question of the simulation of the solution of some SDE with discontinuous diffusion coefficients is a problem which have been hardly treated (See however [45,16,93]). Until recently, the case of processes generated by a divergence form operator of type 1 2 ∂ xi (a i,j ∂ xj ) have not been rigorously investigated, and the simulation of the process it generates in the multi-dimensional case remains an open problem (one knows from [80] that the diffusion may be approximated by a Markov chain, but the computation of the transition probabilities of this Markov chain is intractable in the general case). Due to many fields of applications (geophysics, electro/magneto-encephalography, ecology, astrophysics, ...), it is however of practical importance, and the main problem is related to the understanding of the behavior of the diffusion process when it reaches a hypersurface where a is discontinuous.
In dimension one, Proposition 18 together with Proposition 17 allows us to describe exactly what happens to a diffusive particle that reaches a point where either ρ or a is discontinuous, if its trajectory is a realization of a trajectory of the process generated by the differential operator L given by (54). Although one may think to use Corollary 2 to regularize the coefficients, this leads to unstable simulations (See [29] for numerical examples). Moreover, in practical situations, the jumps of the coefficients may be very high (for example, in a fissured porous media, the order of magnitude may be 1.000 or higher). In addition, it does not explain what happens to the diffusion.
In dimension one, several schemes to simulate diffusion processes with discontinuous coefficients may be given. and we present here briefly several approaches. In many cases, we have assumed that the coefficients a and ρ belong to BV whose points of discontinuity have no cluster points, or that they can be approximated well by such coefficients. We also note that one may assume that b = 0 by transforming the coefficients a and ρ into a(x) exp(−h(x)) and ρ(x) exp(h(x)) with h(x) = 2 x 0 b(z)/a(z)ρ(z) dz. This is the Zvonkin transform, as we saw it in the proof of Proposition 8.
We present several schemes. Once the behavior of the diffusion is understood, it becomes possible to provide other schemes or to mix them.
A. If there is one discontinuity, one may easily adapt the method proposed by E. Hausenblas in [42] for the reflected diffusion processes. There, the simulation skips the small excursions when the process reaches the boundary, and the particle jumps according to the entrance law of the diffusion process of excursions whose size is greater than a fixed parameter ε. The adaptation can be done using the decomposition of excursions measures given in [55], and by choosing the sign of the excursion whose size is greater than ε with an independent Bernoulli random variable.
B. Using Proposition 17, one may find some one-to-one function F to get rid of the term with the local time in (55). There, one obtains some SDE with a drift term and a diffusion term whose coefficients are discontinuous. It is then possible to apply results on the simulations of SDEs with discontinuous coefficients. This is the strategy chosen by M. Martinez and D. Talay [61,62], where he uses the Euler scheme whose convergence is shown in [93]. In addition, the speed of convergence is computed.
C. Still using Proposition 17, after having approximated the coefficients by piecewise constant coefficients (after the drift have been removed), one may find a one-to-one function G that transform the solution X to (55) into the solution to the SDE where {x n } n∈J is the set of points where a and ρ are discontinuous. Hence, around each point x n , Y behaves like a Skew Brownian motion. The idea is then the following: We construct a sequence of points {y k } k∈K such that {G(x n )} n∈J ⊂ {y k } k∈K and y k−1 < y k < y k+1 . We then set For this, we assume that θ 0 = 0 and Y θ 0 = Y 0 belong to {y k } k∈K . It is then possible to simulate exactly the Markov chain (θ i , Y θ i ). If Y θ i = y k does not belong to {x n } n∈J , then it corresponds to simulate the first exit time and position from [y k−1 , y k ] for the Brownian motion, which can be done numerically with the help of analytical expressions of the density of the Brownian motion killed when it exit from some interval. If Y θ i = y k = x n for n ∈ J, then we have to assume that the {y k } k∈K are such that x n − y k−1 = y k+1 − x n (this is why we use a finer grid {y k } k∈K ). Then, using the "symmetry" of the negative and positive excursions of the Skew Brownian motion, the first exit time and position for Y from [y k−1 , y k+1 ] can be deduced from the ones of the killed Brownian motion and the skewness parameter of Y around G(x n ). Let us note that it is also possible to simulate Y T for an arbitrary time T . For this, we decide at each step with a Bernoulli random variable whether {θ i−1 + θ i ≤ T } or not. In the former case, it is also possible to simulate (θ i , Y θ i ) given {θ i−1 + θ i ≤ T } when one knows (θ i−1 , Y θ i−1 ). In the latter case, we are interested in simulating Y T given θ i > T − θ i−1 , which can also be done from the density of the Brownian motion killed when it exit from some interval.
Once we have simulated the (θ i , Y θ i ) i=0,1,... , it is easy to use G to get the distribution (τ ∧ T, X τ ∧T ), there τ is the first exit time from some interval, and T > 0 is an arbitrary time.

D.
We have already noted in Section 11.9.3 that J.-F. Le Gall has proved a Donsker theorem of SDEs of type dZ t = dB t + R ν( dx)L x t (Z). This means that he constructed a random walk (S k ) k∈N such that n −1 S ⌊n 2 t⌋ converges in distribution to Z t for any t ≥ 0. The probability transitions of S k can be explicitly deduced from the measure µ.
In [27], P.Étoré used the previous map G to reduce the simulation of X to this case, and then approximate the diffusion Y using the random walk. He also computed the speed of convergence of this random walk. In some sense, this is also the simplification of the algorithm presented in C. where the {y k } k∈K are equally spaced (at size 1/n) and the exit times θ i+1 − θ i are replaced by their expectations, which are constant over all the position and equal to 1/n 2 .
E. In [28] (see also [29]), we constructed a bi-dimensional Markov chain (θ k , Z k ) k∈N from a realization of a trajectory of X generated by L by setting where G is an arbitrary grid whose distance between two points is not necessarily equal to a fixed parameter.
Let us set for t ∈ [θ k , θ k+1 ] and K(t) = inf k ∈ N θ k ≥ t . When the mesh of G decreases to 0, one may show that (θ K(t) , Z(t)) converges uniformly in t ∈ [0, T ] in probability to (t, X t ) and the rate of convergence may be computed. This construction no longer uses the SDE (55), and the assumption that a and ρ belong to BV may be dropped. We then approximate X by simulating a Markov chain with the distribution of (Z k , θ k ) k=0,1,2,... . For this, the probability transition of (Z k ) k∈N is deduced from the scale function, since for F. In [19,22] , where B is a Brownian motion, the write the density p(t, x, y) of X as where X β is the SBM(β) with , p β (t, x, y) is its density, They then consider several "Skew models" -such as the Self Exciting Threshold (SET) Cox-Ingersoll-Ross, SET Vasicek, SET Libor market, ... -and use spectral decompositions to compute and/or approximate the density given in (59). They also consider the case of Skew Bessel processes.

Parameter estimation
In [1,61], O. Bardou and M. Martinez have constructed a scheme to estimate both the skewness parameter γ and the position of the point x of the doubly reflected Skew Brownian motion dX t = dB t + γdL x t (X) + dL −1 t (X) − dL 1 t (X). Their construction relies on the ergodicity of the underlying process, and its connections with PDEs with discontinuous coefficients.

Multi-dimensional extensions
The natural multi-dimensional extension of the Skew Brownian motion is that of a SDE involving the local time of the process in some hyper-surface. Yet it is difficult task to construct such a process. We then present briefly the method proposed by N. Portenko on generalized diffusion processes, where the semigroup of the process is constructed using a Volterra series. Afterwards, we give some references to other constructions, some of them relying on fixed point theorems on SDEs, and other on Dirichlet forms.

Generalized diffusion processes
In [69,70,71], N. Portenko develops the concept of diffusions with singular drift, which we already used in Section 2.
Let S be a surface in R d which separates R d into two regions: the interior region D and the exterior region R d \ D, N a vector field on S, q a continuous function from S to R. We assume that S satisfies both the interior and the exterior sphere property, which means that for any x ∈ S, one can find some balls B and B ′ with B ⊂ D, B ′ ⊂ R d \ D, B ∩ S = B ′ ∩ S = {x}.
Let also a be a continuous function with values in the space of d×d-symmetric matrices, which is uniformly elliptic and bounded. A generalized diffusion is a diffusion process whose infinitesimal generator L is formally written The idea is then to construct first the semi-group (P t ) t>0 to L, it to show it generates a Feller process (X, P x ). The solution of the parabolic PDE is then u(t, x) = R d p(t, x, y)ϕ(y) dy = E y [ ϕ(X t ) ].
As the δ-function imposes some flux condition on S, we can think to use a single layer potential, which is generated by a distribution of charges V (t, ·) in S at time t which we choose to be V (t, x) = q(x) 2 (N + (x) · ∇u(t, x+) + N − (x) · ∇u(t, x−)) , x ∈ S.
Let p 0 (t, x, y) be the fundamental solution to L 0 = 1 2 a i,j ∂ 2 xixj . The potential generated by V (t, x) on R d is then equal to v(t, x) = t 0 S p 0 (t − s, x, y)V (s, y) dσ y ds, where σ y is the Lebesgue measure on S, and according to classical results on potential theory, v is harmonic on R d \ S and continuous on R d (and then when crossing S). In addition, for x ∈ S, N ± (x) · ∇v(t, x) = t 0 S N + (x) · ∇p 0 (t − s, x, y)V (s, y) dσ y ds ∓ V (t, x). (64) As v(0, x) = 0, we then sought u to be the superposition of v and u 0 (t, x) defined by u 0 (t, x) = R d p 0 (t, x, y)ϕ(y) dσ y .
By combining the previous equation with Equations (63), (64), we obtain that in order that the flux condition in (62) is satisfied, the charge distribution V shall be solution to V (t, x) = R d N + (x) · ∇p 0 (t, x, y)ϕ(y) dy + t 0 S N + (x) · ∇p 0 (t, x, y)V (s, y)q(y) dσ y ds. (66) This equation may be solved using Volterra series, which means by constructing recursively x, y)ϕ(y) dy, N + (x) · ∇p 0 (t, x, y)V (n) (s, y)q(y) dσ y ds and establishing the convergence of the series V (t, x) = n≥0 V (n) (t, x) (the advantage we get in Section 2 was that there was no need to consider such a series, since N + (x) · ∇p 0 (t, 0, y) = 0 when p 0 (t, x, y) is the one-dimensional heat kernel). We have then constructed a charge distribution on S whose single layer potential u gives a solution to (61) outside S. It is then easily obtained that u is solution to (62).
We are now willing to construct the semi-group (P t ) t>0 of L. For this, we set P 0 t ϕ(x) = R d p 0 (t, x, y)ϕ(y) dy and P t ϕ(x) = P 0 t ϕ(x) + t 0 S p 0 (t − s, x, y)V (s, y)q(y) dσ y ds, where x ∈ R d , t > 0 and V is the solution to (66). This is a perturbation formula. It is then possible to prove that P s+t ϕ(x) = P s P t ϕ(x) for any x ∈ R d , that P t ϕ(x) − −− → t→0 ϕ(x) for x ∈ R d and that N + (x) · ∇P t ϕ(x) = (1 − q(x))V (t, x) and N − (x) · ∇P t ϕ(x) = (1 + q(x))V (t, x), so that (t, x) → P t ϕ(x) is solution to (62). As for the proof of Proposition 1, if |q(x)| ≤ 1, then P t ϕ is non-negative when ϕ is non-negative. In addition P t 1 = 1, so that (P t ) t>0 is a contraction semi-group for the uniform norm.
Theorem 15 ([69, Theorem 1]). Under the previous hypotheses on S, a and q and if |q(x)| ≤ 1 for all x ∈ S, The semi-group (P t ) t>0 generates a conservative, continuous Markov process (X, P x , x ∈ R d ).
Remark 17. If S is some hyper-plane in R n and a is the identity matrix, then similar computations also lead to the same result [69, Section 7].
One may wish to characterize the diffusion process constructed so by its diffusion and drift coefficients. It is standard that the diffusion coefficient is characterized by a(x)θ · θ = lim t→0 E x [ ((X t − x) · θ) 2 ] for any θ ∈ R d . Using the semi-group and a continuous test function ψ with compact support, one can show that where P (t, x, dy) is the probability transition of P . Still using the test function ψ, one gets that is which characterized a drift concentrated on S.
In [70], N. Portenko decomposes the process X as a semi-martingale.
Theorem 16 ([70, Theorems 1 and 2]). There exists a continuous additive functional (ζ t ) t≥0 of bounded variation such that M t = X t − X 0 − ζ t is a continuous square integrable martingale with M t = t 0 a(X s ) ds and such that d dt E x [ ζ i t ] = T t (q(x)N i + (x)) and ζ i t = t 0 1 {Xs∈S} dζ i s for i = 1, . . . , d. Some of these results have been extended recently (see for example [72]), for example to take into account some elastic killing effect when the process passes through the surface S.

SDE with local time in the multi-dimensional case
There exists several works on SDEs with local time that may appears as generalizations of the SBM. However, their constructions are rather technical in general, and the hypotheses tedious to write. This is why we give here just a few references.
A. The article [84] provides a construction of some diffusion X associated to the Dirichlet form where G is a domain of R d , α ∈ (0, 1), a is uniformly elliptic, ρ > 0 and of course, a and ρ have sufficient integrability condition. If a and ρ are "weakly differentiable", then one can decompose X as some SDE with a local time associated to ∂G.
B. Given a measure µ which is singular with respect to the Lebesgue measure, Y. Oshima proved in [66] the weak existence and uniqueness of the d-dimensional process solution to Lipschitz continuous. This diffusion is shown in [97] to be a generalized diffusion in the sense of N. Portenko [71] with drift (βν + α • π S )δ S and a diffusion term Id + σ • π S δ S .

Domains of applications
Being related to diffusion processes with discontinuous coefficients and to diffusions on graphs, the Skew Brownian motion has potential applications in many fields of physics: in astrophysics [98], in geophysics and study of heterogeneous media [35,50,56,57,73,92], in perturbed Hamiltonian systems [37], study of hysteresis phenomena [32], in biology (some modelling of nerve impulses involves parabolic PDEs as in Section 11.9.2 [64]), ecology (the dynamic of populations moving between different refuges in studied in [15]), in finance and actuarial sciences [19,20,21], in optimization (see below)... In all these applications, the properties of the Skew Brownian motion may be used in order to solves the related problems by Monte Carlo methods, or to use its properties to solve such a problem analytically.

Applications to optimal problems
The article [53] looks for a a drift b ≥ 0 with 1 0 b(x) dx < +∞ that minimizes E x0 τ , where τ = inf { t ≥ 0 X τ = 1 } and X is the solution to dX t = dB t + b(X t ) dt with a reflecting boundary at 0. Indeed, the optimal drift is a measure that can be computed explicitly, and that is of type b(dx) = b(x)dx + κδ x0 (dx). Hence it involves a process of Skew Brownian motion type.
The constructions in [2] and in [17] are also related to methods for solving resources' allocations problems (see [60] for example). Recently, the variably skew Brownian motion presented in Section 11.6 has also been used for solving the Skorokhod embedding problem [18].