On resolving singularities of piecewise-smooth discontinuous vector fields via small perturbations

A two-fold singularity is a point on a discontinuity surface of a piecewise-smooth vector field at which the vector field is tangent to the surface on both sides. Due to the double tangency, forward evolution from a two-fold is typically ambiguous. This is an especially serious issue for two-folds that are reached by the forward orbits of a non-zero measure set of initial points. The purpose of this paper is to explore the concept of perturbing the vector field so that forward evolution is well-defined, and characterising the perturbed dynamics in the limit that the size of the perturbation tends to zero. This concept is applied to a two-fold in two dimensions. Three forms of perturbation: hysteresis, time-delay, and noise, are analysed individually. In each case, the limit leads to a novel probabilistic notion of forward evolution from the two-fold.


Introduction
Systems of piecewise-smooth, discontinuous ODEs are utilised in many areas to model phenomena involving switching events. Except in special cases, the dynamical behaviour local to a point on a discontinuity surface conforms to one of three scenarios: crossing, stable sliding, or repelling. These are illustrated in Fig. 1. In the first scenario trajectories simply cross the discontinuity surface. In the second scenario the discontinuity surface attracts nearby trajectories. Here Filippov's solution [1,2,3,4] may be used to define sliding motion on the discontinuity surface. Sliding motion has been described in models of diverse phenomena, including oscillators subject to dry friction [5,6,7,8], relay control [9,10,11], and population dynamics [12,13,14,15]. In the third scenario trajectories head away from the discontinuity surface from both sides. In this case forward evolution from a point on the discontinuity surface is ambiguous or non-unique and a differential inclusion may be used to define a set-valued solution [16,17,18]. However, from an applied viewpoint this is typically not an important issue because there are no initial conditions for which forward evolution leads to the discontinuity surface. Points on discontinuity surfaces at which the vector field is tangent to the surface on one side usually represent boundaries at which the nature of the discontinuity surface changes between one of the three generic scenarios. For example, at the visible tangency shown in Fig. 1, forward evolution changes from sliding motion to regular motion along the tangent trajectory. Here the tangency is referred to as visible because the tangent trajectory is a solution of the system. In contrast, the tangency shown in the bottom right sketch of Fig. 1 is said to be invisible because the tangent trajectory is not seen.
Points on discontinuity surfaces at which the vector field is tangent to the surface on both sides are known as two-folds. Such points arise generically in systems of at least three dimensions [19,20], or as codimension-one bifurcations in planar systems [21]. As with repelling discontinuity surfaces, forward evolution from two-folds is generally ambiguous. However, unlike for repelling discontinuity surfaces, the ambiguity of forward evolution from a two-fold is a serious issue if many trajectories go into the two-fold. This is the case for the two-fold depicted in Fig. 2, because the area of the set of all points whose forward orbits intersect the two-fold is non-zero.
For planar systems there are several distinct generic two-folds. These may be distinguished by whether neither, one, or both tangent trajectories are visible, and the relative direction of these trajectories at the two-fold [21,22]. For Fig. 2, both tangent trajectories are visible and they point in the same direction at the two-fold. The tangent trajectories are the orbits of the left and right vector fields that pass through the origin. We write them as, x ± (t) ≡ (x ± (t), y ± (t)), and for simplicity assume, x ± (0) = (0, 0).
In three dimensions, a two-fold for which at least one tangent trajectory is visible, may exhibit an ambiguity similar to that of Fig. 2 in that the forward orbits of a nonzero volume of initial points intersect the two-fold [19,20]. Dynamics local to generic two-folds in three dimensions have been systematically classified [23,24,25]. In more than three dimensions, two-folds have similar properties [26]. Also, it has recently been shown that two-folds arise in models of simple circuit systems [27], and have deep connections to folded nodes of slow-fast systems [28,29].
However, dynamical behaviour near two-folds should only be understood with the knowledge that as mathematical models nonsmooth equations only represent an approximation to reality. Discontinuities are used to model events, such as impacts, that on an appropriately fine scale are not instantaneous and perhaps actually smooth, although highly nonlinear. Model inaccuracies are crucial at sensitive regions of phase space; indeed two-folds are points of extreme sensitivity.
To address this issue, in this paper physically motivated perturbations of a discontinuous ODE system are analysed with the aim of resolving the ambiguity of forward evolution at twofolds. In order to obtain analytical results the scenario depicted in Fig. 2 is studied because it is the simplest two-fold that exhibits the serious ambiguity issue of interest. Higher-dimensional scenarios are left for future work.
The following three types of perturbation are considered separately. Hysteresis -a common feature of control systems [30] -is incorporated by replacing the discontinuity surface with a hysteretic band of width 2ε. In this context, and throughout the paper, ε > 0 is assumed to be small. Second, time-delay is added. Specifically, it is supposed that the functional form of the equations governing forward evolution switches at a time ε after trajectories intersect the discontinuity surface. Small time-delay is inherent in many switching systems, such as relay control systems [30], and represents the time lag between when the switching threshold is crossed and a change in dynamics is enacted. Lastly, noise of amplitude ε is added to the differential equations. Noise and uncertainty are ubiquitous in real-world systems. Additive white Gaussian noise is used here as this is possibly the simplest manner by which noise may be incorporated into the system.
With a small hysteretic band, stable sliding motion approaching the two-fold is replaced by rapid switching motion. Importantly, for any ε > 0, forward evolution is uniquely determined in a neighbourhood of the two-fold. Forward orbits pass close to the two-fold and then are directed away from the two-fold along a path close to either x + (t) or x − (t). We are not interested in the x + (t) Figure 2: Dynamics local to a generic planar two-fold for which both tangent trajectories, x + (t) and x − (t), are visible and point in the same direction at the two-fold. precise path that orbits take, only whether they eventually "head right", that is follow close to x + (t) after passing by the two-fold, or "head left". Given, say, an interval of initial points on the discontinuity surface below the two-fold, for any ε > 0, some fraction of forward orbits of these points head right after passing by the two-fold. In the limit, ε → 0, this fraction approaches a constant value, ̺ hy . When the system is instead perturbed by the inclusion of time-delay, the dynamics fit a similar description, but in the limit, ε → 0, the fraction of forward orbits that head right approaches a different value, ̺ td . The addition of noise yields a system of stochastic differential equations that have a unique probabilistic solution for any ε > 0. From a suitable initial point below the two-fold, the stochastic system defines a probability that sample paths from this point eventually head right. As ε → 0, this probability approaches a value, ̺ no . Thus in each case, with ε > 0 forward evolution is well-defined, and upon passing near the two-fold most orbits (or sample paths in the case of the noise) follow a path close to x + (t) or x − (t). In each case we obtain a limiting fraction or probability, ̺, of trajectories that head right.
The concept of adding noise to remove an ambiguity is not new. Under general assumptions the addition of noise to a system of nonsmooth ODEs with multiple solutions gives a stochastic system with a unique solution [31,32,33]. In the zero-noise limit, this solution may represent a useful weighted selection of all possible solutions [34,35]. Recently such ideas have been extended and applied to transport equations for fluid dynamics [36]. However, different ways of adding noise can give different solutions in the zero-noise limit [37]. In addition, for a slow-fast system, when both the zero-noise limit and slow-fast limit are taken, the limiting solution may depend on the order in which the limits are applied [38]. A complete description of the zero-noise limit has been achieved for some simple one-dimensional systems [39,40].
Variations on this concept have been applied in alternate contexts. For instance, forward evolution from an unstable hyperbolic equilibrium of an N-dimensional system of ODEs is motionless, but with arbitrarily small noise, sample paths escape the equilibrium and do so, with high probability, along a path near to the invariant manifold associated with the largest eigenvalue of the equilibrium [41]. For systems of heteroclinic networks, trajectories have a "choice" at saddle equilibria as to which heteroclinic path to take. The addition of noise can be used to replace this choice with a probability [42,43].
The remainder of the paper is organised as follows. Equations describing the planar two-fold of Fig. 2 are introduced in §2. Perturbation by hysteresis, time-delay and noise are described and analysed in §3, §4, and §5, respectively. Conclusions are given in §6.

A planar visible-visible two-fold
We consider the dynamics of a general two-dimensional system of ODEs near a smooth discontinuity surface. By choosing coordinates, (X, Y ), such that the discontinuity surface coincides with the Y -axis, a description of the local dynamics of the system may be written as where we assume that the four functions that appear in (2.1) are smooth on the closure of their associated half-planes. In order for (2.1) to exhibit the two-fold of Fig. 2 at the origin, we must have (2. 2) The first two equations of (2.2) specify that the left and right vector fields are tangent to the discontinuity surface at the origin, the second two equations specify that both tangent trajectories are visible, and the third two equations specify that both tangent trajectories are directed upward.
In order to minimise the number of parameters that must be considered, we define the scaled state variables Since only dynamics near the origin is of interest, we expand the two halves of the vector field as Taylor series centred about the origin, using O(k) to denote terms that are order k, or larger, in x and y. Specifically, (2.1) may be rewritten as Consequently, A and B may take any positive values. The limiting fractions or probabilities, ̺ hy , ̺ td and ̺ no , determined in the the following sections, depend only on A and B, and not on any higher order terms. For this reason we write ̺ = ̺(A, B) for each type of perturbation. Furthermore, in each case ̺ satisfies the symmetry property To see this, note that if we were to scale x and y in (2.4) such that the vector field for x < 0 is simply (ẋ,ẏ) = (−y, 1), plus higher order terms, then the vector field for x > 0 would become (ẋ,ẏ) = 1 A y, 1 B , plus higher order terms. Reversing the sign of x would then recover (2.4), but with A and B replaced by 1 A and 1 B , and the concepts of left and right switched. Therefore if ̺(A, B) represents the limiting fraction or probability of trajectories heading right, then is the probability of trajectories heading left. Assuming these probabilities sum to 1 (which is the case for each of the three types of perturbation), we therefore have (2.6).

Hysteresis
In this section the switching condition of the system (2.4) is replaced with a hysteretic band spanning −ε ≤ x ≤ ε. More precisely, suppose that the equations governing forward evolution switch from the right half-system to the left half-system when an orbit reaches x = −ε, and similarly the governing equations switch from the left half-system to the right half-system when an orbit reaches x = ε. Algebraically this equates to In [44], an analogous O(ε) hysteretic band was added to a piecewise-smooth system exhibiting a stable periodic orbit with sliding motion. Here the authors found that the periodic orbit persisted only for relatively small values of ε. More generally, hysteresis may be the cause of complicated and chaotic dynamics [45,46,47]. The stable sliding motion of (2.4) that occurs at x = 0 for y < 0, becomes rapid switching motion across −ε ≤ x ≤ ε for the system (3.1), see Fig. 3. Upon reaching the vicinity of the origin, some orbits head right, and the others head left. For each orbit, there is no ambiguity associated with its forward evolution. The purpose of this section is to precisely formulate, and then compute, the fraction of orbits that head right.
There are two critical trajectories that separate orbits by whether they eventually head right or eventually head left. These trajectories have a tangential intersection with one of the hysteretic switching manifolds (x = ±ε). The intersections occur at some points, (−ε, y L ) and (ε, y R ), as shown in Fig. 3. For our purposes it is sufficient to note that the tangency points satisfy We follow the critical trajectories backward from their tangential intersections assuming that switching occurs at every intersection with x = ±ε, as shown in Fig. 3. The values, y k , for k ≥ 1, are used to denote the successive intersections of the critical trajectories with x = 0 at which the trajectories are governed by the right half-system. By using odd values of k for the critical trajectory passing through (−ε, y L ), and even values of k for the critical trajectory passing through (ε, y R ), we have y k < y k−1 , for all k ≥ 2. Now consider the forward orbit of a point, (0, y), with y k < y ≤ y k−1 for some k ≥ 2, that initially follows the right half-system. Simply, if k is odd the orbit eventually heads right, and if k is even the orbit eventually heads left.
If we treat the initial point, (0, y), as fixed, and decrease ε to zero, the future of the forward orbit changes between eventually heading right and eventually heading left, and in the limit ε → 0 its future is undefined. For this reason, we instead consider a large collection of initial points and study the fraction of points for which forward evolution eventually heads right. To make this precise, we first let x * > ε be an ε-independent value chosen sufficiently small to ensure that global dynamics do not play a role. We then specify that an orbit "heads right" if the orbit first exits the region |x| < x * , at x = x * . We define the following indicator function for the forward orbit of any point, (0, y), that initially follows the right half-system: The assumption that orbits initially follow the right half-system is taken without loss of generality and the value of ̺ hy is not affected by this choice.) For any interval of y-values, I, we let represent the fraction of the interval I for which forward evolution eventually heads right. Already we have shown Figure 3: A sketch of phase space near the origin of the hysteretic system (3.1). The thick curves are the critical switching trajectories that divide orbits that eventually head right and orbits that eventually head left. For instance, as indicated, the forward orbit of any point, (0, y), with y 3 < y ≤ y 2 , that initially follows the right half-system, eventually heads right, whereas if y 2 < y ≤ y 1 , then the orbit eventually heads left.
Our goal is to derive where [y min , y max ] is a small ε-independent interval of the y-axis below the origin. In the following paragraphs we will see that ̺ hy is independent of y min and y max , and thus ̺ hy is an fundamental quantity of the unperturbed system. Before we consider such ε-independent intervals, it is useful to first compute an explicit formula for q [y k ,y k−2 ] . For any initial point (x 0 , y 0 ), with x 0 = O(ε) and y 0 = O ( √ ε), if evolution (forward or backward) is governed purely by the left half-system up to a time t = O ( √ ε), then the location of the orbit at this time is given by Similarly, if evolution is purely governed by the right half-system, then the orbit is located at Similar substitutions using both (3.7) and (3.8) lead to In this fashion we obtain the recursion relation Then it is easy to show that the combination (3.9)-(3.11), gives the explicit formula By substituting (3.12) into (3.5), and considering large k, we obtain Naturally we would like to use (3.13) to evaluate (3.6). Let k min and k max be the smallest and largest odd integers for which y k ∈ [y min , y max ]. Then, by (3.4), where the error term corresponds to the O(ε) differences, y max − y k min and y kmax − y min . The quotient, (3.14), may be written as a convex combination of, For this reason we have the bound: where the extrema are computed over all odd k with k min + 2 ≤ k ≤ k max . However, k min , k max → ∞ as ε → 0, and since it has not been shown that the O( √ ε) term in (3.13) remains bounded as k → ∞, equations (3.13) and (3.15) are insufficient for us to derive the anticipated and intuitive result, (3.16), given below. We have overcome this technical difficulty via an involved procedure of carefully bounding error terms. We state the following result and provide a proof in Appendix A.
Lemma 1. Given any −1 ≪ y min < y max < 0, The condition, −1 ≪ y min , ensures that global features of the system do not influence the dynamics. By (3.6) and (3.16), As mentioned above, ̺ hy is independent of the values of y min and y max . A plot of ̺ hy (A, B) is shown in Fig. 4.

Time-delay
Now suppose that the switching condition of (2.4) involves a time-delay of length ε: The inclusion of time-delay may destroy periodic orbits with sliding [44]. Dynamics and bifurcations specific to general piecewise-smooth ODE systems with time-delayed switching conditions are discussed in [48], and in [49] for systems with symmetry. If both hysteresis and time-delay are incorporated in the switching condition, there are yet more potential complications [50]. About the negative y-axis, dynamics of the time-delayed system (4.1) switches at curves, Γ − and Γ + (shown in Fig. 5) that may be thought of as the result of evolving the y-axis by a time ε under each half-system. As in the previous section we use series solutions to obtain analytical results. If x 0 = O(ε 2 ) and y 0 = O(ε), then evolution up to a time, t = O(ε), for the left half-system of (4.1) is given by and for the right half-system of (4.1), is given by As shown in Fig. 5, upon passing close to the origin, most forward orbits cease switching and either head right or head left. However, there are two critical orbits that switch across the positive y-axis indefinitely, or at least until global features of the system induce different dynamical behaviour. As in the previous section, we let y k for k ≥ 1 denote the intersections of these orbits with the negative y-axis at which the orbits are governed by the right half-system, see and our goal is to calculate x Γ − Γ + for some −1 ≪ y min < y max < 0. The remainder of this section consists of a derivation of formulas involving the switching points of the two critical orbits, a recursion relation for y k , and a numerical evaluation of (4.7). The orbit of (4.1) that is shown in Fig. 6 is located at (0, z 0 ) at t = 0 and satisfies the following two assumptions. First, it is assumed that for all t ∈ (−ε, 0), the orbit lies to the right of x = 0. Consequently, until t = ε, the orbit is governed by the right half-system. Second, we assume that z 0 < 0 is sufficiently small that at t = ε, the orbit is located at a point (x 1 , z 1 ) with x 1 > 0. By (4.3) we have The points, (x k , z k ), for k ≥ 1, are used denote successive switching points of the orbit. We let T 1 denote the time taken for the orbit to first return to x = 0, then Similarly we let T k , for each k ≥ 2, denote the time taken for the orbit to reach x = 0 from each (x k−1 , z k−1 ). Finally we let S k , for each k ≥ 1, denote the elapsed time between the consecutive switching points (x k , z k ) and (x k+1 , z k+1 ). Note that, S 1 -the time taken for the orbit to travel between (x 1 , z 1 ) and (x 2 , z 2 ) -is, by definition, also equal to the length of time that the orbit spends in the left half-plane between (0, z 0 ) and (x 1 , z 1 ), that is, Similarly S 2 is equal to the length of time that the orbit spent previously in the right, that is S 2 = ε − T 1 + T 2 , and for k ≥ 3, S k = S k−2 − T k−1 + T k . From this, also by (4.11), we obtain the simpler recursion relation:  and (4.14) For any values of A and B, the above recursion relations may be used to numerically identify the value, z 0 = y 1 (A, B), plotted in the inset of (5), for which, in the limit ε → 0, the orbit switches across the positive y-axis infinitely many times. The analogous value,ỹ 1 < 0, for the orbit that initially heads right, then switches indefinitely, may obtained in a similar fashion. Alternatively, by symmetry,ỹ Note that it appears to be not possible to obtain an explicit expression for y 1 (A, B) from the recursion relations. For backward evolution on the negative y-axis, illustrated in Fig. 5, the following recursion relation may be derived from (4.2) and (4.3) (in the same manner as in the previous section): (4.16) Also The formulas are sufficient for us to perform an efficient numerical evaluation of the fraction, q [y k ,y k−2 ] . To do this, for any A and B, y 1 is computed via an iterative scheme that identifies a value of z 0 for which recursion according to (4.12), (4.13) and (4.14), and starting from (4.8)-(4.11), gives values (x k , z k ) that lie on successively opposite sides of x = 0 for as many values of k as one chooses to compute. Then (4.15) is used to repeat this process and computeỹ 1 . Successive y k are computed via (4.16) and (4.17), and finally q [y k ,y k−2 ] is evaluated by (4.6).
It is expected that ̺ td = lim k→∞ lim ε→0 q [y k ,y k−2 ] , since this identity holds for the hysteretic perturbation of the previous section (it is an elementary consequence of Lemma 1). However, since here an exact expression for y k is not known, and in view of the complexity in proving Lemma 1, a verification of this claim is left for future studies. Fig. 7 shows a plot of ̺ td approximated by evaluating q [y k ,y k−2 ] at suitably large k and small ε via the iterative scheme described above.

Noise
This section concerns the following formulation of (2.4) in the presence of noise: Here ε denotes the overall noise amplitude and d ij are constants that control the relative strength of the two-dimensional noise in the x and y directions. Alternative formulations include coloured noise, spatially dependent noise, and multiplicative or parametric noise, each of which may be more suitable in certain applications.

General aspects of the zero-noise limit
Equation (5.1) is an example of a multi-dimensional stochastic differential equation, for which the drift, φ, may be discontinuous. If φ is measurable and bounded and the diffusion matrix, D, is non-singular, then (5.2) has a unique strong solution from any initial point [31,32,51,52]. In following discussion, p ε (x, t; x 0 ) denotes the probability density function of x(t), given If φ is smooth, (5.3) has a unique solution, call it x det (t; x 0 ). Straight-forward asymptotic arguments are sufficient to demonstrate weak convergence: p ε (x, t; x 0 ) → δ(x − x det (t; x 0 )) as ε → 0, where δ denotes the Dirac-delta function [53,54]. If φ is continuous but non-differentiable, (5.3) may have multiple solutions and p ε may converge to a weighted sum of delta functions centred at some of these solutions. In particular we may have as ε → 0, where x 1 (t; x 0 ) and x 2 (t; x 0 ) are two different solutions to (5.3).
In the short article [39], stochastic calculus methods are used to prove (5.4) in one dimension when φ is continuous, odd, nondecreasing, convex on [0, ∞), and with a finite value of lim ∆→0 ∆ 0 1 φ(x) dx, when x 0 = 0. Here, ̺ = 1 2 by symmetry, and of the uncountably many solutions to (5.3), the x 1 and x 2 that appear in (5.4) are the two solutions for which x(t) = 0 for all t > 0. These are called the minimal and maximal solutions because, for any other solution, x(t), for all t > 0, x 1 (t) < x(t) < x 2 (t). In [40], similar results are obtained when φ is not odd and the value of ̺ is derived using first passage concepts. Similar results for multiple dimensions have recently been described in [55].
For discontinuous φ, it is necessary to consider generalised notions of a solution to (5.3) [18]. In [56] was is shown that solutions to (5.2) may limit only to Filippov solutions of (5.3) as the noise amplitude is taken to zero. However, in the case that there are multiple solutions, this does not tell us which Filippov solutions contribute to the limiting density. On stable sliding regions there is a unique Filippov solution, and if sliding occurs for the duration of time interval, [0, t], it may be shown directly that p ε (x, t; x 0 ) limits to a delta-function centred at the Filippov solution as ε → 0, via an asymptotic expansion of p [57].
If φ is discontinuous, then, to the author's knowledge, only when φ is one-dimensional and piecewise-constant is an exact expression for p ε (x, t; x 0 ) known. Specifically, for any constants φ ± ∈ R, the transitional probability density function of is given by see [58,59]. Moreover, if φ − < 0 < φ + , then multiple Filippov solutions emanate from x = 0, and by using Laplace's method [60] to evaluate (5.6) asymptotically, it may be shown that This matches the results for continuous φ, in that φ − t and φ + t are the minimal and maximal Filippov solutions to (5.5) with ε = 0. Large deviations for the case (φ − , φ + ) = (−1, 1) and generalisations are described in [61].

Asymptotics of a first passage representation
Our interest is in forward evolution of (5.1) from points on the negative y-axis. For small ε > 0, upon passing near the two-fold, sample paths follow close to either x + (t) or x − (t) with high probability. This is illustrated in Fig. 8. Here the probability that a sample path will eventually head right or head left is determined via first passage methods for arbitrarily small ε > 0. We restrict our attention to an ε-independent rectangle, [−x * , x * ] × [−y * , y * ], in the (x, y)plane, and assume x * , y * > 0 are sufficiently small that global features of the dynamics may be ignored. For any initial point, (x(0), y(0)) = (x 0 , y 0 ), inside the rectangle, we let Q ε (x 0 , y 0 ) denote the probability that first passage by (5.1) to the boundary of the rectangle occurs at a point with x > 0. We then define ̺ no ≡ lim ε→0 Q ε (0, y 0 ; x * , y * ) , (5.8) for, −y * < y 0 < 0, as this represents the probability that forward evolution from the negative yaxis heads right after passing by the origin in the zero-noise limit. Below it will become apparent that this limiting probability is independent of the values of x * , y * and y 0 as implied by (5.8).
As described in stochastic differential equation texts, such as [62], Q ε is the C 1 solution to the elliptic boundary value problem comprised of the backward Fokker-Planck equation for (5.1): and the boundary conditions The boundary conditions follow from the observation that first passage occurs in zero time for any point on the boundary of the rectangle. Therefore, as stated in (5.10), on the boundary, Q ε = 1 if x 0 > 0, and Q ε = 0 otherwise. A description of Q ε for small ε may be obtained from an asymptotic expansion of (5.9)-(5.10). Since only lim ε→0 Q ε is of interest, only the leading order term in this expansion is required here. A formal justification of the expansion is a difficult analytical task deferred for future work.
Here the scaling laws of the asymptotic expansion are obtained in an intuitive fashion via a blow-up of the dynamics of (5.1) about the origin. Specifically, with the general scaling, where each λ i > 0, (5.1) becomes The appropriate values of λ i are those for which three terms in (5.12) are O(1) and all other terms are of higher order. This gives two possibilities. First we may have λ 1 = λ 2 = λ 3 = 2, but this is merely the scaling that in general sets drift and diffusion to the same order and thus covers a time-scale too short to describe stochastic departure from the discontinuity surface. Hence, we use the second possibility, which is Since (5.13) scales x to a higher degree than y, noise in y is of higher order than noise in x. By absorbing the magnitude of the noise in x into ε, we may assume d 2 11 + d 2 12 = 1 , (5.14) so that the noise in x is normalised. In the limit, ε → 0, we definẽ x → u ,ỹ → v ,t → s , (5.15) with which the stochastic differential equation, (5.12), takes the reduced form We now return to the boundary value problem for the probability, Q ε . In terms of the limiting scaled variables, (5.15), the leading order term for the asymptotic expansion of Q ε , call it Q, satisfies the backward Fokker-Planck equation of (5.16): Since the boundary conditions for Q ε (5.10) are at the boundary of a fixed rectangle, the boundary conditions for Q are at infinity. However, by the nature of the reduction, (5.17) is parabolic and consequently we cannot impose a boundary condition on Q in the limit v 0 → −∞. This is not problematic because in the v-direction (5.16) has positive drift and no noise, thus for sample paths to (5.16), v(t) → ∞ as t → ∞ with probability 1. Q satisfies the remaining three boundary conditions, The goal is to determine ̺ no (5.8), which may now be written as ̺ no = lim v 0 →−∞ Q(0, v 0 ).

The limiting probability, ̺ no
The partial differential equation, (5.17), takes a more familiar form if we introduce the pseudotime In terms of r, the parabolic boundary value problem is Reinterpreting Q as a function of u 0 and r, we have Due to the discontinuity at u 0 = 0 and the presence of r on the right hand side of (5.20), it appears to be not possible to obtain an explicit expression for Q from (5.20)-(5.21). Fig. 9 shows a typical plot of Q(u 0 , r) as computed by finite difference approximations. Fig. 10 shows a plot of ̺ no (A, B) computed by then approximating the limit (5.22) at a suitably large value of r. The following lemma provides the value of ̺ no at some special values of A and B:   Finally, consider the transitional probability density function for (5.1), call it p ε (x, t; x 0 ). Write x 0 = (x 0 , y 0 ), and suppose x 0 = 0 and −1 ≪ y 0 < 0. Let t slide denote the time taken for the deterministic sliding trajectory of (2.4) to travel from x 0 to the origin. Then for small t > t slide , the above results imply as ε → 0.

Conclusions
The two-fold of Fig. 2 has two important properties: (i) many orbits reach the two-fold, and, (ii) forward evolution from the two-fold is ambiguous. To clarify the second property, the two-fold has uncountably many distinct forward orbits in the sense of Filippov. Specifically there are two tangent trajectories, x + (t) and x − (t), and there are orbits that travel along the positive y-axis for some time, then enter and evolve in either the left or right half-plane. Local dynamics are described by the two-dimensional ODE system, (2.4), which has two parameters, A and B, that relate to the strength of the vector field for x < 0, relative to that for x > 0, in the x and y directions respectively. In this paper three different O(ε) perturbations to (2.4) have been studied: hysteresis, timedelay and noise. Each perturbation removes the ambiguity of forward evolution and in each case forward evolution close to the two-fold has been analysed. This microscopic analysis of the two-fold uncovers a macroscopic view of the dynamics. For small ε > 0, most orbits (or sample paths in the case of noise), follow close to either x + (t) or x − (t). In the limit, ε → 0, almost all orbits or sample paths follow exactly one of the these trajectories. Thus the possibility of evolution along the positive y-axis for a nonzero length of time may be excluded. This observation mirrors theoretical results for stochastically perturbed continuous, non-differentiable systems in the zero-noise limit [39,40].
With a hysteretic band of width 2ε the system is described by (3.1), with a time-delay of t = ε the system is described by (4.1), and with O(ε) noise the system is described by (5.1). It is interesting that the three different types of perturbation exhibit three different scaling laws. Specifically, for hysteresis, x = O(ε), for orbits approaching the two-fold about the negative y-axis. For time-delay, x = O(ε 2 ), near the two-fold. Finally for noise, x = O ε 4 3 , near the two-fold.
For perturbation by hysteresis and time-delay, ̺ hy and ̺ td , respectively, represent the fraction of orbits that head right (that is, follow close to x + (t) after passing by the two-fold) in the limit ε → 0. Similarly, for the stochastic perturbation, ̺ no represents the limiting probability that sample paths head right. As a function of A and B, an explicit expression was only obtained for ̺ hy , specifically (3.17). For ̺ td and ̺ no efficient algorithms were established to evaluate these values numerically. Lemma 2 gives special values of ̺ no . Plots of each ̺(A, B) are shown in Figs. 4, 7 and 10. Each plot has the same scale and viewpoint so that they may be compared fairly. For example, we notice that as A → 0, for hysteresis and time-delay ̺ → 0, but for noise ̺ → 1 2 . Also ̺ no (A, B) appears relatively flat, but, by (5.23), may take value in (0, 1). The primary goal of this work has been to resolve the ambiguity of forward evolution from a two-fold. To conclude, we may interpret forward evolution of the unperturbed system (2.4) from the two-fold as probabilistic, with evolution along x + (t) with probability, ̺(A, B), and along x − (t) with probability, 1 − ̺ (A, B). However, the value of ̺ depends upon the type of perturbation under consideration. This observation indicates that such a regularisation of two-folds is potentially futile in the absence of additional information about the system. As a mathematical model the system represents only an approximation to real-world behaviour. For a given application, physical reasons may justify the use of one of the three types of perturbation over the other two. It remains to apply a probabilistic representation for forward evolution from a two-fold to a physical system. Certainly by discounting Filippov solutions that traverse part of the repelling sliding region, the two-piece probabilistic solution proposed here is more sophisticated than permitting all possible Filippov solutions. It remains to extend the ideas described here to two-folds in three or more dimensions that arise generically in discontinuous systems of ODEs.

A Proof of Lemma 1
Here we write (3.11) and (3.12) as There are two main parts to the proof. First it is shown that there exists M 1 ∈ R such that given 0 < θ < 1. Second it is shown that for some M 2 ∈ R. Lemma 1 is then a consequence of a few additional observations. The key point is that the error in (A.4) is of higher order than ε which enables us to usefully evaluate q [y k ,y k−2 ] at k = O(ε −1 ). We restrict our attention to y ∈ [y * , 0], ε ∈ [0, ε * ], where y * and ε * are suitably small. For this reason we may assume that the error in (A.1), G, is a smooth bounded function and let M 1 = 2 max(|G|) + max(|ζ 1 |) + 1 2 max(|ζ 2 |). Trivially (A.3) holds for k = 1 and k = 2. Mathematical induction is required to verify (A.3) for all k ≤ ε −θ . Combining (A.1) and (A.2) leads to for odd values of k, and a similar formula for even k. By appropriately bounding the large square-root term in (A.5), which requires assuming ε * is sufficiently small that it may be shown that By further using (A.6), noting 1 k > 1 − , and applying the induction hypothesis, we Consequently, |ζ k+2 | ≤ (k + 2)M 1 , for odd values of k. The result for even k may be shown in the same manner. This completes the inductive demonstration of (A.3).
Since (3.14) involves only odd values of k, for remainder of the proof it is assumed that k is odd (even values of k may be treated similarly). Substituting (A.2) into (3.5) yields . (A.9) By applying (A.3), an evaluation of (A.9) at k = O(ε −θ ) gives . (A.10) To minimise the magnitude of the error we take θ = 1 5 . It is likely that a stronger bound on the error is achievable, but (A.10) is sufficient for this proof. The primary difficulty is bounding ζ k−1 − ζ k , in (A.9), for which (A.3) was used crudely to obtain |ζ k−1 − ζ k | ≤ 2kM 1 .

C Derivation of (5.24)
If B = 1, then (5.16) has two special properties that provide simplification. First, v(s) increases at a unit rate at all times. Assuming, for simplicity, v(0) = 0, we therefore have du(s) = −As , u < 0 s , u > 0 ds + dW (s) , (C.1) and ̺ no = lim s→∞ P[u(s) > 0] Second, we may write (C.1) as The key observation is that the drift in (C.3) is odd with respect to time, that is, for all u and s. For this reason, roughly speaking, drift over negative times is cancelled by drift over positive times and consequently the probability that u(s) is positive approaches 1 2 as s → ∞. Indeed (5.24) is a consequence of the following general result: Lemma 3. Suppose φ is a bounded, measurable function satisfying (C.4) with possibly finitely many values of u at which it is discontinuous. For any T > 0, for the stochastic process (C.2), P u(T ) > 0 u(−T ) = 0 = 1 2 . The desired result, (5.24), follows by applying Lemma 3 to (C.3) and taking T → ∞. Below, Lemma 3 is proved for smooth φ. For brevity a generalisation permitting discontinuities in φ is omitted. This may be achieved by simply adding consistency conditions for the transitional probability density function of (C.2) at points of discontinuity.
For (C.2), the probability density function of u(s), call it p(u, s), is the unique bounded solution to ∂p ∂s = − ∂(φp) ∂u + 1 2 Here we let By the Feynman-Kac formula [63,59], Q is the unique, bounded, C 1 solution to By the symmetry property, (C.4), the solutions to these two partial differential equations are related by ∂Q ∂u 0 (u 0 , s 0 ) = p(u 0 , −s 0 ), for all u 0 and s 0 . Now define for all B > 0. In view of (2.6), this is equivalent to (5.25). For (5.16), the limit, A → ∞, represents infinitely strong drift for u < 0. Therefore a sample path from a point (u, v) with u < 0 and v < 0 is immediately sent to (0, v), with probability 1. If instead u < 0 and v > 0, then the u-value of the solution goes to −∞, with probability 1. Consequently, it suffices to analyse dynamics for u > 0 and impose a reflecting boundary condition on the negative v-axis and an absorbing boundary condition on the positive v-axis. Specifically, we may assume v(s) ≡ s, and that u(s) is governed by du(s) = s ds + dW (s) . (D. 2) The probability density function for u(s), call it p(u, s), therefore satisfies In (D.7), Ai(z) is the Airy function of the first kind, and Br denotes a suitable Bromwich contour in the complex plane. Throughout this exposition Bromwich contours may be assumed to be vertical lines at a non-negative real value. The following expression of p for s ≥ 0 was obtained by methods similar to those in [64] (for brevity the details are omitted): and Bi(z) is the Airy function of the second kind. We may use (D. and take u, s → ∞ with ∆ = O(1).
As u → ∞, only the piece of (D.9) with u > ξ is important. From this we utilise various basic properties of Airy functions to obtain, for large u, Since (D.17) is independent of the value of γ, to evaluate it we may take the limit γ → 0. To achieve this we apply the following version of Plemelj-Sokhotski's formula [66]: lim γ→0 + h(x) x−iγ dx = iπh(0) + C.P. h(x) x dx, (for integration over an interval containing zero), which produces