On the Stability and the Exponential Concentration of Extended Kalman-Bucy filters

The exponential stability and the concentration properties of a class of extended Kalman-Bucy filters are analyzed. New estimation concentration inequalities around partially observed signals are derived in terms of the stability properties of the filters. These non asymptotic exponential inequalities allow to design confidence interval type estimates in terms of the filter forgetting properties with respect to erroneous initial conditions. For uniformly stable signals, we also provide explicit non-asymptotic estimates for the exponential forgetting rate of the filters and the associated stochastic Riccati equations w.r.t. Frobenius norms. These non asymptotic exponential concentration and quantitative stability estimates seem to be the first results of this type for this class of nonlinear filters. Our techniques combine $\chi$-square concentration inequalities and Laplace estimates with spectral and random matrices theory, and the non asymptotic stability theory of quadratic type stochastic processes.


Introduction
The linear-Gaussian stochastic filtering problem has been solved in the beginning of the 1960s by Kalman and Bucy in their seminal articles [7,8,21]. Since this period, Kalman-Bucy filters have became one of the most powerful estimation algorithm in applied probability, statistical inference, information theory and engineering sciences. The Kalman-Bucy filter is designed to estimate in an optimal way (minimum variance) the internal states of linear-Gaussian time series from a sequence of partial and noisy measurements. The range of applications goes from tracking, navigation and control to computer vision, econometrics, statistics, finance, and many others. For linear-Gaussian filtering problems, the conditional distribution of the internal states of the signal given the observations up to a give time horizon are Gaussian. The Kalman-Bucy filters and the associated Riccati equation coincide with the evolution of the conditional averages and the conditional covariances error matrices of these conditions Gaussian distributions.
Using natural local linearization techniques Kalman-Bucy filters are also currently used to solve nonlinear and/or non Gaussian signal observation filtering problems. The resulting Extended Kalman-Bucy filter (abbreviated EKF) often yields powerful and computational efficient estimators. Nevertheless it is well known that it fails to be optimal with respect to the minimum variance criteria. For a more thorough discussion on the origins and the applications of these observer type filtering techniques we refer to the articles [25,33,34] and the book by D. Simon [32].
There is a vast literature on the applications and the performance of extended Kalman filter, most on discrete time filtering problems, but very few on the stability properties, none on the exponential concentration properties.
In the last two decades, the convergence properties of the EKF have been mainly developed into three different but somehow related directions: The first commonly used approach is to analyze the long-time behaviour of the estimation error between the filter and the partially observed signal. To bypass the fluctuations induced by the signal noise and the observation perturbations, one natural strategy is to design judicious deterministic observers as the asymptotic limit of the EKF when the observation and the sensor noise tend to zero. As underlined in [6], in deterministic setting the original covariance matrices of the stochastic signal and the one of the observation perturbations are interpreted as design/tuning type parameters associated with the confidence type matrices of the trusted model and the confidence matrix of associated with the measurements.
For a more detailed discussion on deterministic type observers as the limit of filters when the sensor and the observation noise tend to zero we refer the reader to the seminal article [3] and the more recent study [6]. Several articles proposed a series of observability and controllability conditions under which the estimation error of the corresponding discrete time observer converges to zero [3,5,33,34]. These regularity conditions allow to control the maximal and the minimal eigenvalues of the solution of the Riccati equations (and its inverse).
One of the drawbacks of this approach is that it gives no precise information on the stochastic EKF but on the limiting noise free-type deterministic observer. On the other hand, up to our knowledge there does not exist any uniform result that allow to quantify the difference between the filter and its asymptotic limit with respect to the time parameter. Another drawback is that the initial estimation errors need to be rather small and the signal model close to linear.
In general practical and stochastic situations, mean square errors do not converge to zero as the time parameter tends to 8. The reasons are two folds: Firstly, the observation noise of the sensors cannot be totally cancelled. On the other hand the internal signal states are usually only partially observed, and some components may not be fully observable.
A second closely related strategy is to design a Lyapunov function to ensure the stochastic stability of the EKF. Here again these Lyapunov functions are expressed in terms of the inverse of the Riccati equation. These stability properties ensure that the mean square estimation error is uniformly bounded w.r.t. the time horizon [3,22,27,28]. The regularity conditions are also based on a series of local observability and controllability conditions. As any variance type estimate, these mean square error control are somehow difficult to use in practical situations with rather crude confidence interval estimates.
The third and more recent approach is based on the contraction theory developed by W. Lohmiller and J.J.E. Slotine in the seminal articles [23,24], and further developed in [6]. This approach is also designed to study deterministic type observers. The idea is to control the estimation error between a couple of close EKF trajectories in a given region w.r.t. the metric induced by the quadratic form associated with the inverse of the solution of the Riccati equation. This approach considers the partially observed signal as a deterministic system and requires the filter to start in a basin of attraction of the true state. In summary, these techniques show that the observer induced by the EKF converges locally exponentially to the state of the signal when the quadratic form induced by the inverse of the Riccati equation is sufficiently regular and under appropriate observability and controllability conditions.
The objective of this article is to complement these three approaches with a novel stochastic analysis based on exponential concentration inequalities and uniform χ-square type estimates for stochastic quadratic type processes.
Our regularity conditions are somehow stronger than the ones discussed in the above referenced articles but they do not rely on suitable local initial conditions nearby the true signal state. Last but not least our methodology applies to stochastic filtering problems, not to deterministic type observers.
In our framework the signal process is a uniformly and exponentially stable Langevin type diffusion, and the sensor function is the identity matrix up to a change of basis.
In this apparently simple nonlinear filtering problem the quantitative analysis of the EKF exponential stability is based on sophisticated probabilistic tools. The complexity of these stochastic processes can be measured by the fact that the EKF is a nonlinear diffusion process equipped with a diffusion correlation matrix satisfying a coupled nonlinear and stochastic Riccati equation.
This study has been motivated by one of our recent research project on the refined convergence analysis Ensemble type Kalman-Bucy filters. To derive some useful uniform convergence results with respect to the time horizon we have shown in [14] that the signal process needs to be uniformly stable and fully observed by some noisy sensor. These rather strong conditions cannot be relaxed even for linear Gaussian filtering models. We plan to extend these results for nonlinear filtering models based on the non asymptotic estimates presented in this article.
In this context we present new exponential concentration inequalities to quantify the stochastic stability of the EKF. They allow to derive confidence intervals for the deviations of the stochastic flow of the EKF around the internal states of the partially observed signal. These estimates also show that the fluctuations induced by any erroneous initial condition tend to zero as the time horizon tends to`8.
Our second objective is to develop a non asymptotic quantitative analysis of the stability properties of the EFK. In contrast to the linear-Gaussian case discussed in [14], the Riccati equation associated with the EFK depends on the states of the filter. The resulting system is a nonlinear stochastic process evolving in multidimensional inner product spaces. To analyze these complex models we develop a stability theory of quadratic type stochastic processes. Our main contribution is a non asymptotic L p -exponential stability theorem. This theorem shows that the L p -distance between two solutions of the EKF and the stochastic Riccati equation with possibly different initial conditions converge to zero as the time horizon tends to`8. We also provide a non asymptotic estimate of the exponential decay rate.
The rest of the article is organized as follows: In the next two sections, Section 1.1 and Section 1.2, we present the nonlinear filtering models discussed in the article and we state the main results developed in this work. Section 2 is concerned with the stability properties of quadratic type processes. This section presents the main technical results used in the further development of the article. Most of the technical proofs are provided in the appendix. Section 3 is dedicated to the stochastic stability properties of the signal and the EKF. The end of the article is mainly concerned with the proofs of the two main theorems presented in Section 1.2.

Description of the models
This section presents the nonlinear filtering models in this article. We also discuss and illustrate our regularity conditions with several classes of Langevin type signal processes partially observed by noisy sensors.
Consider a time homogeneous nonlinear filtering problem of the following form In the above display, pW t , V t q is an pr 1`r2 q-dimensional Brownian motion, X 0 is a r 1valued Gaussian random vector with mean and covariance matrix pEpX 0 q, P 0 q (independent of pW t , V t q), the symmetric matrices R and R 1{2 2 are invertible, B is an pr 2ˆr1 q-matrix, and Y 0 " 0. The drift of the signal is a differentiable vector valued function A : x P R r 1 Þ Ñ Apxq P R r 1 with a Jacobian denoted by BA : x P R r 1 Þ Ñ Apxq P R pr 1ˆr1 q .
The Extended Kalman-Bucy filter associated with the filtering problem (1) is defined by the evolution equations where B 1 stands for the transpose of the matrix B. For nonlinear signal processes the random matrices P t cannot be interpreted as the error covariance matrices. Nevertheless, rewriting the EKF in terms of the signal process we have Replacing pApX t q´Ap p X t qq by the first order approximation BAp p X t qpX t´p X t q we define a process d r It is a simple exercise to check that the solution of the Riccati equation (2) coincides with the F t -conditional covariance matrices of r X t ; that is, for any t ě 0 we have P t " E´r X t r X 1 t | F t¯.

Langevin-type signal processes
In the further development of the article we assume that the Jacobian matrix of A satisfies the following regularity conditions: $ & %´λ BA :" sup xPR r 1 ρpBApxq`BApxq 1 q ă 0 }BApxq´BApyq} ď κ BA }x´y} for some κ BA ă 8.
where ρpP q :" λ max pP q stands for the maximal eigenvalue of a symmetric matrix P . In the above display }BApxq´BApyq} stands for the L 2 -norm of the matrix operator pBApxq´BApyqq, and }x´y} the Euclidean distance between x and y. A Taylor first order expansion shows that p3q ùñ xx´y, Apxq´Apyqy ď´λ A }x´y} 2 with λ A ě λ BA {2 ą 0.
The above rather strong conditions ensure the contraction needed to ensure the stability of the EFK. They are also used to derive uniform estimates w.r.t. the time horizon for Ensemble Kalman-Bucy particle filters [15]. For linear systems Apxq " Ax, associated with some matrix A, the parameters λ A " λ BA {2 coincide with the logarithmic norm of A.
The prototype of signals satisfying these conditions are multidimensional diffusions with drift functions pA, BAq " p´BV,´B 2 Vq associated with a gradient Lipschitz strongly convex confining potential V : x P R r 1 Þ Ñ Vpxq P r0, 8r. The logarithmic norm condition (3) is met as soon as B 2 V ě v Id with v " 2|λ BA |. Equivalently the smallest eigenvalue λ min pB 2 Vpxqq of the Hessian is uniformly lower bounded by v. In this case (3) is met with λ BA " v{2. These conditions are fairly standard in the stability theory of nonlinear diffusions, we refer the reader to the review article [26], and the references therein. Choosing R 1 " σ 2 1 Id and A "´βBV, for some β, σ 1 ě 0 the signal process X t resumes to a multidimensional Langevin-diffusion dX t "´β BVpX t q dt`σ 1 dW t .
This process is reversible w.r.t. the invariant distribution µ β , where µ β is the probability distribution on R r 1 given by Vpxq˙dx Ps0, 8r.
In the above display dx stands for the Lebesgue measure on R r 1 . The Lipschitz-continuity condition of the Hessian B 2 V introduced in (3) ensures the continuity of the stochastic Riccati equation (2) w.r.t. the fluctuations around the random states p X t . We illustrate this condition with a nonlinear example given by the function with some symmetric positive definite matrices pQ 1 , Q 2 q and some given vector q P R r 1 . In this case we have In this situation we have This shows that conditions (3) are met with the parameters pλ BA , κ BA q " β´2´1λ min pQ 1 q, 2λ 3{2 max pQ 2 q¯.
A proof of (6) is provided in the appendix on page 29. More generally these regularity conditions also hold if we replace in (5) the parameter σ 1 by any choice of covariance matrice R 1 . Also observe that the Langevin diffusion associated with the null form Q " 0 coincides with the conventional linear-Gaussian filtering problem discussed in [14]. Stochastic gradient-flow diffusions of the form (5) arise in a variety of application domains. In mathematical finance and mean field game theory [9,17], these Langevin models describe the interacting-collective behaviour of r 1 -individuals. For instance in the Langevin model discussed in [17] the state variables X t "`X i t˘1 ďiďr 1 represent the log-monetary reserves of r 1 banks lending and borrowing to each other. The quadratic potential function is given by In this context, the parameter β represents the mean-reversion rate between banks. More general interacting potential functions can be considered. Mean field type diffusion processes are also used to design low-representation of fluid flow velocity fields. These vortex-type particle filtering problems are developed in some details in the pionnering articles by E. Mémin and his co-authors [10,11,13,30]. These probabilistic interpretations of the 2dincompressible Navier-Stokes equation represent the vorticity map as a mixture of basis functions centered around each vortex.
In this connexion, we mention that our approach also applies to interacting diffusion gradient flows described by a potential function of the form for some gradient Lipschitz strongly convex confining potential U i : R i Þ Ñ r0, 8r, i " 1, 2.
In this situation, we have We further assume that In this case, we have This shows that conditions (3) are met with The detailed proofs of (7)- (8) are provided in the appendix on page 29.

Observability conditions
When the observation variables are the same as the ones of the signal; the signal observation has the same dimension as the signal and resumes to some equation of the form for some parameters b P R and σ 2 ě 0. These sensors are used in data grid-type assimilation problems when measurements can be evaluated at each cell. These fully observed models are discussed in [19,Section 4] in the context of the Lorentz-96 filtering problems. These observation processes are also used in the article [4] for application to nonlinear and multiscale filtering problem. In this context, the observed variables represent the slow components of the signal. For partially observed signals we cannot expect any stability properties of the EKF and the EnKF without introducing some structural conditions of observability and controllability on the signal-observation equation (1). Observe that the EKF equation (2) implies that dp p X t´Xt q " This equation shows that the stability properties of this process depends on the nature of the real eigenvalues of the symmetric matrices pApxq´P Sq sym , with x P R r 1 . In contrast with the conventional Kalman-Bucy filter the Riccati equation (2) is a stochastic equation. As a result, the stability property of the EKF is not induced by some kind of observability condition that ensures the existence of a steady state deterministic covariance matrix.
The random fluctuations of the matrices BAp p X t q entering in the Riccati equation (2) may corrupt the stability in the EKF, even if the linearized filtering problem around some chosen state is observable and controllable. For a more thorough discussion on the stability properties of Kalman-Bucy filters and Riccati equations for linear Gaussian filtering problems we refer the reader to [1,2,18,35,37,38].
The stability analysis of diffusion processes is always much more documented than the ones on their possible divergence. For instance, in contrast with conventional Kalman-Bucy filters, the stability properties of the EKF are not induced by some kind of observability or controllability condition. The only known results in this direction is the recent pioneering work by X. T. Tong, A. J. Majda and D. Kelly [36] in the context of discrete generation Ensemble Kalman filters. One of the main assumptions of the article is that the sensormatrix has full rank. The authors also provide a concrete numerical example of filtering problem with sparse observations for which the EnKF experiences a catastrophic divergence.
In a recent article [14], in the context of linear drift functions we also show that the uniform propagation of chaos properties of EnKF require strong signal stability properties and the same type of observability conditions. They are some strong similarities between the EKF and the EnKF: The first one comes from the fact that the predictable part of the EKF is stochastic and nonlinear. The predictable part of the EnKF also depend on stochastic covariance matrices. These interaction functions are clearly nonlinear in the internal states of the particle system.
The second one comes from the fact that the Riccati equation associated with the EKF is stochastic. The stochastic perturbation theorem obtained in [14,Theorem 3.1] also shows that the sample covariance matrices satisfy a stochastic diffusion type Riccati equation.
Without any strong observability conditions, these stochastic nonlinearities may corrupt severally the stability of the EKF. In the further development of the article we shall assume that the sensor function has the same form as the one discussed in [14]. More precisely, we assume that the following observability condition is satisfied: The fully observed model discussed in (9) clearly satisfies condition (10) with the parameter ρpSq " pb{σ 2 q 2 . As mentioned above, in the context of linear-Gaussian filtering problems this condition is also essential to ensure the uniform convergence of Ensemble Kalman-Bucy filter w.r.t. the time parameter. Section 4 in the article [14] provides a detailed discussion on spectral estimates and semigroup contraction inequalities based on this condition. A geometric description of global divergence regions in the set of positive covariances matrices is also provided in the context of 2-dimensional partially observed filtering problems.
Last but not least, we mention that (10) is satisfied when the filtering problem is similar to the ones discussed above; that is, up to a change of basis functions. For instance (10) is met with S " I r 1 for sensors with orthonormal matrices BR´1 {2 2 . Under this condition, up to a change of observation basis, the observation process reduces to Bq is invertible can be turned into that form. To check this claim we observe that In this situation the filtering model pX t , Y t q satisfies (10) with ρpSq " 1. In addition, we have Bq ñ pA, BAq " pBU, B 2 U q.

Statement of the main results
We let φ t pxq :" X t and ϕ t pxq :" x t be the stochastic and the deterministic flows of the stochastic and the deterministic systems We also let Φ t :" pΦ t , Ψ t q be the stochastic flow associated with the EKF and the Riccati stochastic differential equations; that is Φ t p p X 0 , P 0 q "´Φ t p p X 0 , P 0 q, Ψ t p p X 0 , P 0 q¯:"´p X t , P t¯.
Given pr 1ˆr2 q matrices P, Q we define the Frobenius inner product xP, Qy " trpP 1 Qq and the associated norm }P } 2 F " trpP 1 P q where trpCq stands for the trace of the matrix C. We also equip the product space R r 1R r 1ˆr1 with the inner product xpx 1 , P 1 q, px 2 , P 2 qy :" xx 1 , x 2 y`xP 1 , P 2 y and the norm }px, P q} 2 :" xpx, P q, px, P qy.
We recall the χ-square Laplace estimate The proof of (11) and more refined estimates are housed in the appendix. We have the rather crude almost sure estimate This readily yields the upper bound trpP t q ď τ t pP q :" e´λ BA t tr pP 0 q`trpR 1 q{λ BA ñ sup tě0 trpP t q ď tr pP 0 q`trpR 1 q{λ BA . (12) Most of the analysis developed in the article relies on the following quantities: Our first main result concerns the stochastic stability of the EKF and is described in terms of the function δ P r0, 8rÞ Ñ ̟pδq :" More precisely we have the following exponential concentration theorem.
Theorem 1.1. For any initial states px, p x, pq P R r 1`r1`p r 1ˆr1 q and any time horizon t P r0, 8r, and any δ ě 0 the probabilities of the following events are greater than 1´e´δ: }φ t pxq´Φ t pp x, pq} 2 ď 4 ̟pδq The proofs of the concentration inequalities (14) and (15) are provided respectively in Section 3.1 and Section 3.2. See also Theorem 3.1 and Theorem 3.2 for related Laplace χ-square estimates of time average distances.
The role of each quantity in (14) and (15) is clear. The size of the "confidence events" are proportional to the signal or the observation perturbations, and inversely proportional to the stability rate of the systems. More interestingly, formula (15) shows that the impact of the initial conditions is exponentially small when the time horizon increases.
Our next objective is to better understand the stability properties of the EKF and the corresponding stochastic Riccati equation. To this end, it is convenient to strengthen our regularity conditions. We further assume that and for some α ą 1 In contrast with the linear-Gaussian case, the Riccati equation (2) depends on the internal states of the EKF. As a result its stability properties are characterized by a stochastic Lyapunov exponent that depends on the random trajectories of the filter as well as on the signal-observation processes. Condition (17) is a technical condition that allows to control uniformly the fluctuations of these stochastic exponents with respect to the time horizon. By (16) this condition is met as soon as Loosely speaking, when the signal is not sufficiently stable the erroneous initial conditions of EKF may be too sensitive to small perturbations of the sensor. When the exponential decay to equilibrium of the signal is stronger than these spectral instabilities the EKF and the corresponding stochastic Riccati equations are stable and forgets any erroneous initial conditions. We set ∆ t :" }Φ t´p X 0 , P 0¯´Φt´q X 0 , q P 0¯} 2 and Λ{λ BA : We are now in position to state our second main result. Assume conditions (16) and (17)  We end this section with some comments on our regularity conditions. Notice that Λ does not depend on the parameter δ nor on ρpSq. As mentioned above, we believe that these technical conditions can somehow be relaxed. These conditions are stronger than the ones discussed in [14] for linear-Gaussian models. In contrast with the linear case, the Riccati equation in nonlinear settings is a stochastic process in matrix spaces. For this class of models, these technical conditions are used to control the fluctuations of the stochastic Riccati equation entering into the EKF.

Stability properties of quadratic type processes
Let pU t , V t , W t , Y t q be some non-negative processes defined on some probability space pΩ, F, Pq equipped with a filtration F " pF t q tě0 of σ-fields. Also let pZ t , Zt q be some processes and M t be some continuous F t -martingale. We use the notation dY t ď Zt dt`dM t ðñ`dY t " Z t dt`dM t with Z t ď Zt˘ (18) Let us mention some useful properties of the above stochastic inequalities.
Let pY t , Zt , Z t , M t q be another collection of processes satisfying the above inequalities. In this case it is readily checked that dpY t`Y t q ď pZt`Zt q dt`dpM t`Mt q Let pH, x., .yq be some inner product space, and let A t : x P H Þ Ñ A t pxq P H be a linear operator-valued stochastic process with finite logarithmic norm ρpA t q ă 8. Consider an H-valued stochastic process X t such that for some continuous F t -martingale M t with angle bracket satisfying the following property This section is concerned with the long-time quantitative behaviour of the above quadratic type processes. The main difficulty here comes from the fact that A t is a stochastic flow of operators. As a result we cannot apply conventional Lyapunov techniques based on Dynkin's formula, supermartingale theory and/or more conventional Gronwall type estimates.
Next theorem provides a way to estimate these processes in terms of geometric type processes and exponential martingales.
More generally, for any n ě 1 we have with the rescaled processes The proof of this theorem is rather technical thus it is housed in Section 5.2 in the appendix.
When ρpA t q ď´a t and W t ď w t for some constants a t , w t , and X 0 " 0 we have with λ n pa s , w s q :" a s´n´1 2 w s .
Proof. The first assertion is a direct consequence of the estimates stated in Theorem 2.1.
Replacing W t and ρpA t q by w t and p´a t q from the start in the proof of Theorem 2.1 we find that In the above display we have used the fact that This ends the proof of the corollary.
Proposition 2.3. Assume that ρpA t q ď´a for some parameter a ą 0, and X 0 " 0 " W t . Also assume that for any n ě 1 and any t ě 0 we have E pU n t q 1{n ď u t and E pV n t q 1{n ď v t for some functions u t , v t ě 0. In this situation, for any ǫ Ps0, 1s we have the uniform estimates for any functions pu t paq, v t paqq such that ż t 0 e´a pt´sq u s ds ď u t paq and In addition, when v t " v for any ǫ P r0, 1s we have The proof of the proposition is provided in the appendix, Section 5.3. We end this section with some comments on the estimate (24). Let us suppose that d}X t } 2 " "´a }X t } 2`u ‰ dt`dM t for some u ě 0 (with X 0 " 0). In this case, by Jensen's inequality we have The r.h.s. of (24) gives the estimate The above estimates coincides for any ǫ P r0, 1s and any u ě 0 as soon as 1´ǫ˙.

The signal process
This section is mainly concerned with the stochastic stability properties of the signal process.
One natural way to derive some useful concentration inequalities is to compare the flow of the stochastic process with the one of the noise free deterministic system discussed in the beginning of Section 1.2. We start with a brief review on the long-time behaviour of the semigroup ϕ t pxq. It is readily checked that B t }ϕ t pxq´ϕ t pyq} 2 ď´2λ A }ϕ t pxq´ϕ t pyq} 2 ñ }ϕ t pxq´ϕ t pyq} ď e´λ A t }x´y}.
This contraction property ensures the existence and the uniqueness of a fixed point @t ě 0 ϕ t px ‹ q :" x ‹ ðñ Apx ‹ q " 0 ùñ }ϕ t pxq´x ‹ } ď e´λ A t }x´x ‹ }.
We let δφ t pxq be the Jacobian of the stochastic flow φ t pxq. We have the matrix valued equation B t δφ t pxq " BApφ t pxqq δφ t pxq ñ δφ t pxq u " expˆż t 0 BApφ s pxqq ds˙u for any u P R r 1 . This implies that }δφ t pxq} :" sup }u}ď1 }δφ t pxq u} ď exp p´λ BA t{2q ÝÑ tÑ8 0.
Using the formula φ t pyq´φ t pxq " ż 1 0 δφ t px`ǫpy´xqq py´xq dǫ, we easily check the almost sure exponential stability property }φ t pxq´φ t pyq} ď exp p´λ BA t{2q }x´y}. (25) The same analysis applies to estimate the Jacobian δϕ t pxq of the deterministic flow ϕ t pxq.
Using the estimate }φ t pX 0 q´φ t pE pX 0 qq } ď e´λ BA t{2 }X 0´E pX 0 q } we also have λ BA ż t 0 }φ s pX 0 q´φ s pE pX 0 qq } 2 ds ď }X 0´E pX 0 q } 2 from which we conclude that The next proposition quantifies the relative stochastic stability of the flows pϕ t , φ t q in terms of L n -norms and χ-square uniform Laplace estimates.
Proposition 3.1. For any n ě 1 and any x P R r 1 we have the uniform moment estimates In addition, for any ǫ Ps0, 1s we have the uniform Laplace estimates Combining (26) with the concentration inequality (33) we prove that the probability of the event }φ t pxq´ϕ t pxq} 2 ď ̟pδq trpR 1 q{λ A is greater than 1´e´δ, for any δ ě 0 and any initial states x P R r 1 . This ends the proof of (14). Proof of Proposition 3.1: We have dpX t´xt q " rApX t q´Apx t qs dt`R 1{2 1 dW t with X 0 " x 0 , and therefore with the martingale dM t :" 2xX t´xt , R 1{2 1 dW t y ùñ B t xM y t " 4 tr`R 1 pX t´xt qpX t´xt q 1˘ď 4 trpR 1 q }X t´xt } 2 . The end of the proof is now a direct consequence of (22) and Proposition 2.3 applied to The proof of the proposition is now completed.

The Extended Kalman-Bucy filter
This section is mainly concerned with the stochastic stability and the concentration properties of the semigroup of the EKF stochastic process. As for the signal process discussed in Section 3.1 these properties are related to L p -mean error estimates and related χ-square type Laplace inequalities. Our main results are described by the following theorem. Let p p X t , P t q be the solution of the evolution equations (2) starting at p p X 0 , P 0 q.
Theorem 3.2. For any n ě 1 we have For any ǫ Ps0, 1s and any P 0 there exists some time horizon t 0 pǫ, P 0 q such that In addition for any t ě s ě 0 and any ǫ Ps0, 1s we have Before getting into the details of the proof of this theorem we mention that (15) is a direct consequence of (27) combined with (33) and (25). Indeed, applying (33) to (27) we readily check that the probability of the events }φ t pxq´Φ t pp x, pq} 2 is greater than 1´e´δ, for any δ ě 0 and any initial states px, p x, pq P R r 1`r1`p r 1ˆr1 q . In this connection, the Laplace estimates (29) readily imply that the probability of the events 1 t´s is greater than 1´e´δ, for any δ ě 0 and any time horizon t. Now we come to the proof of the theorem. Proof of Theorem 3.2: We set X t :" φ t p p X 0 q´p X t . We have This yields the estimate Also observe that This implies that This implies that Applying Proposition 2.3 to A t x :"´ax "´2λ A x, we find that 4e˙`e 2 ? 2 1 ? ǫ for any ǫ Ps0, 1s.
Using the fact that for any non negative real numbers x, y, λ we have we find that for any t ě tpǫq, for any ǫ P r0, 1r and some tpǫq.
The end of the proof is now a direct consequence of (22) and Proposition 2.3 applied to A t x :"´ax "´2λ A x and u t " v t {4 ď u " v{4 :" " trpR 1 q`τ 2 s pP q ρpSq ‰ with t P rs, 8r. The proof of the theorem is now completed.

Proof of Theorem 1.2
We let p p X t , P t q be the solution of Equations (2) starting at p p X 0 , P 0 q. We denote by p q X t , q P t q the solution of these equations starting at some possibly different state p q X 0 , q P 0 q. Firstly we have pp12q, p26q and p27qq ùñ @n ě 1 sup tě0 E´r} p X t´q X t } 2`} P t´q P t } 2 F s n¯ă 8.
We couple the equations with the same observation processes. In this situation we find the evolution equation with the martingale dM t :" This implies that dp p X t´q X t q "´rAp p X t q´Ap q X t qs`q P t S p q X t´p X t q`pP t´q P t qSpX t´p X t q¯dt`dM t from which we conclude that with the martingale dM t " 2 x p X t´q X t , dM t y.
The angle bracket of M t is given by Recalling that λ A ě λ BA {2, also observe that the drift term in (30) is bounded bý In much the same way we have In the last assertion we have used the matrix decomposition Recalling that 2´1B t }P t´q P t } 2 " 2´1B t xP t´q P t , P t´q P t y " xP t´q P t , B t pP t´q P t qy " tr´pP t´q P t qB t pP t´q P t q¯, we find that 2tr´BAp q X t qpP t´q P t q 2¯`2 tr´rBAp p X t q´BAp q X t qsP t pP t´q P t q¯.
This implies that We set Notice that Observe that p3λ BA´ρ pSqq`c 1 4 pλ BA´ρ pSqq 2`p β t`αt q 2 ď´λ BA`βt`αt ď ρpA t q :"´´λ BA´2 κ BA τ t pP q´ρpSq }X t´p X t }T he final step is based on the following technical lemma. (17) is satisfied for some α ą 1. In this situation, for any ǫ Ps0, 1s there exists some time horizon s such that for any t ě s we have the almost sure estimate for some positive random process Z t s.t. sup tě0 E`Z αδ t˘ă 8. The end of the proof of Theorem 1.2 is a direct consequence of this lemma, so we give it first. Combining (31) with (20) we find that This ends the proof of Theorem 1.2.
For any ǫ Ps0, 1s, there exists some time horizon ς ǫ pP 0 q such that t ě s ě ς ǫ pP 0 q ùñ p1´ǫq ď ∆ BA psq{∆ BA ď 1 ùñ ρpA t q ď´p1´ǫq∆ BA`ρ pSq }X t´p X t } On the other hand, the contraction inequality (25) implies that ż t s }X r´p X r } dr " ż t s }φ r´s pφ s pX 0 qq´p X r } dr ď ż t s }φ r´s pX s q´φ r´s p p X s q} dr`ż t s }φ r´s p p X s q´p X r } dr The above inequality yields the almost sure estimate Using the estimate x´1{4 ď x 2 , which is valid for any We find that (17) we can also choose s sufficiently large so that In this situation, by (29), we have We conclude that The last assertion comes from the formula On the other hand we have This shows that Under the assumption (17) and using (28) we have We can also choose s sufficiently large so that This ends the proof of the lemma.

Concentration properties and Laplace estimates
This section is mainly concerned with the proof of (11). The initial state X 0 of the signal is a Gaussian random variable with mean p X 0 and some covariance matrix P 0 . In this case X 0´p X 0 law " P 1{2 0 W 1 and Recalling that }W 1 } 2 is distributed according to the chi-squared distribution with r 1 degrees of freedom we have @γ ă 1{p2ρpP 0 qq E´e γ}X 0´p X 0 } 2¯ď E´e γρpP 0 q}W 1 } 2¯" p1´2γρpP 0 qq´r 1 {2 ă 8.
Choosing γ " p1´2r 1 ǫq{p2ρpP 0 qq, with ǫ Ps0, 1{p2r 1 qr we find that @ǫ Ps0, 1{p2r 1 qr E˜exp We check (11) by choosing More generally, for any non negative random variable Z such that n for some parameter z ě 0 and for any n ě 1 we have E`Z 2n˘ď pz 2 nq n ď e ? 2´e 2 z 2¯n EpV 2n q for some Gaussian and centered random variable V with unit variance. We check this claim using Stirling approximation EpV 2n q " 2´n p2nq! n! ě e´1 2´n ? 4πn p2nq 2n e´2 n ? 2πn n n e´n " ? 2e´1ˆ2 e˙n n n .
By [16,Proposition 11.6.6], the probability of the following event is greater than 1´e´δ, for any δ ě 0. The above estimate also implies that from which we check that In summary we have Choosing t " p1´ǫq{pz 2 eq, we conclude that

Proof of Theorem 2.1
When U t " V t " 0 we have Next we provide a proof of the second assertion based on the above formula. For any n ě 0, we observe that with the collection of exponential martingales

This implies that
Eˆexpˆ´n Arguing as above we use the decomposition This ends the proof of the first assertion. More generally, we have This yields Observe that This shows that We set Notice that This shows that This implies that Using Hölder inequality we have This yields the estimate We conclude that Using the decomposition nd Cauchy-Schwartz inequality we check that This implies that This ends the proof of the theorem.

Proof of Proposition 2.3
By (22), for any m ě 1 we have the uniform estimate E`}X t } 2m˘1 {m ď pu t paq`mv t paqq ñ 2 E`}X t } 2m˘ď p2u t paqq m`p 2v t paqq m m m .
Choosing γ " p1´ǫq{p2ev t paqq, with ǫ Ps0, 1s we find that This ends the proof of (23). Now we come to the proof of (24). We have This implies that ż t 0 }X s } 2 ds ď  .
This ends the proof of the proposition.
6 Proof of formulae (6), (7) and (8) We start with the proof of (6). To clarify the presentation, we write Q instead of Q 2 . Let xx, yy Q " xx, Qyy and xx, xy 1{2 Q " }x} Q be the inner product and the norm associated with the symmetric definite positive matrix Q. Let zptq " x`tpy´xq be an interpolating path from x to y, indexed by t P r0, 1s. Also let ϕptq :" xu, B 2 V pzptqquy " }x`tpy´xq} Q }u} 2 Q`1 }x`tpy´xq} Q xu, x`tpy´xqy 2 Q .
In the above display, BU 1 and B 2 U 1 stands for the first and second derivative of U 1 on R; and B i U 2 stands for the partial derivatives of U 2 : px 1 , x 2 q P R 2 Þ Ñ U 2 px 1 , x 2 q P r0, 8r with respect to the i-th coordinate x i ; and B i,j :" B i B j " B j B i , with i, j P t1, 2u. This ends the proof of (7). We end this section with the proof of (8). We have |xps 1 , s 2 q, " B 2 U 2 px 1 , x 2 q´B 2 U 2 py 1 , y 2 q ‰ ps 1 , s 2 qy| ď κ B 2 U 2 }px 1 , x 2 q´py 1 , y 2 q} }ps 1 , s 2 q} 2 from which we find that |xs, " px i´yi q 2`p x j´yj q 2`s 2 i`s 2 j˘.
This ends the proof of (8).