Probabilistic Estimation and Control of Dynamical Systems Using Particle Filter with Adaptive Backward Sampling

Estimating and controlling dynamical systems from observable time-series data are essential for understanding and manipulating nonlinear dynamics. This paper proposes a probabilistic framework for simultaneously estimating and controlling nonlinear dynamics under noisy observation conditions. Our proposed method utilizes the particle filter not only as a state estimator and a prior estimator for the dynamics but also as a controller. This approach allows us to handle the nonlinearity of the dynamics and uncertainty of the latent state. We apply two distinct dynamics to verify the effectiveness of our proposed framework: a chaotic system defined by the Lorenz equation and a nonlinear neuronal system defined by the Morris–Lecar neuron model. The results indicate that our proposed framework can simultaneously estimate and control complex nonlinear dynamical systems.


Introduction
Estimating dynamical systems from noisy time-series data is vital to understanding complex systems [1].Furthermore, controlling dynamical systems has attracted significant attention in various fields for manipulating known complex systems.Controlling unknown complex systems requires simultaneously estimating and controlling dynamical systems from time-series data.
The state-space model has been utilized in various fields to estimate latent variables from noisy observational data [2][3][4][5][6][7][8][9][10].The state-space model (SSM), also called the general hidden Markov model, is a probabilistic framework used to represent the dynamical system governing the latent variable and the probabilistic observation producing the observation variable.Under the assumption that the parameters governing the entire dynamics are known, the latent variable can be estimated efficiently using a sequential Bayesian filter such as a particle filter [11][12][13].Nonetheless, in practice, it is unrealistic that the parameter values are completely known.If the parameters are unknown, it is necessary to estimate the latent variable and the parameters simultaneously.Therefore, many parameter estimation methods based on the particle filter have been proposed to estimate the latent variable and the parameters of the SSM (see [14] and the references therein).Some offline methods have been proposed, where every recursive parameter update requires a batch of observational data [15].Other online methods have been proposed, where the parameter values can be estimated at the same time that new observational data become available [16].However, previous online parameter estimation methods require multiple samples for each particle drawn from backward ancestor sampling every time, which leads to high computational costs.
In contrast, model predictive control has been applied in various fields to control nonlinear dynamical systems [17].Model predictive control is a control strategy that predicts a known model's behavior in a future interval and calculates an optimal control input; however, to apply model predictive control to a nonlinear model, it is necessary to formulate an optimal control problem, depicted by the calculus of variations or the Euler-Lagrange equation.Solving the optimal control problem, including nonlinearity, usually requires high computational cost and is not always feasible.In particular, particle filter-based model predictive control (PF-MPC) has been proposed [18].PF-MPC can directly handle known nonlinear models by using a particle filter; however, in previous studies on control, the model dynamics are assumed to be known [18].For considering real-world applications, the dynamics of complex systems are often unknown.
In this paper, we propose a framework for simultaneously estimating the latent variable and the parameters of the state-space model, as well as controlling the latent variable, fully based on the particle filter.The proposed method realizes a general method for estimating and controlling dynamical systems described by the state-space model by integrating and adapting the framework of the particle filter (PF), the online expectation-maximization (EM) algorithm, and model predictive control (MPC).With the PF, we estimate the latent variable of the nonlinear dynamics online.In particular, to fully realize the online algorithm for simultaneous estimation and control, we derive a novel online parameter estimation method for the state-space model by applying an adaptive smoothing (AdaSmooth) algorithm, which has recently been proposed for smoothing statistics based on the PF [19].By combining AdaSmooth with the online EM algorithm, we efficiently approximate the sufficient statistics and estimate the parameter values based on the smoothed sufficient statistics.Furthermore, we consider combining PF-based model predictive control (PF-MPC) [18] with our AdaSmooth-based online EM algorithm to form the feedback control law.By introducing the PF not only as a state estimator and a prior estimator for the dynamics but also as a controller, we can directly handle the nonlinearity of the dynamics and uncertainty of the latent state.
This paper is organized as follows.Section 2 introduces the state-space model and the particle filter.We then propose the AdaSmooth-based online EM algorithm for estimating the parameters of the state-space model.Finally, by combining PF-MPC with the state estimation PF and the AdaSmooth-based online EM algorithm, we propose a framework for estimating and controlling dynamical systems.In Section 3, we verify the effectiveness of our proposed method using simulation environments.We apply our proposed framework to two distinct dynamical systems: a chaotic system defined by the Lorenz equation and a nonlinear neuronal model.In each subsection, we first formulate the state-space model for each dynamical system.We then derive the parameter estimation procedure based on the AdaSmooth-based online EM algorithm.Next, we derive the concrete control law based on PF-MPC before applying our proposed framework to each complex nonlinear dynamical system.Futhermore, we present the results for estimating and control the chaostic system and neural system.Finally, Section 4 presents our concluding remarks.

Methods
This study proposes a probabilistic framework for concurrent data assimilation-based control of dynamical systems.To handle a realistic situation, we assume that we can only observe noisy data via some observation model, i.e., we cannot directly obtain the true latent state.Figure 1 shows the overview of our proposed framework.Our proposed framework comprises three parts: (i) estimation of the posterior distribution of the latent state from noisy observational time-series data, (ii) online estimation of the model parameters governing the dynamical system, and (iii) solving the optimal control problem using a control PF with the estimated state and parameters.(i) We apply a particle filter to estimate the latent state x t from noisy observational data y t online using data assimilation.(ii) We apply our adaptive smoothing (AdaSmooth)-based online expectation-maximization (EM) algorithm to estimate the dynamics parameter θ using the particles obtained from the state estimator in statistical machine learning.(iii) We employ another particle filter based on model predictive control to determine the approximate feedback control signal u t for controlling the target dynamics.

Formulation of State-Space Model
First, we formulate the SSM. Figure 2 shows the overview of the SSM.The SSM includes the system model and the observation model.The system model represents the dynamics of the latent variable x t .Conversely, the observation model represents how we obtain the observation variable y t under a realistic situation where some noise exists.We consider the system model, including state x t , control input u t , and certain system noise z t , with a Markov property: where t denotes a discrete time index, θ is an unknown parameter governing the entire dynamics, and f is a function determining the dynamics, which can include nonlinearity.
Corresponding to Equation (1), we also define the probabilistic representation of the system model for the SSM as follows: We then formulate the observation model for the SSM.To reflect a realistic scenario, we assume that we can only obtain the observation variable y t and not the latent variable x t directly at time t.Hence, we formulate the following observation model for the SSM: where g is a function determining the information we obtain via the observation and η t represents certain noise.Corresponding to Equation (3), we also define the probabilistic representation of the observation model for the SSM: For simplicity, we assume that the joint probability p(x 1:t , y 1:t ; θ) belongs to the exponential family, which is a common assumption.Here, x 1:t and y 1:t represent the time-series of the latent variable and observation variable x 1:t = {x 1 , . . . ,x t } and y 1:t = {y 1 , . . . ,y t }, respectively.For example, when both the latent state and observation variable are onedimensional real numbers and the model is linear Gaussian, the state-space model is described as follows: where x t and y t ∈ R are the latent state and observation variable, respectively; a, b, c, and d ∈ R are parameters; and z t−1 and η t are noise terms, both of which obey standard Gaussian distributions N (0, 1).In this linear Gaussian case, the probabilistic representation of the SSM is expressed as follows: where N (x | µ, σ 2 ) denotes the Gaussian distribution with mean µ and variance σ 2 .

Estimation of the Hidden State
To estimate the latent state x t from noisy time-series observational data y 1:t = {y 1 , y 2 , . . . ,y t } online, we apply the particle filter (PF) [11-13,20].
Here, we estimate the latent variable using the filtering distribution p(x t | y 1:t ).In the PF, the filtering distribution is approximated online using particles x , where N is the number of particles.We calculate these particles by alternately performing two steps: a resampling step and a prediction step.
In the resampling step, we resample particles obtained previously based on their weights.Although many resampling schemes have been proposed, we use multinomial resampling [21,22].Using a multinomial distribution M w t for the i-th particle at time t is sampled as follows: where i ∈ {1, 2, . . . ,N} is the index of the particle and M(w (1) , . . ., w (N) ) denotes the multinomial distribution supported on {1, 2, . . ., N}.Using the ancestor index A (i) t , the i-th particle x(i) t at time t is resampled as follows: We apply this resampling step only when the estimated effective sample size Neff satisfies Neff (t) ≤ αN, where 0 ≤ α ≤ 1 is a hyperparameter that determines the threshold of the effective sample size [23].Here, the estimated effective sample size Neff is calculated as follows: When Neff (t) > αN, instead of probabilistic resampling, as described in Equation ( 8), we simply set A (i) t = i for all particles.In the prediction step, we sample particles that approximate the filtering distribution when new observational data y t are obtained.First, we sample each particle using the system model: We then calculate the weight of each particle using the observation model.
where w(i) t denotes an un-normalized weight, and we assume that the weights from the previous time step and the ancestors are accessible.Note that the weight update given by Equation ( 12) is derived from sequential importance sampling [20], where the importance distribution is assumed to be the same as in the system model.Finally, we normalize the weights as follows: Using the sampled particles and their weights x allows us to obtain the filtering distribution of the latent variable online as follows: where δ(•) is Dirac's delta function.Here, the particles and their weights at time t = 0 are generated as follows: where p 0 (x) is the initial distribution.We estimate the latent state online by alternately applying the resampling and prediction steps as new observational data arrive.Although the PF can suffer from the path-degeneracy problem, it can be mitigated using the resampling step.In this study, we apply multinomial resampling; however, other resampling methods can be used in our proposed framework.The particles and their associated weights are used to estimate the latent state and the parameters, including a prior for the state in the controller, within our proposed framework.

Estimation of the Parameter
We utilize the particle filter-based online parameter estimation strategy to estimate the parameters governing the dynamics online.This subsection briefly introduces the original expectation-maximization (EM) algorithm [15] and particle-based online EM algorithm [14,16].We then derive an efficient parameter estimation procedure using additive smoothing (AdaSmooth) [19].

EM Algorithm
The EM algorithm [15] is a widely used offline parameter estimation method.In the EM algorithm, when we have a batch of observational data {y 1 , . . . ,y T }, we estimate the parameter sequence θ1 , . . ., θk , . . .by alternately conducting an E-step and an M-step, where T is the terminal time and k is the number of iterations.
In the E-step, we calculate the expectation of the joint probability p(x 1:T , y 1:T ; θ) given the previous iteration parameter θk−1 .
where E[•] p(x 1:T |y 1:T , θk−1 ) denotes the expectation of the joint filtering distribution with respect to x 1:T given the observational data y 1:T and the parameter θk−1 .L 1:T (θ) corresponds to the log-likelihood of the joint probability p(x 1:T , y 1:T ; θ).In the M-step, we update the estimation parameter at this iteration θk as follows: where arg max θ denotes the maximizer with respect to the parameter θ.
The EM algorithm is an appealing parameter estimation method; however, we need a batch of observational data, and the computational cost of repeating two steps as the size of the observational data increases cannot be ignored.Furthermore, it is difficult to calculate the expectation in the E-step in typical cases.Moreover, an online parameter estimation algorithm must be utilized to realize a simultaneous method for estimating and controlling nonlinear dynamics.

AdaSmooth-Based Online EM Algorithm
In this subsection, we propose the AdaSmooth-based online EM algorithm for estimating the parameters of the SSM.To overcome the problems in the original EM algorithm, some online adaptations have been proposed [14, 24,25].In particular, a particle-based online EM algorithm for time-series state-space models has been proposed [16].However, a previous parameter estimation method based on PaRIS [26] requires multiple samples for each particle drawn from backward ancestor sampling every time, which leads to high computational costs.
Figure 3 shows the flow of our proposed online EM algorithm for estimating the parameter θ t online.We consider the parameter θ t at time t rather than the iteration number k used in the offline EM algorithm.Note that the model parameter θ in the true dynamics is not time-varying, whereas the model parameter for estimating and controlling the dynamical system should be treated as a time-varying state parameter θt in an online manner.Furthermore, we approximate smoothing sufficient statistics Φ 1:t; θt−1 [S t ] using particles obtained online before and after time x via the PF [16].To efficiently estimate the parameters, we apply AdaSmooth [19] to smooth the sufficient statistics.AdaSmooth requires only one sample for each particle drawn from backward ancestor sampling.In addition, AdaSmooth decides whether we should conduct backward ancestor sampling adaptively; hence, we can obtain approximated smoothing sufficient statistics Φ 1:t; θt−1 [S t ] and estimate the parameter θt online.
} N i=1 at time t are obtained by the particle filter (PF) with new observation y t and current parameter value θt−1 , we update the sufficient statistics S t (x 0:t ) using AdaSmooth, which incorporates particles at two subsequent times: t − 1 and y (x . (Mstep) Based on the updated sufficient statistics S t (x 0:t ), we estimate the parameter value θt online.We can efficiently estimate dynamics online by repeating the E-step and the M-step as new observational data are obtained.
First, we rewrite the M-step in the EM algorithm.Assuming that the joint probability p(x 1:t , y 1:t ; θ) belongs to the exponential family, a suitable function Λ(•) [14] exists as a maximizer of the log-likelihood in a specific form.That is, we can rewrite the M-step in the context of online estimation as follows: where Φ 1:t; θt−1 denotes the expectation with respect to the joint probability p(x 1:t | y 1:t , θt−1 ).The parameter index now becomes time t, contrasting with the iteration count k used in Equation (17) in the offline context.S t is the sufficient statistics defined as follows: Here, sτ (x τ−1:τ ) = sτ (x τ−1 , x τ ) is a statistic given by the states x τ−1 and x τ before and after times τ − 1, τ.
Next, we estimate Φ 1:t; θt−1 [S t ] online.At time t, we calculate the approximated smoothed sufficient statistics in the E-step [14,16] as follows: where κ t (x t ) is a statistic updated each time via the following procedure: where κ 0 (x 0 ) = 0 and γ t is a decay rate that satisfies Here, we estimate κ t (x t ) using particles.We prepare a new set of particles κ t (x t ).Then, we derive the update dynamics of κ t (x t ) based on AdaSmooth.Essentially, we update each particle κ (i) t corresponding to Equation ( 21) as follows: where t is the ancestor index obtained in the PF resampling step.The first term in Equation (22) represents the value at a previous time.The second term in Equation ( 22) represents the sufficient statistics, including the information obtained at time t.Nonetheless, updating κ (i) t using only Equation ( 22) can lead to the particle path-degeneracy phenomenon and usually leads to instability.We introduce adaptive ancestor backward resampling to avoid this problem [19].During resampling in the PF, when the diversity of the ancestors Nanc (t) satisfies Nanc (t) ≤ βN, we conduct backward ancestor resampling to sample B (i) t .We then update the dynamics using the following instead of Equation ( 22): The first term in Equation ( 23) is the same as in Equation ( 22).The second term in Equation ( 23) represents the correction of the sufficient statistics using a new ancestor drawn from backward ancestor sampling.Here, 0 ≤ β ≤ 1 is a hyperparameter that determines the threshold of ancestor diversity.Nanc (t) is the number of unique Enoch indices {E (i) t }, defined as follows [19]: Here, the Enoch index t is updated via the following procedure: where t 0 represents the last time backward ancestor sampling was conducted.The definition of each Enoch index shows that it retains the index of the ancestor at the last time backward ancestor sampling was conducted.Hence, a decrease in Nanc (t) corresponds to a decrease in the diversity of the ancestors.The value of Nanc (t) is used to determine whether we should conduct backward ancestor sampling.
Here, we show the procedure used in backward ancestor sampling to obtain the backward ancestor index B (i) t .According to Bayes' theorem, the backward ancestor index is sampled as follows: We implement this sampling scheme efficiently using rejection-acceptance sampling [19,26].Once κ t (x t ) is approximated, the smoothing sufficient statistics are estimated as follows: Finally, we estimate the parameter using Equation (18).The parameter θt , estimated using our AdaSmooth-based online EM algorithm, is also used in the controller described in Section 2.4.
Here, we briefly compare our proposed AdaSmooth-based online EM algorithm with the offline EM algorithm and their variants without considering control contexts.The offline EM algorithm requires many iterations with a batch of data.As a result, it has high computational costs.On the other hand, our AdaSmooth-based online EM algorithm requires no iteration for each step.Hence, our proposed framework realizes efficient online estimation of model parameters.

Control Strategy for the Dynamics
To design an efficient controller for the general state-space model, PF-based model predictive control (PF-MPC) [18] is integrated with our dynamics estimator in the proposed method.Model predictive control (MPC) is a theoretical framework used for feedback control in dynamical systems.In conventional MPC, an optimal control problem is formulated at time t over a future interval called a horizon, using the model x t+1 = f (x t , u t ).Next, the problem is numerically solved to obtain the optimal control input series {u * τ } τ∈horizon over the horizon.Finally, the first control input from this series is adopted as the actual input.Specifically, in conventional MPC, the optimal control problem in the horizon is treated as solving the Euler-Lagrange equation numerically, which is computationally expensive because it includes the derivative of the model concerning the state x t and control input u t .In contrast, in PF-MPC, the optimal control problem is treated as a filtering problem for the control input u t based on an augmented state-space model within the horizon.
Figure 4 shows the overview of PF-MPC.Our framework has two distinct PF modules: one (state PF) acts as a state estimator, as described in Section 2.2, and the other (control PF) serves as a solver for the control-filtering problem defined below.First, the latent state particles and their associated weights (x are obtained via the PF.Then, the control input particles ū(i) t are sampled from the proposal distribution p( ūt | u t−1 ).Subsequently, by regarding the input variable as part of the latent state, we form augmented state particles (x over the horizon.With the given reference trajectory r t , . . ., r T+T H , we solve the control-filtering problem by applying the control PF within the horizon and obtain the filtering distribution of the control input p( ūt | r t , . . ., r T+T H ).
Here, T H is the duration of the horizon.Finally, we adopt u t as the actual input based on the filtering distribution of the initial control input.
. These particles and their weights (x , obtained via the state PF, are directly used as initial augmented particles by the PF for control in the horizon (control PF).The input particles u are sampled from the proposal distribution p( ūt | u t−1 ).Using the control PF, we can obtain the filtering distribution of the optimal control p( ūt | r t:t+T H ) given the reference trajectory; the optimal control input is calculated using a specific point estimator like the posterior mean.
Here, we discuss the details of PF-MPC.To formulate the control-filtering problem, we define an augmented state-space model over the horizon.The augmented state-space model consists of the augmented system model and the augmented observation model.
First, we formulate the augmented system model over the horizon.Within this horizon, we consider the control input u t as part of the augmented state ζ t = ( xt , ūt , ũt ), where ũt serves as an auxiliary variable to preserve the initial input throughout the horizon.Here, xt , ūt , respectively, denote the predictive state and control input within the horizon.Note that the xt and ūt notations are required to distinguish the predicted state and control input transitions over the horizon ( xt , ūt ), . . ., ( xt+T H , ūt+T H ) from the actual state and input time series {(x 1 , u 1 ), . . ., (x t , u t ), . ..}.Therefore, we define the following augmented system model for the augmented state ζ t within the horizon: where τ ∈ {t, t + 1, . . ., t + T H } is the time index within the horizon, p( xτ+1 | xτ , ūτ ; θt ) is defined using the original system model within the state-space model described in Section 2.1, p( ūτ+1 | ζ τ ) is the transition probability, and δ( ũτ+1 − ũτ ) is the probability corresponding to the deterministic transition ũτ+1 = ũτ .The parameter θt is estimated using our AdaSmooth-based online EM algorithm and is fixed throughout the horizon.Next, we formulate the augmented observation model.We consider controlling the latent state x t toward the given reference trajectory {r 1 , . . . ,r t , . ..}.We define the augmented observation model as follows: For example, a Gaussian distribution can be used as the observation model: where R τ is an appropriately sized matrix and Σ r is a diagonal covariance matrix that adjusts the error between the state and reference trajectory.Note that the reference value's dimension does not have to be the same as that of the latent variable, i.e., R τ can be a non-square matrix.Furthermore, the augmented observation model is very flexible, but its details are omitted here (see [18]).
We then solve the control-filtering problem given by the augmented state-space model [Equations ( 28) and (29)] and the given reference trajectory {r τ } t+T H τ=t .Using control PF in the horizon {t, t + 1, . . ., t + T H }, we solve the control-filtering problem efficiently to estimate the augmented state ζ t = ( xt , ūt , ũt ).Here, the initial augmented particles and their associated weights w(i) in the horizon are initialized for each particle using information obtained from the state PF as follows: where (x are the particles obtained from the state PF for the latent state estimation and p( t ) is the initial proposal distribution for the control input.To avoid confusion, we note here that the aim of the control PF is to estimate the optimal control input obtained using the initial control input via the proposal distribution.For filtering the initial control input ūt , an auxiliary variable ũt is defined to preserve the initial input ūt .Although ũ(i) t = ū(i) t at time t is defined in Equation ( 31), the predictive control inputs ū(i) t+1 , ū(i) t+2 , . . . in the horizon have values different from ũ(i) t .Therefore we need both notations ūt and ũτ for the control PF in the horizon.
Finally, we estimate the optimal control input based on the filtered particles.By extracting the auxiliary variables that retain the initial control input and their associated weights ũ(i) t+T H , w(i) , we compute the actual control input at time t via point estimation of the filtering distribution as follows: It should be noted that the auxiliary variable ũt+T H inherits the information at time t, . . ., t + T H − 1.Hence, to obtain the estimated optimal control input, we focus solely on these particles at time t, . . ., t + T H − 1 within the horizon.
Here, we briefly discuss the differences between conventional MPC and PF-MPC.MPC requires derivatives of the model concerning the state and the control input.Furthermore, exact point estimation of the latent state is required.Conversely, PF-MPC does not require model derivatives and only relies on particles representing the latent variable.The only requirement of PF-MPC is that the model can be simulated forward in time, which is satisfied in typical cases.Nonetheless, under certain assumptions, the control input obtained using PF-MPC is equivalent to that of conventional MPC [18].Moreover, to clarify the advantage of our proposed method compared with previous work, we briefly touch on the differences between the previous work [27] and our proposed framework.Three significant differences exist between the previous work and our proposed framework.First, the previous work updates parameters at fixed intervals.In contrast, our proposed framework updates parameters continuously, with the controller referencing these updated parameters; hence, our proposed framework is more adaptive.Second, the previous work uses the offline EM algorithm for parameter estimation.Due to the structure of the offline EM algorithm, the previous framework maintains a history of particles within a window.In contrast, our proposed method is based on the online EM algorithm.Using the online EM algorithm as the fundamental parameter estimation strategy, we retain only the preceding and succeeding particle sets; therefore, our framework is more memory-efficient.Third, in the previous work, there are complex hyperparameters, e.g., the parameter update interval and window length.These complex hyperparameters require sensitive hand-tuning and domain-specific knowledge.Conversely, the hyperparameters of our proposed framework are defined in a simpler form.For further details, the typical values of the hyperparameters for AdaSmooth are discussed in [19].Therefore, our proposed framework is easier to tune compared with the previous work.
We have proposed a fully PF-based framework for simultaneously estimating the latent state and the parameters governing the dynamics, and controlling the dynamics against a general state-space model.Although conventional MPC can handle the model's nonlinearity directly and PF-MPC can address the model's uncertainty, the latent variable and dynamics are assumed to be known.The previous method [27] attempts to resolve this problem; however, it requires difficult hyperparameter tuning and applies only to neuronal models.Our proposed method establishes a feedback control law for general state-space models by combining a PF and an AdaSmooth-based online EM algorithm.

Experiments
We verify the effectiveness of our proposed method in a simulation environment.We apply our proposed method to two complex systems: the Lorenz system and the Morris-Lecar neuron model.For each distinct model, we first formulate its complex dynamics as a general SSM.Next, we derive the sufficient statistics necessary for our AdaSmooth-based online EM algorithm to estimate the parameters.Then, we formulate the augmented SSM over the horizon to control the dynamics.Finally, the experimental settings and results for each experiment are shown.

Application to Chaotic Lorenz System
The Lorenz system, introduced in [28], is known as a chaotic system.Because chaotic behavior is not always desirable, eliminating it is essential for practical application.Delayed feedback control has been proposed to stabilize chaotic models [29,30].In contrast, a control strategy for chaotic behavior with simultaneous parameter estimation using statistical machine learning has not been studied.By applying our proposed method to the Lorenz system, we aim to confirm its ability to estimate the latent state and parameters of the chaotic Lorenz system and to stabilize its chaotic behavior.
The Lorenz system is given as follows: where t ′ represents continuous time; (x, y, z) ⊤ ∈ R 3 represents the three-dimensional latent states; u is the control input; and σ, r, b are parameters.Based on a previous study on suppressing chaotic behavior [30], we assume the control input only affects one element, y.
To avoid confusion, the variables used in the Lorenz system are shown in Table 1.When we fix u = 0, the Lorenz system with σ = 10, b = 8/3, r > r H exhibits subcritical Hopf bifurcation, where r H ≃ 24.73684211.We aim to control the y-element value toward the unstable fixed point y f = b(r − 1) under noisy environmental conditions of the latent state where only noisy observations of the latent state can be observed.To this end, we first derive the SSM for the Lorenz system.We then derive the E-step and the M-step of our online EM algorithm.Finally, we formulate the augmented SSM for the Lorenz system to control the latent state.With these Lorenz system formulations, we apply the proposed method to the Lorenz system.Here, we derive the SSM for the Lorenz system.We first derive the system model of the SSM.We discretize the continuous-time model given by Equation ( 36) with a time step ∆t.
where x t = [x t , y t , z t ] ⊤ ∈ R 3 corresponds to the latent state at time t∆t, u t ∈ R is the control input at time t∆t, and θ = [σ, r, b] ⊤ ∈ R 3 corresponds to the unknown parameters of the system.f (x t , u t ; θ) is a function that describes the dynamics of the Lorenz system as follows: Next, we introduce system noise governed by additive Gaussian noise with zero mean and covariance matrix Here, diag(a 1 , a 2 , a 3 ) denotes the diagonal matrix with diagonal elements a 1 , a 2 , a 3 .The system model of the SSM for the Lorenz system is formulated as follows: Next, we derive the observation model of the SSM.We assume that the latent state with additive Gaussian noise is obtained.Hence, we formulate the observation model as follows: where Σ y ∈ R 3×3 is the covariance matrix corresponding to the observation noise.

Derivation of Online EM Algorithm
Here, we derive the sufficient statistics S t (x 0:t ) and the update function Λ(•) for the Lorenz SSM, using Equations ( 39) and ( 40), in our AdaSmooth-based online EM algorithm.After calculation, we obtain the maximizer of the expectation of log-likelihood Λ(•) as follows: where the sufficient statistics S t = (A t , b t ) comprise two additive-form statistics A t ∈ R 3×3 and b t ∈ R 3 , given as follows: where Ãτ and bτ are defined as follows: (45)

Formulation of the Augmented State-Space Model for Control
We aim to suppress the chaotic behavior and control the y-element of the latent state toward the fixed point y f = b(r − 1) [30].First, we define the control transition over the horizon.To consider realistic scenarios, we impose a constraint concerning the control input at every time t as follows: where u lim > 0 is the limit of the control input.Therefore, we define the augmented system model for the control input transition as follows: where z τ+1 ∼ N (0, σ 2 u ) is Gaussian noise and clamp u lim (•) is a clamp function defined as follows: To initialize the augmented particles, we define the following initialization procedure corresponding to the initial proposal distribution p( where z ) is Gaussian noise.Note that we do not need a probabilistic form of the augmented SSM.
We then define the augmented observation model of the augmented SSM over the horizon.We aim to control the y-element of the latent state toward the fixed point y f ; hence, we define the augmented observation model for the PF-MPC controller as follows: where σ 2 r ∈ R is a variance that adjusts for acceptable error against the reference trajectory.The given reference value is also always equivalent to y f , i.e., r t = y f for t = 1, 2, . . . .

Settings
The true dynamics are assumed to be the Lorenz system defined by Equation (36), and in the simulation, they are solved using the Runge-Kutta method.Here, the true parameters are σ = 10 and r = 28, b = 8/3.We apply our proposed method to the Lorenz system to suppress its chaotic behavior toward the unstable fixed point y f = b(r − 1).Note that we can only observe the noisy time series of the latent variable, not the true value.
The hyperparameters are set as follows.For simulation, the time step is ∆t = 0.01.The variances of the system noise and the observation noise are ), respectively.For the PF, the number of parti- cles is N = 1000, and the threshold rate of resampling is α = 0.8.The initial particles are sampled from the initial distribution, defined as follows: where Σ x0 = diag(10 2 , 10 2 , 10 2 ).For parameter estimation, we use the threshold rate of backward resampling β = 0.7 and the decay rate γ t = t −1 .Following [16], we start updating the parameter after T burnIn steps.In this experiment, we set T burnIn = 100.To control the Lorenz system, we set the number of steps in the horizon to T H = 10.As a hyperparameter in the augmented SSM, the acceptable error variance and the control input limit are σ 2 r = 1 2 and u lim = 10, respectively.The variances of the augmented SSM's control transitions are σ 2 u = 1 2 and σ 2 u0 = 10 2 .

Results
Figure 5 shows the simulation results after applying our proposed framework to the Lorenz system.We confirmed that our method can simultaneously estimate and control the Lorenz dynamics.To verify whether our proposed method can suppress chaotic behavior, we did not adopt the control input until t = 5.
The graphs on the left in Figure 5 show the x, y, z elements of the latent state from top to bottom, respectively.The state estimated by the PF represents the filtering distribution of the latent state.Hence, to confirm that we can estimate the latent variable, the estimated value was calculated using the weighted mean of the particles as follows: The three graphs on the left in Figure 5 show the estimated values (x est , y est , z est ) (blue solid lines) and the true values (x true , y true , z true ) (green dashed lines), respectively.We found that the estimated values tracked the true values over time.In particular, the middle graph in Figure 5 also shows the target unstable fixed point y f (blue dotted line).We also found that the estimated values (x est , y est , z est ) accurately reproduced the true values (x true , y true , z true ), and the state was successfully manipulated.Here, the red dash-dotted line represents the simulation results when no input was applied, as shown in the graphs on the left in Figure 5. Without our proposed method, the Lorenz system still exhibited chaotic behavior.Conversely, after applying our proposed method, the Lorenz system stabilized and converged to the fixed point y f .
The latent state [x t , y t , z t ] is estimated using our proposed method over time.In addition, all of the estimated parameters converge to the true values compared to the initial values, although σ oscillates slightly.To confirm that the model exhibits chaotic behavior, we provide the control input after time t = 5 (gray dashed line in the middle-left graph).
To confirm whether our proposed method could suppress chaotic behavior, we visualized the trajectory of the latent variable (x t , y t , z t ) for the non-controlled and controlled Lorenz systems in a three-dimensional graph.Figure 6 shows the two trajectories of the non-controlled Lorenz system and the Lorenz system controlled by our proposed automatic control law.The left side of Figure 6 shows the simulation results of the true dynamics when the control input was fixed at zero: u t = 0.When we did not adopt any control input, the Lorenz system exhibited chaotic behavior around two fixed points.Conversely, by applying our proposed framework, as shown on the right in Figure 6, the Lorenz system stabilized around the fixed point determined in advance.
Next, we clarified whether our proposed method could estimate the unknown parameters.The graphs on the right in Figure 5 show the parameters governing the Lorenz system.Each estimated parameter value of three types (σ t , r t , b t ) of the Lorenz system converged to the true values over time, although the estimated value for σ t oscillated slightly.In particular, the bifurcation parameter r t , which influences the dynamics of y, was accurately estimated by our AdaSmooth-based online EM algorithm.Additionally, b t , governing the dynamics of z, was also estimated accurately by our proposed method.
Finally, we clarified whether our proposed method could estimate and control the Lorenz dynamics using the control input u t with a limited range of strengths.Figure 7 shows the control input (blue line) and the limit value (blue dash-dotted line).We maintained u t = 0 during the time period marked with a gray background.The control input u t always satisfied the norm constraint |u t | ≤ u lim .Hence, we can control the Lorenz dynamics simultaneously, even if we impose input constraints.The history of the control input u t obtained automatically by our proposed method.Note that the control input u t always satisfies the absolute value constraint |u t | ≤ u lim .Nonetheless, Figure 5 shows that we can simultaneously estimate and control chaotic nonlinear dynamical systems.

Application to Morris-Lecar Neuron Model
It is important to estimate and control the nonlinear dynamics of neurons to understand nerve systems and brain functions [31,32].Nonetheless, it is difficult to control neuronal dynamics due to their complexity and partial observability.Typically, only noisy membrane potentials are observable [33][34][35].
Following previous work [27], this section aims to control the time series of membrane potentials toward a given reference trajectory under realistic conditions, where only noisy time series of membrane potentials can be observed.To this end, we first derive the SSM for the Morris-Lecar neuron model.We then derive the sufficient statistics and the update function for our online AdaSmooth-based EM algorithm for online parameter estimation.Finally, we formulate the augmented SSM for the Morris-Lecar neuron model to control the membrane potential.We apply the proposed method to the Morris-Lecar neuron model based on these SSM formulations.

Formulation of State-Space Model
In the Morris-Lecar neuron model [36], the nonlinear dynamics of the membrane potential v and the channel variable n are described by the following continuoustime model: where v is the membrane potential, n is the channel variable, and I is the external input current.Here, is a function that determines the speed of the dynamics.m ∞ (v) and n ∞ (v) are nonlinear functions that describe the calcium and potassium dynamics; E L , E Ca , and E K are the reversal potentials; and C m is the membrane capacitance [27].To avoid confusion, the variables used in the Morris-Lecar neuron model are shown in Table 2. First, we derive the system model for the SSM.The continuous-time model given by Equation (53) is discretized with a time step ∆t.
where x t = [v t , n t ] ⊤ ∈ R 2 corresponds to the latent state at time t∆t, u t ∈ R corresponds to the control input at time t∆t, and θ = [g L , g Ca , g K ] ⊤ ∈ R 3 corresponds to the unknown parameters.Following previous work [27], we separate the net input I t into the controllable input u t and the known streaming currents from other neurons I inj t , i.e., I t = u t + I inj t .f (x t , u t ; θ) is a function representing the dynamics of the Morris-Lecar neuron model as follows: We then introduce system noise governed by white Gaussian noise with zero mean and covariance matrix The system model of the SSM for the neuron model is formulated as follows: Next, we formulate the observation model for the SSM.We consider that only noisy membrane potentials y t ∈ R can be observed.Introducing additive Gaussian noise as observation noise, we formulate the observation model as follows: where σ 2 y ∈ R is the variance corresponding to the observation noise.

Derivation of Online EM Algorithm
Here, we derive the sufficient statistics S t (x 0:t ) and the update function Λ(•) for the neuron model described by Equations ( 56) and (57), which are used in our AdaSmoothbased online EM algorithm.After calculation, we obtain the maximizer of the expectation of log-likelihood Λ(•) as follows: where the sufficient statistics S t = (A t , b t ) include two additive-form statistics A t (x 0:t ) ∈ R 3×3 and b t (x 0:t ) ∈ R 3 , given as follows: Here, V ion (x t ) ∈ R 3 represents the potentials of each ion channel, given as follows:

Formulation of the Augmented State-Space Model for Control
Here, we formulate the augmented SSM over the horizon.First, we derive the control transition of the augmented system model p( ūτ+1 | ζ τ ).In realistic scenarios, it is essential to guarantee that the net input current I t = u t + I inj t is always within a specific range.Therefore, we impose a norm constraint on the control input at every time t as follows: where I inj t is the current injected from other neurons and I lim is the limit current.To this end, the control transition within the horizon is described as follows: where z τ+1 ∼ N (0, σ where z Next, we derive the augmented observation model for the augmented SSM over the horizon.We define the following augmented observation model to control the time series of membrane potentials toward a given reference trajectory r t ∈ R: where σ 2 r ∈ R is a variance that adjusts for acceptable error between the predictive state transition and the reference trajectory within the horizon.

Settings
The true dynamics are assumed to follow the Morris-Lecar neuron model defined by Equation (53), and in simulation, they are solved using the Euler method.Here, the true parameters are those for the Homoclinic [37].We apply our proposed method to the Morris-Lecar neuron model to control the membrane potentials v t toward the desired reference membrane potentials r t .The reference trajectory r t is generated by the true model in advance.Note that we can only observe noisy membrane potentials; that is, we assume a partial observation situation.
The hyperparameters are set as follows.For simulation, the time step is ∆t = 0.1.The variances of the system noise and observation noise are Σ x = diag(σ 2 v , σ 2 n ) = diag(0.1 2 , 0.001 2 ) and σ 2 y = 0.1 2 , respectively.For the PF, the number of particles is N = 1000, and the threshold rate of resampling is α = 0.8.The initial distribution for the initial particles p 0 (x) is p 0 , where U (n | 0, 1) is a uniform distribution within [0, 1].For parameter estimation, we use the threshold rate of backward resampling β = 0.7 and the decay rate γ t = t −1 .We update the parameters after T burnIn = 1000 steps.To control the Morris-Lecar neuron model, we set the number of steps in the horizon as T H = 10.As a hyperparameter in the augmented SSM, the acceptable error variance and the control input limit are set to σ 2 r = 0.1 2 and I lim = 150, respectively.The variances of the augmented SSM's control transitions are σ 2 I = 1 2 and σ 2 I0 = 10 2 .

Results
Figure 8 shows the simulation results after applying our proposed framework to the Morris-Lecar neuronal system.Here, we controlled the neuronal dynamics toward the reference trajectory, which represents the Homoclinic behavior during time intervals 0 ≤ t ′ ≤ 250 and 500 ≤ t ′ ≤ 750, and the resting state during intervals 250 ≤ t ′ ≤ 500 and 750 ≤ t ′ ≤ 1000.First, we found that our method could simultaneously estimate and control the Morris-Lecar neuron model.The middle-left graph shows the true membrane potential v true (blue solid line), the estimated value v est (green dashed line), and the target membrane potential v ref (red dotted line).We found that the estimated value tracked the true value, as shown in the middle-left graph in Figure 8.In particular, the pale purple line represents the simulation results without adopting any control input.Without our proposed method, the Morris-Lecar neuron model converged to a specific value.However, by applying our proposed method, the Morris-Lecar neuron model was controlled toward the desired trajectory, allowing us to switch between neuronal firing and resting states.The bottom-left graph shows the true value n true (blue solid line) and the estimated value n est (green dashed line).We can see that the estimated value also closely tracked the true value.
Next, we verified whether our proposed method could estimate the parameters governing the dynamics of the neuron model.The right side of Figure 8 shows the parameters (g L , g Ca , g K ) governing the Morris-Lecar neuronal model.As shown in the three graphs on the right, each estimated parameter value converged to the true value over time, although the estimated values spiked around 500 to 600.
Finally, we verified whether the proposed framework could estimate and control the neuronal dynamics under varying strengths of the control input u t .The top-left graph shows the controllable input current u (blue solid line), the injected current I inj (green dashed line), the actual net input I = u + I inj , and the limit value (light-blue dash-dotted line).The net input u = u + I inj always satisfied the norm constraint |u + I inj | ≤ I lim .
Hence, we can simultaneously estimate and control the nonlinear neuronal dynamics while adhering to the constraint for the control signal.

Conclusions
In this paper, we have proposed a probabilistic framework for simultaneously estimating and controlling general dynamics entirely based on the particle filter.By introducing the particle filter not only as a state estimator and a prior estimator for the dynamics but also as a controller, we can directly handle the nonlinearity of the dynamics and the uncertainty of the latent state effectively and efficiently.With adaptive backward resampling, the proposed framework can suppress the degeneracy of the particles with respect to the sufficient statistics and realize accurate parameter estimation.Through experiments on two general models, including nonlinearity in distinct fields, we verified the effectiveness of our proposed framework.By applying our proposed framework to the Lorenz system in a simulation environment, we found that the proposed framework can not only estimate the latent variables from noisy observational data and the parameters governing chaotic behavior but also control the latent state toward the fixed point.Furthermore, by applying our proposed framework to the Morris-Lecar neuron model, we showed that even under noisy partial observation conditions, it can estimate the latent state and parameters and control the latent state toward the desired trajectory.Although our proposed method can simultaneously estimate the latent state and parameters and control nonlinear complex dynamics, our framework cannot handle semi-parametric or non-parametric models directly.In future work, we plan to extend our proposed framework to handle such semi-parametric or non-parametric models.

Figure 1 .
Figure1.Overview of the proposed framework for probabilistic estimation and control of dynamical systems.(i) We apply a particle filter to estimate the latent state x t from noisy observational data y t online using data assimilation.(ii) We apply our adaptive smoothing (AdaSmooth)-based online expectation-maximization (EM) algorithm to estimate the dynamics parameter θ using the particles obtained from the state estimator in statistical machine learning.(iii) We employ another particle filter based on model predictive control to determine the approximate feedback control signal u t for controlling the target dynamics.

Figure 2 .
Figure 2. The overview of the state-space model (SSM).

Figure 3 .
Figure 3.The flow of our proposed AdaSmooth-based online EM algorithm.(E-step) When the latent state particles { x

Figure 4 .
Figure 4. Overview of particle filter-based model predictive control (PF-MPC) integrated with our AdaSmooth-based online EM algorithm.The particle filter for state estimation (state PF) provides the distribution of latent variables p(x t | y 1:t ) as particles and their corresponding weights

Figure 5 .
Figure 5. Results for simultaneous estimating and controlling the Lorenz system.The dynamical behavior of the state variables x t = [x t , y t , z t ] ⊤ and model parameters θ = [σ, r, b] ⊤ are shown.By applying the reference signal [fixed point y f = b(r − 1)], all three variables (x, y, z) approach a steady point(x f , y f , z f ) = b(r − 1), b(r − 1), r − 1 .The latent state [x t , y t , z t ] is estimated using our proposed method over time.In addition, all of the estimated parameters converge to the true values compared to the initial values, although σ oscillates slightly.To confirm that the model exhibits chaotic behavior, we provide the control input after time t = 5 (gray dashed line in the middle-left graph).

Figure 6 .Figure 7 .
Figure 6.Dimensional visualization of trajectories with the non-controlled and controlled Lorenz systems.Left: The non-controlled trajectory obtained through simulation, starting with the same initial state used in the experiment.Right: The controlled trajectory obtained through simultaneous estimation and control of dynamics using our proposed method.Here, the arrows on each trajectory represent the direction of the transitions.By applying our proposed framework, the Lorenz system stabilizes around the desired fixed point (x, y, z) = b(r − 1), b(r − 1), r − 1 .

Figure 8 .
Figure8.Results for simultaneously estimating and controlling the Morris-Lecar neuronal system.The dynamics of state variables x t = [v t , n t ] ⊤ , control input u, and model parameters θ = [g L , g Ca , g K ] ⊤ are shown.Top-left graph: Control current u, streaming currents I inj , and net input u + I inj for the neuron, along with the control limit value.Middle-left graph: True membrane potential v true , estimated membrane potential v est , and reference trajectory v ref , along with the non-controlled trajectory.Bottom-left graph: True channel variable n t and estimated channel variable n est .Right three graphs: Estimated values of the parameters g L , g Ca , g K , with true values indicated by dashed lines.We can simultaneously estimate and control the Morris-Lecar neuronal dynamics by applying our proposed method.

Table 1 .
Input and output variables used in the Lorenz system.

Table 2 .
Input and output variables used in the Morris-Lecar neuron model.