Kalman Filtering for Genetic Regulatory Networks with Missing Values

The filter problem with missing value for genetic regulation networks (GRNs) is addressed, in which the noises exist in both the state dynamics and measurement equations; furthermore, the correlation between process noise and measurement noise is also taken into consideration. In order to deal with the filter problem, a class of discrete-time GRNs with missing value, noise correlation, and time delays is established. Then a new observation model is proposed to decrease the adverse effect caused by the missing value and to decouple the correlation between process noise and measurement noise in theory. Finally, a Kalman filtering is used to estimate the states of GRNs. Meanwhile, a typical example is provided to verify the effectiveness of the proposed method, and it turns out to be the case that the concentrations of mRNA and protein could be estimated accurately.


Introduction
According to the genetic central dogma, a specific protein can be generated by a complex gene expression process (including transcription process, translation process, and other interaction process) among DNAs, RNAs, and gene products [1,2]. To guide the gene expression correctly, each stage of the gene expression should be regulated. The regulation functions for each stage form genetic regulatory networks (GRNs). Cleary, gene expression levels can be determined by GRNs. For this, a lot of GRNs models have been built to track the concentration of mRNA and protein, like Boolean model [3,4], Bayesian model [5][6][7], differential equation model [8][9][10][11], and statespace model [12,13]. However, due to the uncertainties of the system, time-varying delays [14][15][16] and data missing [17,18] in real gene expression process, the measurements obtained from the sensor are usually contaminated by noise and cannot represent the true values well. Thus, a lot of filtering methods are proposed to reveal the true values.
In studying the stability of genetic regulatory networks, noise disturbances are one of the main factors that cannot be ignored, and it is mainly composed of process noise and measurement noise. In order to restrain these noise disturbances, many filtering methods like ∞ filter [19] and Kalman filter [20] are proposed to obtain stable GRNs. Although process noise and measurement noise were usually taken into consideration, the correlation between process noise and measurement noise always is ignored in these methods, so it does not have the generality from this point of view. In this paper, in order to make the filtering method more representative, the correlation between process noise and measurement noise would be taken into consideration; meanwhile, the correlation will also be decoupled in theory.
Generally, gene expression levels (the concentration of mRNA and protein) can be measured by the DNA microarray technology, but there are many reasons which can cause value miss like dust or scratch on the slide, inappropriate thresholds in preprocessing, insufficient resolution of the microarray, experimental errors during the laboratory processes, or image corruption [18]. So, the measured value for gene expression levels would contain a certain degree of distortion that would cause concentration value deviating from real concentration. To overcome this drawbacks, the setvalues filtering for GRNs with missing value was proposed in [17,21]; although this method has dealt with the specific well, it did not give a detailed explanation about missing 2 Computational and Mathematical Methods in Medicine The degradation rates of mRNA ( ) The degradation rates of protein ( ) The coupling coefficient of the genetic networks ( ) The translation rate The bounded constant which denotes the dimensionless transcriptional rate [21] value in a detailed mathematical formula, so, in this paper, the observation model with missing value will be given; meanwhile, a Kalman filtering will also be designed to obtain stable GRNs with missing value. In this paper, an estimation problem for a class of discretetime GRNs model with time-varying delays, missing values, and correlation of noise is considered. The rest of the paper is organized as follows. In Section 2, a discrete model of genetic regulation networks is introduced; we also built observation model with missing value to give a detailed explanation about it in mathematical formula; meanwhile, the correlation between process noise and measurement noise is decoupled in theory. In Section 3, a Kalman filtering is designed to estimate the real concentrations of GRNs; meanwhile, the stability of Kalman filtering is analyzed. In the Section 4, a typical example is provided to illustrate the effectiveness of the proposed method.
In addition, (⋅) ∈ R is a monotonic function in Hill form, which represents the feedback regulation of the protein.
In practice, the actual GRNs might be influenced by the dynamic reaction of the networks, time delays, and molecular noise. Based on system (4), discrete-time GRNs with observation equation and noises are considered: ( ) is the noise driven matrix, and ( ) is the observation matrix. In addition, Then, in order to solve the time-delay of the system (5), a new state vector is defined as follows: Using the new state variable (7) gives where ( ) and V( ) are white, zero-mean, correlated noises; furthermore, Computational and Mathematical Methods in Medicine 3 As for the measurements model with missing value, it can be expressed as that measurement values lost at a certain probability, so, the measurements model with missing value can be described as follows [24]: where ( ) is received by the estimator, the initial state (0) is independent of ( ), ( ), and V( ) and satisfies the fact that obey the Bernoulli distribution, and it is uncorrelated with other random variables. There are two basic properties about ( ): where 0 ≤ ( ) ≤ 1. If ( ) = 0, it means the measurements value is lost at , and there is no missing value with ( ) = 1.
More properties about the distribution of ( ) are showed in [24]. Then, substituting the observation equation of system (8) into (10), thus, a discrete-time model of GRNs with the observation equation with missing value is established as follows: where Let ( ) denote the autocovariance matrix of ( ), V ( ) denote the autocovariance matrix of V( ), and ( ) denote the cross-covariance matrix of ( ) and V( ).
To simplify the calculation,Φ( ),Γ( ), and̃( ) can be broken down into some simple separations as follows: Since the process noises of this system are correlated with the observation noises, to decouple the relevance about ( ) and V( ), according to system (12) and then adding (17) to the state equation of (12), we have where ( ) ∈ × . Clearly, the last two terms in (18) are the process noises Since Kalman filtering requires that the process noise and the measurement noise must be white uncorrelated Gaussian noise, then consider the correlation between process noise and measurement noise firstly: Clearly, if ( ) is chosen as (21)

Main Results
In this section, the Kalman filtering is designed for obtaining the minimum variance estimation. Firstly, the expression of the filtering error is calculated, and then the Kalman gain can be obtained by minimizing the covariance matrix of the filtering error ; at last, the recursion of the filtering error is calculated; thus, the design of Kalman filtering is completed.

Numerical Example
In this section, an example will be provided to show the effectiveness of the proposed method. In Escherichia coli [25], the dynamics of the networks have been experimentally studied, and the model of 3-gene repressilator is given as follows:̇= − + 1 + , Computational and Mathematical Methods in Medicine where denotes the concentrations of three mRNA and denotes the concentrations of three repressor-proteins, is the feedback regulation coefficient, denotes the ratio of the protein decay rate to the mRNA, and is the Hill coefficient, = lacl, tetR, cl; = cl, lacl, tetR.
The discrete-time GRNs model based on the method in [26] can be obtained as Let ℎ = 1, the Hill coefficient = 2, the time-delay = 1, ( ) = 2 /(1 + 2 ), and the other parameters are taken as follows: ] . (39) According to system (3), we can get that the mRNA and proteins will adjust each other; they will also degrade along with the time, so the GRNs would tend to be equilibrium if there are no noise disturbances, and the unique equilibrium can be checked easily when ( ) = 0; thus, the system's states ( ) and ( ) with ( ) = 0 are shown in Figures 1 and 2.    designed in this paper is effective for the GRNs with missing value and noise correlation.
In order to test out the influence of the missing rate, the experiments with four missing rates of 10%, 20%, 30%, and 50% are carried out. In addition, the normalized root mean squared error (NRMSE) [27] is used to indicate the influence level of the missing rate, and the NRMSE is defined as So, the NRMSE are shown in Table 2.
Compared with the NRMSE obtained by set-membership filtering given in [17], in spite of the missing rate increases from 10% to 30%, the NRMSE listed in Table 2 increases  slightly; however, the NRMSE increases greatly with the increasing of the missing rate in [17]. Moreover, at the low level of missing rate, the set-membership filtering has a better performance, but at the high level of missing rate, the method proposed in this paper is more appropriate than the setmembership filtering, and the cut-off point roughly equals 14.66%. Thus, it shows that the proposed method is more effective for the filtering problem for GRNs.

Conclusion
In this paper, a discrete model of genetic regulation networks is introduced; we also built an observation model with missing value to give a detailed explanation about it in mathematical formula; meanwhile, the correlation between process noise and measurement noise is decoupled in theory. Finally, a Kalman filtering is designed to obtain stable GRNs; meanwhile, the simulation result shows that the method proposed in this paper is effective for the GRNs with missing value, and compared with the set-membership filtering, the Kalman filtering has a better performance when the missing rate stays at a high level.