Statistical monitoring of dynamic processes based on dynamic independent component analysis

https://doi.org/10.1016/j.ces.2004.04.031Get rights and content

Abstract

Most multivariate statistical monitoring methods based on principal component analysis (PCA) assume implicitly that the observations at one time are statistically independent of observations at past time and the latent variables follow a Gaussian distribution. However, in real chemical and biological processes, these assumptions are invalid because of their dynamic and nonlinear characteristics. Therefore, monitoring charts based on conventional PCA tend to show many false alarms and bad detectability. In this paper, a new statistical process monitoring method using dynamic independent component analysis (DICA) is proposed to overcome these disadvantages. ICA is a recently developed technique for revealing hidden factors that underlies sets of measurements followed on a non-Gaussian distribution. Its goal is to decompose a set of multivariate data into a base of statistically independent components without a loss of information. The proposed DICA monitoring method is applying ICA to the augmenting matrix with time-lagged variables. DICA can show more powerful monitoring performance in the case of a dynamic process since it can extract source signals which are independent of the auto- and cross-correlation of variables. It is applied to fault detection in both a simple multivariate dynamic process and the Tennessee Eastman process. The simulation results clearly show that the method effectively detects faults in a multivariate dynamic process.

Introduction

In most chemical plants, on-line monitoring and fault diagnosis of the process operating performance are gaining importance for plant safety and the maintenance of yield and quality in a process. An important aspect for the safe operation of chemical processes is the rapid detection of faults or process upsets and the removal of the factors causing such events. Traditionally, statistical process control (SPC) has been used to monitor individual process signals to detect trends, outliers and other anomalies. However, these procedures are of limited use with high-dimensional multivariate data that are strongly cross-correlated. The need to monitor such multivariate processes has led to the development of many process monitoring schemes that use multivariate statistical methods based on principal component analysis (PCA) and partial least squares (PLS). These methods have been used and extended in various applications (Nomikos and MacGregor, 1994; Wise and Gallagher, 1996; Dong and McAvoy, 1996; Bakshi, 1998; Li et al., 2000).

Most multivariate statistical monitoring methods based on PCA assume implicitly that the observations at one time are statistically independent to observations at past time and the latent variables follow a Gaussian distribution. However, in chemical processes, variables rarely remain at a steady state but rather are driven by random noise and uncontrollable disturbances. These effects make the variables have autocorrelation and the system have dynamic properties. This suggests that a method taking into account the serial correlations in the data is needed in order to implement a process monitoring method. Ku et al. (1995) proposed dynamic PCA (DPCA) that uses an augmenting matrix with time-lagged variables. DPCA can extract the time-series model from the eigenvectors of the covariance matrix that corresponds to zero eigenvalues. For its simplicity, DPCA has been used in many cases with other developed methods. Luo et al. (1999) used multiscale analysis and DPCA for sensor fault detection. Tsung (2000) provided an integrated approach to simultaneously monitor and diagnose an automatic controlled process by using DPCA and minimax distance classifier. Yoo et al. (2002) proposed a dynamic monitoring method for multiscale fault detection and diagnosis in the wastewater treatment process, which is based on DPCA, the D statistic, and the monitoring of individual eigenvalues of generic dissimilarity measure (GDM).

Recently, several works using a state space model have been proposed to capture process dynamics. Negiz and Cinar (1997) proposed a monitoring method that utilizes a state space identification technique based on canonical variate analysis (CVA) to solve the dynamic problem. This method takes serial correlations into account during the dimension reduction step, like DPCA, and uses the state variables for computing the monitoring statistic in order to remove the serial correlation. Russell et al. (2000) evaluated and compared the performance of PCA, DPCA and CVA for detecting faults in a realistic chemical process simulation. They also suggested a CVA-based residual space statistic (Tr2) that gave better overall sensitivity and promptness than the existing PCA, DPCA, and CVA statistics. Simoglou et al. (2002) identified the system states and the state space model parameters using the multivariate statistical projection techniques of CVA and PLS. In their paper, a number of metrics based on Hotelling's T2 statistic are proposed for the monitoring of the state of the system and the confidence limits for these metrics are calculated using the empirical reference distribution.

There are other approaches for monitoring the dynamic process efficiently. Kano et al. (2002) suggested a statistical process monitoring based on the dissimilarity of process data. It is based on the idea that a change of operating condition can be detected by monitoring the distribution of time-series process data because the distribution reflects the corresponding operating condition. Chiang and Braatz (2003) proposed an advanced method to compare distribution, where the modified distance (DI), based on Kullback-Libler information distance, is used to measure the similarity of the measured variable between the current operating conditions and the historical operating conditions. They also suggested the modified causal dependency (CD) to measure the causal dependency of two variables. Chen and Liao (2002) developed a new monitoring method, NNPCA, which integrates two data driven techniques, neural network (NN) and PCA, to handle the nonlinear dynamic process. The proposed technique uses NN as the nonlinear dynamic operator to remove the nonlinear and dynamic characteristics and applies PCA to generating simple monitoring charts based on the multivariable residuals derived from the difference between the process measurements and the neural network prediction.

More recently, monitoring methods based on independent component analysis (ICA) have been developed (Kano et al., 2003; Lee 2003a, Lee 2003b). The goal of ICA is to decompose observed data into linear combinations of statistically independent components. PCA can only impose independence up to second order statistics information (mean and variance) whereas ICA involves higher-order statistics, i.e., it not only decorrelates the data (second order statistics) but also reduces higher order statistical dependencies (Lee, 1998). Therefore, an ICA based monitoring method can give more sophisticated results than a PCA based one since ICA can extract the essential independent components that drive a process.

In this paper, ICA monitoring on lagged variables, called DICA monitoring, is suggested for developing dynamic models and improving the monitoring performance. In order to consider auto correlation, the time-lagged extension of the data matrix is performed before applying ICA. This paper is organized as follows. PCA and DPCA monitoring methods are introduced in Section 2. In Section 3, the DICA monitoring method is explained in detail with ICA algorithm and monitoring statistics. The superiority of DICA monitoring method over PCA, DPCA, and ICA ones is illustrated in Section 4 through two examples of a simple multivariate process and the Tennessee Eastman process. Finally, conclusion will be presented in Section 5.

Section snippets

PCA and DPCA monitoring

PCA has been widely used in the field of process monitoring since it can handle high dimensional, noisy, and correlated data by projecting the data onto a lower dimensional subspace which contains most of the variance of the original data (Wise and Gallagher, 1996). It decomposes the data matrix into the sum of the outer product of score vectors and loading vectors. Two typical statistical indices of T2 and squared prediction error (SPE) are used in the PCA monitoring (Kresta et al., 1991). A

Independent component analysis (ICA)

ICA is a statistical technique for revealing hidden independent components that underlie sets of random variables, measurements, or signals. In the ICA algorithm, it is assumed that at time k the observed d-dimensional data vector x(k)=[x1(k),…,xd(k)]T can be expressed as linear combinations of m unknown independent components, s1(k),…,sm(k), given by the model,x(k)=As(k)+e(k),where A∈Rd×m is the unknown mixing matrix, s(k)=[s1(k),…,sm(k)]T is the independent component vector and e(k) is the

Application

In this section, several monitoring methods, including PCA, DPCA, ICA, and DICA, are applied to monitoring problems of a simple multivariate dynamic process and the Tennessee Eastman process.

Conclusions

In this paper, a new statistical process monitoring method using dynamic independent component analysis is proposed to monitor a process with auto- and cross-correlated variables. Since the goal of ICA is to find a linear representation of non-Gaussian data so that the components are statistically independent up to more than second order statistics, ICA can reveal more useful information than PCA. The proposed monitoring method, DICA, using ICA to the augmenting matrix with time-lagged

Acknowledgements

This work was supported by a grant No. (R01-2002-000-00007-0) from Korea Science & Engineering Foundation.

References (33)

  • E.B Martin et al.

    Non-parametric confidence bounds for process performance monitoring charts

    Journal of Process Control

    (1996)
  • A.C Raich et al.

    Multivariate statistical methods for monitoring continuous processesassessment of discriminatory power disturbance models and diagnosis of multiple disturbances

    Chemometrics and Intelligent Laboratory Systems

    (1995)
  • E.L Russell et al.

    Fault detection in industrial processes using canonical variate analysis and dynamic principal component analysis

    Chemometrics and Intelligent Laboratory Systems

    (2000)
  • A Simoglou et al.

    Statistical performance monitoring of dynamic multivariate processes using state space modeling

    Computers and Chemical Engineering

    (2002)
  • B.M Wise et al.

    The process chemometrics approach to process monitoring and fault detection

    Journal of Process Control

    (1996)
  • B.R Bakshi

    Multiscale PCA with application to multivariate statistical process monitoring

    American Institute of Chemical Engineering Journal

    (1998)
  • Cited by (327)

    View all citing articles on Scopus
    View full text