Average Run Length on CUSUM Control Chart for Seasonal and Non-Seasonal Moving Average Processes with Exogenous Variables

The aim of this study was to derive explicit formulas of the average run length (ARL) of a cumulative sum (CUSUM) control chart for seasonal and non-seasonal moving average processes with exogenous variables, and then evaluate it against the numerical integral equation (NIE) method. Both methods had similarly excellent agreement, with an absolute percentage error of less than 0.50%. When compared to other methods, the explicit formula method is extremely useful for finding optimal parameters when other methods cannot. In this work, the procedure for obtaining optimal parameters—which are the reference value (a) and control limit (h)—for designing a CUSUM chart with a minimum out-of-control ARL is presented. In addition, the explicit formulas for the CUSUM control chart were applied with the practical data of a stock price from the stock exchange of Thailand, and the resulting performance efficiency is compared with an exponentially weighted moving average (EWMA) control chart. This comparison showed that the CUSUM control chart efficiently detected a small shift size in the process, whereas the EWMA control chart was more efficient for moderate to large shift sizes.


Introduction
Statistical process control (SPC) has been widely used to monitor processes and services, so as to avoid any instabilities and inconsistencies. The control chart is the main tool for SPC. Page first introduced the cumulative sum (CUSUM) control chart [1] and Roberts initially devised the exponentially weighted moving average (EWMA) control chart [2]. The CUSUM and EWMA control charts were developed from the Shewhart control chart [3], which is suitable for processes with a large shift size in the parameters of interest (the mean or variance) when the observations follow a normal distribution. Meanwhile, the CUSUM and EWMA control chart can detect small shift sizes in the parameters of interest and they are suitable for observations following more complex patterns, such as auto-correlated observations, trending and seasonal observations, and changing point observations [4][5][6].
SPC has been adopted for monitoring production and service processes in several fields, such as medical sciences, industrial manufacturing, network analysis, mechanical trading on securities, and healthcare. A systematic review of the researches on the limitations and benefits of SPC for the quality improvement of healthcare systems can be found in [7,8], and a comparison of SPC to several control charts that were implemented in the manufacturing industry was given by Saravanan and Nagaragan [9]. Moreover, range and X-bar control charts to monitor production at Swat Pharmaceutical Company were researched by Muhammad [10], while SPC in a computer-integrated manufacturing environment was investigated by Montgomery and Friedman [11] and traffic observations for IP networks that were monitored by CUSUM, Shewhart and EWMA control charts were studied and analyzed by Matias et al. [12].
An advantage of the CUSUM control chart is that it is suitable for detecting unstable processes for a subgroup of data and single observations. The following researches focus on the application of the CUSUM control chart for real observations. Sunthornwat et al. [4] designed optimal differencing and smoothing parameters for EWMA and CUSUM control charts to evaluate average run length (ARL) with an autoregressive fractionally integrated moving average (ARFIMA) model. The practical observations were obtained from the time intervals in days between explosions in mines in Great Britain from 1875 to 1951 [13]. Meanwhile, Sheng-Shu and Fong-Jung [14] reported that the CUSUM control chart performed better than the EWMA control chart in monitoring the failure mechanism of wafer production quality control. Benoit and Pierre detected the persistent changes in the mean and variance in the state of marine ecosystems, while using indicators of North Sea cod from the International Bottom Trawl Survey [15]. Their results showed that the CUSUM control chart is suitable for detecting small, persistent changes.
The observations or data in practical situations are often collected from stochastic processes that are dependent on time-space or time series. In other words, the models established under the econometric models are sometimes specified in time-series models. The observations in econometrics as a time-series model comprise of autoregressive (AR) and moving average (MA) elements. Moreover, it is very simple to identify the movement patterns of time-series observations in MA models and, often, the seasonal factor can be embedded in the observations, being modeled as a seasonal moving average (SMA) model. Moreover, the error, which is the difference between the exact value and approximated value, should be considered in the modeling. A smaller error signifies better accuracy. The error of a time-series model, which is called white noise, usually follows a normal distribution, but another form of time-series model with auto-correlated observations is exponential white noise [16][17][18][19][20].
Econometric models are related to the economic indicators or variables that affect economic forecasting. An exogenous variable is a variable that is not affected by other variables in the system. For example, the exogenous variable may depend on the government's investment policies. Exogenous variables that are popularly used in econometric models are the exchange rate, interest rate, and inflation rate, among others. These variables affect the econometric model when forecasting economic situations in the future. When forecasting in economics and other fields, if the forecasting model includes an exogenous variable, the model is usually more accurate than one without it. X denotes the exogenous variable in economic models, and models that are based on an MA or SMA time series with an exogenous variable are denoted as MAX and SMAX, respectively. For the quality control of processes and services, the efficiency of a control chart can be measured in terms of the proposed ARL [5,21]. The ARLs for the in-control state (ARL 0 ) and the out-of-control state (ARL 1 ) are evaluated as the efficiency criteria; a large ARL 0 means that the control chart can be applied to efficiently control the process, whereas a small ARL 1 demonstrates that the control chart can detect a change in the process quickly. The Numerical Integral Equation (NIE) method is widely used for the continuous distribution of observations in real-world applications. Furthermore, Banach's fixed-point theory has been adopted to prove the existence and uniqueness of the ARL in the following researches. Sunthornwat et al. investigated an explicit formula for the ARL on an EWMA control chart for the Autoregressive Fractionally Integrated Moving Average (ARFIMA) model with exponential white noise [4], while Mititelu et al. [22] solved one representing the ARL on a CUSUM control chart based on observations in a hyper-exponential distribution. Their findings show that the explicit formula ARL was more quickly evaluated than the NIE ARL. Petcharat et al. derived an explicit formula ARL, while using an integral equation method on a CUSUM control chart for an MA model with exponential white noise [23], with its existence and uniqueness being proved via Banach's fixed-point theory. However, the optimal parameters could not be obtained in an MA model.
As mentioned in this literature review, the ARL, for measuring the efficiency of a CUSUM control chart, is very important for comparing control chart performance. Moreover, the application of control charts to detect shifts in the processes from autocorrelated data, which needs a high accuracy model with the study of exogenous variables, is very interesting. For example, the MAX and SMAX models are widely used in the field of economics. In addition, finding the optimal values for the parameters in a CUSUM chart is important for these models to detect changes in a process as fast as possible. Therefore, the aim of this research was to evaluate the ARL on a CUSUM control chart of MAX and SMAX processes, and to apply Banach's fixed-point theory to prove the existence and uniqueness of the ARL. The stock price for Aeon Thana Sinsap (Thailand) PCL (AEONTS) with exogenous variables, which are the USD/THB exchange rate and the interest rate, were applied to analyze the ARL of a CUSUM control chart. The rest of this paper is organized, as follows. In the next section, preliminaries regarding the definitions and theory used in the study are given. Section 3 illustrates the explicit formula. Section 4 presents the existence and uniqueness of the explicit formula, the NIE method and the numerical procedure for obtaining optimal parameters for ARL. The computational results, a comparison, and an application to real data are reported in Section 5. Section 6 consists of the conclusion and a discussion of this research.

Preliminaries
In this section, the definitions and theories related to the fixed point theorem, MAX and SMAX processes, CUSUM control chart, and characteristic are proposed, as follows: 2.1. The MAX(q,r) and SMAX(Q,r) L Processes Definition 1. Let {Z t , t = 1, 2, . . . ,} be a sequence of MAX(q,r) process given as and {ζ t , t = 1, 2, . . . ,} be a sequence of SMAX(Q,r) L process given by the expression where ε t is a exponential white noise process, µ is a process mean, is the moving average polynomials in B of order q.
is the seasonal moving average polynomials in B of order Q; L is a natural number, B is the backward shift operator, i.e., B q ε t = ε t−q , and X it is exogenous variable and β i is a coefficient of X it .  (1) ρ(x, y) ≥ 0 i.e., ρ is finite and non-negative real valued function. (Triangular inequality).

Theorem 1. (Banach's Fixed Point
Theorem (see Richard [24])) If C * : M → M is a contraction mapping on a complete metric space (M, ρ), then there exists an unique solution s ∈ M of C * s = s.

CUSUM and EWMA Control Charts for the MAX(q,r) and SMAX(Q,r) L Processes
The CUSUM statistic under the assumption {Y t ; t = 1, 2, 3, . . .}, as a sequence of i.i.d continuous random variables with common probability density function, is considered. The CUSUM statistic (Y t ) is referred to as an upper CUSUM statistic, being based on MAX(q,r) and SMAX(Q,r) L processes. Y t can be expressed by the recursive formula, as where Z t is a sequence of the MAX(q,r) and SMAX(Q,r) L processes with exponential white noise, the starting value Y 0 = s is an initial value; s ∈ [0, h], where h is a control limit and a is usually called the reference value of CUSUM chart. The CUSUM stopping time (τ h ) with predetermined threshold h is defined as Meanwhile, the EWMA statistic for constructing EWMA control chart with smoothing parameter η ∈ (0, 1], mean µ, variance σ 2 , initial value of the process mean: M 0 = µ is defined as where Z t is generated from the MAX(q,r) and SMAX(Q,r) L processes with exponential white noise. The control limits of EWMA control chart consist of Upper control limit: where κ is the width of the control limits.

Characteristics of Average Run Length
Let {ε t , t = 1, 2, 3, . . .} be a sequence of independent and identically distributed random variables with a probability density function f (x) with the parameter λ = λ 0 , which is before a change-point time θ ≤ ∞; the parameters λ 1 > λ 0 are after the change-point time. Generally, the change-point times are considered. The expectation E θ (.) for fixed θ under probability density function f (x) with parameter λ 1 is that the change-point occurs at point θ. The appropriate chart provides a large ARL for θ = ∞. There is the behavior of in-control state of ARL, being denoted by ARL 0, or the state of no change (λ = λ 0 ). The expectation of the run length τ h in the -control state can be defined as Meanwhile, if θ = 1, in the case of the change-point time from λ 0 to λ 1 , then the ARL is evaluated as the out-of -control state of ARL, being denoted by ARL 1 , which can be defined as

The Explicit Formulas for Average Run Lengths with MAX(q,r) and SMAX(Q,r) L Processes
In this section, the derivations of the explicit formulas for the ARL of CUSUM chart, when observations are MAX(q,r) and SMAX(Q,r) L processes with exponential white noise from the integral equations, after checking the existence and uniqueness of the solutions for the ARL, are presented, as follows: The explicit formula C(s) for the ARL of MAX(q,r) process with an exponential white noise is Proof. Let C(s) be the explicit ARL of MAX(q,r) process with an exponential white noise.
Let g be constant as C(s) can be written as Now, constant g can be found as the following Substituting the constant g into Equation (7), then As previously mentioned, the value of exponential parameter λ = λ 0 ,; this implies that the process is an in-control state. Hereby, the explicit analytical solution for ARL 0 can be written as On the contrary, if the process is in an out-of-control state, the value of exponential parameter λ 1 = λ 0 (1 + δ), where λ 1 > λ 0 , and δ is the shift size. The explicit analytical solution for ARL 1 can be written as Theorem 3. The explicit formula D(s) for the ARL of SMAX(Q,r) L process is Proof. Let D(s) be the explicit ARL of SMAX(q,r) process with an exponential white noise.
For s = 0, then Now, constant k can be found as the following Substituting the constant k into Equation (11), then As previously mentioned, the value of exponential parameter λ = λ 0 ,; this implies that the process is an in-control state. Hereby, the explicit analytical solution for ARL 0 can be written as On the contrary, if the process is in an out-of-control state, the value of exponential parameter λ 1 = λ 0 (1 + δ), where λ 1 > λ 0 , and δ is the shift size. The explicit analytical solution for ARL 1 can be written as

Explicit Formulas and Numerical Integral Equation Method for Average Run Length
In this section, the existence and uniqueness of the explicit formulas for the analytical ARLs and numerical integral equation method for numerical ARLs are shown as the following. ρ(ARL n+1 , ARL n ) = ρ(C * ARL n , C * ARL n−1 )

Proof (Uniqueness).
Let Therefore, the explicit formulas ARL on the CUSUM control chart of MAX or SMAX processes with exponential white noise have existence and uniqueness.

The Numerical Integral Equation Method
According to the integral equation in Equations (6) and (10), the numerical integral equation method can evaluate the solution by the Gauss-Legendre quadrature rule as It can be rewritten in matrix form as Therefore, the approximation of average run length is evaluated by the numerical integral equation method for ARL(u) is where a j = h m j − 1 2 and w j = h m , j = 1, 2, 3, . . . , m. An absolute percentage relative error (APRE) criterion is used as a performance criterion and the APRE can be expressed as where ARL(s) exp is the ARL results from the explicit formula and ARL(s) app is an approximation of ARL from the NIE method.

The Numerical Procedure for Obtaining Optimal Parameters for MAX Designs
Step 1. Select an acceptable in-control value of ARL 0 and decide on the change parameter value (λ 1 ) for an out-of-control state; Step 2. For the given values λ 0 and T, find the optimal values of a and h to minimize the ARL 1 values that are given by Equation (9), subject to the constraint that ARL 0 = T in Equation (8), i.e., a and h are solutions of the optimality problem.
In addition, the numerical procedure for obtaining optimal parameters for SMAX designs is the same as the MAX procedure, by using Equations (12) and (13) for ARL 0 and ARL 1 , respectively. Table  shows the optimal (a, h) values for T = 370 and magnitudes of change.

Computational Results and Real Application
In this section, the results of the ARL with the explicit formula and the NIE method are provided and compared.
The ARL results of the proposed explicit formula (explicit) were compared with the NIE method while using the Gauss-Legendre quadrature rule with 1000 nodes for the CUSUM chart, based on the APRE criterion in Equation (14). The numbers in parentheses are CPU time (minutes) with the NIE method.

Numeric Results
The numerical results were obtained from the simulation. The chart was set up with reference value (a) that is greater than s, where s is an initial value (s = 0). The parameter a could be any number, but it combines with the control limit (s) that corresponds to the in-control ARL 0 = 370 and 500. In this paper, the parameter a is varied between 1.5 to 3.0. A comparison of the solution of the explicit formula (Explicit) with the NIE method for the CUSUM chart, when λ 0 = 1, β 1 = 0.5 for MAX (2,1) and SMAX (3,1) 12, are reported in Tables 1 and 2 for ARL 0 = 370, 500, with which they are in good agreement. Notice that the absolute percentage relative error is small. In Tables 3 and 4, we use Equations (8) and (9) to show ARL 0 and ARL 1 for the MAX (2,1) process with parameter a = 3 and the coefficient parameters of the process θ 1 = 0.1, θ 2 = 0.2, and β 1 = 0.5. For Equations (12) and (13), the parameters a = 3, Θ 1 = 0.1, Θ 2 = 0.2, Θ 3 = 0.3, and β 1 = 0.5. were used for the SMAX (3,1) 12 process. The parameter value λ 0 = 1 was applied to the in-control process. Meanwhile, for the out-of-control process (λ 1 > λ 0 ), parameter values λ 1 = λ 0 (1 + δ) were used, with shift sizes of 0.01, 0.03, 0.05, 0.10, 0.30, 0.50, 1.0, 1.5, and 2.0. The first row in both tables shows that the results of ARL 0 with the explicit formula were close to the NIE method, when ARL 0 approached 370 and 500. The values in parentheses are the CPU times of the ARL from both of the methods. The ARL values of the explicit formula and the NIE method were similar and tended to decrease when the level of the shift size increased. Note that the absolute percentage relative error was very small and the CPU time with the explicit formula was just a fraction of a second, while the NIE method took around 11-13 min.  Table 3. ARL values for MAX (2,1) process using explicit formula against numerical integral equation given a = 3, θ 1 = 0.1, θ 2 = 0.2, β 1 = 0.5, h = 3.265 for ARL 0 = 370 and h = 3.588 for ARL 0 = 500. In Table 5, the results in terms of the optimal reference value (a) and optimal width of the control limit (h) and the minimum ARL 1 (ARL * 1 ) of MAX (1,2) and SMAX (2,2) 12 processes for ARL 0 = 370 are shown. For example, if we want to detect a parameter change from λ 0 = 1 to λ 1 > λ 0 and the ARL 0 is 370, then the optimality procedure given above will give the optimal parameter values a = 1.66999118582 and h = 5.9737144930593 and ARL 1 * value = 13.552. The suggested explicit formulas are useful to practitioners, especially when finding the optimal parameters of the MAX and SMAX processes for the CUSUM chart.

Real-World Application
Application to real-world data was conducted to evaluate the ARL by the explicit formula and NIE method, as reported in Table 6. The AEONTS share prices in the SET with two exogenous variables, the US/THB exchange rate, and the interest rate, were collected monthly from January 2012 to December 2016 as the dataset of real observations. The first-order MA model is suitable for fitting the AEONTS share price with two exogenous variables, because the error of estimation is the smallest as compared to other models. Therefore, the first-order MA model was constructed with the process coefficients µ = 650.0555, θ 1 = 0.746083, β 1 = −11.773, β 2 = −61.94972, and the error as exponential white noise with λ 0 = 6.9094. For the ARL performance comparison, the boundary values h = 22.48, 24.71 for the CUSUM control chart and κ = 0.5333, 0.8791 for the EWMA control chart were used with conditions of ARL 0 = 370 and ARL 0 = 500, respectively. The smoothing parameter of the EWMA control chart was set to 0.1. The results in Table 6 are similar to the results in Tables 3 and 4, in that the NIE results approached the explicit formula results. In Table 7, the performance of CUSUM control chart with the explicit formula is compared with EWMA control chart by using the NIE method. The results of the performance comparison show that the CUSUM control chart provided a smaller ARL 1 than the EWMA control chart when the shift size was small, but the EWMA control chart performed better than the CUSUM control chart when the shift size was δ ≥ 0.015.  Table 7. Performance comparison of cumulative sum (CUSUM) and exponentially weighted moving average (EWMA) control charts using explicit formula (Explicit) for MAX (1,2) when λ 0 = 6.9094 for ARL 0 = 370 and ARL 0 = 500.

Conclusions and Discussion
In the theoretical computation, the ARL that was calculated from the explicit formula was in excellent agreement with the ARL obtained from the NIE method with the percentage error at less than 0.50%. However, the CPU time of the NIE method took between 10 and 16 min., whereas that of the explicit formula was less than one second. Moreover, the explicit formula for evaluating the ARL of the CUSUM control chart could not only significantly reduce the computational time, but also obtain the optimal parameters. In addition, the results from the experiment using a real-world dataset were similar to those of the theoretical computation. This shows that the CUSUM control chart is good for detecting processes with a small shift size, while the EWMA control chart can efficiently detect processes with a moderate to large shift size. Thus, it is suggested that the explicit formulas for the ARL of the CUSUM chart have real-world applications for a variety of data processes, including finance, agriculture, hydrology, and environmental. These issues should be addressed in future research. Future research could compare the results of the ARL for the CUSUM control chart with nonparametric control charts, such as the Tukey control chart. Moreover, a variety of data processes could be extended to other models or the explicit formula of ARL could be developed for other observations that correspond to the exponential family.
Author Contributions: All authors contributed significantly to the study and preparation of the article. They have read and approved the final manuscript. All authors have read and agreed to the published version of the manuscript.