SPC methods for time-dependent processes of counts—A literature review

During the last few years, there was increasing interest in SPC methods for time-dependent processes of counts. We survey recent developments in this field: feasible models for autocorrelated counts processes are presented, approaches for corresponding control charts are considered, and also the topic of process capability indices is briefly discussed. The article is accompanied by a comprehensive list of relevant references, and it concludes by outlining promising directions for future research. Subjects: Mathematics & Statistics; Science; Statistical Computing; Statistical Theory & Methods; Statistics & Computing; Statistics & Probability; Statistics for Business; Finance & Economics


Introduction
Methods of statistical process control (SPC) help to monitor and improve processes in manufacturing and service industries. For such a process, certain quality characteristics are measured at discrete times t ∈ ℕ: = {1, 2, …}, thus leading to a (possibly multivariate) stochastic process (X t ) ℕ of continuous-valued or discrete-valued random variables (variables data or attributes data, respectively). One of the most important SPC tools is the control chart, which requires the relevant quality characteristics to be measured online. Control charts are applied to a process operating in a stable state (in control), i.e. (X t ) ℕ is assumed to be stationary according to a specified model (in-control model). As a new measurement arrives, it is used to compute a statistic (possibly also incorporating past values of the quality characteristic) which is then plotted on the control chart with its control limits. If the statistic violates the limits, an alarm is triggered, signaling that the process may not be stable

PUBLIC INTEREST STATEMENT
In many fields of application, we are concerned with count data processes. Typical examples are counts of defects per produced item in manufacturing industry, counts of new cases of an infection per time unit in health care monitoring, or counts of complaints by customers per time unit in service industry. Often, it is important to detect changes in the process as soon as possible to be able to start preventive actions or to avoid further damages. Methods of statistical process control are a suitable tool for this purpose. The article provides a detailed survey of such methods together with a comprehensive list of relevant references, and it concludes by outlining promising directions for future research.
anymore (out of control) and requires corrective actions. Besides such an online monitoring to detect changes in the process, it is also important to analyze to what extent the given target values and specification limits 1 are met by the process in its in-control state. A widely used SPC solution for this purpose is process capability indices, which are also briefly discussed in the present article. Furthermore, also another type of application of control charts is considered. The use of control charts for online monitoring, as described before, is commonly referred to as the Phase-II application. But control charts may also be applied in a retrospective manner to already available in-control data, with the aim of characterizing the in-control properties of (X t ) ℕ ; this is called the Phase-I application of a control chart. More details about all these terms and concepts can be found in the textbook by Montgomery (2009) and in the survey papers by Woodall (2000), Woodall and Montgomery (2014).
In this article, we shall concentrate on a type of attributes data processes: count data processes, where each X t has a range contained in the set of non-negative integers, ℕ 0 : = {0, 1, …}. Typical examples are counts of defects per produced item in manufacturing industry, counts of new cases of an infection per time unit in health care monitoring, or counts of complaints by customers per time unit in service industry. A lot of work has been done regarding such attributes data processes, see the survey by Woodall (1997), but with one important restriction: the large majority of papers about SPC methods for attributes data assumes the underlying process to be serially independent in its in-control state, so the counts X 1 , X 2 , … have to be independent and identically distributed (i.i.d.).
Only during the last few years, increasing research activity can be observed concerning attributes data processes with serial dependence. The aim of the present paper is to present a survey of these research activities, and to outline relevant issues for future research in this area.
At this point, it is important to stress that this lack of interest in autocorrelated attributes data is in sharp contrast to the variables case. After few scattered works concerning the effects of autocorrelation on variables control charts' performance during the 1960s to 1980s, a lot of research activity in this direction can be observed since the 1990s, initiated, among others, by the works by Alwan and Roberts (1988), Alwan (1992Alwan ( , 1995. Surveys on control charts for autocorrelated variables data processes are provided by Knoth and Schmid (2004), Psarakis and Papaleonida (2007). Although not being a topic of research until a few years ago, Alwan (1995) had already shown that autocorrelation is indeed a common phenomenon if being concerned with attributes data processes. Typical reasons for counts data processes to be autocorrelated are high sampling frequency due to automated production environments in manufacturing industry, or varying service times (extending over more than one time unit) in service industry, or varying incubation times and infectivities of diseases in health care monitoring.
The delay in working on SPC methods for autocorrelated attributes data processes might have been caused by the problem that simple stochastic models for such processes, i.e. which are of comparable simplicity to the well-known autoregressive moving average (ARMA) models for autocorrelated variables data processes, were not known to a broader audience for a long time. Therefore, we start in Section 2 with a brief review of the basic approaches for modeling autocorrelated processes of counts. Section 3 then provides information about the most popular SPC tools, control charts and process capability indices. While Section 3.1 only presents a basic Shewhart chart and puts more emphasize on topics like performance evaluation and the effect of estimated parameters, details on advanced control charts like CUSUM and EWMA methods are presented in Section 4. Finally, we outline possible directions for future research in Section 5.

Basic models for autocorrelated counts processes
In the sequel, several common count data distributions shall be mentioned without presenting further details about them; a reader being interested in more background information is referred to the book by Johnson, Kemp, and Kotz (2005).
One of the oldest approaches toward stationary count data processes is the INAR(1) model by McKenzie (1985), Al-Osh and Alzaid (1987), the integer-valued counterpart to the usual autoregressive model of order 1. This model can be understood as a special type of branching process with immigration, and it uses the binomial thinning operator by Steutel and van Harn (1979): If X is a count data random variable and if ∈ (0;1), then the random variable •X: = ∑ X i=1 Z i is said to arise from X by binomial thinning, where Z i are i.i.d. binary random variables with P(Z i = 1) = (we abbreviate Z ∼ Bin(1, )), which are also independent of X. So •X is conditionally binomially distributed, •X ∼ Bin(X, ). where all thinning operations are performed independently of each other and of ( t ) ℕ , and where the thinning operations at each time t as well as t are independent of (X s ) s<t .
Except using the thinning operator "•" instead of the usual multiplication " ⋅ ", recursion (2.1) looks like the usual AR(1) recursion. In fact, it also constitutes a Markov chain with an exponentially decaying autocorrelation function (ACF), (k): = Corr[X t , X t−k ] = k , and marginal mean and variancemean ratio are obtained as Beyond mimicing the typical AR(1)-like autocorrelation structure, the INAR(1) model is particularly relevant for typical tasks of statistical quality control due to its intuitive interpretation (see Weiß, 2007). The thinning operation • X itself is interpreted as expressing the number of survivors from a population of size X, where each individual, independent of the other individuals, has survival probability . So recursion (2.1) is interpreted as Adapted to the application scenarios sketched in Section 1, the "population" at time t might consist of faults in a system or network, of persons being infected by a certain disease, or of unanswered complaints by customers. These might be faults or infected persons or complaints that were already available at the previous time t − 1 ("survivors"), or which newly occured at time t ("immigration").
The most popular case of the INAR(1) family is the Poisson INAR(1) model. Here, it is assumed that the innovations t are Poisson-distributed according to Poi( ) such that = 2 = . Then the stationary marginal distribution is also a Poisson distribution, Poi 1− (see Al-Osh & Alzaid, 1987), such that also the observations have a variance being equal to the mean (the latter property is referred to as equidispersion). In applications, however, one often observes the counts having a variance being larger than the mean, i.e. having overdispersion (Weiß & Testik, 2011). According to (2.2), such a feature is easily implemented into the INAR(1) model by simply using an overdispersed distribution for the innovations, like a compound Poisson distribution (Schweer & Weiß, 2014) or the Poisson log-normal distribution (Weiß & Testik, 2015a). By the same approach, also other nonstandard features like, e.g. zero inflation (excess of zeros) can be implemented into the INAR(1) model (see Jazi, Jones, & Lai, 2012). Finally, it should be mentioned that also higher order INARMA models have been discussed in the literature, for instance, in Du and Li (1991), Weiß (2008b).
Often motivated by the aim of defining an AR(1)-like model for counts with overdispersion, a number of modifications to the basic INAR(1) model (2.1) have been proposed where the binomial thinning operator is replaced by another type of thinning (see Weiß, 2008a for a survey). As an example, Ristić, Bakouch, and Nastić (2009) introduced the negative binomial thinning operator * X: . Then the innovations' distribution can be chosen in such a way that the new geometric integer-valued autoregressive (NGINAR) process of order 1, defined by is stationary with geometrically distributed marginals having an arbitrary mean > 0, provided that ≤ ∕(1 + ), and with ACF (k) = k .
Another popular approach for modeling stationary processes of counts are the INGARCH models, which are particularly attractive for overdispersed counts. The INGARCH model, the integer-valued counterpart to the conventional generalized autoregressive conditional heteroskedasticity model, was introduced by Heinen (2003), Ferland, Latour, and Oraichi (2006). Given the past observations, a conditional Poisson distribution with an ARMA-like recursion for the conditional means is assumed. For the special case of the INARCH(1) model, which constitutes a counterpart to the INAR(1) model discussed before, let us denote the model parameters by > 0 and 0 < < 1. Then the process (X t ) ℤ is said to follow the INARCH(1) model if X t is conditionally Poisson distributed in the following way: The ACF equals (k) = k like in the standard AR(1) case, and marginal mean and variance-mean ratio of the INARCH(1) process are given by There are certainly many alternative approaches for modeling time series of counts, e.g. regression models (Kedem & Fokianos, 2002) or hidden Markov models (Zucchini & MacDonald, 2009), but these shall not be considered further in this text, since they have not been used yet in an SPC context (to the knowledge of the author).

Control charts
The most common application scenario for control charts is the so-called Phase-II application (also see Section 1 before), i.e. the prospective online monitoring to detect a possible change in the process. The (unknown) time where such a process change first happens is called a change point. To be more precise, we consider the following (unconditional) change point model (Knoth, 2006): For ∈ ℕ, we assume that (X t ) t< and (X t ) t≥ are stationary processes with distributions abbreviated as F 0 and F 1 , respectively. The time index is the change point, which is not known in practice. For t < , the process is said to be in control, while it is out of control for t ≥ if F 1 ≠ F 0 .
Applying a control chart, we aim at detecting the unknown change point as early as possible. The most simple control charts are the so-called Shewhart charts, which are based on statistics Z t being a function only of the most recent observation X t (or of the most recent sample for a sample-based monitoring). Then Z t is plotted on a chart against time t with time-invariant lower and upper control limits l < u. An alarm is triggered at time t for the first time if (2.6) An extensive review of Shewhart control charts is given by Montgomery (2009).
Regarding count data monitoring, the so-called c chart is particularly relevant where simply Z t = X t , i.e. the counts are directly plotted on the chart as they arrive in time. More advanced control charts, where Z t : = f t (X 1 , … , X t ; ) is an appropriately chosen measurable function of X 1 , … , X t and of a vector of design parameters, are considered later in Section 4 in more detail. Applications of the c chart to INAR(1) processes (2.1) were considered by Weiß (2007Weiß ( , 2011b, Morais and Pacheco (in press), to NGINAR(1) processes (2.4) by Li, Wang, and Zhu (in press), and to INARCH(1) processes (2.5) by Weiß and Testik (2012).  (2012), Kang and Lee (2014), Kang and Song (2015), while Torkamani, Niaki, Aminnayeri, and Davoodi (2014), Davoodi, Niaki, and Torkamani (2015) considered an underlying INAR(1) process, also see the references in Hudecová, Hušková, and Meintanis (2015). Note that the main difference between such change point tests and the above control charts is that the first are usually applied in an offline manner, to find the location of the change point withing the available (and static) time series. Online versions of change point tests, where the in-control model is sequentially tested based on the available data at each time, are presented by Hudecová et al. (2015) for the case of the INAR(1) model (2.1), and by Kirch and Kamgaing (2015) for the case of the INARCH(1) model (2.5).
The essential step before starting process monitoring is to find an appropriate chart design, i.e. appropriate values for the control limits l < u in case of the c chart. Although sometimes being criticized (Kenett & Pollak, 2012), still, the main approach is to consider appropriately defined mean statistics based on the run length L, i.e. an average run length (ARL), where L : = min t ∈ ℕ | Z t ∉ [l;u] is defined as the random number of plotted points until the first alarm is triggered. The most common ARL concepts are as follows (Knoth, 2006): Defining E [⋅] as the expectation related to the change point , • the zero-state ARL (also initial-state ARL) is defined as • the expected conditional ARL (also expected or conditional delay) is defined as • the steady-state ARL is defined as Obviously, we have ARL (1) = ARL. For any of these ARL concepts, we refer to the computed ARL value as the in-control ARL (out-of-control ARL) if F 1 = F 0 (F 1 ≠ F 0 ); the in-control ARL is commonly abbreviated by adding the index "0". A popular approach for chart design is to choose l < u such that the zero-state ARL 0 reaches a prespecified level (expressing the robustness of the chart against false alarms), and then to evaluate the out-of-control performance based on the steady-state ARL (since the value of the change point is not known but it will satisfy ≫ 1 in many real applications).
It remains to ask how to compute any of the ARL concepts (3.2)-(3.4) given a certain chart design (this question holds in the same way also for the advanced control charts discussed in Section 4 below). Certainly, in any case where it is possible at all to simulate the considered type of counts data process, ARLs can be approximated based on such simulations with a sufficiently high number of replications (usually at least 10,000). But if (X t ) ℕ follows a type of discrete Markov model (note that any of the three models (2.1), (2.4), and (2.5) constitutes a discrete Markov chain), then it is often possible to adapt the Markov chain approach (MC approach) as first proposed by Brook and Evans (1972). A detailed description for several types of control charts (including the c chart for INAR(1) processes), together with corresponding software implementations, is provided by the tutorial by Weiß (2011b).
To conclude this section, let us briefly look at the Phase-I application of control charts, and at the related topic of the effect of estimated parameters on the control charts' performance. To design the control charts for use in Phase II, a model for the in-control behavior of the process is required (which is then used for chart design as outlined before). Since in practice, the true in-control model is hardly known, one has to fit a model to a set of historic data which are believed to stem from the in-control model. There are several issues that have to be considered carefully in this context, see the recent survey by Jones-Farmer, Woodall, Steiner, and Champ (2014). Among others, once a data sample for Phase-I analysis is available, it has to be checked if these data can be assumed at all to stem from a unique model, or if, for instance, the data are contaminated by outliers. In the latter case, such outliers have to be excluded from the data before fitting the in-control model. For this task, control charts are often used (especially Shewhart charts), which is known as the Phase-I application of control charts. In Weiß and Testik (2015b), the concrete implementation of the Phase-I analysis for an underlying INAR(1) process is discussed in detail, and the effect of undetected outliers during Phase I on the resulting chart design and performance during Phase II is studied.
Once the available data can be assumed to be "clean", the parameters of the in-control model have to be estimated. The estimated in-control model is then used for chart design for Phase II. Many articles considered the effect of estimated parameters on the charts' performance in Phase II (see Jensen, Jones-Farmer, Champ, & Woodall, 2006 for references), where the properties of the used estimators or the sample size play an important role. In the context of autocorrelated count data processes, this topic was considered by Testik (2011), Zhang, Nie, He, andHou (2014), Weiß and Testik (2015b) for the Poisson INAR(1) model and diverse types of control charts.

Process capability indices
Saying that a process is in control only implies that it is stationary, following a specified model (see above), but it does not imply that the output of the process meets the given quality requirements. Concerning the latter issue, one has to check, for instance, to what extent the given target values and specification limits are met by the process. If the process is not consistent with the given external specifications, adjustments are necessary such that the new in-control model better agrees with the quality requirements. A popular tool for evaluating the actual process capability is process capability indices. An introduction to such indices (especially for the variables data case) can be found in the book by Montgomery (2009), the most recent literature survey seems to be the one by Saha and Maiti (2015).
Only few of the works about capability indices refer to attributes data processes. Perakis and Xekalaki (2005) picked up the idea of considering the actual "proportion of conformance": if the upper specification limit USL describes, e.g. the maximal acceptable number of non-conformities per produced item, then the probability P(X > USL) is compared to a prespecified acceptable probability level 1 − p 0 . Perakis and Xekalaki (2005) considered an index defined by the quotient A related approach designed for the specific level 1 − p 0 = 0.0027 was proposed by Borges and Ho (2001) as where Φ denotes the distribution function of the standard normal distribution N(0, 1). (3.5) (3.6) For practice, a relevant question is how to estimate the indices (3.5) and (3.6) from given in-control data (in analogy to the Phase-I analysis discussed before). While Perakis and Xekalaki (2005) considered this task for an underlying i.i.d. process of Poisson counts, Weiß (2012b) extended this work to an underlying Poisson INAR(1) process (2.1), distinguishing between the process capability for the observations or innovations, respectively, from such an INAR(1) process.

Advanced control charts
The basic c chart presented in Section 3.1 allows for a continuous monitoring of a serially dependent count data process, but the statistic plotted on the chart at time t, which is simply the count value being observed at time t, does not comprise information about past values of the process (at least not explicitly, beyond the mere effect of autocorrelation). Therefore, the c chart (as any other Shewhart-type chart) is not particularly sensitive to small or moderate changes in the process. For this reason, several types of advanced control charts have been proposed, where the plotted statistic at time t also uses past observations of the process and hence accumulates information about the process for a longer period of time.

CUSUM charts
The traditional cumulative sum (CUSUM) control chart (Page, 1954), being applied directly to the observations X t of the process, is perhaps the most natural advanced candidate for monitoring autocorrelated processes of counts, because it preserves the discrete nature of the process by only using additions (but no multiplications). Initialized by a starting value c + 0 ≥ 0, the upper-sided CUSUM is defined by The starting value is commonly chosen as c + 0 = 0; a value c + 0 > 0 is referred to as a fast initial response (FIR) feature, and it may help to detect an initial out-of-control state more quickly. If k + and c + 0 are taken as integer values, then also (C + t ) ℕ 0 is integer valued, or, as another example, if k + , c + 0 ∈ {0, 1∕2, 1, 3∕2, …} then so is C + t . An alarm is triggered if C + t violates the control limit h + (decision interval).
While the upper-sided CUSUM is mainly designed to detect increases in the process mean, the lower-sided CUSUM, defined by aims at uncovering decreases in the mean. If (C + t , C − t ) are monitored simultaneously, then this chart combination is referred to as a two-sided CUSUM chart. An excellent book with a lot of background information about CUSUM charts is the one by Hawkins and Olwell (1998).

EWMA charts
Another advanced approach for process monitoring, which is also very popular in applications, is the exponentially weighted moving average (EWMA) control chart dating back to Roberts (1959). The standard EWMA recursion defined by however, has an important drawback compared to the CUSUM approach of the previous Section 4.1 if applied to count data processes: it does not preserve the discrete range. Quite the contrary, the range of possible values of Z t changes in time, which rules out, among others, the possibility of an exact ARL computation by the Markov chain approach (remember Section 3.1). Therefore, Gan (1990) suggests to plot rounded values of the statistic (4.3): which is initialized by Q 0 : = q 0 ∈ ℕ 0 . q 0 might be chosen as the rounded value of the in-control mean. An alarm is triggered if Q t violates one of the control limits 0 ≤ l ≤ u. Note that the statistics Q t can take only integer values from ℕ 0 .
If the underlying count data process (X t ) ℕ is a Markov chain, then (X t , Q t ) ℕ is a bivariate Markov chain with range ℕ 2 0 , so ARLs can be computed again exactly by adapting the MC approach (see Weiß, 2009b for details). In the latter article as well as in Zhang et al. (2014), the particular case of an underlying INAR(1) process (2.1) was considered, while Li et al. (in press) investigated the EWMA approach (4.4) applied to an NGINAR(1) process (2.4).
A possible disadvantage of the rounded EWMA approach (4.4) was presented in Weiß (2011a): especially for small values of , which are generally recommended if small mean shifts are to be detected, one may observe some kind of "oversmoothing", i.e. Q t becomes piecewise constant in time t and rather insensitive to process changes. Therefore, Weiß (2011a) proposed a modification of (4.4), where a refined rounding operation is used: For s ∈ ℕ, the operation s-round maps x onto the nearest fraction with denominator s. For s = 1, we obtain the usual rounding operation, while 2-round rounds onto values in {0, 1∕2, 1, 3∕2, …}, for example. The resulting s-EWMA chart follows the recursion If (X t ) ℕ is a Markov chain (Weiß, 2011a considered the instance of an INAR(1) process (2.1)), then (X t , Q (s) t ) ℕ again is a discrete Markov chain, now with range ℕ 0 × ℚ + 0,s , where ℚ + 0,s : = { r s | r ∈ ℕ 0 } is the set of all non-negative rationals with denominator s. So again, it is possible to adapt the MC approach by Brook and Evans (1972) for an exact ARL computation.

Jumps chart
The last type of advanced control chart to be presented here is the jumps chart proposed by Weiß (2009c). It considers the "jumps" J t : = X t − X t−1 (Weiß, 2008b), which are particularly sensitive to a reduction of autocorrelation, since this leads to increased jumps. So in view of monitoring changes in the mean and the autocorrelation structure simultaneously, Weiß (2009c) proposed to apply the combined jumps chart, where the counts X t and jumps J t are plotted simultaneously on a c chart with limits 0 ≤ l < u and a jumps chart with limits ∓k, respectively. If (X t ) ℕ is a Markov chain (Weiß, 2009c considered the instance of an INAR(1) process (2.1), (Li et al., in press) that of an NGINAR(1) process (2.4)), then (X t , J t ) ℕ is a discrete Markov chain with range ℕ 0 × ℤ, so ARLs can be computed exactly by adapting the MC approach.

Conclusions
After having been neglected for a long time, there was a rapidly increasing research interest in SPC methods for time-dependent processes of counts during the last few years. The present article provides a comprehensive survey of recent developments in this field in conjunction with a list of relevant references being as complete as possible.
We conclude this article by briefly discussing possible directions for future research in the area of SPC methods for autocorrelated attributes data. Up to now, mainly "well behaved" types of counts data processes have been considered, especially those having a Poisson marginal distribution. But in view of real counts processes as observed, e.g. in epidemiology, future research should also consider phenomena like an excessive number of zeros (zero inflation) or seasonality (the latter leading to a non-stationary but still a "regular" in-control behavior). Also the topic of count data processes having a finite range {0, … , n}, with a fixed upper limit n reflecting, e.g. the sample size in manufacturing industry or the number of service entities in service industry, would be very relevant in practice, but was considered only casually up to now (Weiß, 2009a;Weiß & Kim, 2013;Weiß & Testik, 2015b). The same applies to multivariate count data processes in Bersimis, Psarakis, and Panaretos (2007, p. 523) . Even more sobering, it seems that the case of serially dependent processes with the full set of integers ℤ = {… , −1, 0, 1, …} as their range (Kim & Park, 2008) has not been discussed so far at all in an SPC context. It is also important to emphasize that INAR(1) processes are related to certain queue length processes (with an infinite number of servers (see Schweer & Wichelhaus, 2015), so control charts for queueing systems as in Chen and Zhou (2015) and the control charts for autocorrelated counts as described in this article might be mutually enriching.
Besides other types of process models, also different approaches for process monitoring and chart design should be considered in future works. These may cover adaptive sampling procedures (e.g. variable sampling intervals) as discussed in Epprecht, Costa, and Mendes (2003), Montgomery (2009), for instance, or the additional use of runs rules as, e.g. in Alwan, Champ, and Maragah (1994), Acosta-Mejia 1999, Koutras, Bersimis, and Maravelakis (2007). Related to the latter approach, the so-called synthetic control charts attracted a lot of research interest during the last years, but recently also drew sound criticism (Knoth, in press). Concerning chart design, it might be interesting to apply economic design principles (Celano, 2011;Montgomery, 2009) in the context of autocorrelated counts, and also the Phase-I analysis for such processes (choice of estimators, effect of parameter estimation, etc.) deserves more attention.
Finally, much more research effort should be put on other types of discrete-valued and serially dependent processes, especially on categorical processes (both ordinal and nominal). There are some works for the special case of serially dependent binary attributes, e.g. the Markov Binary CUSUM chart for continuously monitoring a Markov-dependent Bernoulli process as proposed by Mousavi and Reynolds (2009), or the Markov Binomial EWMA chart for monitoring segments taken from such a Markovian Bernoulli process (see Weiß, 2009d). But if product or service quality, for instance, is classified in more than only two categories, then methods for monitoring non-binary but serially dependent categorical processes would be required. See Weiß (2012a) and the references therein for a few first approaches in this direction, while a comprehensive treatment of this area is still pending.