A Review of Control Charts and Exploring Their Utility for Regional Environmental Monitoring Programs

: Industrial control charts are used in manufacturing to quickly and robustly indicate the status of production and to prompt any necessary corrective actions. The library of tools available for these tasks has grown over time and many have been used in other disciplines with similar objectives, including environmental monitoring. While the utility of control charts in environmental monitoring has been recognized, and the tools have already been used in many individual studies, they may be underutilized in some types of programs. For example, control charts may be especially useful for reporting and evaluating data from regional surveillance monitoring programs, but they are not yet routinely used. The purpose of this study was to promote the use of control charts in regional environmental monitoring by surveying the literature for control charting techniques suitable for the various types of data available from large programs measuring multiple indicators at multiple locations across various physical environments. Example datasets were obtained for Canada’s Oil Sands Region, including water quality, air quality, facility production and performance, and bird communities, and were analyzed using univariate (e.g., x-bar) and multivariate (e.g., Hotelling’s T 2 ) control charts. The control charts indicated multiple instances of unexpected observations and highlighted subtle patterns in all of the example data. While control charts are not uniquely able to identify potentially relevant patterns in data and can be challenging to apply in some monitoring analyses, this work emphasizes the broad utility of the tools for straightforwardly presenting the results from standardized and routine surveillance monitoring.


Introduction
Identifying patterns in measurements prompting additional studies are common tasks for scientists and engineers. While statistical hypothesis testing has been the typical approach for this work across disciplines, control charts were introduced in the mid-1900s in industrial manufacturing to quickly alert process engineers and production managers to either existing or potential future deficiencies in the desired qualities of finished products, e.g., [1,2]. The original Shewart chart eventually spread and was later joined by additional procedures, such as the exponentially weighted moving average (EWMA), the cumulative sum (CUSUM), and attribute control charts, to address more nuanced questions and to increase the sensitivity of the techniques to novel sources of variation [2][3][4][5][6][7][8][9]. The suite of control charting techniques has been further broadened to include multivariate and non-parametric approaches and various techniques suitable for non-normal, censored, or auto-correlated data [2].
Since their introduction, control charts have become standard tools in manufacturing and industrial process control [2,8], but these tools are also applicable in other fields of study with similar goals and practices. For example, environmental monitoring also measures indicators to document their current state, applies tools and approaches to detect various types of changes, including subtle differences, trends, or other relevant differences [10][11][12], and uses this information to inform future work [13]. Similar to industrial process control, varying types of data are also collected in environmental monitoring programs [14][15][16][17],

Univariate Control Charts
The first control chart developed was the Shewart univariate x-bar chart [1]. The Shewart chart examines patterns in process means over time and relies on the normality of sample means despite sampling from a non-normal population [31]. The x-bar chart is typically used to flag unusually large (or small) means, but it can also be used to examine individual observations [2]. Commonly, narrow thresholds set at ±2 standard deviations (SD) are called 'warning limits', while wider thresholds, such as ±3 SDs, are called 'control limits', e.g., [37]. Specification limits may also be defined in manufacturing, and probabilitybased limits have also been widely used [2]. Other thresholds, such as the probability limits, may also be used often to extend the sensitivity of the method to additional types of change, including improbable series or runs of observations [38]. While these typical thresholds are often used, there are no restrictions beyond the acceptability of the false-positive, or false negative rates defined by the user.
Other types of univariate charts have also been introduced. For example, cumulative sum of deviations from a long-term mean (CUSUM) charts have been developed [3,39,40]. Several versions of these charts have evolved, but the tabular type tracks two statistics, C + i and C − i , to respectively indicate increases and decreases in deviations from a target value over time [2]. CUSUMs are sensitive to small changes [40] and are also favored when single samples (rather than sample means) are the input variables [2]. CUSUMs can also be more sensitive than x-bar charts. For example, a CUSUM applied to water availability in Australia was triggered 21 years before exceedances were identified with an x-bar chart [41]. CUSUM charts are also typically applied prospectively for Phase II control charting after cleaning the data in a retrospective Phase I analysis [2,42], and in one simulation study, they were robust for up to 50% fewer data points [43]. Potential challenges with the retrospective application of the CUSUM in Phase I analyses have prompted the development of additional techniques, such as the self-starting CUSUM [2,42]. Further details of the CUSUM, including the calculation of its control limits, are available elsewhere, e.g., [2]. The final univariate example examined here, the EWMA chart, is also used routinely in manufacturing [4,6,7]. EWMA charts use exponential weighting, denoted by λ (ranging from 0 to 1), to account for historical measurements [4]. As λ approaches 0, weighting is spread among all previous observations and resembles that with the CUSUM. In contrast, as λ approaches 1, less weight is ascribed to historical observations, and the chart resembles the Shewart chart [4]. The weighting factor λ is typically set between 0.2 and 0.3 [4]. However, a smaller λ can be used to detect smaller shifts, while a larger λ is used to detect larger shifts [2]. Similar to CUSUM, EWMA charts are also typically applied in Phase II control charting and are particularly recommended for analyses of individual observations, including non-normally distributed data [2]. A forecast of a single time-step is, however, also an inherent quality of the EWMA [4], and this property can be used to examine historical data (although the opportunities for dynamic (and contemporaneous) process control [4] will be lost). Since the EWMA is a smoothed and transformed variable, any detected shifts must also be referenced to the raw data [5]. Further details of the EWMA, including the calculation of its control limits, are available elsewhere, e.g., [2].

Control Charts with Raw Concentration Data
Among the literature reviewed for this study, water quality, estimated as chemical concentrations, were commonly examined using control charts [44][45][46][47][48]. The parameters selected for the univariate analyses were vanadium (V) and arsenic (As) measured between 2004 and 2020 at the Lower Muskeg River (site AB07DA0610; Figure 1). These variables were selected for several reasons. All measurements were greater than the detection limits; in bitumen, vanadium is enriched, and arsenic is depleted [49]; and these data are publicly available [50]. While examining these elements alone cannot provide a complete picture of any potential industrial influence on watercourses in the OSR, the changes in them over time and the types of changes present may help to diagnose the causes or highlight the environmental risks. For example, while other drivers, such as natural erosion, also affect these elements, a signal indicating greater industrial contamination in streams in the OSR may be a systematic increase in V over time, coupled with a concurrent decrease in As.
The first (and preliminary analyses) of V and As were performed using an x-bar chart for individual raw observations, an EWMA, and a CUSUM using the 'qcc()' function in the qcc package for R [51] and using the default settings for calculating the control limits ( Figure S1). The raw concentration data were also plotted over time ( Figure S1). These preliminary and supplemental analyses showed a general decrease in the average concentration of V over time and an increase in As over time. The plots also highlight the relative advantages of the chart types. The Shewart charts are most sensitive to large differences and highlight unusual runs, while both the EWMA and CUSUM charts are more sensitive to smaller differences. The EWMA and CUSUM charts also more clearly highlight some of the temporal patterns in the data at shorter durations than the entire set. For example, patterns suggesting seasonality are present in the EWMA and CUSUM charts for As in 2008, 2009, 2012, and 2013. These patterns are also present in the raw data and in the x-bar, but they are more apparent in the EWMA and CUSUM charts. Particular challenges with the water quality datasets available for the OSR, including the lower Muskeg River, include both autocorrelation and unequal intervals between the sampling events. For example, seasonal sampling was performed prior to 2008, monthly sampling was performed between 2010 and 2015, and a mixed design (sub-weekly, submonthly, and monthly sampling) was used after 2017 ( Figure S2). There is also a two-year gap from April 2015 to March 2017 in the water quality dataset examined here ( Figure S2). Discharge in rivers also affects the concentration of many chemical parameters, e.g., [48]. These factors and other limit the potential utility of univariate analyses of these water quality data by increasing the false-positive rate [52,53].

Residual Control Charts
While programs can be designed with additional features to address high false-positive rates [54] or to prospectively plan for other foreseeable challenges [55], statistical tools are also useful for retrospectively addressing some weaknesses. Among some water quality data, using auto-regressive models for data with equal sampling intervals, e.g., [56], and applying the control chart procedure to the residuals are common approaches. , and river discharge location (07DA008; yellow cross); mine project areas shown with white hatching (facilities mentioned in the manuscript text are named in the figure); in situ project areas are shown as white dotted fill; and the Joslyn North mine project area is shown in gray cross-hatching. Particular challenges with the water quality datasets available for the OSR, including the lower Muskeg River, include both autocorrelation and unequal intervals between the sampling events. For example, seasonal sampling was performed prior to 2008, monthly sampling was performed between 2010 and 2015, and a mixed design (sub-weekly, submonthly, and monthly sampling) was used after 2017 ( Figure S2). There is also a two-year gap from April 2015 to March 2017 in the water quality dataset examined here ( Figure S2). Discharge in rivers also affects the concentration of many chemical parameters, e.g., [48]. These factors and other limit the potential utility of univariate analyses of these water quality data by increasing the false-positive rate [52,53].

Residual Control Charts
While programs can be designed with additional features to address high false-positive rates [54] or to prospectively plan for other foreseeable challenges [55], statistical tools are also useful for retrospectively addressing some weaknesses. Among some water quality data, using auto-regressive models for data with equal sampling intervals, e.g., [56], and applying the control chart procedure to the residuals are common approaches. Residual control charts are suggested when data are affected by an external and measurable drivers, or they are auto-correlated [2,57]; they can extend the utility of control charts to additional Environments 2023, 10, 78 5 of 21 data structures, e.g., [58]; and they may be useful for diagnostics in multivariate analyses [59]. However, residual control charts using the x-bar approach may be insensitive to small process shifts, and EWMA and CUSUM processes are recommended for these analyses [2] (and references in [2]). In this analysis, the generalized estimating equation (GEE) was used (appropriate for unequal sampling intervals; [60]) using the AR1 correlation structure and a Gaussian distribution with the identity link function in the geeglm() function in the R package geepack [61]. The data were log 10 -transformed prior to the analysis. Discharge from the 07DA008 location was also included as a covariate [62], but chemical observations after 2019 and in the winter (November-February) for data collected before 2013 were omitted because discharge was not available at the time of analysis, highlighting a potential weakness of this approach for many environmental datasets assembled from multiple data sources.
Response residuals were used as input variables for x-bar, EWMA, and CUSUM charts to examine their relative utility (Table S1). The residual control charts used in this analysis all indicate that neither V nor As is in statistical control after accounting for the effects of discharge and unequal sampling intervals ( Figure 2). Exceedances of the upper control limits (UCLs) and the lower control limits (LCLs) set at ±3 SD of the three example control charts are observed for both parameters. Statistically unlikely runs of data either greater (e.g., As) or less than (e.g., V) mean residual value (>6 sequential observations greater than less than the mean) were also identified in the x-bar chart ( Figure 2). Although EWMA and CUSUM are typically used in Phase II monitoring, changes are especially apparent after~2015 in these two chart types, indicating a lack of statistical control in the datagenerating processes.
While examining these potentially relevant differences can provide some focus for identifying the sources of the exceedances, the patterns apparent in the control charts are also consistent with the advantages of each type of chart. The x-bar is sensitive to discrete exceedances, and EWMA is even more sensitive [2]. CUSUM is also sensitive to long-term patterns in the data (such as greater-than-average vanadium from 2007 to 2015 and lower than average afterward) but less sensitive to larger step-changes [2]. However, the susceptibility of CUSUM to inertia are also apparent, as long-run increases followed by decreases are slow to return to the in-control conditions [63].
Some of these changes in water quality may be associated with industrial activities. Spikes in V in the lower Muskeg River may be associated with local development in 2008 (beginning of Kearl construction; Figure 1), including mine construction and consequent groundwater depressurization (depending on the fate of this water), but earlier construction of the Jackpine mine (started in 2006; Figure 1) is not apparent from the x-bar, EWMA, or CUSUM charts ( Figure 2). However, four of the nine exceedances of the UCL in the V x-bar chart occur in the month of April, suggesting that the changes may still be associated with discharge. Relationships between concentration and discharge in rivers may be poorly defined at extreme values [64], and the performance may also be further diminished when instantaneous discharge information for the moment and location at which a sample is collected is not available. For example, previous work in the Athabasca River has also shown the likely residual influence of discharge despite its inclusion in statistical models describing total metals, such as V [48]. While discharge was used as an opportunistic covariate in this other study, e.g., [48], and it did not benefit from planning for co-located flow adjustments, there are also only weak effects of discharge on both As and V at the lower Muskeg River station ( Figure S3).
Opposite long-term trends (most apparent in the EWMA and CUSUM analyses ( Figure 2) but also shown in the raw concentration data ( Figure S1)) were also observed in V and As. Vanadium was progressively lower than predicted over time, while the concentration of As was progressively greater than predicted over time ( Figure 2). The discordant changes in As and V may suggest an alteration of sources influencing water quality in the Muskeg River from 2004 to 2019. Changes in groundwater interactions with surface waters, bedrock or surficial sediment erosion, seasonal groundwater recharge and discharge ( Figure S4), the fate of mine depressurization water, and activities at individual facilities may also affect these variables, e.g., [65]. While a lack of a clear correlation between As and V ( Figure S5) also suggests the influence of different drivers or sources, the changes in V and As may, however, also be associated with study design changes or alterations in the data-generating process after 2015, despite the continuation of the chemical measurement methodology. Although specifically and clearly diagnosing the causes of these changes is beyond the scope of this work, the control charts have demonstrated their value by clearly highlighting differences after accounting for background sources of variation that may be worth pursuing in further detailed analyses.
Environments 2023, 10, x FOR PEER REVIEW 6 of 21 Muskeg River from 2004 to 2019. Changes in groundwater interactions with surface waters, bedrock or surficial sediment erosion, seasonal groundwater recharge and discharge ( Figure S4), the fate of mine depressurization water, and activities at individual facilities may also affect these variables, e.g., [65]. While a lack of a clear correlation between As and V ( Figure S5) also suggests the influence of different drivers or sources, the changes in V and As may, however, also be associated with study design changes or alterations in the data-generating process after 2015, despite the continuation of the chemical measurement methodology. Although specifically and clearly diagnosing the causes of these changes is beyond the scope of this work, the control charts have demonstrated their value by clearly highlighting differences after accounting for background sources of variation that may be worth pursuing in further detailed analyses.

Multivariate Control Charts
While univariate analyses can be performed using individual parameters, single measurements are rarely performed in most monitoring programs. In contrast, multiple and concurrent measurements are often obtained on various indicators, e.g., [66], and in these cases, multivariate techniques to evaluate all measurements simultaneously may be preferred [2]. Hotelling's T 2 was the first multivariate control chart developed and is analogous to the Shewart x-bar [2]. Other multivariate control charts include the multivariate CUSUM (MCUSUM; [67,68]) and the multivariate EWMA (MEWMA; [69]). While these charts have important advantages over examining many univariate charts, including accounting for correlations among variables, they have weaknesses as well [36]. For example, tools such as MEWMA indicate only the magnitude (e.g., distance) of a shift but not its actual direction [36]. The variable(s) driving the change is/are also obscured [2]. Subsequently, diagnostics are required to determine the origin of the changes and to determine whether the alterations are driven by unusually large or unusually small deviations in the individual input variables ( [69]; although important to multivariate control charting, diagnostics were reserved for the next section in this study on industrial performance and production data (See Section 3)).

Examining PM 2.5 at Three Stations Using Multivariate Control Charts
Along with water quality, another commonly measured attribute of the environment is air quality. Air quality is estimated as chemical concentrations of parameters, such as sulfur dioxide or size fractions of particulate matter measured routinely near industrial facilities or in populated areas [16]. The indicators are compared individually, combined into indices, or compared using multivariate statistics, e.g., [70]. Additionally, although comparisons to guidelines or calculations of air quality indices are common practices for estimating potential health effects [71], other objectives, such as estimating downwind influence and understanding patterns in data, are also common [72][73][74]. The example analyses here were focused on the latter.
Air quality parameters are measured routinely in the OSR by the Wood Buffalo Environmental Association (WBEA) [16]. While there can be unavoidable problems with consistency of a parameter across multiple stations and analyzers going offline occasionally (or data may be otherwise unavailable, such as 24 June-22 September 2016 and 16 September-31 December 2020 at the AMS15 site), the air quality network provides quality-controlled and publicly available data. Unlike water quality sampling, air quality data are typically available with equal sampling intervals. However, as outlined above, unexpected breaks in the dataset may also occur.
In this example analysis, PM 2.5 data were obtained from three stations (Horizon . Prior to analyses with the control charts, weekly means were calculated from the 1-h QA/QC'ed data available from WBEA. The concentrations were used in an autoregressive (AR1) linear model by site for AMS01 and AMS13 with no missing weeks and equal sampling intervals (although some individual 1-h samples were missing). A GEE was used again for the AMS15 station to address the gaps in the dataset at this site (Table S2). The response residuals from each individual model were used in multivariate control charts: T 2 , MEWMA, and MCUSUM. Two MCUSUM procedures, MCUSUM and MCUSUM2, were used and respectively correspond to the Crosier [67] and the Pignatiello and Runger [68] techniques. These multivariate control charts were created using the mult.chart() function in the qcc R package [51] using the default settings. Two sets of control charts were also created. First, all points were included in the initial charts, observations likely associated with known fires were removed, and a second updated analysis was also performed. In the second set of control charts, the GEE was used to account for unequal gaps in the sampling intervals [60].
Similar to the univariate control charts used for individual parameters of water quality, the initial multivariate control charts of mean weekly PM 2.5 from three air stations in the OSR identified some out-of-control conditions ( Figure 3). The Hotelling's T 2 control chart indicated five consecutive out-of-control weeks in 2011 and two in 2016. Additionally, the two largest single deviations for PM 2.5 occurred during these periods. Two well-known forest fires likely affected these patterns. First, the Richardson fire occurred in the spring of 2011 and the Horse River fire occurred in the spring of 2016. The impact of these fires, especially the Horse River fire, have been examined extensively in the peer-reviewed literature, e.g., [76].
Environments 2023, 10, x FOR PEER REVIEW 8 of 21 Similar to the univariate control charts used for individual parameters of water quality, the initial multivariate control charts of mean weekly PM2.5 from three air stations in the OSR identified some out-of-control conditions ( Figure 3). The Hotelling's T 2 control chart indicated five consecutive out-of-control weeks in 2011 and two in 2016. Additionally, the two largest single deviations for PM2.5 occurred during these periods. Two wellknown forest fires likely affected these patterns. First, the Richardson fire occurred in the spring of 2011 and the Horse River fire occurred in the spring of 2016. The impact of these fires, especially the Horse River fire, have been examined extensively in the peer-reviewed literature, e.g., [76]. Other out-of-control (OOC) points were also observed, but they are more apparent in the MEWMA and the two MCUSUM charts. The first clustering occurs in the weeks of 25 June and 2 July 2015 and may be associated with two fires in the Birch Mountains northwest of the Horizon mine (in the Peace basin; Figure S6). A second clustering of OOC  Other out-of-control (OOC) points were also observed, but they are more apparent in the MEWMA and the two MCUSUM charts. The first clustering occurs in the weeks of 25 June and 2 July 2015 and may be associated with two fires in the Birch Mountains northwest of the Horizon mine (in the Peace basin; Figure S6). A second clustering of OOC points in both the MEWMA and both MCUSUM charts occurred in 2014 (weeks of 30 July-20 August 2014). This clustering does not appear related to any proximate forest fires ( Figure S6), but as described next (Section 4), the increased T 2 may be directly or indirectly associated with greater flaring/wasting of sulfur at the Horizon mine (Figure 4). Another anomalous clustering of weeks exceeding the control limits was identified in August 2018 only in the MCUSUM charts and as a single OOC point in the Hotelling's T 2 chart; a spike was also indicated in the MEWMA chart, but no exceedances of the control limits were identified. While more specific analyses may be warranted, OOC points in PM 2.5 , such as those in 2018, may be associated with distant fires hundreds or thousands of kilometers away, e.g., [77]. Importantly, the 'exaggeration' of patterns associated with the inclusion of historical measurements in the MCUSUMs highlights the utility of this approach but also the need for processes to protect against over-interventions. Additionally, the results from the multivariate charts also emphasize the typical avoidance of MCUSUM charts for Phase I analyses [2], but they also suggest how the sensitivity can be beneficial when the tools are used in retrospective analyses. However, the charts can also be used in a pseudo-Phase II approach to illustrate whether and when OOC points were observed, potentially to develop hypotheses to explain changes in the chosen indicators, and to demonstrate opportunities for earlier interventions [41]. A pseudo-Phase II process may be enhanced by comparing the results of the Hotelling's T 2 and other multivariate charts to respectively highlight large and small deviations, but it may also emphasize important distinctions between industrial and ecological control charts. For example, industrial process control occurs in highly controlled, enclosed, and bounded settings, whereas the opposite is often true in natural areas.
From these initial charts, identifying events not associated with oil sands industrial activity suggests that the known fires can be removed from the analyses, and we can progress into the second tier of preliminary analyses. Removing the observations associated with the 2011 and 2016 forest fires reveals many additional OOC points (Figure 3), including three large values in the MEWMA in January 2012, July 2015, and August 2018. The analyses also suggest greater sensitivity of the MCUSUM compared to the MCUSUM2 when the forest fire data are removed.
Several additional points apparent from the analyses of the reduced dataset are worth emphasizing. First, some of these OOC points may still also be associated with other fires, such as the OOC data in July 2015 (Figure 3 and Figure S6), but a large exceedance also occurs in the winter (e.g., 2012). While this difference in the winter may be driven by discrepancies in the PM 2.5 measurements among sampling locations, the combination of unique industrial and weather conditions during these times and any available and detailed chemical profiles of the particulate matter [78] may assist in future diagnostics. These results also show the potential difficulties of prematurely or only using tools such as MCUSUM in Phase I analyses (with ≥68% of observations are OOC (Figure 3)) and suggest the likely need for other analyses of these data using additional tools. Additionally, like other control charts, the results are self-referential and may not have any relevance for objectives, such as implications for human or ecosystem health [30,79] when not calibrated to do so.

Multivariate Control Charts, High Dimensionality, and Diagnostics: Industrial Production and Performance Data for the Horizon Mine
Multivariate control charts are useful for examining datasets comprised of multiple measurements, but two potential issues require special attention. The first is the problem of high dimensionality. Including too many variables in a multivariate control chart can affect the calculation of variance-covariance matrices [80] and reduce their efficiency [2]. To rectify this ISSUE, subsets of variables can be included, or techniques such as Principal Components Analysis (PCA) can be used to reduce the dimensionality of the original dataset [2].
Although multivariate control charts can indicate changes among multiple variables, they can also reduce the interpretability of the patterns requiring diagnostics [36]. Many diagnostic techniques have been proposed, including multiple regression [59], univariate control charts, and decomposing of the control chart statistic by removing one variable at a time [2]. However, some diagnostic approaches can be challenging with charts based on PCA [2] and can erode the straightforwardness typical of control charting. In some cases, straightforward plots of the individual input variables may reveal the measurements likely driving the exceedances [36].
Among the example datasets used here, the production and performance data provided to the Alberta Energy Regulator (AER) by the Horizon mine are closest to the types of data used in industrial control charts [2]. The performance and production data provide general information about the status of a facility and include metrics of mining and primary/secondary separation (e.g., the volume of crude bitumen produced and the mass of mined oil sand), the performance of upgrading equipment (e.g., volume of synthetic crude produced and mass of petroleum coke produced), and environmental performance (e.g., sulfur production, sulfur flared/wasted). These data are provided to the AER as monthly sums. For the analysis here, these data were scaled to daily averages per month and were log 10 transformed prior to analysis.
These analyses of the industrial data were performed using three control charts using the functions and R packages described above. First, Hotelling's T 2 was applied to 15 variables available for the Horizon mine [81]. PCA was also performed to decompose the Horizon dataset and reduce its dimensionality, and two additional control charts were created. From the decomposed data, two additional control charts were produced. First, an x-bar chart was applied to the first PC (capturing~67% of the variability present in the raw data; Table 1) describing mining and synthetic crude production at the Horizon mine (Table 1). A second T 2 chart was also used to examine patterns in the first five PCs (comprising~95% of the variability; Table 1). The second PC was associated with diluent naphtha as fuel and plant use of natural gas, while PC3 to PC5 were associated with flaring/wasting of various substances, such as process gas and sulfur (Table 1). However, the loss of information from using PCA is also apparent from the factor loadings (Table 1).
Multivariate control charts of the industrial production data from the Horizon mine highlight relevant patterns ( Figure 4). First, the exceedances in the three charts produced using the production and performance data correspond, but do not match exactly. For example, the gradual increase in production shown as the exceedances of the LCL and the runs after~2016 in the PC1 x-bar chart are not indicated in the T 2 or the PC T 2 charts ( Figure 4). However, the analyses do show the correspondence between Hotelling's T 2 and the PCA charts for the largest outliers, such as~January to July 2011 ( Figure 4). Overall, the multivariate charts indicate general patterns in the status of input variables and can indicate subtle shifts that are difficult to parse when using multiple univariate charts [2]. Although multivariate options reduce the number of charts needed, they can simultaneously reduce the specificity of the analysis requiring diagnostics. Diagnosing the primary drivers of a multivariate control chart can be performed by viewing plots of the individual input variables or of univariate control charts (Figure 4, Figure S7, and Figure S8). The univariate plots of raw data suggest that the periods of potential warnings in December 2010 (near the UCL) and January 2011 (greater than the UCL) in the PC1 x-bar chart, as an example, are associated with reduced or no production at the facility during the upgrading shutdown from~January to July 2011 ( Figure 4). However, plots of the raw variables compared to the multi-or univariate control charts also show how processing the data alters the sensitivity of a given approach to patterns in the data. For example, the flaring and wasting of sulfur in August 2014 and June 2015 are among the largest measurements in the raw data ( Figure 4) and are flagged in the Shewart x-bar and EWMA charts ( Figure S8), but the 3 multivariate charts do not all highlight both instances (Figure 4). While viewing raw data or univariate control charts can easily suggest the variables driving a multivariate statistic such as Hotelling's T 2 , diagnostics can also be more formal. Omitting individual variables and re-running the analysis can reveal the variables driving the T 2 [2]. Among the OOC points identified with the full T 2 model, most are affected by more than one process/performance variable ( Table 2). For example, the OOC point in January 2011 is affected by multiple variables, including the production of sulfur, crude bitumen, and process gases (Table 2). However, these data may also suggest the role of upgrading in driving unusual plant conditions, consistent with the technological sophistication of this process compared to mining and primary/secondary separation [82]. Table 2. Values of (1 − (T 2 reduced /T 2 full )) used for diagnosing drivers of Hotelling's T 2 statistic at the Horizon mine, where exceedances were observed in the original T 2 control chart from Figure   Utilizing control charts of industrial production and performance data has important implications for interpreting the results of regional environmental monitoring. As discussed in other work [17,83], a production/upgrading shutdown in 2011 may have implications for the deposition of contaminants of concern in snow during 2011 snowpack studies [84,85]. Later reductions in production at Horizon, including February 2012, May 2013, and October 2017, may also have implications for other studies; another slowdown occurred in July 2016 and may be associated with the Horse River fire. The gradual increase in synthetic crude production at the Horizon mine after~2013 and other alterations, such as greater flaring and wasting of sulfur after~2016 (Figure 4 and Figure S8), are also apparent from these analyses, which, along with changes at other facilities e.g., [86], are likely relevant to interpreting the results of regional monitoring studies. Retrospective analyses, such as Phase I control charting applied to industrial data and highlighting periods of changes in production, may focus additional analyses of ambient monitoring using data available from those periods.
Prospectively, designing opportunistic studies and adjusting any ongoing monitoring may also be prompted by routinely examining control charts of the industrial performance data. For example, any observed patterns in the control charts of industrial performance may be tested using indicators that respond quickly. These indicators may be used to document the influence of industrial facilities by identifying any responses to unplanned shutdowns, plant re-starts, or other changes in production.
Longer shutdowns, or slowdowns over a season may also be relevant for designing collections of indicators that respond over longer periods of time, including benthic communities [87]. Similarly, the occurrence of any observed differences in environmental indicators during the 'normal' operational status of a facility or multiple facilities may be used to better contextualize any relevant patterns in the ambient data.
However, there are also some challenges with using routinely reported production and performance data to adjust a monitoring program or contextualize results of field studies. Currently, only monthly production and performance data are available for these facilities. Daily averages (or totals) reported per calendar month are likely not sufficient to identify acute events. For example, known upset events lasting less than one day (e.g., [88]) were not apparent in the control chart of mean daily (calculated from monthly statistics) production and performance data for the Horizon mine. Similarly, longer events spanning the end of one month and the beginning of the next may be obscured by the reporting schedule of the data. Finally, data provided by the AER are also three months out of date. Overcoming these challenges may benefit the detection of the acute or chronic effects of oil sands development and may also improve any state of environment reporting.

Control Charts and Biological Monitoring Data
As already discussed briefly, many previous applications of control charts in environmental monitoring have examined either chemical or physical endpoints. However, biological monitoring is also commonplace, and many of these data may not conform to the assumptions of traditional control charts, e.g., [28]. In these instances, analogs of control charts can be derived from estimating data distributions and their attributes, including gamma, Weibull, lognormal, or other distributions, e.g., [22,24,25]. Others have also developed generic techniques for multivariate data. For example, Anderson and Thompson [30] developed a generic distance-based multivariate control chart to examine benthic communities. The benefit of this approach is its use with any desired dissimilarity measurements, and there is no limit to the number of variables that can be included [30]. While diagnostics are required to determine which variables may be driving any observed exceedances and can have challenges for understanding relevance [89], the approach has been used for community structures in saline lagoons [90] and coastal wetlands [91]. The distances for an age-structured fish population have also been used as an input variable in a CUSUM [92], further highlighting the potential versatility of control charts.
Unlike other types of metrics used to calculate a T 2 chart, for example, chart types proposed for some biological data often use simulations/bootstrapping to calculate the control limits, e.g., [30,93]. The main advantages of using a bootstrapping procedure to calculate a control limit is its versatility and the many versions available, such as the smoothed bootstrapping applicable for small sample sizes [94].

Bird Communities in the Ells River Basin
Exploring the utility of control charts for biological data was performed here using bird community data. The methodology developed by Anderson and Thompson [30] was adopted here to examine changes in bird community data available from Saracco et al. [95] and were collected using methods developed for the Monitoring Avian Productivity and Survivorship (MAPS) program. Data collection for birds occurred in the spring from 2012 to 2019 at the selected example locations. Data for two adjacent sites in the Ells Basin (sites ELBS and ELBN; Figure 1) designated here at the test sites were compared to data from the MAKR reference site. The MAKR location is roughly 4-5 km from the ELBS and ELBN test sites but is the least disturbed from the MAPS data [95]. Relative to the MAKR location, the physical habitat disturbance within 5 km of the ELBS and ELBN sites were 2.6 and 3.03 times greater in 2018 (the year used to gauge disturbance pressure [95]).
Following Anderson and Thompson [30] multivariate (Euclidean) distances for each of the test sites were calculated and used to further calculate two test statistics relative to a two-year baseline period (2012 and 2013; BL; d b t ) and to predict each additional year of data over time (OT; d t ) compared to all previous years [30]. Additionally, following [30], thresholds to interpret the test statistics used simulations from data from the MAKR site. Using kernel smoothers of capture rates of each bird species, simulations of 10,000 eightyear random (and reference) bird communities were generated. The UCL was set to the 90th percentile of the 10,000 iterations. The median was also calculated.
Observed distances compared to the expected range of possible distances for each location suggested no exceedances of the 90th percentile UCL in the comparisons over time (e.g., d t ; Figure 5). However, a run of four d b t values from 2016 to 2019 at the ELBN exceeded the UCL computed for the first two years from the MAKR data (Figures 1 and 5). These exceedances could be related to the changes in the Ells Basin after development of the Joslyn North mine was halted in May 2014 [96].
interpretation of data [28]. However, a standardized and simplified presentation can also obscure the details embedded in the analyses, including the validity of any assumptions made. The approach also requires reference data, which may be challenging to obtain, especially in the OSR, where multiple stressors, such as physical disturbances of habitats, may often be accompanied by atmospheric deposition of contaminants of concern [96].
While often not necessary for identifying some types of differences, including known spills, for data collected as covariates, or for data not intended for a direct comparison among treatments, control charts have desirable attributes for further expanding their use in environmental monitoring [13]. The tools are well suited for quickly identifying subtle step-changes, gradual trends, or other unusual or non-random patterns in surveillance data. The tools can be applied to grouped data or can be used to examine individual observations [2]. Control charts can ingest many different types of data, their versatility can be expanded by pre-processing, and both stock and custom control charts can be used [30,114]. Control charts may also be useful for presenting preliminary results of data collected from instrumented watersheds [41].
While the tools are broadly useful and the use of control charting in surveillance monitoring is possible, there are some types of programs that may benefit from the approach. For example, control charts may be especially useful in large-scale and long-term regional monitoring programs examining many physical environments, in which study questions are tiered (e.g., surveillance and investigations of cause), measurements are routinized and planned, and a filtering mechanism is needed to categorize results [29,54,55]. Control charts (and the concept) may also be well suited when relatively uniform The analyses of the bird community data highlight some of the main advantages for adopting control charts in biological monitoring. The goals of control charting align closely with the goals of environmental monitoring, including identifying unusual states in environmental indicators. Another advantage of incorporating control charts into environmental monitoring is the potential for standardizing the presentation and interpretation of data [28]. However, a standardized and simplified presentation can also obscure the details embedded in the analyses, including the validity of any assumptions made. The approach also requires reference data, which may be challenging to obtain, especially in the OSR, where multiple stressors, such as physical disturbances of habitats, may often be accompanied by atmospheric deposition of contaminants of concern [96].
While often not necessary for identifying some types of differences, including known spills, for data collected as covariates, or for data not intended for a direct comparison among treatments, control charts have desirable attributes for further expanding their use in environmental monitoring [13]. The tools are well suited for quickly identifying subtle step-changes, gradual trends, or other unusual or non-random patterns in surveillance data. The tools can be applied to grouped data or can be used to examine individual observations [2]. Control charts can ingest many different types of data, their versatility can be expanded by pre-processing, and both stock and custom control charts can be used [30,114]. Control charts may also be useful for presenting preliminary results of data collected from instrumented watersheds [41].
While the tools are broadly useful and the use of control charting in surveillance monitoring is possible, there are some types of programs that may benefit from the approach. For example, control charts may be especially useful in large-scale and long-term regional monitoring programs examining many physical environments, in which study questions are tiered (e.g., surveillance and investigations of cause), measurements are routinized and planned, and a filtering mechanism is needed to categorize results [29,54,55]. Control charts (and the concept) may also be well suited when relatively uniform reporting across multiple indicators with varying attributes is desired by managers and stakeholders to assess the state of the environment, e.g., [34]. As practices improve and better programs are planned and implemented in the future, e.g., [33], control charting may also become a standard approach to simplify any initial reporting of the routine data. Additionally, regular, current, and expeditious analyses of industrial performance data with control charts and the widespread availability of those results may also greatly improve the efficiency and design of monitoring programs.
Although there are benefits to adopting control charting within regional monitoring, the approach is neither infallible nor universally applicable. For example, some control charts are susceptible to the directional inertia of serial observations [63], and the causes of changes can be difficult to diagnose, especially with multivariate charts [2]. Additionally, not all activities considered 'monitoring' may require probabilistic control charts, such as comparisons of chemical concentrations to well-established, robust, and relevant guidelines. Furthermore, although activities in which conventional statistical approaches and hypotheses are tested are typically used may be the best targets for gaining the benefits of control charts, the new tools will likely compete against those established methods [2,29,112,[115][116][117][118]. In some or many applications, the conventional tools may also be sufficient. For example, simple linear regressions identified temporal trends in V and As ( Figure S2). Additionally, control charts and the control limits may have limited or no inherent ecological or social relevance [89] and cannot be used to make management decisions, e.g., [41], unless explicitly imbued with these properties [29].
Other challenges are also likely with using control charts. Confidently transitioning from the Phase I (retrospective) to Phase II (prospective) analyses [2] in uncontrolled, natural environments can be challenging. However, Phase I retrospective analyses are intended to eliminate special causes [2] and for instigating deeper studies, including either initiating new focused collections of data or obtaining and analyzing other existing and relevant data. As we have seen, accounting for natural background covariation may simplify these analyses, but complex, arduous, and data-intensive physical models or pre-processing of data may also eventually be required. Despite these difficulties, planning for the use of control charts, e.g., [29], and for the eventual transition to Phase II analyses can greatly improve the utility of environmental monitoring.
Another potential challenge is a discrepancy between the actual and estimated falsepositive rates [9,100,119]. Control charts can also be susceptible to high false-positive rates, especially if autoregressive models are not used where they are likely useful. Similarly, breaks in datasets can also complicate analyses. For example, both missing PM 2.5 data from the Horizon air quality station and time delays in releasing data used as covariates, such as discharge data for the lower Muskeg, can affect the suitability of default control charts for analyzing data. Expanding analyses to multiple stations (as for air) or to multiple semiindependent datasets (as for water quality) may be especially challenging. Additionally, the challenges of combining semi-independent datasets, such as discharge and water quality, will be further affected when these analyses are not planned. The tools and presentation of the results can also obscure the technical details of the underlying analyses, and the tools can require well-defined reference or baseline data [28].
Despite the potential challenges with control charts, the tools have broad applicability within environmental monitoring. However, as with most approaches, understanding their advantages and limitations is necessary. As described above, control charts are well suited for reporting the results of routine surveillance monitoring obtained from long-term programs [13] and could be included as one part of a suite of techniques.
Supplementary Materials: The following supporting information can be downloaded at: https:// www.mdpi.com/article/10.3390/environments10050078/s1. Table S1. Regression table for vanadium and arsenic. Generalized estimating equations used to calculate response residuals used in Figure 2. Table S2. Regression results for PM 2.5 at three stations in the oil sands regions used to calculate response residuals for multivariate control charts in Figure 3; AR1 indicates the correlation coefficient between adjacent observations calculated using the geeglm() function in the geepack R package; Lag1 variable is the lagged PM 2.5 for previous week used as an input variable in the full regression models with no missing weeks. Figure S1. Control charts of raw concentrations of vanadium and arsenic at the lower Muskeg station, plus the raw concentrations plotted over time with a linear model. Figure S2. Time intervals between sampling events at the lower Muskeg River site between 2004 and 2021. Figure S3. Relationships of the concentrations of vanadium and arsenic with discharge in the lower Muskeg River from 2004 to 2019; statistical relationship described with a generalized additive model (with its 95% confidence interval). Figure S4. Mean daily elevation of groundwater at the sampling well 07DAG051 in the Muskeg drainage area (57.237790845 • N, −111.449408386 • W). Figure S5. Relationship between the concentrations of vanadium and arsenic (mg/L) at the lower Muskeg River site between 2004 and 2019. Figure S6. Areas of forest fires from 2010 to 2020 in northern Alberta showing the oil sands administrative area along with all project areas. Figure S7. Scaled (0 to 1) performance and production mean daily values (per month) for the Horizon mine; red symbols show out-of-control measurements using all variables and the T 2 chart (see Figure 4 in main text). Figure S8. Univariate control charts (x-bar, EWMA, CUSUM) for industrial production and performance measurements for the Horizon mine from January 2010 to December 2020; red symbols indicate exceedances of upper control limits (UCLs); blue symbols indicate exceedances of lower control limits (LCLs); yellow symbols indicate runs of >6 serial observations greater or less than the mean in the x-bar chart.
Funding: This research received no external funding.

Data Availability Statement:
All data used in this study are publicly available; websites are provided in the references.