A Study on the 
 
 
 
 X 
 ¯ 
 
 
 
 and S Control Charts with Unequal Sample Sizes

The control charts based on X ¯ and S are widely used to monitor the mean and variability of variables and can help quality engineers identify and investigate causes of the process variation. The usual requirement behind these control charts is that the sample sizes from the process are all equal, whereas this requirement may not be satisfied in practice due to missing observations, cost constraints, etc. To deal with this situation, several conventional methods were proposed. However, some methods based on weighted average approaches and an average sample size often result in degraded performance of the control charts because the adopted estimators are biased towards underestimating the true population parameters. These observations motivate us to investigate the existing methods with rigorous proofs and we provide a guideline to practitioners for the best selection to construct the X ¯ and S control charts when the sample sizes are not equal.


Introduction
Control charts, also known as Shewhart control charts [1][2][3], have been used to determine if a manufacturing process is in a state of control. In particular, theX and S charts have been widely used to monitor or detect the mean and variability of variables. Here, a variable is a quality characteristic measured on a numerical scale. For example, variables include continuous measurement process data such as length, pressure, width, temperature, and volume, in a time-ordered sequence.
Due to their importance and usefulness in real life applications, these traditional types of univariate and control charts have still received much attention in the literature. We observe that these control charts are usually adopted for continuously monitoring numerous data and for solving the problem of process control in the Industry 4.0 framework; see, for example [4][5][6], among others. Of particular note is that [7] developed a nice qcr package in R to generate Shewhart-type charts and obtained numerical results of interest for a process quality control. More recently, based on the concept of data depth, Ref. [8] proposed a novel alternative way for constructing control charts when the critical to quality (CTQ) variables of the process are functional and also developed the Phase I and II control charts for stabilizing and monitoring the processes, respectively.
It deserves mentioning that these traditional Shewhart control charts mentioned above consist of the upper and lower control limits (for short, UCL and LCL) and the center line (CL). It is noteworthy that the American Standard is based on CL ± 3 · SE control limits with an ideal false alarm rate (FAR) of 0.27% while the British Standard is based on CL ± 3.09 · SE with an ideal FAR of 0.20%, where SE denotes the standard error.
The usual requirement behind these control charts is that the sample sizes from the process are all equal. In practice, however, it is often the case that this requirement can not be met due to wrong or missing observations in collecting them. In such setting, the three conventional approaches below are widely used to deal with unequal sample sizes: (i) A weighted average approach in calculatingX andS 2 .
(ii) Control limits based on an average sample size. (iii) A weighted average in calculatingS.
For more details, see Subsection 6.3.2 of Montgomery [9] and Subsection 3.8.B of ASTM [10]. The first approach uses variable-width control limits which are determined by the sample-specific values such as n i , A 3 , B 3 , and B 4 . To estimate the scale parameter, a weighted average of sample variances is calculated first and then its square-root is taken to estimate the scale parameter. The second approach uses fixed-width control limits which is based on the average of the sample sizes. For more details on these two methods, see Subsection 6.3.2 of Montgomery [9]. The third approach is very similar to calculatingS 2 in the first approach. However, it uses a weighted average of sample standard deviations directly to estimate the scale parameter. For more details, see Subsection 3.8.B of ASTM [10]. It is known that these three approaches may be satisfactory when the sample sizes are not very different. Given that the average of the sample sizes is not necessarily an integer in general, a practical alternative to the second approach is the use of a modal sample size.
However, when using these ad hoc approaches above, the parameter estimators are biased and they actually underestimate the true population parameters as will be shown in Remark 2 in Section 2 and Remarks 3 and 4 in Section 3. These underestimating ad hoc approaches could result in degraded performance of the control charts. Nonetheless, these biased methods are widely covered in many popular textbooks. These observations motivate us to clarify these conventional methods and investigate other existing methods, especially when the samples are not equal in size. Through the rigorous proofs, we provide a guideline for the best selection of the methods to construct theX and S control charts. This paper is organized as follows. In Section 2, we provide two location estimators and four scale estimators with unequal sample sizes and show that they are all unbiased. In Section 3, we provide the variances of the estimators considered in this paper and show the inequality relations among them. In Section 4, we provide the relative efficiency of the methods and conduct simulation results to compare the performance of the location and scale estimators. In Section 5, we illustrate how to construct various Shewhart-type control charts (i.e., S, S 2 , andX charts) using the provided estimators. In Section 6, we provide the empirical estimates of the average run length (ARL) and the standard deviation of the run length (SDRL) through using the extensive Monte Carlo simulations. Three real-data examples are presented in Section 7 for illustrative purposes. Some concluding remarks are given in Section 8.

Estimation of Process Parameters with Unequal Sample Sizes
In this section, we provide two location estimators and three scale estimators for the process parameters under the assumption that each sample has different sample sizes. In parametric statistical quality control, the underlying distribution is used to construct the control charts. A quality characteristic is assumed to be normally distributed, which is most widely used in most practical cases. Under this assumption, we show that the estimators provided in this section are all unbiased.
We assume that we have m samples and that each sample has different sample sizes. Let X ij be the ith sample (subgroup) of size n i from a stable manufacturing process, where i = 1, 2, . . . , m and j = 1, 2, . . . , n i . We also assume that X ij are independent and identically distributed as normal with mean µ and variance σ 2 .

Location Parameter
Using the ith sample above, the sample mean and the sample variance are given bȳ where i = 1, 2, . . . , m. Montgomery [9] provides two location estimators of the population mean parameter µ in Equations (6.2) and (6.30) in his book, which are given bȳ where n i ≥ 2 and N = ∑ m i=1 n i . These grand averages can be used as the CL on theX chart. Since E(X i ) = µ for i = 1, 2, . . . , m, it is easily seen that E(X A ) = µ and E(X B ) = µ, showing that these two estimators are unbiased. In addition, the variances ofX A andX B are given by Var( It should be noted that, for the case of an equal sample size, we haveX A =X B =X, whereX = ∑ m i=1X i /m.

Scale Parameter
It is well known that S 2 i is an unbiased estimator of σ 2 . However, S i is not an unbiased estimator of σ as below. Since (n i − 1)S 2 i /σ 2 is distributed as the gamma with shape (n i − 1)/2 and scale 2, we have Then, we have where This shows that S i /c 4 (n i ) is actually an unbiased estimator of σ. Note that c 4 (·) is the normal-consistent unbiasing factor, which is a function of the sample size and this c 4 notation was originally used in ASQC [11] to the best of our knowledge. For more details on c 4 (·), the interested reader is referred to Vardeman [12]. In Appendix A, we also provide an approximate calculation of c 4 (·) which can be used for practically easier calculation.
Thus, we can estimate σ using S i /c 4 (n i ) which are are all unbiased. Analogous to Equation (1), one can useS which is clearly unbiased for σ. (3), the estimator below is also unbiased for σ. In addition, we consider the following unbiased estimator proposed by Burr [13] This estimator is actually the best linear unbiased estimator (BLUE) and we provide the proof below. Theorem 1. The estimatorS C in Equation (7) is the BLUE.
Proof. First, we consider a linear unbiased estimator in the form of ∑ m i=1 w i S i . Then, its variance and expectation are given by To obtain the BLUE, we need to minimize which can be easily solved by using the method of Lagrange multipliers. The auxiliary function with the Lagrange multiplier λ is given by It is immediate from ∂Ψ/∂w k that 2w k {1 − c 4 (n k ) 2 } − λc 4 (n k ) = 0, which results in Multiplying c 4 (n k ) to Equation (8) and then making the sum of the two sides, we have Since ∑ m i=1 w i c 4 (n i ) = 1, we first solve the above for λ and then substitute it into Equation (8), which provides which results inS C = ∑ m i=1 w i S i . This completes the proof.

Remark 1.
It should be noted that, for the case of an equal sample size (n 1 = n 2 = · · · = n m = n), we can easily show thatS One can also incorporate the pooled sample variance in estimating σ given by where N = ∑ m i=1 n i again. However, S p is not unbiased for σ although S 2 p is. This is because Based on this, Burr [13] suggested the following unbiased estimator of σ, Remark 2. The weighted average approach introduced in Subsection 6.3.2 of Montgomery [9] uses the pooled sample standard deviation S p from Equation (9) to estimate σ. However, since c 4 (x) < 1, which will be shown in Lemma 1, it is immediate from Equation (10) that S p clearly underestimates the true parameter σ.
We have introduced the four unbiased scale estimators of σ which are denoted byS A ,S B ,S C , andS D . A natural question appears: which of the four estimators should be recommended for estimating σ in practical applications? In the following section, we clarify this question by providing a guideline in terms of inequalities of their variances of the estimators under consideration.

Inequalities of the Variances of the Scale Estimators
We first obtain the variance ofS A in Equation (5), which is obtained as Using Equation (3) and the unbiasedness property of S 2 i , we have Substituting Equation (13) into (12), we have Similarly, the variance ofS B in Equation (6) is easily obtained as and that ofS C in Equation (7) is also obtained as Finally, for the case ofS D , we have Using Equation (11) and the unbiasedness property ofS D , we can rewrite Equation (17) as Since S 2 p is also unbiased for σ 2 , we have Next, as aforementioned, we here answer the question of how to choose the best one among the four estimators by comparing their variances of these four scale estimators. To be more specific, we prove the inequality relations among the four scale estimators as follows.
Lemma 2 (Chebyshev's sum inequality). If a 1 ≥ a 2 ≥ · · · ≥ a m and b 1 ≥ b 2 ≥ · · · ≥ b m , then we have Similarly, if a 1 ≤ a 2 ≤ · · · ≤ a m and b 1 ≥ b 2 ≥ · · · ≥ b m , then we have Proof. First, we will consider the case where both a i and b i are increasing. In this case, both (a i − a j ) and (b i − b j ) have the same sign, or at least one of them can have the zero value. Then, the value of (a i − a j )(b i − b j ) is positive or zero for any i and j, which results in After the tedious algebra of the above, we have Next, we consider the case when a i is increasing and b i is decreasing. In this case, (a i − a j ) and (b i − b j ) have different signs, or at least one of them can have the zero value. Then, the value of (a i − a j )(b i − b j ) is negative or zero for any i and j. Similar to the above approach, we have This completes the proof.
It should be noted that the above inequality name is coined after Pafnuty Lvovich Chebyshev (1821-1894) who mentioned it in a brief note [17]. He provided it in an integral form and his original proof can be found in Chebyshev [18]. For more details, the readers are also referred to Besenyei [19] and Section 2.17 of Hardy et al. [20].

Theorem 2. We have
Var(S A ) ≥ Var(S B ).
Proof. For convenience, we rearrange the sample sizes so that n 1 ≤ n 2 ≤ · · · ≤ n m . Then, it is immediate from Equations (14) and (15) that it suffices to show Since c 4 (x) is increasing from Lemma 1, it is easily seen that 1/c 4 Applying Lemma 2 again with a i = c 4 (n i ) (increasing) and Comparing Equations (22) and (23), we have This completes the proof.

Theorem 3. We have
Var Proof. We have the variances ofS B andS C from Equations (15) and (16), which are given by For convenience, we let i that the inequality in Equation (24) holds. This completes the proof.
Thus, log θ(y) is convex so − log θ(y) is concave. Since log y is also concave, log(y/θ(y)) = log y + − log θ(y) is concave. This implies that y/θ(y) is log-concave. The log-concavity of y/θ(y) guarantees that it is concave. This completes the proof.

Proof. From Equations
Thus, it suffices to show Using Equation (25) in Lemma 3, we have and where θ(x) is defined in Equation (20). Comparing Equations (27) and (28), we need to show is decreasing from Watson [14], the above inequality in Equation (26) holds. This completes the proof.
In combination with the inequalities in Theorems 2-4, we have the following result: Proof. First, we will show that c 4 (x) is concave. Taking the logarithm of c 4 (x) in Equation (19), we have It is well known that the second derivative of log Γ(x) can be expressed as the sum of the series For more details, see Merkle [27] and Section 11.14 (iv) of Schilling [28]. Using Equation (30), we have This completes the proof.

Remark 3.
It is worth mentioning that one can argue an alternative way based on an average sample sizē n = ∑ m i=1 n i /m. For example, see Section 6.3.2 of Montgomery [9]. In such setting, one may consider the following quotient for the estimator of σS * =S c 4 (n) , However, this estimator is not unbiased because from Lemma 4,S * can underestimate the true parameter σ. However, for the case of an equal sample size (n 1 = n 2 = · · · = n m = n), we haveS * =S/c 4 (n) so thatS A =S B = S C =S * . Thus, in this special case of an equal sample size,S * is unbiased.

Remark 4.
There is another alternative in Subsection 3.8.B of ASTM [10], which is based on the weighted average of sample standard deviations given bȳ Then, we have It deserves mentioning that the estimator above still underestimates the true parameter σ because c 4 (x) < 1 from Lemma 1. In addition, for the case of an equal sample size, we haveS w =S, whereS = 1 m ∑ m i=1 S i .
Then,X andS E = S N /c 4 (N) are the uniform minimum variance unbiased estimators of µ and σ, respectively. Thus, we have Proof. One can obtain the uniform minimum variance unbiased estimator (UMVUE) using complete sufficient statistics as described in Theorem 7.3.23 of Casella and Berger [29]. For more details on the UMVUE, one can refer to Definition 1.6 of Lehmann and Casella [30].
The control charts we have developed are under the assumption that X ij are independent and identically distributed as normal with mean µ and variance σ 2 which leads to the joint complete sufficient statistics for µ and σ 2 given byX = 1 are the UMVUEs of µ and σ, respectively. This completes the proof.

Remark 5.
It is noteworthy that, analogous to Equation (18), we have which also results in

Remark 6.
AlthoughS E can attain the minimum variance, we do not adopt this estimator to construct the control charts. The main reason can be discussed as follows. Consider the case that X ij are independent and identically distributed as normal with different means µ i for j = 1, 2, . . . , n i and variance σ 2 . Then, the pooled sample variance, S 2 p , is a complete sufficient statistic. See Example 2.3 in Section 2.2 of Lehmann and Casella [30]. This implies thatS E is better under the null hypothesis H 0 : µ 1 = µ 2 = · · · = µ m (in control), whereasS D is better under the alternative (out of control). Thus, if the process is out of control, the control charts usingS E could have wider control limits, which may result in an increase in the rate incorrectly signaling that the process is in-control when the process is actually out of control.
In addition, one can think of the case of non-constancy of σ. In this heteroscedasticity case, Burr [13] mentioned thatS C seems preferable toS D . We think that this case should be investigated more thoroughly in a sequel paper.

Comparison of the Performance
In this section, we provide the relative efficiency of the methods to compare their performance. We also carried out Monte Carlo simulations to compare the empirical biases and variances.

Relative Efficiency
When we compare the performance of unbiased estimators (say,θ 1 andθ 0 ), the relative efficiency (RE) is widely used in the statistics literature. See Section 2.2 of Lehmann [31] for more details. The RE ofθ 1 with respect toθ 0 is given by whereθ 0 is a reference estimator. In general, the estimator with the smaller variance of the two estimators is used as a reference estimator so that RE(θ 1 |θ 0 ) ≤ 1.
To estimate the location parameter, we consideredX A in Equation (1) andX B in Equation (2). Then, the RE ofX A with respect toX B is easily obtained as Note that RE(X A |X B ) ≤ 1 where the equality holds if and only if n 1 = n 2 = · · · = n m due to the inequality of the arithmetic mean and the harmonic mean [32].
For the case of the scale parameter, we considered the five estimators. Among them,S E has the smallest variance. Thus, it can be used as a reference to compare the performance of the scale estimators and the RE is then given by It is immediate from Equations (14)- (16), (18), and (34) that we have It can be easily seen from Equations (29) and (34) In particular, when n 1 = n 2 = · · · = n m , we have RE(S A ) = RE(S B ) = RE(S C ) ≤ RE(S D ).
We have considered the RE to compare the performance of the above unbiased estimators. However,S * in Equation (31) andS w in Equation (33) are not unbiased as mentioned in Remarks 2 and 3, respectively. In addition,S in Equation (32) is not unbiased as easily seen from Equation (3). Thus, it is reasonable to consider the mean square error (MSE) to obtain the RE since the MSE can be regarded as a overall measure of bias and dispersion. For other utilization and modification of the RE, one can refer to Park et al. [33,34] and Ouyang et al. [35] which considered the ratio of the determinants of the covariance matrices, that is, the generalized variances [36,37]. Since the MSE is the same as the variance for unbiased estimators (S A ,S B ,S C , andS D ), the RE based on the MSE is the same as that based on the variance in this unbiased case. In addition, the variance ofS E is the same as its MSE. Thus, we consider the RE ofS,S * , andS w as follows. We denote them by We next obtain their biases which are easily obtained using Equation (3) Bias In addition, the variances are also easily obtained by using Var(S i ) = σ 2 1 − c 4 (n i ) 2 so that we have Considering that the MSE is the variance plus the squared bias, we can obtain the RE based on the MSE

Empirical Biases and Variances
The RE is a useful statistical tool to compare the performance of unbiased estimators. However, the conventional methods such as in Equations (31)-(33) are biased. To compare these with the methods provided here, we obtain the empirical biases and variances by carrying out Monte Carlo simulations.
We consider two cases: an equal sample size and unequal sample sizes. We first generated samples of equal size with n 1 = n 2 = n 3 = 3, with n 1 = n 2 = n 3 = 10, and with n 1 = n 2 = n 3 = 20. We next generated samples of unequal sizes with n 1 = 3, n 2 = 5, n 3 = 7, with n 1 = 5, n 2 = 10, n 3 = 15, and with n 1 = 10, n 2 = 20, n 3 = 30. Again, let X ij be the ith sample (subgroup) of size n i . Then, X ij were generated from the normal distribution with mean µ 0 = 100 and standard deviation σ 0 = 10 and we obtained the scale estimates including the unbiased estimators (S A ,S B ,S C ,S D ,S E ) and the conventional methods (S,S * ,S w ). In order to obtain the empirical biases and variances of these estimates, we repeated this simulation ten million times (I = 10 7 ) and the results are summarized in Table 1 for the case of an equal sample size and Table 2 for the case of unequal sample sizes. In addition, the MSEs along with the squared empirical biases (red) and variances (light blue) are plotted in Figure 1 for the case of an equal sample size and in Figure 2 for the case of unequal sample sizes. In the figures, S A ,S B ,S C ,S D ,S,S * , andS w are denoted by A, B, C, D, S, S*, and Sw, respectively. Comparing the simulation results in Table 1, we can observe that the empirical results ofS A ,S B ,S C , andS * are the same for the case of an equal sample size. These results are quite reasonable as pointed out in Remark 3. We can notice that the empirical biases are not exactly zero, although quite negligible, because these biases are due to a random phenomenon of Monte Carlo simulation. In addition,S and S w have the same results as pointed out in Remark 4. On the other hand, for the case of unequal sample sizes, they all have different results.
For the case of an equal sample size, the empirical variances and biases are noticeably different for a smaller sample size, but all of them are getting closer as the sample size is increasing as also shown in Figure 1. For the case of unequal sample sizes, the empirical biases are getting smaller as the sample sizes are increasing. However, the variances are still noticeably different even with large sample sizes. Comparing the empirical variances only,S w performs very well, whereas it is seriously biased. Thus, it is reasonable to compare the MSEs along with the biases and we can conclude thatS D is overall the best. Note that we do not recommend the use ofS E , even thoughS E always has the best results in all the measures. This is because it can lead to degraded performance when the process is out of control, as aforementioned in Remark 6.

Construction of the Control Charts with Unequal Sample Sizes
We briefly introduce how to construct the control charts and then discuss how to implement the estimators provided here to construct the S chart in Section 5.1 and improve the S chart using the probability limits in Section 5.2. We also discuss the construction of theX chart in Section 5.3.
In general, we construct statistical quality control charts based on two phases [9,38], usually denoted by Phase-I and Phase-II. Then, one can establish control limits with a set of stable manufacturing process data in Phase-I. Then, we monitor the process in Phase-II using the control limits obtained in Phase-I. We assume that we have m samples from a stable manufacturing process (Phase-I) and each sample has different sample sizes, denoted by n i where i = 1, 2, . . . , m. Then, we monitor the process with a sample of size n k (Phase-II).

The S Chart
From the statistical asymptotic theory (for example, Corollarys 6-10 of Arnold [39]), we can have an approximate distribution where S k is the sample standard deviation with sample size n k . In order to construct the CL ± 3 · SE control limits, we can set up (S k − E(S k ))/SE(S k ) = ±3. Solving this for S k , we have Since σ is generally unknown, we need to estimate σ. One can useS A in Equation (5),S B in Equation (6),S C in Equation (7), andS D in Equation (11). UsingS A , we can construct the S chart with sample size n k as follows: where we assign zero to LCL if it is negative. Next, usingS B , we construct the S chart as follows: In addition, usingS C , we can construct the S chart as follows: Of particular note is that, for the case of an equal sample size (n 1 = n 2 = · · · = n m = n) in Phase-I, it is easily seen that UCL A = UCL B = UCL C , CL A = CL B = CL C , and LCL A = LCL B = LCL C . Thus, we have the S chart below with sample size n k to use in Phase-II We assign zero to LCL if it is negative. Furthermore, if we assume n k = n, we have . This is a well-known S chart introduced in the quality control literature. For example, see the chart in Equation (6.27) of Montgomery [9]. This indicates that the proposed S chart includes the existing S chart as a special case. UsingS D , we construct the S chart as follows: Unlike the previous cases, this control chart is not the same as the one in Equation (36) even for the case of an equal sample size.

The S and S 2 Charts with Probability Limits
We can improve the above S charts by using probability limits as mentioned in Section 4.7.4 of Ryan [40]. It follows from the following Chi-square distribution result where χ 2 γ,ν is the γth upper quantile of the Chi-square distribution with ν degrees of freedom. Rewriting Equation (37) about S k , we then have Thus, we can construct the S chart with probability limits such that UCL = σ{χ 2 α/2,n k −1 /(n k − 1)} 1/2 , CL = σ, and LCL = σ{χ 2 1−α/2,n k −1 /(n k − 1)} 1/2 . In practice, since σ is unknown, we need to estimate σ. Thus, with the estimatorσ, we can obtain In the above, one construct the S chart with probability limits by usingS A ,S B ,S C orS D instead ofσ. Next, we consider the construction of the S 2 chart. We rewrite Equation (37) about S 2 k and we then have Using the above along withσ 2 = S 2 p where S 2 p is the pooled sample variance denoted in Equation (9), we can also construct the S 2 chart as follows: Note that none ofS 2 A ,S 2 B ,S 2 C , orS 2 D is unbiased for σ 2 , whereas S 2 p is unbiased. Thus, it is not recommended to use any of the four estimators to construct the S 2 chart.

TheX Chart
From the statistical asymptotic theory, we havē whereX k is the sample mean with sample size n k . In order to construct the CL ± 3 · SE control limits, we can set up (X k − E(X k ))/SE(X k ) = ±3. Solving this forX k , we have Since µ and σ are unknown in practice, we need to estimate them. With the estimatesμ andσ, we have To estimate µ, one can useX A defined in Equation (1) orX B defined in Equation (2). To estimate σ, one can use any ofS A ,S B ,S C , andS D .
As an illustration, usingX B andS A , we haveX chart as follows: Similarly, we can construct variousX charts using a total of eight combinations of µ and σ estimators.

Remark 7.
It should be noted that, whenS D is used, theX chart is somewhat different from the above chart and is given by As shown in Section 4,S D performs better thanS A ,S B , andS C . However, to the best of our knowledge, theX chart based onS D has not been widely used probably because the calculation of the normal-consistent unbiasing factor c 4 is difficult especially with a large argument value. Most textbooks provide the values of c 4 (n) only for n ≤ 25. In Appendix A, for an easy calculation, we provide an approximation for c 4 (n) which is highly accurate within one unit in the ninth decimal place for n > 25. Thus, using this, one can easily calculate the LCL and UCL of theX chart based onS D .

Average Run Length and Standard Deviation of Run Length
To compare the performance of the control charts based on the various estimators, we obtained the empirical estimates of the ARL and the SDRL through using the extensive Monte Carlo simulation method. For this simulation, we considered theX chart. In this chart, we only estimated the location withμ =X B because the RE ofX B is better than that ofX A as shown in Section 4. For the scale, we considered seven different estimates (S A ,S B ,S C ,S D ,S,S * ,S w ) and we denoted the charts based on these scale estimates by A, B, C, D, S, S*, and Sw, respectively.
We assume that we have m samples (n 1 , n 2 , . . . , n m ) in Phase-I. Again, let X ij be the ith sample (subgroup) of size n i . Then, X ij 's were generated from the normal distribution with mean µ 0 = 10 and standard deviation σ 0 = 5 and we obtained the location estimate (X B ) and the seven scale estimates. Using these estimates, we constructed the seven control charts based on CL ± 3 · SE control limits with FAR 0.27%. Then, we monitored the process with a new sample of size n k from the same normal distribution and obtained the run length in Phase-II. We repeated this simulation one million times (I = 10 6 ) to obtain the run lengths and then estimated the ARL and SDRL based on these run lengths. Note that the simulation results are the same as the ones under different parameter values of µ 0 and σ 0 . It deserves mentioning that the results are quite reasonable, since the normal distribution is a location-scale family, whereas the results are somewhat dependent on the number of samples and the combination of the sample sizes.
We considered Scenarios I-IV for the cases of unequal sample sizes and Scenario V for the case of an equal sample size as a reference. Upon each of these scenarios, we estimated the ARL and SDRL as described above. The simulation results are provided in Table 3. Note that the ideal ARL under the normal distribution is around 1/0.0027 ≈ 370 with FAR 0.27%. However, this ideal value is obtained when using the true µ 0 and σ 0 . In practice, we need to estimate these parameters in Phase-I. Thus, with such an uncertainty due to estimation, the target ARL can be different from the ideal ARL of 370. We observed that the empirical results of A, B, C, and S* have the same value under the case of an equal sample size (Scenario V) and these results have the same tendency when the RE measure is used in Section 4.2. These results are expected as pointed out in Remark 3. The estimated ARL of D (362.58) is very close to that of A, B, C, and S* (364.36). In addition, the estimated SDRL of D (537.31) is very close to that of A, B, C, and S* (541.45). This minor difference may be due to a random phenomenon of the Monte Carlo simulation. Thus, it is quite reasonable to assume that the target ARL is around 360 and the target SDRL is around 540.
In what follows, we analyze and compare our results based on these target values. Note that we did not consider the values from the control charts based on S and Sw because they have much smaller ARL values than the others, which is mainly due to the fact that they have serious negative bias even with the case of an equal sample size as seen in Table 1.
In Scenario I, there is a serious difference in terms of the sample sizes (3,10,17). The ARL values of the charts based on A and B seriously overshoot the target value and their SDRLs are far above the target value. The ARL values of the charts using S and Sw seriously undershoot the target value while the ARL of the chart using S* has a minor underestimate. These results are closely related to the RE. In Table 2, S andS w have noticeable negative values of the biases whileS * has a decent negative value. This implies that the scale estimates underestimate the true value, which results in a narrower control chart. Thus, using a narrower control chart, one can have a smaller ARL. When it comes to the SDRL, the SDRLs of the charts using A and B seriously overshoot the target and those using S and Sw undershoot the target. The chart using S* overshoots the target. These results are similar to those with the RE.
In Scenarios II-IV, we have similar observations as noticed in Scenario I. Because there is a less serious difference in sample sizes, the observations are not so dramatic as seen in Scenario I, but their tendencies are quite similar. For example, in Scenario IV, the sample sizes are very slightly different, so these results are quite close to those in Scenario V (equal sample size).
In Scenario V, as mentioned earlier, we considered this as a reference. In this case, we can also observe that the charts using S and Sw have the same results and their ARL values seriously undershoot the target value. In Remark 4, we pointed outS w =S for the case of an equal sample size. We can also observe this in Figure 1a. As mentioned earlier, their underperformance is due to a serious negative bias as seen in Table 1. The results show that the charts based on C and D perform very well. However, when the samples are very small (small number of samples with small sample size), we expect that the chart based on D can be slightly better than that based on C as shown in Figure 1. However, in practice, with a decent size of samples, one can use the charts based on either C or D.

Illustrative Examples
Here, we provide three real-data examples to illustrate the applications of the proposed methods into the control charts. All computations were analyzed using the R language [41,42]. The R functions for theX and S charts can be obtained in Appendix B.

Example 1.
We consider the data set presented earlier in Table 30 in Section 3.31 of ASTM [10]. The data sets were obtained from ten shipments whose sample sizes were not equal. Using the R functions provided in Appendix B, we can obtain the following results. In the R function, A, B, C, and D denote the control limits based onS A ,S B ,S C ,S D , respectively. The R function Xbarchart() uses X B as a default for the CL of theX chart sinceX B performs better as seen in Section 4.1.  The control limits are calculated using Methods A-D and the results are summarized in Table 4. The results using Methods A-C are quite close but those using Method D are slightly different from others.

Conclusions
In this paper, we have considered several unbiased location and scale estimators for the process parameters of theX and S control charts when the sample sizes are not necessarily equal. These estimators are essential for constructing the control limits of the Shewhart-type control charts. A natural question is: among these unbiased estimators, which one should be recommended in practical applications? We clarified this question by providing the inequality relations among the variances of these estimators through the rigorous proofs. We also showed that the conventional ad hoc methods could result in degraded performance of the control charts, mainly because the adopted estimators are all biased and they actually tend to underestimate the true scale parameter.
We also provided the relative efficiency of the methods along with the conventional methods and the empirical estimates of the ARL and the SDRL through using the extensive Monte Carlo simulations. We observed that the chart based onS D outperforms the others under consideration from both theoretical and numerical points of view. The only difficulty of usingS D lies in calculating the normal-consistent unbiasing factor c 4 for a large argument value (large sample size) without using a professional software. To resolve this problem, we provided an approximation of the c 4 as a function of a sample size which can be easily calculated with a general calculator for the case of a large argument value and is accurate within one unit in the ninth decimal place.
All of the theoretical results revealed an interesting and useful connection between the two fields of statistics and mathematics. For example, the normal-consistent unbiasing factor c 4 can be expressed as a function through using the Watson representation, which helps one to understand the behavior of the c 4 in depth. We thus expect that the new findings about the c 4 can help quality engineers develop more useful results in various engineering statistics fields.
It is noteworthy to mention that all the theoretical and empirical results of this paper require the assumption of the homoscedasticity case of σ. In an ongoing work, we investigate these estimators more thoroughly for the heteroscedasticity case.
Author Contributions: C.P. developed methodology and R functions; M.W. investigated mathematical formulas; and C.P. and M.W. wrote the manuscript. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: Many textbooks provide the table for the values of c 4 (n). However, to our knowledge, this is very limited to the case for n ≤ 25. To construct the control charts using the proposed methods, it is important to calculate c 4 (n) accurately, especially for n > 25.
The calculation of c 4 (n) needs to calculate the gamma function or the factorial which runs into overflow errors for a large value of n. To avoid this problem, one can use the log-gamma function to calculate c 4 (n), or an approximation technique for c 4 (n) can be used for an easier and simpler calculation. For example, two well-known approximations below are widely used for c 4 (n): For more details, refer to Chapter 3 of ASTM [10]. These approximations are based on the Stirling's formula [43] and accurate within one unit in the fourth and fifth decimal places for n > 25, respectively. However, we can obtain a better approximation using the Wallis' inequality. For more details on the Wallis' inequality, one can refer to Wallis [15], Stedall [44], and Kazarinoff [45]. Here, we provide two approximations such that c 4c (n) ≈ 1 n − 1 · 4 n 2 − 3n + 5 2 and c 4d (n) ≈ 1 n − 1 · 8 n 4 − 6n 3 + 14n 2 − 15n + 6, which are accurate within one unit in the seventh and ninth decimal places, respectively, for n > 25. For more details, see Mortici [16].
To compare the approximations above with the true value, we calculate the relative error times 10 6 of each approximation which is given by where j = a, b, c, d. The relative errors for n = 10, 20, 30, 40, 50 are calculated and summarized in Table A1. As shown in the table, c 4d (n) provides the best approximation and the accuracy gets better for a larger value of n as expected. We can also observe that the approximation is quite accurate even for smaller values of n so that it can be practically used for n ≥ 10, say. This approximation can be useful for field engineers and practitioner because c 4d (n) can be calculated using a regular calculator.