A New Result on Regular Designs under Baseline Parameterization

Abstract

The study on designs for the baseline parameterization has aroused attention in recent years. This paper focuses on two-level regular designs for the baseline parameterization. A general result on the relationship between K-aberration and word length pattern is developed.

Share and Cite:

Qin, M. and Zhao, Y. (2024) A New Result on Regular Designs under Baseline Parameterization. Open Journal of Applied Sciences, 14, 441-449. doi: 10.4236/ojapps.2024.142031.

1. Introduction

The regular fractional factorial designs have been extensively studied in the last decades. Most of these works are based on the zero-sum constrains on the levels of the experiment factors, known as orthogonal parameterization (OP). However, in some situations, a quite natural constrain for the levels of factors is the baseline constrain, known as the baseline parameterization (BP). In some cases, where the experimenter-practitioner does not want to make extensive changes to the process and identify one or two important factors, BP is a suitable option. The BP keeps most of the factors at their current levels, which can reduce the difficulty and cost of experimentation. For example, the cDNA microarray experiments in Yang and Speed (2002) [1] , Glonek and Solomon (2004) [2] , and Banerjee and Mukerjee (2008) [3] . For the BP, the factorial effects are defined with reference to the baseline level.

Recently, there has been a few works for the BP. Mukerjee and Tang (2012) [4] proposed the K-aberration criterion (will be introduced in Section 2) for choosing two-level designs. With a complete search algorithm, Mukerjee and Tang (2012) [4] found some optimal 8, 12 and 16-run two-level factorial designs with respect to the K-aberration criterion. Li et al. (2014) [5] proposed an efficient incomplete search algorithm and found the optimal or near optimal 20-run two-level factorial designs. Miller and Tang (2016) [6] established a relationship between the values of K 2 , K 3 , , K t in K-aberration sequence and the word length pattern (WLP) which is a concept for the OP. Mukerjee and Tang (2016) [7] obtained some certain rank conditions for finding optimal factorial designs. By employing approximate theory together with certain discretization procedures, Mukerjee and Huda (2016) [8] tabulated some efficient robust fractional factorial designs for inference on the main effects or some interactions. Lin and Yang (2018) [9] studied multistratum baseline designs under the generalized minimax A-criterion. Karunanayaka and Tang (2017) [10] , Chen et al. (2021) [11] and Li et al. (2022) [12] considered a class of compromise designs which are friendly to situations where some interactions are important. Sun and Tang (2022) [13] explored the relationship between the BP and OP which is helpful to optimal design constructions. Yan and Zhao (2023) [14] proposed minimum aberration criterion for choosing three-level factorial designs and developed an algorithm to find them.

As aforementioned, Miller and Tang (2016) [6] proposed to study two-level regular designs for the BP using the WLP (will be introduced in Section 2). Miller and Tang (2016) [6] established a relationship between the value of K4 and the WLP for a special case where A 3 0 A 3 . The contributions of this work are as follows. We further investigate the relationship between the value of K4 and the WLP. Exploring this relationship is helpful to find good baseline designs under the minimum K-aberration criterion. A general result for K4 to be expressed by WLP is proposed. The new proposed result has broader applications than that proposed in Miller and Tang (2016) [6] , as it releases the constrain A 3 0 A 3 . To demonstrate this point, an illustrative example is provided.

The rest of the paper is organized as follows. In Section 2, some notation and definitions are provided. Section 3 develops the main result. Section 4 gives the concluding remarks.

2. Preliminaries

Suppose D is an N-run design with m factors each at two levels 0 and 1, where 0 represents the baseline level and 1 represents the test level. Then D is a design for the BP. Let Ω s ( D ) denote the full collection of all the s-column subdesigns of D. Without specially stated, in the following, we use Ω s instead of Ω s ( D ) for reason of readability. For W Ω s , denote α ( W ) as the number of rows in W which consists of elements 1’s. Mukerjee and Tang (2012) [4] developed the following expression (2.1) which quantifies the alias caused by s-factor interactions when estimating the main effects

K s = ( 4 / N 2 ) ( s T 1 + T 2 ) , (2.1)

where T 1 = W Ω s ( α ( W ) ) 2 and T 2 = W Ω s + 1 W * Ω s ( W ) ( 2 α ( W ) α ( W * ) ) 2 . A two-level design which sequentially minimizes the sequence

( K 2 , K 3 , , K m )

is called a K-aberration design.

In this work, the notation 2 m p is used to denote the two-level regular fractional factorial design which has N = 2 m p runs and m columns each at two levels coded as 0 and 1. In Table 1, a regular 2 5 2 design is shown. The 2 5 2 design in Table 1 has defining contrast subgroup A B D = 1 8 , B C E = 1 8 and A C D E = 0 8 , where 1 8 and 0 8 is 8-dimension vector of ones and zeros, respectively. Such a defining contrast subgroup means that ( A + B + D ) mod 2 = 1 8 , ( B + C + E ) mod 2 = 1 8 and ( A + C + D + E ) mod 2 = 0 8 . In general, a collection of columns from a regular 2 m p design is called a defining word, if the sum (mod 2) of these columns equals to a vector of ones or zeros. Recall the meaning of Ω k ( D ) , for any W Ω k ( D ) , denote ϕ ( W ) as a vector generated by taking sum (mod 2) of the columns in W. Denote Ψ ( W ) as the sum of the elements in ϕ ( W ) . Define

J k ( W ) = | 2 Ψ ( W ) N | .

For the regular 2 m p designs, there exists J k ( W ) = 0 or N. The formula J k ( W ) = 0 indicates that ϕ ( W ) contains half zeros and half ones, and W is of strength k. The formula J k ( W ) = N is due to ϕ ( W ) = 0 N or 1 N , which means that W is a defining word. Without causing confusions, hereafter, we use ϕ instead of ϕ ( W ) for conciseness. Let A k = W Ω k J k ( W ) , then A k is the number of defining words of length k. Under the OP, for a regular 2 m p design of resolution t 3 , the sequence ( A 3 , A 4 , , A m 1 ) is called its word length pattern (originally proposed by Fries and Hunter (1980) [15] ).

Clearly, a regular 2 m p design can be regarded as a design of N = 2 m p runs and m columns under the BP. It is worthy of noting that the interaction columns under the OP are different from that under the BP. As an illustration, we consider the 2 5 2 design in Table 1. Under the OP, the interaction column of the main effect columns A and B is generated by taking sum (mod 2) of columns A and B, i.e., A B = ( 0,1,1,0,0,1,1,0 ) . Under the BP, the interaction column of

Table 1. A regular 2 5 2 design.

the main effect columns A and B is the element-wise multiplies of columns A and B, i.e., A B = ( 0,0,0,1,0,0,0,1 ) .

With the knowledge above, in Section 3, we establish the relationship between the value of K4 and Ak’s.

3. Relationship between the Value of K4 and the WLP

We first introduce a lemma which explores the number of defining words in a collection of t + 2 columns from a regular 2 m p design D with resolution t = 3 .

Lemma 1. Suppose D is a regular 2 m p design with resolution t = 3 . Let W Ω t + 2 ( D ) , then W contains at most two independent defining words, where n 5 .

Suppose W = { g 1 , g 2 , g 3 , g 4 , g 5 } , where g 1 , g 2 , , g 5 are five columns of D. Then, it is easy to cheek that W contains only one defining word or two independent defining words. For the later case, the two independent defining words can be d 1 = g 1 g 2 g 3 and d 2 = g 1 g 4 g 5 , without loss of generality. This completes the proof.

Denote A 3 0,0 as the number of pairs of length three defining words which have a common column and these defining words have ϕ = 0 N ; A 3 1,1 as the number of pairs of length three defining words which have a common column and these defining words have ϕ = 1 N ; and A 3 0,1 as the number of pairs of length three defining words which have a common column, where one of these two defining words has ϕ = 0 N and the other has ϕ = 1 N . Define A i 0 as the number of defining words which length i and ϕ = 0 N , where i = 3 and 4. The following theorem establishes the relationship between the value of K4 and the WLP for t = 3 .

Theorem 1. For a regular 2 m p design D of resolution t = 3 we have

K 4 = ( 1 / 8 ) 2 [ 4 ( m 4 ) 6 A 3 0 , 0 2 A 3 0 , 1 + 10 A 3 1 , 1 + [ 3 ( m 3 2 ) 4 ( m 3 ) ] A 3 0 + [ 3 ( m 3 2 ) + 12 ( m 3 ) ] A 3 1 + 4 ( m 1 ) A 4 0 + 4 ( m 5 ) A 4 1 + 5 A 5 ] .

Denote W = { g 1 , g 2 , g 3 , g 4 } , there are five scenarios for the columns in W,

(a1) W contains a defining word of length three and its ϕ = 0 N ;

(a2) W contains a defining word of length three and its ϕ = 1 N ;

(a3) W contains a defining word of length four and its ϕ = 0 N ;

(a4) W contains a defining word of length four and its ϕ = 1 N ;

(a5) W contains four independent columns.

For (a1), it is impossible for W to have a row of ( 1,1,1,1 ) . Thus, α ( W ) = 0 . There are ( m 3 ) A 3 0 such W’s. For (a2), suppose ( g 1 + g 2 + g 3 ) mod 2 = 1 N without loss of generality. The four-tuple combinations ( 1,1,1,1 ) appears N/8 times in the rows of { g 1 , g 2 , g 3 , g 4 } . There are ( m 3 ) A 3 1 such W’s. For (a3), α ( W ) = N / 8 and there are A 4 0 such W’s. For (a4), we have α ( W ) = 0 and there are A 4 1 such W’s. For (a5), we have α ( W ) = N / 16 and there are

( m 4 ) A 4 0 A 4 1 ( m 3 ) ( A 3 0 + A 3 1 ) such W’s. Recalling the definition of T1 below the formula (1), we obtain

T 1 = ( ( N / 16 ) 2 ( ( m 4 ) A 4 0 A 4 1 ( m 3 ) ( A 3 0 + A 3 1 ) ) + ( N / 8 ) 2 ( ( m 3 ) A 3 1 + A 4 0 ) ) = ( ( m 4 ) + 3 A 4 0 A 4 1 ( m 3 ) A 3 0 + 3 ( m 3 ) A 3 1 ) ( N / 16 ) 2 .

Suppose W = { g 1 , g 2 , g 3 , g 4 , g 5 } , there are the following possibilities for the columns in W:

(b1) W contains two independent defining words of length three and their ϕ = 0 N ;

(b2) W contains two independent defining words of length three and their ϕ = 1 N ;

(b3) W contains two defining words of length three and they have ϕ = 0 N and ϕ = 1 N respectively;

(b4) W contains only one defining word of length three and its ϕ = 0 N ;

(b5) W contains only one defining word of length three and its ϕ = 1 N ;

(b6) W contains only one defining word and, its length is four and its ϕ is 0 N ;

(b7) W contains only one defining word and its length is four and its ϕ is 1 N ;

(b8) W contains a defining word of length five and its ϕ = 0 N ;

(b9) W contains a defining word of length five and its ϕ = 1 N ;

(b10) W contains five independent columns.

Where the possibilities (b1), (b2) and (b3) are due to the following reasons. According to the proof of Lemma 1, there are three possibilities for W which contains a defining word of length four and its ϕ = 0 N :

(c1) W contains two length three defining words of ϕ = 0 N which have a common column. These two length three defining words create a length four word of ϕ = 0 N ;

(c2) W contains two length three defining words of ϕ = 1 N which have a common column. These two length three defining words create a length four word with its ϕ = 0 N ;

(c3) W contains only one defining word and its length is four with ϕ = 0 N .

Similarly, there are two possibilities for W which contains a defining word of length four and its ϕ = 1 N :

(c4) W contains two length three defining words with a common column. One of these two defining words has ϕ = 0 N and the other has ϕ = 1 N . These two length three defining words create a length four defining word with its ϕ = 1 N .

(c5) W contains only one defining word and its length is four with ϕ = 1 N .

We now proceed to investigate the number of W in each of the cases (b1)-(b10), and the contributions of each W in (b1)-(b10) to T2. Hereafter, we denote W * as subset of W, where W * has one less column than W.

For (b1), the number of W is A 3 0,0 . Since each W in this case contains a length four defining word of ϕ = 0 N , then the number of five-tuple combination ( 1,1,1,1,1 ) for each W is zero. Therefore, α ( W ) = 0 . Among the five W * ’s, four of them contain at least one length three defining word of ϕ = 0 N and thus α ( W * ) = 0 for these four W * ’s. One of the five W * ’s contains no length three defining word but only one length four defining word of ϕ = 0 N , and this W * has α ( W * ) = N / 8 .

For (b2), the number of W is A 3 1,1 . With a similar argument of (b1), we obtain that α ( W ) = N / 8 and α ( W * ) = N / 8 for all of the five W * ’s.

For (b3), the number of W is A 3 0,1 . For each W in this case, we have α ( W ) = 0 . There are three W * ’s with α ( W * ) = 0 and two with α ( W * ) = N / 8 .

For (b4), the number of W is ( m 3 2 ) A 3 0 2 A 3 0,0 A 3 0,1 , where the 2 A 3 0,0 is due to that any pair of length three defining words of ϕ = 0 N contributes twice to ( m 3 2 ) A 3 0 . For example, we suppose g 1 g 2 g 3 = 0 N and g 1 g 4 g 5 = 0 N . Then, any two columns from { g 4 , g 5 , , g m } and the columns g 1 , g 2 , g 3 comprise a W. There are total ( m 3 2 ) such W’s including { g 1 , g 2 , g 3 , g 4 , g 5 } which belongs to case (b2). Any two columns from { g 2 , g 3 , g 6 , , g m } and the columns g 1 , g 4 , g 5 comprise a W. There are total ( m 3 2 ) such W’s including { g 1 , g 2 , g 3 , g 4 , g 5 } which belongs to case (b2). Clearly, the { g 1 , g 2 , g 3 , g 4 , g 5 } is counted twice. With a similar argument to (b1), we have α ( W ) = 0 , α ( W * ) = 0 for two W * ’s and α ( W * ) = N / 16 for three W * ’s.

For (b5), the number of W is ( m 3 2 ) A 3 1 2 A 3 1,1 A 3 0,1 . Each W in this case has α ( W ) = N / 16 , and α ( W * ) = N / 8 for two W * ’s and α ( W * ) = N / 16 for three W * ’s.

For (b6), the number of W is ( m 4 ) A 4 0 A 3 0,0 A 3 1,1 . Each W in this case has α ( W ) = N / 16 , and α ( W * ) = N / 8 for one W * and α ( W * ) = N / 16 for four W * ’s.

For (b7), the number of W is ( m 4 ) A 4 1 A 3 1,0 . Each W in this case has α ( W ) = 0 , and α ( W * ) = 0 for one W * and α ( W * ) = N / 16 for four W * ’s.

For (b8), the number of W is A 5 0 . Each W in this case has α ( W ) = 0 and α ( W * ) = N / 16 for all of the five W * ’s.

For (b9), the number of W is A 5 1 . Each W in this case has α ( W ) = N 2 / 16 and α ( W * ) = N / 16 for all of the five W * ’s.

For (b10), there exists 2 α ( W ) α ( W * ) = 0 .

The discussions above are summarized in Table 2 below.

From Table 2, recalling the definition of T2 below formula (1), with a careful calculation we obtain that

T 2 = ( N / 8 ) 2 A 3 0,0 + 5 ( N / 8 ) 2 A 3 1,1 + 2 ( N / 8 ) 2 A 3 0,1 + 3 ( N / 16 ) 2 ( ( m 3 2 ) A 3 0 2 A 3 0,0 A 3 0,1 ) + 3 ( N / 16 ) 2 ( ( m 3 2 ) A 3 1 2 A 3 1,1 A 3 0,1 ) + 4 ( N / 16 ) 2 [ ( m 4 ) A 4 0 2 A 3 0,0 A 3 1,1 ] + 4 ( N / 16 ) 2 [ ( m 4 ) A 4 1 A 3 0,1 ] + 5 ( N / 16 ) 2 A 5 0 + 5 ( N / 16 ) 2 A 5 1 = 6 ( N / 16 ) 2 A 3 0,0 2 ( N / 16 ) 2 A 3 0,1 + 10 ( N / 16 ) 2 A 3 1,1 + 3 ( m 3 2 ) ( N / 16 ) 2 A 3 0 + 3 ( m 3 2 ) ( N / 16 ) 2 A 3 1 + 4 ( m 4 ) ( N / 16 ) 2 A 4 0 + 4 ( m 4 ) ( N / 16 ) 2 A 4 1 + 5 ( N / 16 ) 2 A 5 .

Therefore,

4 T 1 + T 2 = [ 4 ( m 4 ) 6 A 3 0,0 2 A 3 0,1 + 10 A 3 1,1 + [ 3 ( m 3 2 ) 4 ( m 3 ) ] A 3 0 + [ 3 ( m 3 2 ) + 12 ( m 3 ) ] A 3 1 + 4 ( m 1 ) A 4 0 + 4 ( m 5 ) A 4 1 + 5 A 5 ] ( N / 16 ) 2 .

This completes the proof.

For the regular 2 m p designs with reslotion t = 3 , Miller and Tang (2016)

Table 2. α ( W ) , α ( W * ) and f W for Theorem 1.

f W denotes the number of W’s in (b1)-(b10), means that the W’s in (b10) do not contribute to T2.

[6] established a relationship between K4 and the WLP, which works only for the case where A 3 0 = A 3 . Theorem 1 provides a more general relationship between K4 and the WLP, which works for both cases where A 3 0 = A 3 and A 3 0 A 3 . With Theorem 1, one can easily obtain the value of K4 for a regular 2 m p design of resolution t = 3 based via its word length pattern. This point is demonstrated in the example below.

Example 1. Consider the value of K4 of the regular 2 6 3 design with defining contract subgroup A 1 A 2 A 4 = 0 N , A 1 A 3 A 5 = 1 N , A 1 A 2 A 3 A 6 = 0 N , A 2 A 3 A 4 A 5 = 1 N , A 3 A 4 A 6 = 0 N , A 2 A 5 A 6 = 1 N and A 1 A 4 A 5 A 6 = 1 N . This design has A 3 = 4 and A 3 0 = 2 . Clearly, A 3 0 A 3 , and thus the result in Miller and Tang (2016) [6] is not applicable here. Using Theorem 1, we can obtain that K 4 = 2.625 noting that A 3 1 , 1 = 1 and A 3 0 , 1 = 4 .

4. Concluding Remarks

Recently, the studies on the designs for the BP have arisen wide attention. For the regular 2 m p designs with resolution t = 3 , Miller and Tang (2016) [6] established the relationship between K4 and the WLP for a special case where A 3 0 = A 3 for the regular 2 m p designs with resolution t = 3 . Theorem 1 provides a more general result on the relationship between K4 and the WLP, which work for both cases where A 3 0 = A 3 and A 3 0 A 3 . Such a point is demonstrated in Example 1.

Acknowledgements

This work was supported by the National Natural Science Foundation of China (Grant Nos. 12171277 and 11801331).

Conflicts of Interest

The authors declare no conflicts of interest regarding the publication of this paper.

References

[1] Yang, Y.H. and Speed, T. (2002) Design Issues for cDNA Microarray Experiments. Nature Genetics, 3, 579-588.
https://doi.org/10.1038/nrg863
[2] Glonek, G.F.V. and Solomon, P.J. (2004) Factorial and Time Course Designs for cDNA Microarray Experiments. Biostatistics, 5, 89-111.
https://doi.org/10.1093/biostatistics/5.1.89
[3] Banerjee, T. and Mukerjee, R. (2008) Optimal Factorial Designs for cDNA Microarray Experiments. Annals of Applied Statistics, 2, 366-385.
https://doi.org/10.1214/07-AOAS144
[4] Mukerjee, R. and Tang, B. (2012) Optimal Fractions of Two-Level Factorials under a Baseline Parameterization. Biometrika, 99, 71-84.
https://doi.org/10.1093/biomet/asr071
[5] Li, P., Miller, A. and Tang, B. (2014) Algorithmic Search for Baseline Minimum Aberration Designs. Journal of Statistical Planning and Inference, 149, 172-182.
https://doi.org/10.1016/j.jspi.2014.02.009
[6] Miller, A. and Tang, B. (2016) Using Regular Fractions of Two-Level Designs to Find Baseline Designs. Statistica Sinica, 26, 745-759.
https://doi.org/10.5705/ss.202014.0099
[7] Mukerjee, R. and Tang, B. (2016) Optimal Two-Level Regular Designs under Baseline Parametrization via Cosets and Minimum Moment Aberration. Statistica Sinica, 26, 1001-1019.
https://doi.org/10.5705/ss.202015.0214
[8] Mukerjee, R. and Huda, S. (2016) Approximate Theory-Aided Robust Efficient Factorial Fractions under Baseline Parametrization. Annals of the Institute of Statistical Mathematics, 68, 787-803.
https://doi.org/10.1007/s10463-015-0509-x
[9] Lin, C.Y. and Yang, P. (2018) Robust Multistratum Baseline Design. Computational Statistics & Data Analysis, 118, 98-111.
https://doi.org/10.1016/j.csda.2017.08.009
[10] Karunanayaka, R.C. and Tang, B. (2017) Compromise Designs under Baseline Parameterization. Journal of Statistical Planning and Inference, 190, 32-38.
https://doi.org/10.1016/j.jspi.2017.04.003
[11] Chen, A., Sun, C.Y. and Tang, B. (2021) Selecting Baseline Designs Using a Minimum Aberration Criterion When Some Two-Factor Interactions Are Important. Statistical Theory and Related Fields, 5, 95-101.
https://doi.org/10.1080/24754269.2020.1867795
[12] Li, W., Liu, M.Q. and Tang, B. (2022) A Systematic Construction of Compromise Designs under Baseline Parameterization. Journal of Statistical Planning and Inference, 219, 33-42.
https://doi.org/10.1016/j.jspi.2021.11.004
[13] Sun, C.Y. and Tang, B. (2022) Relationship between Orthogonal and Baseline Parameterizations and Its Applications to Design Constructions. Statistica Sinica, 32, 239-250.
https://doi.org/10.5705/ss.202020.0032
[14] Yan, Z.H. and Zhao, S.L. (2023) Optimal Fractions of Three-Level Factorials under a Baseline Parameterization. Statistics and Probability Letters, 202, Article ID: 109902.
https://doi.org/10.1016/j.spl.2023.109902
[15] Fries, A. and Hunter, W.G. (1980) Minimum Aberration Designs. Technometrics, 22, 601-608.
https://doi.org/10.1080/00401706.1980.10486210

Copyright © 2024 by authors and Scientific Research Publishing Inc.

Creative Commons License

This work and the related PDF file are licensed under a Creative Commons Attribution 4.0 International License.