R Package DoE.base for Factorial Experiments

The R package DoE.base can be used for creating full factorial designs and general factorial experiments based on orthogonal arrays. Besides design creation, some analysis functionality is also available, particularly (augmented) half-normal effects plots. In addition to this specific functionality, the package provides convenience features for analysing experimental designs and the infrastructure for a suite of further packages on designing and analyzing experiments. This infrastructure is available for use also by further design of experiments packages.


Introduction
Factorial experiments are very common in industrial experimentation. The most widely spread such experiments use 2-level factors only, but experiments with mixed level factors are also quite common, for example with the 18 run experimental plan proposed by Taguchi (NIST/SEMATECH 2012). The design and execution of such experiments is often done during everyday work without support from a statistical expert -thus it is important to have a software available that can be safely used by non-experts. At the same time, statisticians are often involved in the more important industrial experiments, and there are many facets to construction of such experiments for which a statistician very much appreciates support from a powerful software. The R package DoE.base (Grömping 2015b) targets both nonexperts and statisticians. It is part of a larger package suite containing also the packages FrF2, DoE.wrapper and RcmdrPlugin.DoE, and a fifth supporting package FrF2.catlg128 (Grömping 2014b(Grömping ,c, 2013b(Grömping ,d, 2011b(Grömping , 2013c. All these packages and all packages on which DoE.base depends (Chasalow 2012;Venables 2013;Venables and Ripley 2002;Meyer, Zeileis, and Hornik 2013) are available from the Comprehensive R Archive Network CRAN, which also holds the software R itself (R Development Core Team 2015). The GUI package Rcmdr-Plugin.DoE, which will not be described in this article, provides access to some functionality from the package suite. Grömping (2011b) gives a detailed example-based tutorial for using it. Package DoE.base provides the infrastructure for the entire package suite, in particular the class design, functions for importing and exporting experimental designs, and simple analysis functions for printing, summarizing, plotting, and modeling design data.
Besides providing infrastructure, the main contribution of package DoE.base is to offer features for creating factorial designs: potentially blocked full factorials (function fac.design) and catalog-based general factorials (function oa.design) are available. Functions fac.design and oa.design have taken inspiration from the "white book" (Chambers and Hastie 1984), where these S functions are described that never made it into base R. The most advanced contributions of the package are the features around orthogonal arrays (function oa.design), which are subject to ongoing research. These rely heavily on a catalog of orthogonal arrays, most of which have been taken from Kuhfeld (2010). To the author's knowledge, the package is the only place in R where non-regular orthogonal arrays other than Plackett-Burman designs are provided for experimentation. Non-regularity of an array has been discussed to be beneficial for screening experiments because of their good projectivity properties (see, e.g., Box and Tyssedal 1996;Deng and Tang 1999;Tang and Deng 1999). This discussion has so far focused on 2-level designs, but should analogously apply to more general factorial designs.
There is another R package closely related to the design creation functionality of package DoE.base: the R package planor (Kobilinsky, Bouvier, and Monod 2015a) can create regular fractional factorial designs in a general sense (see also Kobilinsky, Monod, and Bailey 2015b). Package DoE.base is more general than package planor in that it also creates non-regular designs, can calculate various types of quality criteria, and does not require specification of a model but can optimize a design with respect to model robustness criteria. It is less general than planor in that it does not allow to specify a model and estimable effects, i.e., it treats all effects of the same order on an equal footing. Sections 4 and Section 5.3 will illustrate function regular.design of package planor as an alternative to functions from packages DoE.base and FrF2.
The remainder of this article is organized as follows: Section 2 briefly explains and exemplifies full factorial designs and orthogonal arrays and explains the basic principles of experimental design. Sections 3 presents the mathematical background and terminology for general orthogonal arrays and quality criteria from them. Section 4 discusses creation of full factorial designs, in particular also with the possibility of blocking them. Section 5 provides insights into usage and inspection of the orthogonal arrays implemented in package DoE.base. Section 6 discusses design creation and analysis tools, using the example of an experimental design in biotechnology (Vasilev, Schmitz, Grömping, Fischer, and Schillberg 2014). Section 7 describes in more detail the half-normal plotting functionality provided by package DoE.base. Finally, a brief overview of further developments is provided.

Full factorial designs and designs based on orthogonal arrays
A factorial design is an experimental plan in which k "factors" are systematically varied. The jth factor has l j "levels", j = 1...k. If all factors have the same number of levels, i.e., l 1 = ... = l k , the design is called a "fixed level" or "symmetric" design, otherwise it is called "mixed level" or "asymmetric". A "full factorial" design contains (a multiple of) all factor level combinations, i.e., a multiple of l 1 * ... * l k experimental runs. In a full factorial design, all coefficients for an adequately coded linear model with all main effects, 2-factor interactions, ..., up to k-factor interactions are estimable. The number of estimable effects remains the same, regardless of the choice of adequate coding. Section 7 will discuss how the coding affects correlation between coefficient estimates.
Full factorial designs are often not feasible in the real world, if the number of factors or the numbers of factor levels are not very small. For example, a full factorial experiment with one 2-level factor and six 3-level factors requires 1458 runs. There are several possibilities for designs with fewer runs: D-optimal designs require the specification of a model to be estimated; they can be created with R packages AlgDesign or DoE.wrapper (Wheeler 2014;Grömping 2013b), but are not the topic of this article. Here, we consider experimental designs based on orthogonal arrays: these do not require specification of a model but assume that (i) all effects of the same degree (main effect=degree 1, 2-factor interaction=degree 2, etc.) are equally important and (ii) that effects of lower degree are more important than those of higher degree. Orthogonal array designs are often used with the intention of estimating main effects only; they are particularly common for qualitative factors, although they can also be used for quantitative factors. For a design based on an orthogonal array, each factor has each level the same number of times, and each pair of factors has each pair of levels the same number of times. Genizi Taguchi provided various orthogonal arrays for engineering experimentation; one of the most-well-known ones is an 18-run array for up to one 2-level factor and up to seven 3-level factors. This array can for example be found in NIST/SEMATECH (2012), and is also contained in package DoE.base:
Orthogonal arrays may be regular or non-regular: in the regular case, it is possible to describe the array by a few defining relations, similar to the well-known way of doing so for regular fractional factorial 2-level designs: Starting from a full factorial design in some "generating" or "base" factors, additional factors are accommodated by assigning them to interactions between the base factors, which are consequently completely confounded with the new factors' main effects. This is a little more complicated for factors with more than two levels, but the general principle remains the same. One complication arises from base level factors with nonprime numbers of levels; these can be decomposed into full factorials of factors with prime numbers of levels, so-called "pseudo factors". The aforementioned catalog of arrays contains quite a few regular arrays. Regular orthogonal designs can also be created using function regular.design of package planor.
Non-regular orthogonal arrays cannot be described by defining relations. Some of the cataloged arrays in package DoE.base are non-regular. Section 5.1 provides detail on the catalog and its usage. Note that the catalog is by no means complete; in particular, it is much more difficult to completely enumerate all orthogonal arrays than it is to enumerate all regular orthogonal arrays. Partially complete catalogs of orthogonal arrays are available, e.g., from the website by Eendebak and Schoen (2010) based on the algorithm described in Schoen, Eendebak, and Nguyen (2010). In many cases, the web site provides the best arrays only, or does not provide an array at all (in case of very large numbers of arrays). Where a number of arrays is shown, the complete set of arrays can in principle be obtained from Eric Schoen; however, with large numbers of arrays the complete catalogs are so large that it is not easily feasible to work with them.

Principles of experimental design
The most important principle of experimentation is replication: when comparing two different setups, one will usually not rely on a single instance of each setup, but will replicate each setup a specified number of times. This serves the purpose of making sure that differences are only interpreted if they are sufficiently larger than can be expected from experimental variation. Replication is quite different from repeating measurements only: for a proper replication, all experimental settings have to be redone for each replicate. Sometimes, with very variable measurement devices, it may make sense not to include replications but to repeat the measurement process only. This is called "repeated measurements" and has to be treated quite different from proper replication. Several design generation functions of packages DoE.base and FrF2 offer the option replications for specifying the number of replications and repeat.only for indicating whether these are proper replications (default) or repeated measurements only.
One of the very useful aspects of factorial experiments is implicit replication: when experimenting with many factors, one can often expect higher order interactions to be irrelevant. If this is the case, the degrees of freedom that would have to be dedicated to higher order interactions can instead be used for estimating error variation (or for accommodating further experimental factors). Therefore, in factorial experimentation, one will encounter experiments without replicated runs.
A further important principle is blocking, which can be used to control for known influential factors that are not of interest in themselves, like batch-to-batch variation. For an orthogonal array design, one can simply include the block factor as an additional factor and thus has to find an array of the desired structure. A full factorial design can be blocked without increasing the number of runs, by allocating the degrees of freedom for the block factor to portions from interaction effects. This functionality is implemented in function fac.design (see Section 4).
Randomization means that the experimental runs are conducted in random order; it is a safeguard against bias from unknown influences. If the run order is completely randomized, all experimental runs can be treated as independent observations, and there is little risk of systematic bias from experimental order or unknown factors related to experimental order or time. In real life, there are sometimes so-called randomization restrictions; for example, experimental runs may be randomized within each block only. Function fac.design allows randomization within blocks, while designs created with function oa.design have to be rerandomized with function rerandomize.design for using one of the factors as a block factor.
Whenever proper replication is used, package DoE.base separately randomizes each replication as though it were a block; however, it does not include a block factor for the replications. Users who want to include a block factor for replications in the analysis can obtain such a factor using the function getblock. Users who want to change the randomization, i.e., randomize all replications together instead of in separate blocks, can use the function rerandomize.design.
Using the "[" method for the class design, users can also reorder a design according to a randomization scheme that has been worked out outside of R. Of course, whenever the randomization involves non-trivial restrictions like randomizing in meaningful blocks, the analysis has to be conducted accordingly.

General orthogonal arrays
This section provides the mathematical background for general orthogonal arrays, as far as it can be helpful for using the orthogonal arrays available in R package DoE.base.

Terminology for orthogonal arrays
An array in the sense of this article is a rectangular table of numbers with n rows and k columns, like the L18 shown on p. 5. The rows correspond to experimental runs, the columns to experimental factors. In the cataloged arrays in DoE.base, the levels of an llevel factor are denoted by the numbers 1...l. An array becomes an experimental design by allocating numbers to factor levels. The array is orthogonal, i.e., an OA, if for each pair of columns each combination of levels occurs equally often. If this is the case, main effects of all factors can be estimated separately from each other (provided, no higher order effects are in the model).
An OA is said to be of strength s, if each combination of levels occurs equally often for each subset of s columns. Thus, each OA is at least of strength 2. Strength of an OA is directly related to resolution of an array: resolution, denoted by roman numerals, is always one higher than the strength, i.e., strength 2 arrays are of resolution III and so forth. For an array of resolution III, main effects are not aliased with main effects, but can be aliased with 2-factor interactions (three factors involved); for an array of resolution IV, main effects are not aliased with 2-factor interactions, but can be aliased with 3-factor interactions, while 2-factor interactions can be aliased with other 2-factor interactions (four factors involved). This notion is well-known for regular fractional factorial 2-level designs, and is completely analogous for non-regular designs and for designs with factors at more than 2 levels or in mixed level situations. Note that a full factorial in k factors has strength k.
3.2. Generalized word length pattern and refinements Xu and Wu (2001) introduced the generalized word length pattern (GWLP) for general orthogonal arrays. It is an extension of the well-known word length pattern (WLP) for regular fractional factorial 2-level designs: in the latter, one starts out with a set of base factors and allocates additional factors to interactions among these (the generating contrasts). Coding all main effects model matrix columns with "-1" (one level) and "+1" (the other level), this way of design generation causes products of model matrix columns to be either half "-1" and half "+1", or constant columns. Factors, whose product of model matrix columns yields a constant column, form a "word" together. The word length pattern is a frequency table of word lengths. For regular fractional factorial 2-level designs, each group of c factors either does or does not form a word, i.e., contributes one or zero to the count for words of length c. This results in a word length pattern with only integer entries.
In general, partial aliasing is possible. Even if there are only 2-level factors, e.g., in a Plackett-Burman design (Plackett and Burman 1946), a set of factors can contribute a fraction of a word to the GWLP count for the respective word length. Consequently, GWLP entries need not be integers. For example, the GWLP of the L18 is The GWLP is denoted as A 0 , A 1 , A 2 , A 3 , ..., with A c the number of generalized words of length c. The entry "1" for A 0 is generic and does not indicate confounding. The GWLP for orthogonal arrays and designs based on them is usually presented starting with A 3 , since orthogonality implies absence of words of lengths one or two. The GWLP coincides with the WLP for regular fractional factorial 2-level designs; for details, consult Xu and Wu (2001) themselves or Grömping (2011a) for a more accessible account. The concepts of strength and resolution directly relate to the (G)WLP: the shortest word length with a non-zero count is the resolution of the design, the longest word length with a zero count is the strength. Hence, the L18 has resolution III and strength 2. Note that it is not adequate to use the term "generalized resolution" here, because that term is already in use for a different concept that is also implemented in package DoE.base (see below).
The number of generalized words of length c is the sum over contributions from all sets of c factors. For a set of R factors with l 1 ...l R levels in a resolution R design (equivalent to s + 1 factors in a strength s design), Grömping and Xu (2014) have shown an upper bound for the contribution to the count A R to be min((l 1 − 1), ..., (l R − 1)), i.e., the upper bound for the number of generalized words in a set of R factors depends on the pattern of levels in the set and is given by the main effect degrees of freedom for the factor with the fewest levels (the analogous result for symmetric designs was shown earlier by Xu, Cheng, and Wu 2004). Furthermore, Grömping and Xu (2014) provided a statistical rationale for the contributions of sets of R factors to A R in resolution R designs, i.e., the building blocks for the number of shortest words: each R-set contribution can be seen as the sum of R 2 -values from linear models explaining orthogonal contrast columns for any one of the R factors by a full model in the other R − 1 factors (provided the factor to be explained has orthogonally coded model matrix columns; otherwise, R 2 values have to be replaced by squared canonical correlations). Thus, the number of shortest words measures the extent of worst-case confounding in a plausible way. It is therefore particularly instructive to study R factor sets in resolution R designs. Based on these, Figure 1 illustrates the meaning of words in an informal sense, using mosaic plots as proposed in Grömping (2014a). The most severe imbalance is shown in plot (a) of the figure: the factor level combination of any pair of factors completely determines the level of the third factor. This is a resolution III regular array, for which each main effect is confounded by the two-factor interaction of the other two factors. This is reflected in the number of words, which coincides with the aforementioned upper bound. The plot shows a triple from the array L36.2.16.3.4 from package DoE.base; an identical plot can be produced from columns 2, 4 and 5 of the well-known Taguchi L18 shown on p. 5.
The factor set of Figure 1 (a) attains its upper bound for the number of generalized words. For the other factor sets shown in Figure 1, the upper bound is one because of one degree of freedom only for the 2-level factor in the set. However, with a 2-level and two 3-level factors in an orthogonal array, this upper bound cannot be attained. The upper bound can only be attained if the smallest number of levels is a divisor of all the other numbers of levels, which is for example the case in symmetric designs. In summary, the figure illustrates that more words are related to less balance.
The GWLP can be used for selecting a best design by comparing designs with respect to their so-called aberration: a design is better than another one, if it has higher resolution; in case of equal resolution, a design has smaller aberration, if its number of shortest words is smaller, in case of ties, successively considering longer words until a difference is encountered. If this principle is applied to a complete set of possible designs, the best design is said to have "generalized minimum aberration" (GMA).
Over and above the GWLP, package DoE.base allows to look at individual R-factor set confounding through mosaic plots (Grömping 2014a), like the ones shown in Figure 1, and provides an overview of the confounding in R-factor projections through projection frequency tables (functions P3.3 or P4.4), average R 2 frequency tables (ARFT) and squared canonical correlation frequency tables (SCFT); the latter are based on the results by Grömping and Xu (2014) and detailed in Grömping (2013a). Sometimes, several designs that have GMA can be distinguished further by the more detailed criteria. Grömping and Xu (2014) also introduced a generalization of the generalized resolution GR as proposed by Deng and Tang (1999): GR refines the resolution R by indicating the distance from complete confounding.  We have R ≤ GR < R + 1; the larger GR, the less severe the worst case confounding in the design; if GR = R, there is at least one instance of complete confounding in an R-factor set. Furthermore, GR ind is a stricter version of GR which already becomes equal to R if there is a triple of factors for which there is a coding such that at least one degree of freedom of at least one factor is completely confounded. Besides the overall GR and GR ind , individual factor versions GR tot,i and GR ind,i capture the corresponding worst cases for R factor sets involving the ith factor. The GWLP can be obtained with function GWLP (or with the older function lengths, which is usually slower but performs better for designs with many runs); GR can be obtained with the (old) function GR or -together with GR ind , the individual GR tot,i and GR ind,i , ARFT and SCFT -with function GRind.

Full factorial designs with function fac.design
Function fac.design creates full factorial designs. There is also a simple way in base R for creating all combinations of factor levels: function expand.grid with subsequent randomization of the run order will do the job. The benefits of using function fac.design lie in the inclusion into the general framework, and in the possibility of automatically blocking designs. The blocking method makes use of so-called pseudo-factors: whenever the number of levels of a factor is not prime, it can of course be factored into primes, e.g., 6 into 2 and 3; thus, a six-level factor can be obtained by the six different factor level combinations of a full factorial in a two-and a three-level factor; such component factors are called pseudo factors (e.g., F 1 and F 2 below).
Function fac.design uses a method by Collings (1984Collings ( , 1989 for creating the block factor; in case of automatic blocking, the blocking pattern for several 2-or 3-level factors is taken from optimal blocked catalogs (internal objects block.catlg and block.catlg3); if a factor contains several pseudo factors with the same prime, its use in block generators ensures that different pseudo factors are used for different block generators involving the factor (where possible). However, the procedure does not ensure overall optimality of the blocking strategy. Neither does function regular.design from package planor; however, that function may be worth a try if the result from function fac.design is not satisfactory, and it can be used for situations outside fac.design's scope for automatic blocking.
The following two code examples exemplify situations for which automated blocking works without or with confounding of two-factor interactions of experimental factors (with a warning in the latter case).
Blocking this full factorial in four or nine blocks is also possible, but confounds the block factor with a two-factor interaction among experimental factors, which is signaled by a warning message and is also visible from the summary: R>full.factorial.blocked4 <-fac.design(nlevels = c(2, 3, 3, 2, 2, 6), + blocks = 4) R>summary(full.factorial.blocked4) Call: fac.design(nlevels = c(2, 3, 3, 2, 2, 6), blocks = 4) Experimental design of type full factorial.blocked 432 runs blocked design with 4 blocks of size 108 Confounding of 2 -level pseudo-factors with blocks (each row gives one independent confounded effect): Factor settings (scale ends): A B C D E F 1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 4 4 5 5 6 6 The function allows automatic blocking for the most frequent situations, where most prime factors are two and three, and only single prime (pseudo) factors are larger than three; there is also a limit on the number of 2-and 3-level factors (see the package manual).
We now consider an example, for which block generators have to be manually specified: for blocking a full factorial in one 2-level factor, three 5-level factors and one 10-level factor into 25 blocks, the prime 5 is needed twice for creating the block factor; thus, two block generators for the prime 5 need to be specified. These can be given as a matrix with two rows (one for each block generator) and a column for each prime factor (in the order of factors, and within each factor, in increasing order, i.e., 2, 5, 5, 5, 2, 5 for the present design). The following code yields the desired blocked full factorial for the above requirement.
If the user is not able to come up with a satisfactory block structure, function regular.design from package planor can be used to create a design with the required properties (see code below). If that function takes a very long time for a reasonably-sized problem, there is in many cases no solution for the requested situation; it can, however, also mean that an existing solution is difficult to find for regular designs with factors that have non-prime numbers of levels.

Orthogonal arrays with package DoE.base
If more than two levels are required for some factors, but a full factorial cannot be afforded, a regular or non-regular orthogonal array can be used. For the creation of regular designs, also with mixed levels, function regular.design from package planor is very useful. As seen in the code of the previous section, the function can specify a model and separately a (sub) model to be estimable; in this way, it was, e.g., possible to treat the block factor different from the other factors. Note, however, that it is not possible to generate non-regular designs, and that no effort is made at a better design in terms of overall model robustness, whenever the requested estimability requirements are satisfied. Furthermore, creation of some designs takes a very long time, even for relatively small designs (e.g., 32 runs).
Package DoE.base pursues a different route: It contains the previously-mentioned catalog of orthogonal arrays; most of its arrays have been taken from Kuhfeld (2010). Similarly to most experimental design software, the default approach of function oa.design is to use the smallest array possible and to take columns with the requested numbers of levels from left to right until the design has been filled. If the user does not influence the chosen columns with the columns option, a warning is issued for making the user aware of potential improvements. The following sub sections discuss the catalog and ways to select arrays from it, the optimization of column selection from a selected array, and ways to inspect experimental plans regarding their suitability for the experiment at hand.

The data frame oacat and the function show.oas
The arrays available in package DoE.base are documented in the data frame oacat. Since version 0.27, this data frame contains additional columns with information regarding the array properties. oacat contains the following columns: name gives a structured array name, which indicates the number of runs and the frequency of factors with different numbers of levels; for example, the name L18.2.1.3.7 indicates 18 runs with one 2-level factor and seven 3-level factors. nruns directly gives the number of runs. n2 to n72 give the number of factors with 2 to 72 levels. Thus, for L18.2.1.3.7, nruns=18, n2=1, n3=7, and all other nx entries have the value 0.
lineage contains a string variable which indicates how the array was constructed from so-called parent arrays. An empty string indicates that the array itself is a parent array. Parent arrays are stored in the package and are objects of class oa, which are matrices that usually contain an origin attribute and sometimes also a comment attribute. Arrays with a lineage entry are constructed from the parent arrays.
Logical columns indicate a full factorial array (ff), an array with only squared canonical correlations 0 and 1 (regular), and an array with only average R 2 values 0 and 1 (regular.strict; these can only be fixed level arrays). Note that the squared canonical correlations and average R 2 values currently refer to R factor projections, where R is the resolution of the entire array; an array that exhibits regular behavior for all such projections might still be non-regular when considering projections onto larger numbers of factors. Thus, the regularity indicator columns might change with future package versions.
Column SCones contains the number of squared canonical correlations that are one.
Columns GR and GRind contain the GR and GR ind values, respectively.
Columns maxAR and maxSC contain the maximum average R 2 or squared canonical correlation, respectively.
Column dfe provides the number of error degrees of freedom, if the all columns of the array are used.
Columns A3 to A8 provide the numbers of generalized words of lengths 3 to 8. (There are no words of shorter lengths, of course.) It is possible to use the data frame oacat directly for inspecting which arrays of a certain nature are available, for example for finding 32 run arrays which are regular but not strictly regular: The author prefers non-regular arrays for many situations, at least for the creation of screening designs. Looking at regular arrays in package DoE.base may nevertheless be of interest, if function regular.design of package planor runs for a long time without indicating failure for a run size that is in the scope of package DoE.base: if there is a regular array of the desired size in DoE.base, this array can be inspected and perhaps used after column optimization (see next section). Also, in principle, function regular.design can be expected to eventually succeed, unless estimability requirements make the existing array unsuitable. However, note again that the value TRUE in column regular or even regular.strict of oacat refers to projections onto R factors with R the array's resolution and does not guarantee regularity of the entire array. For example, the array L24.2.12.12.1 has only regular three-factor projections (full factorial for triples of 2-level factors, confounded for triples with the 12-level factor), but non-regular four-factor projections (for quadruples of 2-level factors).
The function show.oas allows inspection of the available arrays in a more convenient way. Suppose, for example, that a design for three 2-level factors, two 3-level factors and one 6level factor is to be created, and between 20 and 54 runs are affordable (a full factorial would have 432). The following statement allows to inspect the candidate arrays, displaying also the quality metrics: R>show.oas(nruns = c(20, 54), nlevels = c (2, 3, 3, 2, 2, 6 Only arrays in 36 runs have been found; these are quite different in the available patterns of numbers of levels, and therefore the quality metrics (especially the A k ) are not directly comparable. Nevertheless, the values for GR and GR ind suggest that one might try the array L36.2.10.3.8.6.1, which is the only one without any completely confounded degree of freedom.

Optimization methods for function oa.design
Function oa.design allows to select an array, and to specify columns from that array either manually or by an optimization approach. As was mentioned earlier, if no array is specified, the function picks the first (and thus smallest) array in the catalog that is able to accommodate the requested factors. If no column selection approach is specified, the function simply takes the first available columns (from left to right).
The code below compares three designs: an unoptimized default design (i.e., the left-most suitable columns of the first array encountered, which is the L36.2.13.3.2.6.1), an optimized default design with optimization option columns="min34" (i.e., an optimized choice of columns from the array L36.2.13.3.2.6.1, optimization being with respect to the number of generalized words of length 3, and subsequently length 4)), and an optimized design obtained by selecting columns from the array L36.2.10.3.8.6.1 selected in Section 5.1 because of its GR ind value.
There are various methods for optimizing column allocation (see the manual for function oa.design); columns = "min34" is the most important one among these. Depending on the number of columns on offer and the number of columns to be selected, optimization can take a very long time; in the above example, 10 3 8 2 = 3360 column choices have to be checked out, which is doable in reasonably short time. For larger designs, the resource implications of the numbers of available columns of the required lengths may also be considered in selecting an array to use.

Blocking general orthogonal arrays
Suppose a design with the above factors is to be run, and the 6-level factor is a blocking factor. In that case, the design should be randomized such that runs are randomized within blocks. Function oa.design does not directly allow to block randomization. However, the function rerandomize.design allows a post-hoc randomization within a single design factor declared as the block factor: R>blockedoa1 <-rerandomize.design(oa.manualoptimized, seed = 24652, + block = "F") R>blockedoa1 run.no run.no.std.rp F A B C D E 1 1 22.5.4 5 2 1 1 1 2 2 2 5.5.2 5 1 2 3 2 2 3 3 15.5.3 5 1 3 3 1 1 4 4 1.5.1 5 1 1 2 1 1 5 5 35.5.6 5 2 2 1 2 1 6 6 33.5.5 5 2 3 2 2 2 run.no run.no.std.rp F A B C D E 7 7 32.3.5 3 2 2 1 2 2 8 8 14.3.3 3 1 2 2 1 1 9 9 34.3.6 3 2 1 3 2 1 10 10 4.3.2 3 1 1 2 2 2 11 11 3. 3.1 3 1 3 1 1  For designs that can also be created with function FrF2 (package FrF2) and/or regular.design (package planor), using one of the latter two may be a better choice, since they allow a more direct control over design quality via minimum aberration or estimable effects: The code below (not run) shows creation and inspection of a 16 run design with eight 2-level factors in eight blocks of size 2 with all three methods; in all three cases, the design is resolution IV in terms of the experimental factors, which is systematically requested in both FrF2 (by the minimum aberration approach of the function) and regular.design (by the model and estimate options). For function oa.design, the design quality is not as finely tunable, apart from the overall optimization that may precede usage of an array for which not all columns are used. For the designs below, however, the quality is the same for all three designs: The GWLPs, starting with A 3 , are (28,14,56,0,28,1) including the block factor and (0,14,0,0,0,1) for the experimental factors alone (for class design objects, this is the default, obtained from calling GWLP without the with.block = TRUE option). As an aside, it is worth mentioning that any experimental plan should be carefully checked before using it for experimentation, since experimentation usually involves a lot of effort: adverse consequences from mistakes in design creation may be severe, but can often be prevented without much trouble, if attended to at the design creation stage. A simple check of the GWLP can serve as a first indication whether the design behaves as expected. For example, an earlier version of package package planor (version 0.2.0, when the package website still warned that the package was under test and should be used with caution) had a bug that could under certain circumstances create avoidably bad designs. The code below (results not shown) gives the above blocking example with a less wise design specification that yielded a non-orthogonal design with one word of length 2 (GWLP starting with The plot command in the code above revealed that factor C was aliased with the Block factor for the latter design, which explains the word of length 2. Such a design should of course not be used. Let me emphasize that this example is not meant to imply that package planor is less trustworthy than other design creation packages but rather that a design should always be checked before using it for experimentation!

Inspection methods for factorial designs
This is a good place to exemplify further possibilities for design inspection. The function names for obtaining quality criteria were already mentioned in Section 3.2.
The function GRind calculates the metrics introduced by Grömping and Xu (2014) with the additional detail proposed in Grömping (2013a). For example, the code below shows that the optimized design based on the manually selected array has a clearly better overall behavior regarding factor-specific worst case confounding (GR tot , i and GR ind,i , see Section 3.2) than the optimized design based on the automatically selected first array: R>GRind1 <-GRind(oa.optimized) R>GRind2 <-GRind(oa.manualoptimized) R>print(cbind(rep(c("oa.opimized", "oa.optimized2"), each = 2), + rbind(GRind1$GR.i, GRind2$GR.i)), quote = FALSE) The plot method for class design can be used to visualize the worst case confounding for a design. The optimized design based on the array that was automatically picked severely confounds the interaction of the two 3-level factors with the 6-level factor: of the 54 possible level combinations in this triple of factors, the design contains 18 only (instead of the possible 36). The optimized design after manually picking a better starting array shows a much less severe worst case confounding. A mosaic plot of that projection is created by the first line of the following code and is shown in Figure 2 (a). It shows that the interaction of the two 3-level factors restricts the 6-level factor to a third of its possibilities only, which is the most severe form of aliasing possible for a triple of factors with this level combination.
R>plot(oa.optimized, select = "worst", selprop = 0.05) R>plot(oa.manualoptimized, select = c(2, 3, 6)) R>plot(oa.manualoptimized, select = "worst", selprop = 0.05) In comparison, a mosaic plot of the same projection for the optimized manually selected design shows a much better picture: now, the design has 36 distinct level combinations, the maximum possible (see Figure 2 (b)). The worst case confounding for that design is shown in the Figure 2 (c). It is distinctly less severe than that of Figure 2 (a); however, apart from the one worst case shown in Figure 2 (a), the optimized automatically-selected design is also quite reasonable.
6. An example from plant biotechnology Vasilev et al. (2014) investigated cultivation factors for geraniol production by plant cells. There were four quantitative 2-level factors, two 3-level factors (one quantitative and one qualitative) and one qualitative 4-level factor. The data, including response values, have been published with the paper and are also included in package DoE.base as VSGFS.

Creating and inspecting the design
For this experiment, a full factorial design would have had 576 runs. It would certainly have been necessary to conduct it in blocks, say in eight blocks of size 72 each. Such a design could have been created by function fac.design or by function regular.design of package planor, as shown in Section 4.
A full factorial appeared neither feasible nor appropriate for the screening situation of the experiment. Instead, the design was conducted using a 72 run orthogonal array, which was generated with function oa.design, using automatic optimization (option columns = "min34") of the manually preselected array L72.2.43.3.8.4.1.6.1. The optimization takes quite a long time and results in the selection of columns 4,22,37,41 for the 2-level factors, 46 and 48 for the 3-level factors, and 52 for the 4-level factor. The design is available as VSGFS, and the code for reproducing it can be found in the manual.
When constructing the experimental plan for VSGFS, the array was preselected by intuition and trial and error, using the version of function show.oas available at the time, which only allowed to show all 72 run designs that can accommodate the requested factors. Using the recently added showmetrics option together with the GRgt3="ind" option (i.e., requesting that GR ind > 3 which means absence of any instance of complete confounding), one can pick arrays that are particularly promising for screening purposes. The following command shows the available arrays without any complete confounding for the example situation: R>show.oas(nlevels = c(2, 2, 2, 3, 2, 3, 4), GRgt3 = "ind", + showmetrics = TRUE)   The array used in actual experimentation is the third array in this list; thus, the intuitive approach was successful in this case. It can be expected that all four arrays listed above are reasonably suited for screening the experimental factors. As the designs are far from saturated, optimal column allocation can substantially improve the worst case confounding: for the eventual design, the output below shows GR = GR ind = 3.6, implying that the largest squared correlation and the largest average R 2 are 1/9 only; these are attained in the triples 1,2,7 and 3,5,7. For illustrating the worst case degree of confounding, Figure 3 shows the triple 3,5,7 in quite reasonable balance. The team was happy with the design and used it for collecting the data. Software-wise, data collection happened in Excel, exporting the randomized design with the export.design function and re-importing it after data entry using the exported rda file together with a csv file with response values added to the data rows using function add.response.

Analyzing experimental data
The data frame VSFGS contains the experiment with response data. Package DoE.base offers a few functions for analysis purposes: the plot method for class design, in case of data with responses, creates simple main effects plots, by invoking the plot method from package graphics the lm method for class design runs a linear model with a modifiable default degree. For orthogonal arrays created with function oa.design, the default degree is one, i.e., a model with main effects only. Linear models can of course also be run by the lm function from the core package stats, and analysis of variance functionality can also be used (see below).
the halfnormal method for classes lm or design creates a half-normal effects plot (see Section 7) Of course, other R functionality can also be used, for example interaction plots or functionality for mixed model analysis. The functionality for handling repeated measurement data or replicated data has been described for package FrF2 in Grömping (2014c, Sections 5.3 and 5.9 of ) and is analogous here.
Main effects plots can be obtained by the simple command plot(VSGFS), which creates these plots for all three response variables. Figure 4 shows the plots with default labeling, arranged with three plots on one page and reduced margin sizes. Of course, one would usually adapt the annotation for final reports or publications. The plot shows that the sugar sucrose is very beneficial for the biomass, not very good for the content, but nevertheless, because of the strong effect on biomass, beneficial for the yield. The content apparently can be increased by choosing the sugar mannitol, level one of factor CD and the +-level of Light. Apart from the +-level of Light, the other settings for high content are not helpful for overall yield.
An assessment of significance can be obtained from a linear model analysis. This can be obtained separately for each response, for example for the content. Here, the analysis confirms the findings from the main effects plot: R>summary(lm(VSGFS, response = "Content"))  The model explains less than 50% of the response variability, which is far from perfect. The strong heredity principle -interactions are only active if both their component factors are active -suggests to look at a model with the three active main effect factors and their interactions, which leaves a somewhat confusing picture (not shown).
For the data at hand, there are enough degrees of freedom to run an Anova analysis with the full degree 2 model. Of course, while main effects are orthogonal to each other, 2-factor interactions can be slightly confounded with main effects and severely confounded with other 2-factor interactions. However, at least, the theory tells us that the estimable effects can be estimated without bias (unless effects of order higher than two bias them). As Anova analyses sums of squares, the analysis is invariant with respect to factor coding. In order to obtain an order-invariant assessment of significance, the function Anova from package car (Fox and Weisberg 2011) can be used; contrary to function anova from package stats, Anova avoids order-dependence by using type II sums of squares, which condition on all other effects except for ones that contain the effect under investigation. The results point to the main effects that were also identified before, and a liberal look at p values additionally indicates some marginal 2-factor interactions (Sugar with each of InocSize, FilledVol, CDs, Light, and ShakFreq:CM, which is completely unrelated to the active main effects).

Half-normal effects plots
In (almost) saturated designs, conventional analysis of variance methods are not very successful, because there are too few degrees of freedom for error. If one assumes a screening design, for which most effects are inactive, the inactive effects actually represent experimental error; however, it is not known a-priori, which are the active effects. Daniel (1959) proposed to use half-normal effects plots for diagnosing which effects are active, and Lenth (1989) proposed a numerical activity check for these, known as "Lenth's method". The critical values proposed by Lenth were later found to be conservative, and it was proposed to use simulated ones instead. Half-normal effects plots with Lenth's method and simulated critical values are implemented in function halfnormal of package DoE.base.

The principle
The standard use for such plots is with 2-level factors which are conventionally coded in -1/+1 coding (see, e.g., Grömping 2014c). Function halfnormal from package DoE.base covers not only these standard situations, but also offers half-normal plots for 2-level designs with a few error degrees of freedom. For these, it automatically augments the estimated effects with error effects, distinguishing these into lack-of-fit and pure error. Significance assessment can be done with Lenth's method on the augmented set of estimates, or with other methods proposed in the literature (Larntz and Whitcomb 1998;Edwards and Mee 2008). Note that error effects are not necessarily uniquely determined.
2-level designs with partially confounded effects. For these, it projects out all preceding effects from the remaining ones (thus, the plotting points depend on the model order for such situations). mixed level designs, for which there is no unique coding and the plotting points are coding dependent. Mixed level designs can also have error degrees of freedom or partially confounded effects.
The strategy chosen in function halfnormal seems to be similar to that applied in the JMP software screening platform (see Chapter 8 of SAS Institute, Inc. 2012), both regarding the treatment of error points and the single degree of freedom representations for factors with more than two levels. On the contrary, Design-Expert (Stat-Ease Inc. 2012) does not plot individual degrees of freedom, but scaled Chi-squared values for effects with more than one degree of freedom. This avoids the coding dependence, but has the adverse effect that the number of plotting points is small so that effect sparsity is not easily achieved.
The steps for augmenting the estimated effects with error degrees of freedom are described, e.g., in Grömping (2015a) and are very similar also to the suggestions by Langsrud (2001) in a different context. A coarse overview works as follows: Make sure the model matrix X has orthogonal columns all of which have the same Euclidean length; if this is not the case, X has to be pre-treated (see below).
For N observations, a trivial saturated model matrix is the N -dimensional identity matrix I N . If a distinction between lack-of-fit and pure error is sought, one can replace this matrix by a matrix S of dummy variables for distinct runs, and additionally include appropriately scaled orthogonal contrast matrices for replicated runs. In the following, for simplicity, I N is used.
Residualize the matrix I N by projecting out the model matrix X, i.e., calculate the residual matrix R = I N − X(X X) −1 X .
Create the half-normal effects plot for the augmented model matrix (X|R), which has been created such that it has orthogonal columns of all the same length.
The pre-treatment mentioned in the first bullet is as follows: If X has orthogonal columns of varying Euclidean lengths, one simply has to normalize all columns to a common length. The case of non-orthogonal columns is more demanding and will be discussed using the model matrix X for a full model in an unreplicated full factorial for one 2-level and one 3-level factor, both in dummy coding, as given in Equation 1: . The columns of this matrix are not orthogonal, which can be easily verified from looking at X X. As the intercept estimates the mean for the combination A 1 B 1 and the coefficients for the main effects columns estimate deviations from that level, there is an obvious dependency.
With the interaction coefficients also measuring deviations of particular cells from additivity, there is another clear dependence. It is well-known which effects are estimable in factorial models: those are the overall means and the contrasts. A model matrix formulated to directly estimate these has orthogonal columns. Xu and Wu (2001) showed that it is particularly useful to code all main effects columns using an orthogonal coding normalized to squared length N , where N is the number of runs: such a coding yields orthogonal columns of the model matrix for all effects up to degree s for any projection of s factors from a strength s design, if the interaction columns are created in the usual way as products of the normalized main effects contrast columns; the interaction columns will then also have squared Euclidean length N and will be orthogonal to each other and to the main effects columns. Such coding will be called Xu Wu coding in the sequel. It is available in two versions in package DoE.base: there are contr.XuWu and contr.XuWuPoly contrasts. The model matrix below is obtained by coding both factors A and B with contr.XuWu contrasts.
If the design has strength s, a model matrix in a Xu Wu coding for main effects with up to s-factor interactions can also be obtained post-hoc from a model matrix X based on nonorthogonal coding by sequential orthogonalization and normalization: Project out the first column (intercept column) from the second, which is just subtraction of the column mean. Project out the first two columns from the third and so on. In addition, one also has to normalize all columns to squared length N (or any other common Euclidean length) in order to satisfy the requirements for a half-normal effects plot.
If the model has truly confounded effects that cannot be orthogonalized by simply applying Xu Wu coding, the same orthogonalization strategy can be applied, but the consequence is not only a recoding of in principle the same information, but an order-dependent removal of earlier confounded effects from later confounded effects.

Example application
The example design is an orthogonal array, i.e., main effect contrasts can be estimated independent of each other. However, depending on the coding, the actual estimated coefficients may be correlated, as was discussed above. For example, for 2-level factors, if -1/+1 coding is used, the estimates are uncorrelated, with 0/1 (dummy) coding, however, they are correlated. It is advisible to explicitly choose an orthogonal factor coding, ideally the Xu Wu variant discussed above. The code below creates an orthogonal main effects model matrix with squared Euclidean length N = 72 (output not shown).
R>VSGFS.XuWuPoly <-change.contr(VSGFS, "contr.XuWuPoly") R>round(crossprod(model.matrix(lm(VSGFS.XuWuPoly))), 2) Per default, function halfnormal refuses to work in case of correlated main effects (except in case of the perfect confounding of regular designs). The option ME.partial=TRUE can be used to change that; if the partial aliasing among main effects estimates is due to nonorthogonal coding in an orthogonal array, use of ME.partial=TRUE is acceptable in the light of the previous considerations, although it seems preferable to decide on an orthogonal coding explicitly.
For the example design, a main effects analysis is quite well-protected against bias from two-factor interactions. However, two-factor interactions may be quite heavily confounded with each other. Nevertheless, we will now consider an almost saturated array in order to demonstrate the most general usage of the function halfnormal.
R>hnAuto <-halfnormal(lm(VSGFS, response = "Content", degree = 2), + ME.partial = TRUE, cex.text = 0.9, cex = 0.9, xlim = c(0, 1.1), + ylim = c(0, 2.8), main = "Half-normal effects plot for Content") R>hnXuWuPoly <-halfnormal(lm(VSGFS.XuWuPoly, response = "Content", + degree = 2), cex.text = 0.9, cex = 0.9, xlim = c(0, 1.1), + ylim = c(0, 2.8), main = "Half-normal effects plot for Content") R>hnXuWuPolyreordered <-halfnormal( + lm(Content~(CDs + Sugar + CM + FilledVol + InocSize + ShakFreq + + Light)^2, VSGFS.XuWuPoly), cex.text = 0.9, cex = 0.9, xlim = c(0, 1.1), + ylim = c(0, 2.8), main = "Half-normal effects plot for Content") Figure 5 shows half-normal effects plots from a model with all main effects and two-factor interactions for the response variable "Content". In order to demonstrate the coding dependence of half-normal effects plots in case of factors with more than two levels, plot (a) shows automatical coding, plot (b) contr.XuWuPoly coding. Plot (c) shows the analysis for contr.XuWuPoly coding with the effects in different order. The codings between (a) and (b) differ in the level ordering for the 3-level factors; for the 4-level factor, the coding differences are more complicated. While there are clear visual differences between the differently-coded plots, the message is more or less the same: As already seen in the linear main effects model, light, sugar and CD are active factors. Furthermore, the Sugar by CD interaction and the Inoculum Size by Sugar interaction seem to have active components. However, the interaction effects are order dependent. The plot with different interaction orders brings up three further interactions into the possibly active range and drops the Sugar by CD interaction; thus, clearly, one has to be cautious with statements on these effects from half-normal effects plots. However, all effects that show up in the half-normal effects plots have also been at least marginal in the Anova analysis of the previous section.

Further developments
Package DoE.base tries a balancing act of offering tools both for practitioners with a relatively weak statistical background and for statistical experts. In order to save the former from avoidable grave mistakes, the package takes a cautious strategy issuing many warnings, where designs might be improvable or analyses might be inadequate. In various cases, it would be desirable to avoid unnecessary warnings; for example, there are arrays for which it is known from theoretical work (Butler 2005) that all choices for certain columns are GMA. For these arrays, warnings for array optimization should be eliminated, which requires some slightly tedious work.
The current design catalog covers many situations, but is also limited especially with respect to availability of non-regular designs. Augmenting the catalog with further useful non-regular designs is a difficult task and should not be addressed before studying in more detail the relation between recently developed design quality criteria like GR, GR ind , ARFT and SCFT and a design's usefulness for experimentation.
So far, the design catalog is used for selecting designs and optimizing column choices from them. Kuhfeld (2010) makes much more extensive use of the catalog within a SAS software q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q Note: Some coefficients are order dependent.
(a) Automatic coding, default order.
q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q (b) contr.XuWuPoly coding, default order.
q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q q   macro suite for creating experimental designs from the arrays in the catalog, specifically created with marketing applications in mind (Kuhfeld 2010). Such use can certainly also be combined with the catalog in package DoE.base. The dependent package support.CEs (Aizaki 2012) implements choice experiments in R. However, implementing a functionality as flexible as the one offered in the (Kuhfeld 2010) SAS macros would require substantial additional effort.
Package DoE.base also serves as a tool for supporting further theoretical investigations into aspects like design regularity and new design quality criteria.
Last but not least, the class design and infrastructure for it are also available for use by other packages. Authors of other S3-based packages are invited to use it; the methods within package DoE.base for its class design can be adjusted to accommodate further types of design; please inform the package maintainer, if you would like a new type of design included into the methods offered.
Attribute design.info is a list that contains all the important information on the design and is heavily used by the methods and functions discussed in the following sub section. This attribute will be described in some detail in the rest of this sub section.
The design.info attribute has some mandatory elements that have to be present for all class design objects and many elements that are needed for some types of designs only. The author's website contains a large table that details which types of designs need which elements of the design.info attribute. The function also outputs the split plot related elements and element quantitative described under specific elements for FrF2 above. plot.name character string name of whole plot factor in split plot request digits number number of digits to which quantitative factors are to be rounded formula formula model formula for which D-optimality is to be achieved constraint logical expression constraint applied to the experimental factors optimality Criteria named numeric vector performance of design on several optimality criteria Further elements for designs created by function lhs.design subtype character string type of latin hypercube design The function also outputs digits and optimalityCriteria elements described under specific elements for Dopt.design above. Apart from the elements listed in the table, there are further elements for class design objects created by design combination functions. These are not discussed here. For such designs, the entry types given in the table can also be replaced by lists of several such entries.
Function rerandomize.design allows to re-randomize a class design object without response data; this also allows explicit blocking of designs created with functions oa.design and pb (see also Section 5.3).
Functions for designs that can be both in wide or in long format, i.e., parameter designs and designs with repeated measurements, can change between long and wide format or aggregate wide format designs: function rep2wide brings repeated measurement designs into wide format, reptolong does the opposite; function paramtowide brings a parameter design to wide format (irreversible, they are created in long format); function aggregate.design (method for the generic aggregate) aggregates designs in wide format into designs with a single response.
Functions add.response and col.remove add responses or remove columns; function response.names can also be used to remove response columns.
Function qua.design influences, which design columns are quantitative, function change.contr changes the contrasts of design columns.
Functions undesign and redesign remove the class design properties from an object or reinstate them.
Functions cross.design and param.design combine class design objects into crossed designs or parameter designs, respectively. Function fix.design is a method for the generic fix and has been adapted from package utils; it allows to edit design objects; however, its use is not recommended.
Function export.design can export a design in html or csv format, together with an R workspace for the design. After entering response data in a spreadsheet program, responses can be added to the design itself using the add.response function.