Two Level Factorial Experiments

=Two Level Factorial Experiments=

Introduction
Two level factorial experiments are factorial experiments in which each factor is investigated at only two levels. The early stages of experimentation usually involve the investigation of a large number of potential factors to discover the "vital few" factors. Two level factorial experiments are used during these stages to quickly filter out unwanted effects so that attention can then be focused on the important ones.

$$2^k $$ Designs
The factorial experiments, where all combination of the levels of the factors are run, are usually referred to as full factorial experiments. Full factorial two level experiments are also referred to as 2 designs where denotes the number of factors being investigated in the experiment. (In DOE++, these designs are referred to as 2 Level Full Factorial Designs as shown in Figure 7.1.) A full factorial two level design with factors requires  runs for a single replicate. For example, a two level experiment with three factors will require runs. The choice of the two levels of factors used in two level experiments depends on the factor - some factors naturally have two levels. For example, if gender is a factor, then male and female are the two levels. For other factors, the limits of the range of interest are usually used. For example, if temperature is a factor that varies from 45 to 90  then the two levels used in the 2 design for this factor would be 45  and 90. The two levels of the factor in the 2 design are usually represented as (for the first level) and  (for the second level). Note that this representation is reversed from the coding used in Chapter 6 for the indicator variables that represent two level factors in ANOVA models. For ANOVA models, the first level of the factor was represented using a value of for the indicator variable, while the second level was represented using a value of. For details on the notation used for two level experiments refer to Chapter 7, Notation.



Figure 7.1: Selection of full factorial experiments with two levels in DOE++.

The $$2^2 $$ Design
The simplest of the two level factorial experiments is the 2 design where two factors (say factor and factor ) are investigated at two levels. A single replicate of this design will require four runs The effects investigated by this design are the two main effects,  and  and the interaction effect. The treatments for this design are shown in Figure 7.2 (a). In the figure, letters are used to represent the treatments. The presence of a letter indicates the high level of the corresponding factor and the absence indicates the low level. For example, (1) represents the treatment combination where all factors involved are at the low level or the level represented by ; represents the treatment combination where factor  is at the high level or the level of, while the remaining factors (in this case, factor ) are at the low level or the level of. Similarly, represents the treatment combination where factor  is at the high level or the level of, while factor  is at the low level and  represents the treatment combination where factors  and  are at the high level or the level of. Figure 7.2 (b) shows the design matrix for the 2 design. It can be noted that the sum of the terms resulting from the product of any two columns of the design matrix is zero. As a result the 2 design is an orthogonal design. In fact all 2 designs are orthogonal designs. [Note] This property of the 2 designs offers a great advantage in the analysis because of the simplifications that result from orthogonality. These simplifications are explained later on in this chapter.

The 2 design can also be represented geometrically using a square with the four treatment combinations lying at the four corners, as shown in Figure 7.2 (c).



Figure 7.2: The 2 design - Figure (a) displays the experiment design, (b) displays the design matrix and (c) displays the geometric representation for the design. In Figure (b), the column names, , and  are used. Column represents the intercept term. Columns and  represent the respective factor settings. Column represents the interaction and is the product of columns  and.

The $$2^3 $$ Design
The 2 design is a two level factorial experiment design with three factors (say factors, and ). This design tests three main effects,,  and ; three  two factor interaction effects, , , ; and one  three factor interaction effect,. The design requires eight runs per replicate. The eight treatment combinations corresponding to these runs are, , , , , , and. Note that the treatment combinations are written in such an order that factors are introduced one by one with each new factor being combined with the preceding terms. This order of writing the treatments is called the standard order or Yates' order. The 2 design is shown in Figure 7.3 (a). The design matrix for the 2 design is shown in Figure 7.3 (b). The design matrix can be constructed by following the standard order for the treatment combinations to obtain the columns for the main effects and then multiplying the main effects columns to obtain the interaction columns.



Figure 7.3: The 2 design - Figure (a) shows the experiment design and (b) shows the design matrix.

The 2 design can also be represented geometrically using a cube with the eight treatment combinations lying at the eight corners as shown in Figure 7.4.



Figure 7.4: Geometric representation of the 2 design.

Analysis of 2k Designs
The 2 designs are a special category of the factorial experiments where all the factors are at two levels. The fact that these designs contain factors at only two levels and are orthogonal greatly simplifies their analysis even when the number of factors is large. The use of 2 designs in investigating a large number of factors calls for a revision of the notation used previously for the ANOVA models. The case for revised notation is made stronger by the fact that the ANOVA and multiple linear regression models are identical for 2 designs because all factors are only at two levels. Therefore, the notation of the regression models is applied to the ANOVA models for these designs, as explained below.

Notation
Based on the notation of Chapter 6, the ANOVA model for a two level factorial experiment with three factors would be as follows: (1) where:

represents the overall mean.

represents the independent effect of the first factor (factor ) out of the two effects and.

represents the independent effect of the second factor (factor ) out of the two effects and.

represents the independent effect of the interaction out of the other interaction effects.

represents the effect of the third factor (factor ) out of the two effects and.

represents the effect of the interaction out of the other interaction effects.

represents the effect of the interaction out of the other interaction effects.

represents the effect of the interaction out of the other interaction effects.

is the random error term.

The notation for a linear regression model having three predictor variables with interactions is: (2)

The notation for the regression model is much more convenient, especially for the case when a large number of higher order interactions are present. [Note] In two level experiments, the ANOVA model requires only one indicator variable to represent each factor for both qualitative and quantitative factors. Therefore, the notation for the multiple linear regression model can be applied to the ANOVA model of the experiment that has all the factors at two levels. For example, for the experiment of Eqn. (1), can represent the overall mean instead of, and  can represent the independent effect, , of factor. Other main effects can be represented in a similar manner. The notation for the interaction effects is much more simplified, e.g. can be used to represent the three factor interaction effect,.

As mentioned earlier, it is important to note that the coding for the indicator variables for the ANOVA models of two level factorial experiments is reversed from the coding followed in the Chapter 6, Analysis of Experiments. Here represents the first level of the factor while  represents the second level. This is because for a two level factor a single variable is needed to represent the factor for both qualitative and quantitative factors. [Note] For quantitative factors, using for the first level (which is the low level) and 1 for the second level (which is the high level) keeps the coding consistent with the numerical value of the factors. [Note] The change in coding between the two coding schemes does not affect the analysis except that signs of the estimated effect coefficients will be reversed (i.e. numerical values of, obtained based on the coding of Chapter 6, and , obtained based on the new coding, will be the same but their signs would be opposite).

In summary, the ANOVA model for the experiments with all factors at two levels is different from the ANOVA models for other experiments in terms of the notation in the following two ways:

The notation of the regression models is used for the effect coefficients. The coding of the indicator variables is reversed.

Special Features
Consider the design matrix,, for the 2 design shown in Figure 7.3 (b). The matrix is:

Notice that, due to the orthogonal design of the matrix, the  has been simplified to a diagonal matrix which can be written as:

where represents the identity matrix of the same order as the design matrix,. Since there are eight observations per replicate of the 2 design, the ' matrix for replicates of this design can be written as:

The matrix for any 2 design can now be written as: (3)

Then the variance-covariance matrix for the 2 design is: (4, 5)

Note that the variance-covariance matrix for the 2 design is also a diagonal matrix. Therefore, the estimated effect coefficients (,, etc.) for these designs are uncorrelated. This implies that the terms in the 2 design (main effects, interactions) are independent of each other. Consequently, the extra sum of squares for each of the terms in these designs is independent of the sequence of terms in the model, and also independent of the presence of other terms in the model. As a result the sequential and partial sum of squares for the terms are identical for these designs and will always add up to the model sum of squares. Multicollinearity is also not an issue for these designs. [Note]

It can also be noted from Eqn. (5), that in addition to the matrix being diagonal, all diagonal elements of the  matrix are identical. This means that the variance (or its square root, the standard error) of all estimated effect coefficients are the same. The standard error,, for all the coefficients is: (6)

This property is used to construct the normal probability plot of effects in 2 designs and identify significant effects using graphical techniques. [Note] For details on the normal probability plot of effects in DOE++, refer to Chapter 7, Normal Probability Plot of Effects.

Example 7.1

To illustrate the analysis of a full factorial 2 design, consider a three factor experiment to investigate the effect of honing pressure, number of strokes and cycle time on the surface finish of automobile brake drums. Each of these factors is investigated at two levels. The honing pressure is investigated at levels of 200 and 400, the number of strokes used is 3 and 5 and the two levels of the cycle time are 3 and 5 seconds. The design for this experiment is set up in DOE++ as shown in Figures 7.5 and 7.6. It is decided to run two replicates for this experiment. The surface finish data collected from each run (using randomization) and the complete design is shown in Figure 7.7. The analysis of the experiment data is explained next.



Figure 7.5: Design properties for the experiment in Example 7.1.



Figure 7.6: Factor properties for the experiment in Example 7.1.



Figure: 7.7: Experiment design for Example 7.1 to investigate the surface finish of automobile brake drums.

The applicable ANOVA model using the notation for 2 designs is: (7) where the indicator variable, represents factor  (honing pressure),  represents the low level of 200  and  represents the high level of 400. Similarly, and  represent factors  (number of strokes) and  (cycle time), respectively. is the overall mean, while, and  are the effect coefficients for the main effects of factors ,  and , respectively. , and  are the effect coefficients for the,  and  interactions, while  represents the  interaction.

If the subscripts for the run ( 1 to 8) and replicates ( 1,2) are included, then the model can be written as: (8)

To investigate how the given factors affect the response, the following hypothesis tests need to be carried:

This test investigates the main effect of factor (honing pressure). The statistic for this test is:

where is the mean square for factor  and  is the error mean square. Hypotheses for the other main effects, and, can be written in a similar manner.

This test investigates the two factor interaction. The statistic for this test is:

where is the mean square for the interaction  and  is the error mean square. Hypotheses for the other two factor interactions, and, can be written in a similar manner.

This test investigates the three factor interaction. The statistic for this test is:

where is the mean square for the interaction  and  is the error mean square.

To calculate the test statistics, it is convenient to express the ANOVA model of Eqn. (7) in the form.

Expression of the ANOVA Model as

In matrix notation, the ANOVA model of Eqn. (7) can be expressed as: [Note]

where:

Calculation of the Extra Sum of Squares for the Factors

Knowing the matrices, and , the extra sum of squares for the factors can be calculated. [Note] These are used to calculate the mean squares that are used to obtain the test statistics. Since the experiment design is orthogonal, the partial and sequential extra sum of squares are identical. The extra sum of squares for each effect can be calculated as shown next. As an example, the extra sum of squares for the main effect of factor is:

where is the hat matrix and  is the matrix of ones. The matrix can be calculated using  where  is the design matrix,, excluding the second column that represents the main effect of factor. Thus, the sum of squares for the main effect of factor is:

Similarly, the extra sum of squares for the interaction effect is:

The extra sum of squares for other effects can be obtained in a similar manner.

Calculation of the Test Statistics

Knowing the extra sum of squares, the test statistic for the effects can be calculated. For example, the test statistic for the interaction is:

where is the mean square for the  interaction and  is the error mean square. [Note] The value corresponding to the statistic,, based on the  distribution with one degree of freedom in the numerator and eight degrees of freedom in the denominator is: [Note]

Assuming that the desired significance is 0.1, since value > 0.1, it can be concluded that the interaction between honing pressure and number of strokes does not affect the surface finish of the brake drums. Tests for other effects can be carried out in a similar manner. The results are shown in the ANOVA Table in Figure 7.8. The values S, R-sq and R-sq(adj) in the figure indicate how well the model fits the data. The value of S represents the standard error of the model, R-sq represents the coefficient of multiple determination and R-sq(adj) represents the adjusted coefficient of multiple determination. For details on these values refer to Chapter 5, Multiple Linear Regression Analysis.



Figure 7.8: ANOVA table for the experiment in Example 7.1.

Calculation of Effect Coefficients

The estimate of effect coefficients can also be obtained:

The coefficients and related results are shown in the Regression Information Table in Figure 7.9. In the table, the Effect column displays the effects, which are simply twice the coefficients. The Standard Error column displays the standard error,. The Low CI and High CI columns display the confidence interval on the coefficients. The interval shown is the 90% interval as the significance is chosen as 0.1. The T Value column displays the statistic,, corresponding to the coefficients. The P Value column displays the value corresponding to the  statistic. (For details on how these results are calculated, refer to Chapter 6, Analysis of Experiments.) Plots of residuals can also be obtained from DOE++ to ensure that the assumptions related to the ANOVA model of Eqn. (7) are not violated.



Figure 7.9: Regression Information table for the experiment in Example 7.1.

Model Equation

From the analysis results in Figure 7.8, it is seen that effects, and  are significant. In DOE++, the values for the significant effects are displayed in red in the ANOVA Table for easy identification. Using the values of the estimated effect coefficients, the model for the present 2 design in terms of the coded values can be written as:

To make the model hierarchical, the main effect,, needs to be included in the model (because the interaction is included in the model). [Note] The resulting model is:

This equation can be viewed in DOE++, as shown in Figure 7.10, using the Show Analysis Summary icon in the Control Panel. The equation shown in the figure will match the hierarchical model once the required terms are selected using the Select Effects icon.



Figure 7.10: The model equation for the experiment of Example 7.1.

Replicated and Repeated Runs
In the case of replicated experiments, it is important to note the difference between replicated runs and repeated runs. Both repeated and replicated runs are multiple response readings taken at the same factor levels. However, repeated runs are response observations taken at the same time or in succession. Replicated runs are response observations recorded in a random order. Therefore, replicated runs include more variation than repeated runs. For example, a baker, who wants to investigate the effect of two factors on the quality of cakes, will have to bake four cakes to complete one replicate of a 2 design. Assume that the baker bakes eight cakes in all. If, for each of the four treatments of the 2 design, the baker selects one treatment at random and then bakes two cakes for this treatment at the same time then this is a case of two repeated runs. If, however, the baker bakes all the eight cakes randomly, then the eight cakes represent two sets of replicated runs.

For repeated measurements, the average values of the response for each treatment should be entered into DOE++ as shown in Figure 7.11 (a) when the two cakes for a particular treatment are baked together. For replicated measurements, when all the cakes are baked randomly, the data is entered as shown in Figure 7.11 (b).



Figure 7.11: Data entry for repeated and replicated runs. Figure (a) shows repeated runs and (b) shows replicated runs.

Unreplicated $$2^k $$ Designs
Sometimes it is only possible to run a single replicate of the 2 design because of constraints on resources and time. As stated in Chapter 6, in an unreplicated experiment, the error sum of squares cannot be obtained as the model fits the data perfectly and no degrees of freedom are available to calculate the error sum of squares. In the absence of the error sum of squares, hypothesis tests to identify significant factors cannot be conducted. A number of methods of analyzing information obtained from unreplicated 2 designs are available. These include pooling higher order interactions, using the normal probability plot of effects or including center point replicates in the design.

Pooling Higher Order Interactions
One of the ways to deal with unreplicated 2 designs is to use the sum of squares of some of the higher order interactions as the error sum of squares provided these higher order interactions can be assumed to be insignificant. By dropping some of the higher order interactions from the model, the degrees of freedom corresponding to these interactions can be used to estimate the error mean square. Once the error mean square is known, the test statistics to conduct hypothesis tests on the factors can be calculated.

Normal Probability Plot of Effects
Another way to use unreplicated 2 designs to identify significant effects is to construct the normal probability plot of the effects. As mentioned in Chapter 7, Special Features, the standard error for all effect coefficients in the 2 designs is the same. Therefore, on a normal probability plot of effect coefficients, all non-significant effect coefficients (with ) will fall along the straight line representative of the normal distribution, N. [Note] Effect coefficients that show large deviations from this line will be significant since they do not come from this normal distribution. Similarly, since effects effect coefficients, all non-significant effects will also follow a straight line on the normal probability plot of effects. For replicated designs, the Effects Probability plot of DOE++ plots the normalized effect values (or the T Values) on the standard normal probability line, N(0,1). However, in the case of unreplicated 2 designs, remains unknown since  cannot be obtained. Lenth's method is used in this case to estimate the variance of the effects.[11] DOE++ then uses this variance value to plot effects along the N(0, Lenth's effect variance) line. The method is illustrated in the following example.

Example 7.2

Vinyl panels, used as instrument panels in a certain automobile, are seen to develop defects after a certain amount of time. To investigate the issue, it is decided to carry out a two level factorial experiment. Potential factors to be investigated in the experiment are vacuum rate (factor ), material temperature (factor ), element intensity (factor ) and pre-stretch (factor ). The two levels of the factors used in the experiment are as shown in Table 7.1. The factor properties entered in DOE++ using this table are shown in Figure 7.12. With a 2 design requiring 16 runs per replicate it is only feasible for the manufacturer to run a single replicate.



Table 7.1: Factors to investigate defects in vinyl panels.



Figure 7.12: Factor properties for the experiment in Example 7.2.

The experiment design and data, collected as percent defects, are shown in Figure 7.13. Since the present experiment design contains only a single replicate, it is not possible to obtain an estimate of the error sum of squares,. It is decided to use the normal probability plot of effects to identify the significant effects. The effect values for each term are obtained as shown in Figure 7.14. Lenth's method uses these values to estimate the variance. If all effects are arranged in ascending order, using their absolute values, then is defined as 1.5 times the median value:[11]



Figure 7.13: Experiment design for Example 7.2.

Using, the "pseudo standard error" is calculated as 1.5 times the median value of all effects that are less than 2.5:

Using as an estimate of the effect variance, the effect variance is 2.25. Knowing the effect variance, the normal probability plot of effects for the present unreplicated experiment can be constructed as shown in Figure 7.15. The line on this plot is the line N(0, 2.25). The plot shows that the effects, and the interaction  do not follow the distribution represented by this line. Therefore, these effects are significant.

The significant effects can also be identified by comparing individual effect values to the margin of error or the threshold value using the pareto chart (see Figure 7.16). If the required significance is 0.1, then:

The statistic,, is calculated at a significance of  (for the two-sided hypothesis) and degrees of freedom number of effects. Thus:

The value of 4.534 is shown as the critical value line in Figure 7.16. All effects with absolute values greater than the margin of error can be considered to be significant. These effects are, and the interaction. Therefore, the vacuum rate, the pre-stretch and their interaction have a significant effect on the defects of the vinyl panels.



Figure 7.14: Effect values for the experiment in Example 7.2.



Figure 7.15: Normal probability plot of effects for the experiment in Example 7.2.



Figure 7.16: Pareto chart for the experiment in Example 7.2.

Center Point Replicates
Another method of dealing with unreplicated 2 designs that only have quantitative factors is to use replicated runs at the center point. [Note] The center point is the response corresponding to the treatment exactly midway between the two levels of all factors. Running multiple replicates at this point provides an estimate of pure error. Although running multiple replicates at any treatment level can provide an estimate of pure error, the other advantage of running center point replicates in the 2 design is in checking for the presence of curvature. The test for curvature investigates whether the model between the response and the factors is linear and is discussed below in Using Center Points to Test Curvature.

Example 7.3

Consider a 2 experiment design to investigate the effect of two factors, and, on a certain response. The energy consumed when the treatments of the 2 design are run is considerably larger than the energy consumed for the center point run (because at the center point the factors are at their middle levels). Therefore, the analyst decides to run only a single replicate of the design and augment the design by five replicated runs at the center point as shown in Figure 7.17. [Note] The design properties for this experiment are shown in Figure 7.18. The complete experiment design is shown in Figure 7.19. The center points can be used in the identification of significant effects as shown next.



Figure 7.17: 2 design augmented by five center point runs.



Figure 7.18: Design properties for the experiment in Example 7.3.



Figure 7.19: Experiment design for Example 7.3.

Since the present 2 design is unreplicated, there are no degrees of freedom available to calculate the error sum of squares. By augmenting this design with five center points, the response values at the center points,, can be used to obtain an estimate of pure error,. Let represent the average response for the five replicates at the center. Then:

Then the corresponding mean square is:

Alternatively, can be directly obtained by calculating the variance of the response values at the center points:

Once is known, it can be used as the error mean square,, to carry out the test of significance for each effect. For example, to test the significance of the main effect of factor the sum of squares corresponding to this effect is obtained in the usual manner by considering only the four runs of the original 2 design.

Then, the test statistic to test the significance of the main effect of factor is:

The value corresponding to the statistic,, based on the  distribution with one degree of freedom in the numerator and eight degrees of freedom in the denominator is:

Assuming that the desired significance is 0.1, since value < 0.1, it can be concluded that the main effect of factor  significantly affects the response. This result is displayed in the ANOVA table as shown in Figure 7.20. Test for the significance of other factors can be carried out in a similar manner.



Figure 7.20: Results for the experiment in Example 7.3.

Using Center Point Replicates to Test Curvature
Center point replicates can also be used to check for curvature in replicated or unreplicated 2 designs. The test for curvature investigates whether the model between the response and the factors is linear. The way DOE++ handles center point replicates is similar to its handling of blocks. The center point replicates are treated as an additional factor in the model. The factor is labeled as Curvature in the results of DOE++. If Curvature turns out to be a significant factor in the results, then this is an indication of the presence of curvature in the model.

Example 7.4

To illustrate the use of center point replicates in testing for curvature, consider again the data of the single replicate 2 experiment from Figure 7.17. Let be the indicator variable to indicate if the run is a center point:

If and  are the indicator variables representing factors  and, respectively, then the model for this experiment is:

To investigate the presence of curvature, the following hypotheses need to be tested:

The test statistic to be used for this test is:

where is the mean square for Curvature and  is the error mean square.

Calculation of the Sum of Squares

The matrix and  vector for this experiment are:

The sum of squares can now be calculated. For example, the error sum of squares is:

where is the identity matrix and  is the hat matrix. It can be seen that this is equal to (the sum of squares due to pure error) because of the replicates at the center point, as obtained in the Example 7.3. The number of degrees of freedom associated with, is four. [Note ] The extra sum of squares corresponding to the center point replicates (or Curvature) is:

where is the hat matrix and  is the matrix of ones. The matrix can be calculated using  where  is the design matrix,, excluding the second column that represents the center point. Thus, the extra sum of squares corresponding to Curvature is:

This extra sum of squares can be used to test for the significance of curvature. The corresponding mean square is:

Calculation of the Test Statistic

Knowing the mean squares, the statistic to check the significance of curvature can be calculated.

The value corresponding to the statistic,, based on the  distribution with one degree of freedom in the numerator and four degrees of freedom in the denominator is:

Assuming that the desired significance is 0.1, since value > 0.1, it can be concluded that curvature does not exist for this design. This result is shown in the ANOVA table in Figure 7.20. The surface of the fitted model based on these results, along with the observed response values, is shown in Figure 7.21.



Figure 7.21: Model surface and observed response values for the design in Example 7.4.

Blocking in $$2^k $$ Designs
Blocking can be used in the 2 designs to deal with cases when replicates cannot be run under identical conditions. Randomized complete block designs that were discussed in Chapter 6 for factorial experiments are also applicable here. At times, even with just two levels per factor, it is not possible to run all treatment combinations for one replicate of the experiment under homogeneous conditions. For example, each replicate of the 2 design requires four runs. If each run requires two hours and testing facilities are available for only four hours per day, two days of testing would be required to run one complete replicate. Blocking can be used to separate the treatment runs on the two different days. Blocks that do not contain all treatments of a replicate are called incomplete blocks. In incomplete block designs, the block effect is confounded with certain effect(s) under investigation. [Note] For the 2 design assume that treatments and  were run on the first day and treatments  and  were run on the second day. Then, the incomplete block design for this experiment is:

For this design the block effect may be calculated as: (9)

The interaction effect is: (10)

Eqns. (9) and (10) show that, in this design, the interaction effect cannot be distinguished from the block effect because the formulas to calculate these effects are the same. In other words, the interaction is said to be confounded with the block effect and it is not possible to say if the effect calculated based on these equations is due to the  interaction effect, the block effect or both. In incomplete block designs some effects are always confounded with the blocks. Therefore, it is important to design these experiments in such a way that the important effects are not confounded with the blocks. In most cases, the experimenter can assume that higher order interactions are unimportant. In this case, it would better to use incomplete block designs that confound these effects with the blocks.

One way to design incomplete block designs is to use defining contrasts as shown next: (11)

where the s are the exponents for the factors in the effect that is to be confounded with the block effect and the s are values based on the level of the th factor (in a treatment that is to be allocated to a block). For 2 designs the s are either 0 or 1 and the s have a value of 0 for the low level of the th factor and a value of 1 for the high level of the factor in the treatment under consideration. As an example, consider the 2 design where the interaction effect is confounded with the block. Since there are two factors,, with representing factor  and  representing factor. Therefore:

The value of is one because the exponent of factor  in the confounded interaction  is one. Similarly, the value of is one because the exponent of factor  in the confounded interaction  is also one. Therefore, the defining contrast for this design can be written as:

Once the defining contrast is known, it can be used to allocate treatments to the blocks.For the 2 design, there are four treatments, , and. Assume that represents block 2 and  represents block 1. In order to decide which block the treatment belongs to, the levels of factors  and  for this run are used. Since factor is at the low level in this treatment,. Similarly, since factor is also at the low level in this treatment,. Therefore:

Note that the value of used to decide the block allocation is "mod 2" of the original value. [Note] This value is obtained by taking the value of 1 for odd numbers and 0 otherwise. Based on the value of, treatment is assigned to block 1. Other treatments can be assigned using the following calculations:

Therefore, to confound the interaction with the block effect in the 2 incomplete block design, treatments  and  (with ) should be assigned to block 2 and treatment combinations  and  (with ) should be assigned to block 1.

Example 7.5

This example illustrates how treatments can be allocated to two blocks for an unreplicated 2 design. Consider the unreplicated 2 design to investigate the four factors affecting the defects in automobile vinyl panels discussed in Chapter 7, Normal Probability Plot of Effects. Assume that the 16 treatments required for this experiment were run by two different operators with each operator conducting 8 runs. This experiment is an example of an incomplete block design. The analyst in charge of this experiment assumed that the interaction was not significant and decided to allocate treatments to the two operators so that the  interaction was confounded with the block effect (the two operators are the blocks). The allocation scheme to assign treatments to the two operators can be obtained as follows.

The defining contrast for the 2 design where the interaction is confounded with the blocks is:

The treatments can be allocated to the two operators using the values of the defining contrast. Assume that represents block 2 and  represents block 1. Then the value of the defining contrast for treatment is:

Therefore, treatment should be assigned to Block 1 or the first operator. Similarly, for treatment we have:

Therefore, should be assigned to Block 2 or the second operator. Other treatments can be allocated to the two operators in a similar manner to arrive at the allocation scheme shown in Figure 7.22.



Figure 7.22: Allocation of treatments to two blocks for the 2 design in Example 7.5 by confounding interaction with the blocks.

In DOE++, to confound the interaction for the 2 design into two blocks, the number of blocks are specified as shown in Figure 7.23. Then the interaction is entered in the Block Generator window (Figure 7.24) which is available using the Block Generator button in Figure 7.23. The design generated by DOE++ is shown in Figure 7.25. This design matches the allocation scheme of Figure 7.22.



Figure 7.23: Adding block properties for the experiment in Example 7.5.



Figure: 7.24: Specifying the interaction as the interaction to be confounded with the blocks for Example 7.5.



Figure 7.25: Two block design for the experiment in Example 7.5.

For the analysis of this design, the sum of squares for all effects are calculated assuming no blocking. Then, to account for blocking, the sum of squares corresponding to the interaction is considered as the sum of squares due to blocks and. In DOE++ this is done by displaying this sum of squares as the sum of squares due to the blocks. This is shown in Figure 7.26 where the sum of squares in question is obtained as 72.25 and is displayed against Block. The interaction ABCD, which is confounded with the blocks, is not displayed. Since the design is unreplicated, any of the methods to analyze unreplicated designs mentioned in Chapter 7, Unreplicated 2k Designs, have to be used to identify significant effects.



Figure 7.26: ANOVA table for the experiment of Example 7.5.

Unreplicated $$2^k $$ Designs in $$2^p $$ Blocks
A single replicate of the 2 design can be run in up to 2 blocks where. The number of effects confounded with the blocks equals the degrees of freedom associated with the block effect. If two blocks are used (the block effect has two levels), then one ( effect is confounded with the blocks. If four blocks are used, then three effects are confounded with the blocks and so on. [Note] For example an unreplicated 2 design may be confounded in 2 (four) blocks using two contrasts,  and . Let  and  be the effects to be confounded with the blocks. Corresponding to these two effects, the contrasts are respectively:

Based on the values of and  the treatments can be assigned to the four blocks as follows:

Since the block effect has three degrees of freedom, three effects are confounded with the block effect. In addition to and, the third effect confounded with the block effect is their generalized interaction,.

In general, when an unreplicated 2 design is confounded in 2 blocks, contrasts are needed. effects are selected to define these contrasts such that none of these effects are the generalized interaction of the others. The 2 blocks can then be assigned the treatments using the contrasts. effects, that are also confounded with the blocks, are then obtained as the generalized interaction of the effects. In the statistical analysis of these designs, the sum of squares are computed as if no blocking were used. Then the block sum of squares is obtained by adding the sum of squares for all the effects confounded with the blocks.

Example 7.6

This example illustrates how DOE++ obtains the sum of squares when treatments for an unreplicated 2 design are allocated among four blocks. Consider again the unreplicated 2 design used to investigate the defects in automobile vinyl panels presented in Chapter 7, Normal Probability Plot of Effects. Assume that the 16 treatments needed to complete the experiment were run by four operators. Therefore, there are four blocks. Assume that the treatments were allocated to the blocks using the generators mentioned in the previous section, i.e. treatments were allocated among the four operators by confounding the effects, and  with the blocks. These effects can be specified as Block Generators as shown in Figure 7.27. (The generalized interaction of these two effects, interaction, will also get confounded with the blocks.) The resulting design is shown in Figure 7.28 and matches the allocation scheme obtained in the previous section.



Figure 7.27: Specifying the interactions and  as block generators for Example 7.6.

The sum of squares in this case can be obtained by calculating the sum of squares for each of the effects assuming there is no blocking. Once the individual sum of squares have been obtained, the block sum of squares can be calculated. The block sum of squares is the sum of the sum of squares of effects,, and , since these effects are confounded with the block effect. As shown in Figure 7.29, this sum of squares is 92.25 and is displayed against Block. [Note ] The interactions, and , which are confounded with the blocks, are not displayed. [Note ] Since the present design is unreplicated any of the methods to analyze unreplicated designs mentioned in Chapter 7, Unreplicated 2k Designs, have to be used to identify significant effects.



Figure 7.28: Design for the experiment in Example 7.6.



Figure 7.29: ANOVA table for the experiment in Example 7.6.

Variability Analysis
For replicated two level factorial experiments, DOE++ provides the option of conducting variability analysis (using the Variability Analysis icon in the Control Panel). The analysis is used to identify the treatment that results in the least amount of variation in the product or process being investigated. Variability analysis is conducted by treating the standard deviation of the response for each treatment of the experiment as an additional response. The standard deviation for a treatment is obtained by using the replicated response values at that treatment run. As an example, consider the 2 design shown in Figure 7.30 where each run is replicated four times. A variability analysis can be conducted for this design. DOE++ calculates eight standard deviation values corresponding to each treatment of the design (see Figure 7.31). Then, the design is analyzed as an unreplicated 2 design with the standard deviations (displayed as Y Std. in Figure 7.31) as the response. The normal probability plot of effects identifies as the effect that influences variability (see Figure 7.32). Based on the effect coefficients obtained in Figure 7.33, the model for Y Std. is:

Based on the model, the experimenter has two choices to minimize variability (by minimizing Y Std.). The first choice is that should be  (i.e.  should be set at the high level) and  should be  (i.e.  should be set at the low level). The second choice is that should be  (i.e.  should be set at the low level) and  should be  (i.e.  should be set at the high level). The experimenter can select the most feasible choice.



Figure 7.30: A 2³ design with four replicated response values that can be used to conduct a variability analysis.



Figure 7.31: Variability analysis in DOE++.



Figure 7.32: Normal probability plot of effects for the variability analysis example.



Figure 7.33: Effect coefficients for the variability analysis example.

Two Level Fractional Factorial Designs
As the number of factors in a two level factorial design increases, the number of runs for even a single replicate of the 2 design becomes very large. For example, a single replicate of an eight factor two level experiment would require 256 runs. Fractional factorial designs can be used in these cases to draw out valuable conclusions from fewer runs. The basis of fractional factorial designs is the sparsity of effects principle. [15] The principle states that, most of the time, responses are affected by a small number of main effects and lower order interactions, while higher order interactions are relatively unimportant. Fractional factorial designs are used as screening experiments during the initial stages of experimentation. At these stages, a large number of factors have to be investigated and the focus is on the main effects and two factor interactions. These designs obtain information about main effects and lower order interactions with fewer experiment runs by confounding these effects with unimportant higher order interactions. As an example, consider a 2 design that requires 256 runs. This design allows for the investigation of 8 main effects and 28 two factor interactions. However, 219 degrees of freedom are devoted to three factor or higher order interactions. This full factorial design can prove to be very inefficient when these higher order interactions can be assumed to be unimportant. Instead, a fractional design can be used here to identify the important factors that can then be investigated more thoroughly in subsequent experiments. In unreplicated fractional factorial designs, no degrees of freedom are available to calculate the error sum of squares and the techniques mentioned in Chapter 7, Unreplicated 2k Designs, should be employed for the analysis of these designs.

Half-Fraction Designs
A half-fraction of the 2 design involves running only half of the treatments of the full factorial design. For example, consider a 2 design that requires eight runs in all. The design matrix for this design is shown in Figure 7.34 (a). A half-fraction of this design is the design in which only four of the eight treatments are run. The fraction is denoted as 2 with the "" in the index denoting a half-fraction. Assume that the treatments chosen for the half-fraction design are the ones where the interaction is at the high level (i.e. only those rows are chosen from Figure 7.34 (a) where the column for  has entries of 1). The resulting 2 design has a design matrix as shown in Figure 7.34 (b).

In the 2 design of Figure 7.34 (b), since the interaction is always included at the same level (the high level represented by 1), it is not possible to measure this interaction effect. The effect,, is called the generator or word for this design. It can be noted that, in the design matrix of Figure 7.34 (b), the column corresponding to the intercept,, and column corresponding to the interaction , are identical. The identical columns are written as and this equation is called the defining relation for the design. In DOE++, the present 2 design can be obtained by specifying the design properties as shown in Figure 7.35. The defining relation,, is entered in the Fraction Generator window (Figure 7.36) using the Fraction Generator button shown in Figure 7.35. Note that in Figure 7.36, the defining relation is specified as. This relation is obtained by multiplying the defining relation,, by the last factor, , of the design. [Note]



Figure 7.34: Half-fractions of the 2 design. (a) shows the full factorial 2 design, (b) shows the 2 design with the defining relation and (c) shows the 2 design with the defining relation.



Figure 7.35: Design properties for the 2 design.



Figure 7.36: Specifying the defining relation for the 2 design.

Calculation of Effects
Using the four runs of the 2 design in Figure 7.34 (b), the main effects can be calculated as follows: [Note]

(12)

(13)

(14)

where, , and  are the treatments included in the 2 design.

Similarly, the two factor interactions can also be obtained as:

(15)

(16)

(17)

Eqns. (12) and (15) result in the same effect values showing that effects and  are confounded in the present 2 design. Thus, the quantity, estimates  (i.e. both the main effect  and the two-factor interaction ). The effects, and  are called aliases. From Eqns. 13 and 16, and 14 and 17, it can be seen that the other aliases for this design are and, and  and. Therefore, the equations to calculate the effects in the present 2 design can be written as follows:

(18)

(19)

(20)

Calculation of Aliases
Aliases for a fractional factorial design can be obtained using the defining relation for the design. The defining relation for the present 2 design is:

Multiplying both sides of the previous equation by the main effect, gives the alias effect of :

Note that in calculating the alias effects, any effect multiplied by remains the same, while an effect multiplied by itself results in. Other aliases can also be obtained:

and:

Fold-Over Design
If it can be assumed for this design that the two-factor interactions are unimportant, then in the absence of, and , Eqns. (18) to (20) can be used to estimate the main effects,, and , respectively. However, if such an assumption is not applicable, then to uncouple the main effects from their two factor aliases, the alternate fraction that contains runs having at the lower level should be run. [Note] The design matrix for this design is shown in Figure 7.34 (c). The defining relation for this design is because the four runs for this design are obtained by selecting the rows of Figure 7.34 (a) for which the value of the  column is. The aliases for this fraction can be obtained as explained in Chapter 7, Calculation of Aliases as, and. The effects for this design can be calculated as:

(21)

(22)

(23)

These equations can be combined with Eqns. (18) to (20) to obtain the de-aliased main effects and two factor interactions. For example, adding Eqns. (18) and (21) returns the main effect.

The process of augmenting a fractional factorial design by a second fraction of the same size by simply reversing the signs (of all effect columns except ) is called folding over. The combined design is referred to as a fold-over design. [Note]

Quarter and Smaller Fraction Designs
At times, the number of runs even for a half-fraction design are very large. In these cases, smaller fractions are used. A quarter-fraction design, denoted as 2, consists of a fourth of the runs of the full factorial design. Quarter-fraction designs require two defining relations. The first defining relation returns the half-fraction or the 2 design. The second defining relation selects half of the runs of the 2 design to give the quarter-fraction. For example, consider the 2 design. To obtain a 2 design from this design, first a half-fraction of this design is obtained by using a defining relation. Assume that the defining relation used is. The design matrix for the resulting 2 design is shown in Figure 7.37 (a). Now, a quarter-fraction can be obtained from the 2 design of Figure 7.37 (a) using a second defining relation. The resulting 2 design obtained is shown in Figure 7.37 (b). The complete defining relation for this 2 design is:



Figure 7.37: Fractions of the 2 design - Figure (a) shows the 2 design with the defining relation and (b) shows the 2 design with the defining relation.

(24) Note that the effect, in the defining relation is the generalized interaction of  and  and is obtained using. In general, a 2 fractional factorial design requires independent generators. The defining relation for the design consists of the independent generators and their 2- (+1) generalized interactions.

Calculation of Aliases
The alias structure for the present 2 design can be obtained using the defining relation of Eqn. (24) following the procedure explained in Chapter 7, Calculation of Aliases. For example, multiplying the defining relation by returns the effects aliased with the main effect,, as follows:

Therefore, in the present 2 design, it is not possible to distinguish between effects, , and. Similarly, multiplying the defining relation by and  returns the effects that are aliased with these effects:

Other aliases can be obtained in a similar way. It can be seen that each effect in this design has three aliases. In general, each effect in a 2 design has 2 aliases.

The aliases for the 2 design show that in this design the main effects are aliased with each other ( is aliased with and  is aliased with ). Therefore, this design is not a useful design and is not available in DOE++. It is important to ensure that main effects and lower order interactions of interest are not aliased in a fractional factorial design. This is known by looking at the resolution of the fractional factorial design.

Design Resolution
The resolution of a fractional factorial design is defined as the number of factors in the lowest order effect in the defining relation. For example, in the defining relation of the previous 2 design, the lowest-order effect is either  or  containing two factors. Therefore, the resolution of this design is equal to two. The resolution of a fractional factorial design is represented using Roman numerals. For example, the previously mentioned 2 design with a resolution of two can be represented as 2. The resolution provides information about the confounding in the design as explained next:

1. Resolution III Designs

In these designs, the lowest order effect in the defining relation has three factors, e.g. a 2 design with the defining relation. In resolution III designs, no main effects are aliased with any other main effects, but main effects are aliased with two factor interactions. In addition, some two factor interactions are aliased with each other.

2. Resolution IV Designs

In these designs, the lowest order effect in the defining relation has four factors, e.g. a 2 design with the defining relation. In resolution IV designs, no main effects are aliased with any other main effects or two factor interactions. However, some main effects are aliased with three factor interactions and the two factor interactions are aliased with each other.

3. Resolution V Designs

In these designs the lowest order effect in the defining relation has five factors, e.g. a 2 design with the defining relation. In resolution V designs, no main effects or two factor interactions are aliased with any other main effects or two factor interactions. However, some main effects are aliased with four factor interactions and the two factor interactions are aliased with three factor interactions.

Fractional factorial designs with the highest resolution possible should be selected because the higher the resolution of the design, the less severe the degree of confounding. In general, designs with a resolution less than III are never used because in these designs some of the main effects are aliased with each other. Table 7.3 shows fractional factorial designs with the highest available resolution for three to ten factor designs along with their defining relations. In DOE++, these designs are shown with a green background in the Available Designs window (Figure 7.38). The window is available using the View Available Designs hyperlink in the Design Wizard.



Table 7.3: Highest resolution designs available for fractional factorial designs with 3 to 10 factors.

Figure 7.38: Two level fractional factorial designs available in DOE++ and their resolutions.

Minimum Aberration Designs
At times, different designs with the same resolution but different aliasing may be available. The best design to select in such a case is the minimum aberration design. For example, all 2 designs in Table 7.4 have a resolution of four (since the generator with the minimum number of factors in each design has four factors). Design has three generators of length four. Design has two generators of length four. Design has one generator of length four. Therefore, design has the least number of generators with the minimum length of four. Design is called the minimum aberration design. It can be seen that the alias structure for design is less involved compared to the other designs. For details refer to [7].



Table 7.4: Three 2 designs with different defining relations.

Example 7.7

The design of an automobile fuel cone is thought to be affected by six factors in the manufacturing process: cavity temperature (factor ), core temperature (factor ), melt temperature (factor ), hold pressure (factor ), injection speed (factor ) and cool time (factor ). The manufacturer of the fuel cone is unable to run the 2 runs required to complete one replicate for a two level full factorial experiment with six factors. Instead, they decide to run a fractional factorial design. Considering that three factor and higher order interactions are likely to be inactive, the manufacturer selects a 2 design that will require only 16 runs. The manufacturer chooses the resolution IV design which will ensure that all main effects are free from aliasing (assuming three factor and higher order interactions are absent). However, in this design the two factor interactions may be aliased with each other. It is decided that, if important two factor interactions are found to be present, additional experiment trials may be conducted to separate the aliased effects. The performance of the fuel cone is measured on a scale of 1 to 15. In DOE++, the design for this experiment is set up using the properties shown in Figure 7.39. The Fraction Generators for the design, and, are the same as the defaults used in DOE++. The resulting 2 design and the corresponding response values are shown in Figure 7.40.



Figure 7.39: Design properties for the experiment in Example 7.7.



Figure 7.40: Experiment design for Example 7.7.

The complete alias structure for the 2 design is shown next. (In DOE++, the alias structure is displayed in the Results Panel (Figure 7.41) which is available using the Show Design Summary icon in the Control Panel):



Figure 7.41: Alias structure for the experiment design in Example 7.7.

The normal probability plot of effects for this unreplicated design shows the main effects of factors and  and the interaction effect,, to be significant (see Figure 7.42). From the alias structure, it can be seen that for the present design interaction effect, is confounded with. Therefore, the actual source of this effect cannot be known on the basis of the present experiment. However because neither factor nor  is found to be significant there is an indication that the observed effect is likely due to interaction,. To confirm this, a follow-up 2 experiment is run involving only factors and. The interaction,, is found to be inactive, leading to the conclusion that the interaction effect in the original experiment is effect,. Given these results, the fitted regression model for the fuel cone design as per the coefficients obtained from DOE++ is (see Figure 7.43):



Figure 7.42: Normal probability plot of effects for the experiment in Example 7.7.



Figure 7.43: Effect coefficients for the experiment in Example 7.7.

Projection
Projection refers to the reduction of a fractional factorial design to a full factorial design by dropping out some of the factors of the design. Any fractional factorial design of resolution, can be reduced to complete factorial designs in any subset of  factors. For example, consider the 2 design. The resolution of this design is four. Therefore, this design can be reduced to full factorial designs in any three of the original seven factors (by dropping the remaining four of factors). Further, a fractional factorial design can also be reduced to a full factorial design in any of the original factors, as long as these  factors are not part of the generator in the defining relation. Again consider the 2 design. This design can be reduced to a full factorial design in four factors provided these four factors do not appear together as a generator in the defining relation. The complete defining relation for this design is:

Therefore, there are seven four factor combinations out of the 35 possible four-factor combinations that are used as generators in the defining relation. The designs with the remaining 28 four factor combinations would be full factorial 16-run designs. For example, factors, , and  do not occur as a generator in the defining relation of the 2 design. If the remaining factors,, and , are dropped, the 2 design will reduce to a full factorial design in , ,  and.

Resolution III Designs
At times, the factors to be investigated in screening experiments are so large that even running a fractional factorial design is impractical. This can be partially solved by using resolution III fractional factorial designs in the cases where three factor and higher order interactions can be assumed to be unimportant. Resolution III designs, such as the 2 design, can be used to estimate main effects using just  runs. In these designs, the main effects are aliased with two factor interactions. Once the results from these designs are obtained, and knowing that three factor and higher order interactions are unimportant, the experimenter can decide if there is a need to run a fold-over design to de-alias the main effects from the two factor interactions. Thus, the 2 design can be used to investigate three factors in four runs, the 2 design can be used to investigate seven factors in eight runs, the 2 design can be used to investigate fifteen factors in sixteen runs and so on.

Example 7.8

A baker wants to investigate the factors that most affect the taste of the cakes made in his bakery. He chooses to investigate seven factors, each at two levels: flour type (factor ), conditioner type (factor ), sugar quantity (factor ), egg quantity (factor ), preservative type (factor ), bake time (factor ) and bake temperature (factor ). The baker expects most of these factors and all higher order interactions to be inactive. On the basis of this, he decides to run a screening experiment using a 2 design that requires just 8 runs. The cakes are rated on a scale of 1 to 10. The design properties for the 2 design (with generators, , and ) are shown in Figure 7.44. The resulting design along with the rating of the cakes corresponding to each run is shown in Figure 7.45.



Figure 7.44: Design properties for the experiment in Example 7.8.



Figure 7.45: Experiment design for Example 7.8.

The normal probability plot of effects for the unreplicated design shows main effects, and  to be significant (see Figure 7.46). However, for this design, the following alias relations exist for the main effects:



Figure 7.46: Normal probability plot of effects for the experiment in Example 7.8.

Based on the alias structure, three separate possible conclusions can be drawn. It can be concluded that effect is active instead of  so that effects,  and their interaction, , are the significant effects. Another conclusion can be that effect is active instead of  so that effects,  and their interaction, , are significant. Yet another conclusion can be that effects, and their interaction, , are significant. To accurately discover the active effects, the baker decides to a run a fold-over of the present design and base his conclusions on the effect values calculated once results from both the designs are available. Using the alias relations, the effects obtained from DOE++ for the present design (Figure 7.47) can be expressed as:



Figure 7.47: Effect values for the experiment in Example 7.8.

The fold-over design for the experiment is obtained by reversing the signs of the columns, , and. The generators to be used are, , and. The resulting design and the corresponding response values obtained are shown in Figure 7.48. The effect values obtained from DOE++ for this design (Figure 7.49) can be expressed as:



Figure 7.48: Fold-over design for the experiment in Example 7.8.



Figure 7.49: Effect values for the fold-over design in Example 7.8.

Using the effect values from both the designs, the effects can be separated (using addition and subtraction of the effect equations) as follows:

Comparing the absolute values of the effects, the largest effects are, and the interaction. Therefore, the most important factors affecting the taste of the cakes in the present case are sugar quantity, egg quantity and their interaction.

Alias Matrix
In Chapter 7, Calculation of Aliases, the alias structure for fractional factorial designs was obtained using the defining relation. However, this method of obtaining the alias structure is not very efficient when the alias structure is very complex or when partial aliasing is involved. [Note ] One of the ways to obtain the alias structure for any design, regardless of its complexity, is to use the alias matrix. The alias matrix for a design is calculated using where  is the portion of the design matrix,  that contains the effects for which the aliases need to be calculated, and  contains the remaining columns of the design matrix, other than those included in.

To illustrate the use of the alias matrix, consider the design matrix for the 2 design (using the defining relation ) shown next:

The alias structure for this design can be obtained by defining using eight columns since the 2 design estimates eight effects. If the first eight columns of are used then  is:

is obtained using the remaining columns as:

Then the alias matrix is:

The alias relations can be easily obtained by observing the alias matrix as: