Many times a factorial experiment requires so many runs that all of them cannot be completed under homogeneous conditions. This may lead to inclusion of the effects of nuisance factors into the investigation. Nuisance factors are factors that have an effect on the response but are not of primary interest to the investigator. For example two replicates of a two-factor factorial experiment require eight runs. If four runs require the duration of one day to be completed then the total experiment will require two days to be completed. The difference in the conditions on the two days may lead to introduction of effects on the response that are not the result of the two factors being investigated. Therefore, the day is a nuisance factor for this experiment.
Nuisance factors can be accounted for using blocking. In blocking, experimental runs are separated based on levels of the nuisance factor. For the case of the two-factor factorial experiment where the day is a nuisance factor, separation can be made into two groups or blocks - runs that are carried out on the first day belong to block 1, and runs that are carried out on the second day belong to block 2. Thus, within each block conditions are the same with respect to the nuisance factor. As a result, each block investigates the effects of the factors of interest, while the difference in the blocks measures the effect of the nuisance factor. [Note]
For the example of the two factor factorial experiment, a possible assignment of runs to the blocks could be - one replicate of the experiment is assigned to block 1 and the second replicate is assigned to block 2 (now each block contains all possible treatment combinations). Within each block, runs are subjected to randomization (i.e. randomization is now restricted to the runs within a block). Such a design, where each block contains one complete replicate and the treatments within a block are subjected to randomization, is called randomized complete block design.
In summary, blocking should always be used to account for the effects of nuisance factors if it is not possible to hold the nuisance factor at a constant level through all of the experimental runs. Randomization should be used within each block to counter the effects of any unknown variability that may still be present.
This section contains the following subsections:
Example 6.3
Consider the experiment of Table 6.5 where the mileage of a sports utility vehicle was investigated for the effects of speed and fuel additive type. Now assume that the three replicates for this experiment were carried out on three different vehicles. To ensure that the variation from one vehicle to another does not have an effect on the analysis, each vehicle is considered as one block. (See the experiment design in Figure 6.13.) [Note] For the purpose of the analysis, the block is considered as a main effect except that it is assumed that interactions between the block and the other main effects do not exist. Therefore, there is one block main effect (having three levels - block 1, block 2 and block 3), two main effects (speed - having three levels; and fuel additive type - having two levels) and one interaction effect (speed-fuel additive interaction) for this experiment. Let represent the block effects. The hypothesis test on the block main effect checks if there is a significant variation from one vehicle to the other. The statements for the hypothesis test are:

The test statistic for this test is:

where represents the mean square for the block main effect and is the error mean square. The hypothesis statements and test statistics to test the significance of factors (speed), (fuel additive) and the interaction (speed-fuel additive interaction) can be obtained as explained in Example 6.2. The ANOVA model for this example can be written as:
(26)
|
Figure 6.13: Randomized complete block design for the experiment in Table 6.5 using three blocks. |
represents the overall mean effect.
is the effect of the th level of the block ().
is the effect of the th level of factor ().
is the effect of the th level of factor ().
represents the interaction effect between and .
represents the random error terms (which are assumed to be normally distributed with a mean of zero and variance of ).
In order to calculate the test statistics, it is convenient to express the ANOVA model of Eqn. (26) in the form . This can be done as explained next.
Since the effects , , , and are defined as deviations from the overall mean, the following constraints exist.
Constraints on are:
(27)
Therefore, only two of the effects are independent. Assuming that and are independent, . (The null hypothesis to test the significance of the blocks can be rewritten using only the independent effects as .) In DOE++, the independent block effects, and , are displayed as Block[1] and Block[2], respectively.
Constraints on are:
(28)
Therefore, only two of the effects are independent. Assuming that and are independent, . The independent effects, and , are displayed as A[1] and A[2], respectively.
Constraints on are:
(29)
Therefore, only one of the effects is independent. Assuming that is independent, . The independent effect, , is displayed as B:B.
Constraints on are:
(30)
(31)
(32)
(33)
(34)
Equations (30) to (34) represent four constraints as only four of the five equations are independent. Therefore, only two out of the six effects are independent. Assuming that and are independent, we can express the other four effects in terms of these effects. The independent effects, and , are displayed as A[1]B and A[2]B, respectively.
Based on the ANOVA model of Eqn. (26) and the constraints of Eqns. (27) to (34), the regression version of the ANOVA model can be obtained using indicator variables. Since the block has three levels, two indicator variables, and , are required, which need to be coded as shown next:

Factor has three levels and two indicator variables, and , are required:

Factor has two levels and can be represented using one indicator variable, , as follows:

The interaction will be represented by and . The regression version of the ANOVA model can finally be obtained as:
(35)
In matrix notation this model can be expressed as:

or

Knowing , and , the sum of squares for the ANOVA model and the extra sum of squares for each of the factors can be calculated. These are used to calculate the mean squares that are used to obtain the test statistics.
The model sum of squares, , for the model of Eqn. (26) can be obtained as:

Since seven effect terms (,,, , , and ) are used in the model the number of degrees of freedom associated with is seven ().
The total sum of squares can be calculated as:

Since there are 18 observed response values, the number of degrees of freedom associated with the total sum of squares is 17 (). The error sum of squares can now be obtained:

The number of degrees of freedom associated with the error sum of squares is:

Since there are no true replicates of the treatments (as can be seen from the design of Figure 6.13 where all of the treatments are seen to be run just once), all of the error sum of squares is the sum of squares due to lack of fit. The lack of fit arises because the model used is not a full model since it is assumed that there are no interactions between blocks and other effects.
The sequential sum of squares for the blocks can be calculated as:

where is the matrix of ones, is the hat matrix, which is calculated using , and is the matrix containing only the first three columns of the matrix. Thus:

Since there are two independent block effects, and , the number of degrees of freedom associated with is two ().
Similarly, the sequential sum of squares for factor can be calculated as:

Sequential sum of squares for the other effects are obtained as and .
Knowing the sum of squares, the test statistics for each of the factors can be calculated. For example, the test statistic for the main effect of the blocks is:

The value corresponding to this statistic based on the distribution with 2 degrees of freedom in the numerator and 10 degrees of freedom in the denominator is:

Assuming that the desired significance level is 0.1, since value > 0.1, we fail to reject and conclude that there is no significant variation in the mileage from one vehicle to the other. Statistics to test the significance of other factors can be calculated in a similar manner. The complete analysis results obtained from DOE++ for this experiment are presented in Figure 6.14.
|
Figure 6.14: Analysis results for the randomized complete block design example. |